The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre.[1] Released in 2019, ARM claimed an increase of 23% and 35% in integer and floating point performance and 15% higher memory bandwidth over its predecessor, the A76.[1]
Design
The Cortex-A77 serves as the successor of the Cortex-A76. The Cortex-A77 is a 4-wide decode out-of-ordersuperscalar design with a new 1.5K macro-OP (MOPs) cache. It can fetch 4 instructions and 6 Mops per cycle. And rename and dispatch 6 Mops, and 13 μops per cycle. The out-of-order window size has been increased to 160 entries. The backend is 12 execution ports with a 50% increase over Cortex-A76. It has a pipeline depth of 13 stages and the execution latencies of 10 stages.[1][2]
There are six pipelines in the integer cluster – an increase of two additional integer pipelines from Cortex-A76. One of the changes from Cortex-A76 is the unification of the issue queues. Previously each pipeline had its own issue queue. On Cortex-A77, there is now a single unified issue queue which improves efficiency. Cortex-A77 added a new fourth general math ALU with a typical 1-cycle simple math operations and some 2-cycle more complex operations. In total, there are three simple ALUs that perform arithmetic and logical data processing operations and a fourth port which has support for complex arithmetic (e.g. MAC, DIV). Cortex-A77 also added a second branch ALU, doubling the throughput for branches.
There are two ASIMD/FP execution pipelines. This is unchanged from Cortex-A76. What did change is the issue queues. As with the integer cluster, the ASIMD cluster now features a unified issue queue for both pipelines, improving efficiency. As with Cortex-A76, the ASIMD on Cortex-A77 are both 128-bit wide capable of 2 double-precision operations, 4 single-precision, 8 half-precision, or 16 8-bit integer operations. Those pipelines can also execute the cryptographic instructions if the extension is supported (not offered by default and requires an additional license from Arm). Cortex-A77 added a second AES unit in order to improve the throughput of cryptography operations.[3]
Larger ROB, Up to 160-entry, up from 128, Add New L0 MOP cache, can up to 1536-entry.[4]
The core supports unprivileged 32-bit applications, but privileged applications must utilize the 64-bit ARMv8-AISA. It also supports Load acquire (LDAPR) instructions (ARMv8.3-A), Dot Product instructions (ARMv8.4-A), and PSTATE Speculative Store Bypass Safe (SSBS) bit instructions (ARMv8.5-A).
The Cortex-A77 supports ARM's DynamIQ technology, and is expected to be used as high-performance cores in combination with Cortex-A55 power-efficient cores.[1]
Both its predecessor (Cortex-A76) and its successor (Cortex-A78) had automotive variants with Split-Lock capability, the Cortex-A76AE and Cortex-A78AE, but the Cortex-A77 did not, thus not finding its way into security critical applications.