Dipl.-Phys. Ing. Gordon Taft
Code Optimization
Core 1
Core 2
Core 3
Core 4
L3 Cache

6MB

Register
L1 Cache

64k Data
64k Code

L2 Cache

512k

L2 Cache

512k

L2 Cache

512k

AMD Phenom II
Register
L1 Cache

64k Data
64k Code

Register
L1 Cache

64k Data
64k Code

Register
L1 Cache

64k Data
64k Code

L2 Cache

512k

On the right side is an example overview of a state of the technology CPU. To optimize code for the CPU, is it important to know, how the CPU works in detail.

The CPU must deliver its data at a very high speed. The regular RAM cannot keep up with that speed. Therefore, a special RAM type called cache is used as a buffer - temporary storage. To get top performance from the CPU, the number of outgoing transactions must be minimized. The more data transmissions, which can be contained inside the CPU, the better the performance. Therefore, the AMD Phenom II was equipped with a built in L1, L2 and a L3 Cache. These Caches help minimize the data flow in and out of the CPU.

To write a speed optimized algorithm is it necessary to minimize the RAM access and the access between the cores.