Thread: hardware notes class 11th
- Join Date
- Feb 2010
- Rep Power
hardware notes class 11th
Chapter 10. The cache
In the previous chapter, I described two aspects of the ongoing development of new CPUÃ¢â‚¬â„¢s Ã¢â‚¬â€œ increased clock frequencies and the increasing number of transistors being used. Now it is time to look at a very different yet related technology Ã¢â‚¬â€œ the processorÃ¢â‚¬â„¢s connection to the RAM, and the use of the L1 and L2 caches.
The CPU works internally at very high clock frequencies (like 3200 MHz), and no RAM can keep up with these.
The most common RAM speeds are between 266 and 533 MHz. And these are just a fraction of the CPUÃ¢â‚¬â„¢s working speed. So there is a great chasm between the machine (the CPU) which slaves away at perhaps 3200 MHz, and the Ã¢â‚¬Å“conveyor beltÃ¢â‚¬Â, which might only work at 333 MHz, and which has to ship the data to and from the RAM. These two subsystems are simply poorly matched to each other.
If nothing could be done about this problem, there would be no reason to develop faster CPUÃ¢â‚¬â„¢s. If the CPU had to wait for a bus, which worked at one sixth of its speed, the CPU would be idle five sixths of the time. And that would be pure waste.
The solution is to insert small, intermediate stores of high-speed RAM. These buffers (cache RAM) provide a much more efficient transition between the fast CPU and the slow RAM. Cache RAM operates at higher clock frequencies than normal RAM. Data can therefore be read more quickly from the cache.
Data is constantly being moved
The cache delivers its data to the CPU registers. These are tiny storage units which are placed right inside the processor core, and they are the absolute fastest RAM there is. The size and number of the registers is designed very specifically for each type of CPU.
Fig. 68. Cache RAM is much faster than normal RAM. The CPU can move data in different sized packets, such as bytes (8 bits), words (16 bits), dwords (32 bits) or blocks (larger groups of bits), and this often involves the registers. The different data packets are constantly moving back and forth:
from the CPU registers to the Level 1 cache.
from the L1 cache to the registers.
from one register to another
from L1 cache to L2 cache, and so onÃ¢â‚¬Â¦ The cache stores are a central bridge between the RAM and the registers which exchange data with the processorÃ¢â‚¬â„¢s execution units.
The optimal situation is if the CPU is able to constantly work and fully utilize all clock ticks. This would mean that the registers would have to always be able to fetch the data which the execution units require. But this it not the reality, as the CPU typically only utilizes 35% of its clock ticks. However, without a cache, this utilization would be even lower.
CPU caches are a remedy against a very specific set of Ã¢â‚¬Å“bottleneckÃ¢â‚¬Â problems. There are lots of Ã¢â‚¬Å“bottlenecksÃ¢â‚¬Â in the PC Ã¢â‚¬â€œ transitions between fast and slower systems, where the fast device has to wait before it can deliver or receive its data. These bottle necks can have a very detrimental effect on the PCÃ¢â‚¬â„¢s total performance, so they must be minimised.
Fig. 69. A cache increases the CPUÃ¢â‚¬â„¢s capacity to fetch the right data from RAM. The absolute worst bottleneck exists between the CPU and RAM. It is here that we have the heaviest data traffic, and it is in this area that PC manufacturers are expending a lot of energy on new development. Every new generation of CPU brings improvements relating to the front side bus.
The CPUÃ¢â‚¬â„¢s cache is Ã¢â‚¬Å“intelligentÃ¢â‚¬Â, so that it can reduce the data traffic on the front side bus. The cache controller constantly monitors the CPUÃ¢â‚¬â„¢s work, and always tries to read in precisely the data the CPU needs. When it is successful, this is called a cache hit. When the cache does not contain the desired data, this is called a cache miss.
Two levels of cache
The idea behind cache is that it should function as a Ã¢â‚¬Å“near storeÃ¢â‚¬Â of fast RAM. A store which the CPU can always be supplied from.
In practise there are always at least two close stores. They are called Level 1, Level 2, and (if applicable) Level 3 cache. Some processors (like the Intel Itanium) have three levels of cache, but these are only used for very special server applications. In standard PCÃ¢â‚¬â„¢s we find processors with L1 and L2 cache.
Fig. 70. The cache system tries to ensure that relevant data is constantly being fetched from RAM, so that the CPU (ideally) never has to wait for data. L1 cache
Level 1 cache is built into the actual processor core. It is a piece of RAM, typically 8, 16, 20, 32, 64 or 128 Kbytes, which operates at the same clock frequency as the rest of the CPU. Thus you could say the L1 cache is part of the processor.
L1 cache is normally divided into two sections, one for data and one for instructions. For example, an Athlon processor may have a 32 KB data cache and a 32 KB instruction cache. If the cache is common for both data and instructions, it is called a unified cache.
Nice sharing good work
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
By $$$lover in forum Computer HardwareReplies: 2Last Post: 12-21-2010, 12:17 AM
By $$$lover in forum Computer HardwareReplies: 3Last Post: 12-19-2010, 11:17 PM
By $$$lover in forum Computer HardwareReplies: 1Last Post: 03-01-2010, 10:45 PM
By $$$lover in forum Computer HardwareReplies: 1Last Post: 03-01-2010, 10:44 PM