Cache Memory | Computer Architecture

Cache Memory – If the part of the program and data that are heavily used by the processor are placed in a fast, small memory, the average memory access time can be reduced. This reduces the total execution time of the program. Such a fast and small memory is referred to as a cache memory. The cache memory is the fastest in the memory hierarchy and approaches the speed of the processor.

What is Cache Memory?

The cache memory is smaller in comparison to main memory. It stores the most frequently accessed data and instructions. If data is accessed, it is transferred to the cache memory. Since cache memory is faster than main memory and stores most frequently accessed data and instructions, the average memory access time approaches the access time of cache.

In a computer system, there is a relatively large and slow main memory together with a smaller, faster cache memory. The cache memory contains a copy of a portion of main memory. When the processor needs to access memory, the cache memory is checked. If the required word is found in cache memory, the word is delivered to the processor. If not, a block of main memory, consisting of the same fixed number of words, is read into the cache, and then the word is transferred to the processor. In this way, the same data is transferred to the cache so that future references to memory find the required words in the fast memory.

Hit Ratio of cache memory

The performance of the cache memory is measured in terms of a quantity called hit ratio. If the desired word is found in cache memory, it is said to produce a hit. If the word is not found in cache, it is referred to as a miss. The ratio of the number of hits to the total processor references to memory (hits+misses) is defined as the hit ratio. Mathematically

Hit Ratio = Hit / (Hit + Miss)

A high value of hit ratio is desirable to make the system fast. If the hit ratio is high enough, so that most of the processors access the cache memory instead of main memory. Therefore, the average access time is closer to the access time of the fast cache memory.

Example 7.1: Determine the average access time of a computer system with a hit ratio of 0.95. The cache access time is 100 n sec, and the main memory access time is 1000 n sec.

Solution: Given that the hit ratio is 0.95. It means if the processor referred to a word 100 times, found 95 times in cache memory and 5 times in main memory. Therefore average access time is given as follows:

T_avg =( (95*100)+(5*1000) / 100 ) n sec = 145 n sec.

Mapping Processor

The transformation of data from main memory to cache memory is known as a mapping process. The mapping process plays an important role in reducing the search time in cache memory. There are three types of the most common mapping processes.

Associative mapping
Direct mapping
Set associative mapping

Associative Mapping

Associative mapping cache stores both the address and the content of the memory word. This type of cache organization uses associative memory as a cache. It stores a 15-bit address and a 12-bit word in octal form as shown in the figure below. The processor stores the address in the input register. Here, the associative memory is searched for the matching address. If the address is found in memory, the corresponding word is read from memory and sent to the processor. If no match occurs, the main memory is accessed for the word.

If the new data has to be inserted in memory and it is full, an address-data pair must be replaced to make space for a new address-data pair to be stored. Many algorithms are used to finalize the older address-data pair with the new one. One of the most common methods, which is used to replace cells of the cache, is the round robin order method. This constitutes a first-in, first-out (FIFO) replacement policy.

Direct Mapping

The basic limitation of the associative mapping cache is that it is expensive in comparison of andom access memory. Direct mapping is a method in which the possibility of using a random-access memory for cache is investigated.

In direct mapping, the address of 15 bits is divided into two fields. The nine least significant bits constitute the index field, and the remaining six bits form the tag field. The mapping shown in the figure shows that main memory needs an address of fifteen bits, whereas cache memory needs only index field bits. The number of bits in the index field is equal to the number of address bits required to access the cache memory. In this case index field bits are nine.

When the processor generates a memory request, the index field is used for the address to access the cache. The tag field of the CPU address is compared with the tag in the word read from cache. If two tags are the same, there is a hit, and the desired data is available in cache. If there is no match, there is a miss, and the desired data is searched in main memory.

The limitation of direct mapping is that the hit ratio can drop considerably if two or more words whose addresses have the same index but different tags are accessed repeatedly.

Set Associative Mapping

The disadvantage of the direct mapping is that two words with the same index in their address but with different tag values can not reside in cache memory at the same time. The set associative mapping is an improvement over the direct mapping. In this type of mapping, each data word is stored together with its tag and the number of tag-data items in one word of cache. This is called a set.

A set-associative cache organization for a set with a size of two is shown in the figure below. Each index address refers to two data words and their associated tags. From the figure below, the words stored at addresses 01000 and 02000 of main memory are stored in cache memory at index address 000.

Similarly, the words at the addresses 02777 and 00777 are stored in cache at index address 777. The tag field of the processor address is compared with both tags in the cache to determine if a match occurs. The comparison logic is done by an associative search of tags in the set, similar to an associative memory search. This is the reason why this is named set associative mapping.

Cache Operations

Initialization, reading, and writing are three basic operations that are performed with cache organization. The cache reading operation is performed through mapping the cache as discussed earlier. The other two are explained here.

Cache Initialization

Cache initialization is performed when power is applied to the computer or when the main memory is loaded with a complete set of programs from auxiliary memory. Once the cache is initialized, it is considered empty; however, it contains invalid data in it. Therefore, it is a must for each word in cache to have a bit that indicates whether the data is valid or not. This bit is named the valid bit. When the cache memory is initialized, it clears all the valid bits of all the data words. This shows there is no valid data in cache. Now, when data is stored in cache, the corresponding valid bit is set, and it remains set until it is reinitialized.

Cache-Writing

The cache writing is an important aspect of cache memory. The computer system can process writing operations in two ways.

Write-through method
Write-back method

The write-through method is the simplest and is mostly used with memory write operations. According to this, if a writing operation is performed, the main memory is updated along with the cache memory in parallel. The advantage of this method is that main memory always contains the same data as the cache memory, and this plays an important role in DMA (direct memory access) operation. This method gives an assurance that the data available in the main memory is valid for input/output device communication.

The write-back method is the second method that is used with the writing operation. In this, if a writing operation is performed, only the cache memory is updated. The same data will not be copied to main memory as in pervious method. This method uses a flag that is marked when data is stored in cache. When the new data is to be stored in place of the older one, the older one will be copied to main memory if the flag is marked. The significance of the write-back method is that during the time a word resides in the cache, it may be updated several times. As long as the word remains in the cache, it does not matter whether the copy in main memory is out of date, since requests from the word are filled from the cache.

Note: Analytical results indicate that the number of memory writes in a typical program ranges between 10 and 30 percent of the total references to memory.