The simple, bus-based multiprocessor illustrated below represents a commonly-implemented symmetric shared-memory architecture. Each processor has a single, private cache with coherence maintained using the snooping coherence protocol of Figure 4.7. Each cache is direct-mapped, with four blocks each holding two words. To simplify the illustration, the cache address tag contains the full address and each word shows only two hex characters, with the least significant word on the right. The coherence states are denoted M, S, and I for Modified, Shared, and Invalid.
For each subproblem below, assume the initial cache and memory state as illustrated in the figure. Each subproblem specifies a sequence of one or more CPU operations of the form:
Where P# designates the CPU (e.g., P0), is the CPU operation (e.g., read or write), denotes the memory address, and indicates the new word to be assigned on a write operation.
What is the final state (i.e., coherence state, tags, and data) of the caches and memory after the given sequence of CPU operations has completed? Show only the blocks that change, e.g., P0.B0: (I, 120, 00 01) indicates that CPU P0В’s block B0 has the final state of I, tag of 120, and data words 00 and 01. Also, what value is returned by each read operation?
Describe the evolution of GPU pipeline over the last decade.