What is it?
16MB on-chip eDRAM buffer cache
- Provides 128GB of buffer cache for a fully populated POWER8 socket (8 DIMM Card)
- 16way set associatively
- SECDED ECC protected (Both cache and directory).
- Hardware controlled line delete function.
- Robust and highly configurable allocation and replacement policies.
- Including weighted history allocation predictor
Memory Buffer Cache does not participate in the Power Bus coherency protocol
Benefits:
Challenge: Increase in processor core counts per chip, which limits the ability to grow the on chip cache (L1/L2/L3) per core
- Memory Buffer cache helps mitigate this trend
Improves average memory latency
- The higher the Memory Buffer cache hit rate for a particular workload, the lower the average memory latency (L2 / L3 miss).
- Unloaded MB cache hit latency = ~55% of memory latency
Acts as a huge memory write reorder buffer
- Helps prevent writes to memory from interfering with latency critical reads
- Levels out „Bursty’ write traffic with respect to the DRAM bus.
- Beneficial for paging in data from disk (I/O DMA write traffic)
Provides ability to order writes to memory in groups of page mode bursts
- Reduces memory power by decreasing the number of row activate commands per write command.
- Increases DRAM bus utilization (bandwidth) by reducing the number of read to write and write to read bus turnaround delays.
Memory Buffer cache has very tight proximity to the DRAM memory interface command scheduler
Reduces memory power
- Memory Buffer cache hits reduce the number of requests that must access several DRAM chips simultaneously.
Improves memory channel bandwidth
- For access patterns in which DRAM memory cannot keep the upstream memory channel (P8<>Memory link) filled with data, Memory Buffer cache hits helps fill the gaps.
Improves the performance of read-modify-write operations
- Partial write operations that target the same cache block will be ‘gathered’ within the Memory Buffer cache before having to be written to memory
- Potentially many read-modify-writes become a single RMW to memory
Extends to the prefetch capabilities of the higher level caches
- Generates additional prefetching into the MB cache which lowers the latency of future demand prefetches from the L1/L2/L3 caches.
16MB on-chip eDRAM buffer cache
- Provides 128GB of buffer cache for a fully populated POWER8 socket (8 DIMM Card)
- 16way set associatively
- SECDED ECC protected (Both cache and directory).
- Hardware controlled line delete function.
- Robust and highly configurable allocation and replacement policies.
- Including weighted history allocation predictor
Memory Buffer Cache does not participate in the Power Bus coherency protocol
Leave a Reply