First level cache often has access time between 1 to 3 processor cycles. Access to second level cache requires 20 to 30 cycles and access to memory will needs >100 cycles.
In a multi-core system, each core has memory directly attached to it. Access to the local memory is faster than access to the memory connected to another core.
One approach is to interleave memory, commonly at cache line boundary so that application would access local and remote memory alternatively. On average, application will experience memory latency which is average of the local and remote memory access. This architecture is called uniform memory architecture (UMA) where all processor see uniform memory latency
Another approach is to accept the difference memory region will have different access frequency and make the OS aware of this. This architecture is called cache coherent nonuniform memory architecture (ccNUMA). OS will attempted to assign process to run on the same processor to maximize local memory access.
No comments:
Post a Comment