Vishal Gupta, Hyesoon Kim, Karsten Schwan,
Evaluating Scalability of Multi-threaded Applications on a Many-core Platform
Multicore processors have been effective in scaling application performance by dividing computation among multiple threads running in parallel. However, application performance does not necessarily improve as more cores are added. Application performance can be limited due to multiple bottlenecks including contention for shared resources such as caches and memory.
In this paper, we perform a scalability analysis of parallel applications on a 64-threaded Intel Nehalem-EX based system. We find that applications which scale well on small number of cores, exhibit poor scalability on large number of cores. Using hardware performance counters, we show that many performance limited applications are limited by memory bandwidth on manycore platforms and exhibit improved scalability when provisioned with higher memory bandwidth. By regulating the number of threads used and applying dynamic voltage and frequency scaling for memory bandwidth limited benchmarks, significant energy savings can be achieved.