This score edges-out the previous champion - the HP ProLiant DL785 G5 with 8, 4-core Opteron 8393SE processors - which reigned at 31.56@21 tiles. In contrast to the 4-socket, 24-core IBM System x3850 M2 Xeon leading the 24-core category, this doubling of socket/core count resulted in only a 50% increase in capacity. This scaling inefficiency is less typical in 2P-to-4P transition but seems to plague the 4p-to-8P segment.
"The x3950 M2 is based on the fourth generation of IBM Enterprise X-Architecture®, and is designed to deliver innovation with enhanced reliability and availability features that enable optimal performance for databases, enterprise applications and virtualized environments.""I'm really looking forward to even more virtualization benchmarks which are coming very soon."
- Elisabeth Stahl, IBM Benchmarking and Systems Performance Blog
Looking at the virtualization notes we discover what it takes to keep 48-cores fed to achieve such a benchmark:
- 4-QLogic QLE2462 HBA's (Dual-port, 4-Gbps FC)
- 1-IBM DS4800 with 4GB cache
- 19 EXP 810 storage expansion units for
- 1.8TB in 49 LUNs
- 280 15K disks total
- 21 IBM x336 clients
- DP 3.2GHz Xeon
- 3GB RAM
- Server 2003 R2
- 2 IBM x335 clients
- DP Xeon 3.06GHz
- 2.5GB RAM
- Server 2003 R2
- Eight vSwitches
- 120 ports total
- 4 Intel PRO 1000PT Dual-port 1Gb Ethernet controllers
- one per vSwitch
While the Dunnington tops the list by sheer brute force, it's safe to assume that - given the 32-core Opteron is nipping at its heels - the 48-core Istanbul results will displace it soon (possibly alluded to in Elisabeth Stahl's "Benchmarking and Performance Blog" reference above). More interestingly, will AMD's much touted "HT Assist" allow the 8P Istanbul to break the 4P-to-8P "curse" of scaling inefficiency? If not, it would show that much work is needed before the relatively "massive " core counts of 2010 are upon us.
Here's an example of linear scaling with socket/core count:
ReplyDelete2P, 8-core (64GB)
HP Proliant DL385 G5 - 11.28@8 tiles (1 tile per core)
4P, 16-core (128GB)
Dell PowerEdge R905 - 22.70@16 tiles (1 tile per core)
And non-linear scaling...
8P, 32-core (256GB)
HP ProLiant DL785 G5 - 31.56@21 tiles (0.66 tile per core)
Here, the bump from 4P to 8P yields less than 50% increase in tile count even though RAM, CPU and core are increased by 100%. AMD's HT Assist was designed to cure the memory access related issues that restrict scaling, but scheduling overhead may play a role as well. It will be good to see if Istanbul's HT Assist - along with better NUMA scheduling - has a measurable affect on 8P VMmark.