Blue gene q transactional memory in software

May 23, 2012 blue gene q represents the third generation in the ibm blue gene line of supercomputer systems. Transactional memory wikimili, the free encyclopedia. In june 2012, blue gene q installations took the top positions in all three lists. This document and the information it contains are provided on an asis basis. Technically, its not true smt thanks, jim dinan but thats the most reasonable approximate term ive found. The other avenue for intel is proliferating transactional memory throughout the x86 product line. It continues to expand and enhance the blue genel and p architectures. Led architectural definition and performance modeling of watch instruction set extensions, incorporated into the a2. In this paper, we argue that the problem lies with the. Most research so far has been done on softwarebased transactional memory implementations. However, amd has not announced whether asf will be used in products, and if so, in what timeframe.

So n years more till we see software transactional memory with dedicated. Quantitative comparison of hardware transactional memory. Other refinements might include partial aborts for nested transactions and programmer control over handling conflicting transactions. In proceedings of the 42nd annual international symposium on computer architecture isca 15. Amy wang, matthew gaudet, peng wu, martin ohmacht, jose nelson amaral, christopher barton, raul silvera, and maged m.

Performance measurement and analysis of transactional. At the hot chips conference in santa clara last week, ibm lifted the curtain on its blue geneq soc, which will soon power some of the highest performing supercomputers in the world. Blue gene is an ibm project aimed at designing supercomputers that can reach operating speeds in the petaflops pflops range, with low power consumption the project created three generations of supercomputers, blue gene l, blue gene p, and blue gene q. Keywords software transactional memory, concurrency control, biased readerwriter locks, strong atomicity, managed languages 1. Computer arc hitecture and operating system codesign caos. Bluegene q is the first commercially available platform that. Blue geneqs bgq unique transactional memory system provides hardware isolation, atomicity and consistency for memory locations while leaving the details of the transactional programming system to software layers above the hardware 22. Hardware transactional memory htm is becoming standard in modern processors because it provides lower overhead than softwarebased implementations of tm 25. Hardware transactional memory also isnt a new idea. The intel haswell processor includes restricted transactional memory rtm, which is the rst commoditybased hardware transactional memory htm to become publicly available. Powerpc a2 blue geneq and power 8 support hardware transactional memory as well. Software support and evaluation of hardware transactional. The blue geneq will be the first to commercially use transactional memory, which is a way of organizing related tasks into one big job for more efficient processing. Blue geneq architecture blue geneq is the third supercomputer design in the blue gene series.

System overview the blue gene q system organization is similar to blue gene l and blue gene p. The blue geneq chip is manufactured on ibms copper soi process at 45 nm. Ibm blue gene q was the first to provide an accessible htm implementation 1. Ibms blue geneq has supported it for a little while, and suns rock cpu was intended to do so before oracle canceled it, not to mention a large number of research cpus over the years. In computer science and engineering, transactional memory attempts to simplify concurrent. Stm, which is a softwarebased implementation of tm. The blue geneq compute chip will be the building block for a power efficient supercomputing system that will be able to scale to tens of petaflops. This paper describes an endtoend system implementation of the transactional memory tm programming model on top of the hardware transactional memory htm of the blue geneq bgq machine. Ibm blue geneq architecture and system software overview. What scalable programs need from transactional memory.

Transactional memory tm is a promising technique to ease the burden on the programmer, but only recently has become available on commercial hardware in the new blue gene q system and hence the. Performance characteristics of hardware transactional. Perhaps the most poorly kept secret at sc11 was ibms official unveiling of its next generation blue geneq bgq supercomputer, the third generation in its blue gene family, with 16 multiprocessing core technology and a scalable peak performance of up to 100 petaflops. Ibm blue geneq was the first to provide an accessible htm implementation 1. If you need more memory, you have to allocate more compute nodes. The bluegene q powered supercomputer will allow a much more extensive realworld testing of the. The goal was to provide hardware primitives that could be used for higherlevel synchronization, such as software transactional memory or lockfree algorithms. This means that 4 hardware threads share a single l1 cache. Evaluation of blue geneq hardware support for transactional memories pact12. Blue geneq contains innovative technology including hardware transactional memory and speculative execution, as well as mechanisms such as scalable atomic operations and a wakeup unit to help us better exploit the 17 cores and 68 threads per node.

Quantitative comparison of hardware transactional memory for. Ontrol it avoids memory conflicts by monitoring a transaction, a set of speculative operations in a defined code section. Best practice guide blue geneq prace research infrastructure. Ibm will implement transactional memory within the confines of a single chip using a tagging scheme on the chips leveltwo cache memory. You cannot ask for a particular quantity of memory in a blue geneq job. You cannot ask for a particular quantity of memory in a blue gene q job. You can, however, access all the memory of the nodes in your block 16 gib per node. Ibms blue gene q has supported it for a little while, and suns rock cpu was intended to do so before oracle canceled it, not to mention a large number of research cpus over the years. Performance measurement and analysis of transactional memory. The l2 cache is multiversioned, supporting transactional memory and. The project created three generations of supercomputers, blue genel, blue.

Blue gene q systems also topped the green500 list of most energy efficient supercomputers with up to 2. This paper describes an endtoend system implementation of a transactional memory tm. Blue gene systems have often led the top500 and green500 rankings of the most powerful and most power efficient supercomputers. However, like other real htms, such as ibms blue geneq, haswells rtm is beste ort, meaning it provides no transactional forward progress guarantees. The following is an incomplete list of blue gene q installations. Errata prompt intel to disable tsx in haswell, early.

Ibms bluegeneq super chip grows 18th core insidehpc. Powerpc a2 blue gene q and power 8 support hardware transactional memory as well. Intels haswell and ibms blue geneq and system z are the. Ibm blue gene q supercomputer blue gene is an ibm project aimed at designing supercomputers that can reach operating speeds in the pflops petaflops range, with low power consumption.

There are now four commercial systems, ibm blue geneq. The communication software stack on blue geneq is described in the following image. Takuya nakaike, rei odaira, matthew gaudet, maged m. Blue gene is an ibm project aimed at designing supercomputers that can reach operating speeds in the petaflops pflops range, with low power consumption the project created three generations of supercomputers, blue genel, blue genep, and blue geneq. Hardware transactional memory htm is becoming standard in modern processors because it provides lower overhead than software based implementations of tm 25. Although transactional memory programs cannot produce a deadlock, programs may. A description of blue geneq 1 and armci 12 is provided as follows.

However, they are all besteffort, meaning that every hardware transaction must have an alternative software fallback path that guarantees forward progress. To deliver a system that enables users to fully exploit the promise of highperformance computing for both traditional hpc applications and new commercial application areas, the blue gene q system architecture combines hardware and software innovations to overcome traditional bottlenecks, most famously the memory and power walls which have. Its ordered and unordered transaction modes implement both speculative execution. Transactional memory tm has been the focus of numerous studies, and it is supported in processors such as the ibm blue geneq and intel haswell. Jobs with high memory requirements but with poor parallel scalabilty are not suitable for the blue geneq architecture. Blue gene q continues to expand and enhance the blue gene l and p architectures. Robust architectural support for transactional memory in the power architecture harold w.

Tmbased synchronization has recently been included in ibms blue geneq. The transactional memory could be configured in two modes. Lowoverhead software transactional memory with progress. Perhaps ibm will be able to demonstrate enough benefits with blue gene q to motivate more sophisticated tm systems. Improved single global lock fallback for besteffort hardware. Quantitative comparison of hardware transactional memory for blue geneq, zenterprise ec12, intel core, and power8. To deliver a system that enables users to fully exploit the promise of highperformance computing for both traditional hpc applications and new commercial application areas, the blue geneq system architecture combines hardware and software innovations to overcome traditional bottlenecks, most famously the memory and power walls which have. Transactional memory tm is a promising technique to ease the burden on the programmer, but only recently has become available on commercial hardware in the new blue geneq system and hence the. Hardware transactional memory on bluegene q tm is an opportunistic concurrency control mechanism. Next year, two doe labs are slated to boot up the most powerful blue gene systems ever deployed. System overview the blue geneq system organization is similar to blue genel and blue genep. Aug 22, 2011 at the hot chips conference in santa clara last week, ibm lifted the curtain on its blue gene q soc, which will soon power some of the highest performing supercomputers in the world.

More recently, ibm announced in 2011 that blue gene q had hardware support for both transactional memory and speculative multithreading. Bg q 209 tfrack, 2000 mfw 45nm asic early 2012 ga scales 256 racks, 53. Transactional memory tm 45, 81 provides an alternative synchronization mechanism that is nonblocking, composable, and easier to write than lockbased code 64. Trey cain principal architect marvell semiconductor. In terms of energy efficiency, scalability, reliability and overall tco, the ibm blue gene q clearly leads the pack and has an edge over the other systems as highlighted by the tco analysis presented in this paper. Bgq 209 tfrack, 2000 mfw 45nm asic early 2012 ga scales 256 racks, 53. Serialization management driven performance in besteffort. Hardware transactional memory on bluegeneq tm is an opportunistic concurrency control mechanism. Understanding hardware transactional memory in intels. There is no plan for providing for future updates and corrections to this document.

So n years more till we see software transactional memory with dedicated hardware support. To allow multiple programs to run concurrently, a blue genel system could. Software support and evaluation of hardware transaction memory on blue geneq ieee transactions on computers. Strong llsc provides the ability to do real software transactional memory. Performance characteristics of hardware transactional memory. However, the speedups obtained for the stamp benchmarks on all tm systems we know of are quite limited. Communications of the association for computing machinery, 5111. Ibm power8, the newest ibm power systems processor at the time of this publication, contains support for hardware transactional memory as an important performance feature. International technical support organization ibm system blue gene solution. The third supercomputer design in the blue gene series, blue gene q has a peak performance of 20 petaflops, reaching linpack benchmarks performance of 17 petaflops. What limits the performance of these benchmarks on tms.

This edition applies to version 1, release 1, modification 2 of ibm blue geneq product number 5733bgq. Jobs with high memory requirements but with poor parallel scalabilty are not suitable for the blue gene q architecture. Although the azul 4 and rock 9 processors implemented htm before blue gene q. Bluegeneq is the first commercially available platform that. More recently, ibm announced in 2011 that blue geneq had hardware support for both transactional memory and speculative multithreading. Blue gene is an ibm project aimed at designing supercomputers that can reach operating. Before using this information and the product it supports, read the information in notices on. Aug 31, 2011 most research so far has been done on software based transactional memory implementations. Robust architectural support for transactional memory in. Software support and evaluation of hardware transactional memory on blue. Blue gene q follows the same philosophy as the earlier blue gene l and blue gene p systems, namely to build a massively parallel and highly reliable high performance computing hpc system out of powerefficient processor chips.

Make it easier for htm to gracefully scale to using software tm enable tagging nontransaction accesses to avoid false con. Led architectural definition and performance modeling of watch instruction set. Ibm to power 20 petaflop supercomputer data center knowledge. Evaluation of blue geneq hardware support for transactional. Ibm xl compiler hardware transactional memory builtin.

Michael, in ieee transactions on computers, jan 2015. Building scalable pgas communication subsystem on blue. Software support and evaluation of hardware transactional memory on blue geneq. Ibm also said at the time that this chip would be a variant of the powera2 wirespeed processor, but. This design allows for complex systems implemented as part of the software runtime. Architecturally, bg q installations can scale to over 100 petaflops. This paper describes an endtoend system implementation of the transactional memory tm programming model on top of the hardware transactional memory htm of the blue gene q bg q machine. Blue gene is an ibm project aimed at designing supercomputers that can reach operating speeds in the pflops petaflops range, with low power consumption the project created three generations of supercomputers, blue genel, blue genep, and blue geneq. Quantitative comparison of hardware transactional memory for blue geneq, zenterprise ec12, intel core, and power 8.

1213 1423 659 99 1379 1373 1441 676 1521 1650 548 322 1064 239 162 1485 22 40 1126 918 1035 648 942 1188 520 437 1094 1232