Seminars on "Multicore and manycore architecture at MIT"

DEI-Conference Room
January 28th, 2010

Talk 1:
Speaker: David Wentzlaff
Title: A Unified Operating System for Clouds and Manycore: fos
Single chip processors with thousands of cores will be available in the next ten years and clouds of multicore processors afford the operating system designer thousands of cores today. Constructing operating systems for manycore and cloud systems face similar challenges. This work identiŽes these shared challenges and introduces our solution: a factored operating system (fos) designed to meet the scalability, faultiness, variability of demand, and programming challenges of OSÕs for single-chip thousand-core manycore systems as well as current day cloud computers. Current monolithic operating systems are not well suited for manycores and clouds as they have taken an evolutionary approach to scaling such as adding Žne grain locks and redesigning subsystems, however these approaches do not increase scalability quickly enough. fos addresses the OS scalability challenge by using a message passing design and is composed out of a collection of Internet inspired servers. Each operating system service is factored into a set of communicating servers which in aggregate implement a system service. These servers are designed much in the way that distributed Internet services are designed, but provide traditional kernel services instead of Internet services. Also, fos embraces the elasticity of cloud and manycore platforms by adapting resource utilization to match demand. fos facilitates writing applications across the cloud by providing a single system image across both future 1000+ core manycores and current day Infrastructure as a Service cloud computers. In contrast, current cloud environments do not provide a single system image and introduce complexity for the user by requiring different programming models for intra- vs inter-machine communication, and by requiring the use of non-OS standard management tools.
Short Bio
David Wentzlaff is a PhD candidate at MIT and is a co-founder of Tilera Corporation. At Tilera, he was lead architect and helped design the scalable TILE processor architecture. He worked on the Raw project at MIT, and now is working on designing scalable operating systems for thousand core multicores and cloud computers. David has a MS in EECS from MIT and a BS in EE from UIUC. He enjoys hiking and mountaineering when not designing multicore processors or operating systems.

Talk 2
Speaker: Charles Gruenwald III
Title: Graphite: A Distributed Parallel Simulator for Multicores
This talks introduces the Graphite open-source distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development. Several techniques are used to achieve this including: direct execution, seamless multicore and multi-machine distribution, and lax synchronization. Graphite is capable of accelerating simulations by distributing them across multiple commodity Linux machines. When using multiple machines, it provides the illusion of a single process with a single, shared address space, allowing it to run off-the-shelf pthread applications with no source code modiŽcation. Our results demonstrate that Graphite can simulate target architectures containing over 1000 cores on ten 8-core servers. Performance scales well as more machines are added with near linear speedup in many cases. Simulation slowdown is as low as 41 x versus native execution.
Short Bio
Charles Gruenwald III is a 2nd year graduate student at MIT CSAIL. While at the University of Colorado he worked on wireless sensor networks, embedded operating systems and reconfigurable computing. Following his masters at CU he spent some time working at a software contracting company called BoulderLabs. Since coming to MIT Charles has focused his research efforts on large scale parallel programming systems for multicore and distributed computation. While not working on research he enjoys outdoor activities such as snowboarding, mountain biking, climbing and scuba diving

Talk 3
Speaker: Hank Hoffmann
Title: Using Dynamic Loop Perforation to Adapt to Changing Computing Environments
This talk presents SpeedGuard, a runtime system that allows applications to meet performance goals in challenging execution environments such as multicore machines that suffer core failures or machines that dynamically adjust the clock speed to reduce power consumption or to protect the machine from overheating. SpeedGuard automatically monitors and maintains real-time application performance by dynamically switching to an alternative implementations that trades accuracy for performance. If SpeedGuard detects a drop in performance below the desired threshold, it dynamically switches to another version of the application that increases performance while minimizing accuracy loss. If performance later recovers, the runtime dynamically adapts to switch the application back to the original version. Our experimental results show how me combine SpeedGuard with loop perforation (which discards iterations of time-consuming loops) to meet real-time performance goals in the face of core failures during the execution of a multithreaded application running on an multicore machine. Our results also show that the combination of SpeedGuard and loop perforation enables applications to dynamically adapt to a variety of environmental changes that reduce computational capacity including core failures and clock frequency changes, which may occur occur in response to overheating (e.g. due to fan failure) or because of energy constraints (e.g. low battery).
Short Bio
Henry Hoffmann received a B.S. degree in mathematical sciences from the University of North Carolina at Chapel Hill in 1999 and an S.M. degree in electrical engineering and computer science from the Massachusetts Institute of Technology in 2003. He spent several years at a startup company, Tilera Corporation, building multicore core processors for embedded video, DSP, and networking applications. He is currently a Ph.D. student at MIT where he is doing research to make multicore programming more accessible.

Talk 4
Speaker: Jonathan Eastep
Title: Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling
As multicore processors become increasingly prevalent, system complexity is skyrocketing. The advent of the asymmetric multicore compounds this -- it is no longer practical for an average programmer to balance the system constraints associated with today's multicores and worry about new problems like asymmetric partitioning and thread interference. Adaptive, or self-aware, computing has been proposed as one method to help application and system programmers confront this complexity. These systems take some of the burden off of programmers by monitoring themselves and optimizing or adapting to meet their goals. This talk presents an open-source self-aware synchronization library for multicores and asymmetric multicores called Smartlocks. Smartlocks is a spin-lock library that adapts its internal implementation during execution using heuristics and machine learning to optimize toward a user-defined goal, which may relate to performance, power, or other problem-specific criteria. Smartlocks builds upon adaptation techniques from prior work like reactive locks, but introduces a novel form of adaptation designed for asymmetric multicores that we term lock acquisition scheduling. Lock acquisition scheduling is optimizing which waiter will get the lock next for the best long-term effect when multiple threads (or processes) are spinning for a lock. Our results demonstrate empirically that lock scheduling is important for asymmetric multicores and that Smartlocks significantly outperform conventional and reactive locks for asymmetries like dynamic variations in processor clock frequencies caused by thermal throttling events.
Short Bio
Jonathan Eastep is a PhD student at the Massachusetts Institute of Technology in the Computer Science and Artificial Intelligence Laboratory working under Professor Anant Agarwal in the area of Computer Architecture and Systems. He completed his BS at the University of Texas at Austin in 2004 in Electrical and Computer Engineering, his MS at the Massachusetts Institute of Technology in 2007 in Computer Science, and will complete his PhD in 2011 in Computer Science with a minor concentration in Quantum Information and Computation. While at MIT, some of his research projects include the Raw Microprocessor, one of the first multicore architectures invented; Graphite, a parallel simulator for architectural and programming studies of kilo-core multicores; and Smartlocks, an adaptive synchronization library for improving performance and programmability of asymmetric multicores. His future research interests are in designing adaptive computing software and hardware systems.