[Crib-list] SPEAKER: Gene Cooperman [Computational Research in Boston Seminar -- Friday, September 5, 2008]

Shirley Entzminger daisymae at math.mit.edu
Fri Sep 5 11:12:24 EDT 2008


T O D A Y . . .

 		COMPUTATIONAL RESEARCH in BOSTON SEMINAR


DATE:		FRIDAY, SEPTEMBER 5, 2008
TIME:		12:30 PM
LOCATION:	Building 32, Room 144  (Stata Center)

Pizza and beverages will be provided by 12:15 PM.


Title:		DISK BASED PARALLEL COMPUTATION AND CHECKPOINT RESTART

Speaker:	GENE COOPERMAN  (Northeastern University)


ABSTRACT:

This talk represents some joint work of the speaker's High Performance 
Computing Laboratory.  It highlights two loosely related topics.  First, a 
vision for disk-based parallel computation is presented, based on our catch 
phrase "Disk is the New RAM".  As the number of CPU cores grows, the RAM per 
CPU core tends to diminish on commodity computers.  The solution is to use the 
many local disks of a computer cluster.  Such a solution has been used to find 
a lower bound on solutions to Rubik's cube, among other applications.

Fifty local disks have approximately the bandwidth of a _single_ RAM subsystem. 
Thus, 50 local disks of a cluster have the potential to emulate a single 50 
terabyte RAM subsystem.  The obvious fallacy is the issue of disk latency.  The 
solution is a new run-time library with an abstraction for many data structures 
and access methods.  The library hides the awkward low-level plumbing of the 
data parallelism.  Appropriate language design principles (simpler language 
constructs are more efficient than complex constructs) then bias the end user 
application toward good latency tolerance.

The second part of this talk then describes a mature user-space 
checkpoint-restart system, DMTCP, that transparently supports distributed, 
multi-threaded applications.  Naturally, DMTCP is fully sufficient to 
checkpoint and restart our disk-based applications.

**********************************************************************************

Massachusetts Institute of Technology
Department of Mathematics
Cambridge, MA  02139



http://www-math.mit.edu/crib

For information on CRiB, contact:

Alan Edelman:  edelman at math.mit.edu
Steven G. Johnson:  stevenj at math.mit.edu
Jeremy Kepner:  kepner at ll.mit.edu
Patrick Dreher:  dreher at mit.edu



More information about the CRiB-list mailing list