[Crib-list] SPEAKER: Gene Cooperman [Computational Research in Boston Seminar -- Friday, September 5, 2008]
Shirley Entzminger
daisymae at math.mit.edu
Thu Aug 28 16:01:58 EDT 2008
COMPUTATIONAL RESEARCH in BOSTON SEMINAR
DATE: FRIDAY, SEPTEMBER 5, 2008
TIME: 12:30 PM
LOCATION: Building 32, Room 144 (Stata Center)
Pizza and beverages will be provided by 12:15 PM.
Title: DISK BASED PARELLEL COMPUTATION AND CHECKPOINT RESTART
Speaker: GENE COOPERMAN (Northeastern University)
ABSTRACT:
This talk represents some joint work of the speaker's High Performance
Computing Laboratory. It highlights two loosely related topics. First, a
vision for disk-based parallel computation is presented, based on our
catch phrase "Disk is the New RAM". As the number of CPU cores grows, the
RAM per CPU core tends to diminish on commodity computers. The solution
is to use the many local disks of a computer cluster. Such a solution has
been used to find a lower bound on solutions to Rubik's cube, among other
applications.
Fifty local disks have approximately the bandwidth of a _single_ RAM
subsystem. Thus, 50 local disks of a cluster have the potential to
emulate a single 50 terabyte RAM subsystem. The obvious fallacy is the
issue of disk latency. The solution is a new run-time library with an
abstraction for many data structures and access methods. The library
hides the awkward low-level plumbing of the data parallelism. Appropriate
language design principles (simpler language constructs are more efficient
than complex constructs) then bias the end user application toward good
latency tolerance.
The second part of this talk then describes a mature user-space
checkpoint-restart system, DMTCP, that transparently supports distributed,
multi-threaded applications. Naturally, DMTCP is fully sufficient to
checkpoint and restart our disk-based applications.
**********************************************************************************
Massachusetts Institute of Technology
Department of Mathematics
Cambridge, MA 02139
http://www-math.mit.edu/crib
For information on CRiB, contact:
Alan Edelman: edelman at math.mit.edu
Steven G. Johnson: stevenj at math.mit.edu
Jeremy Kepner: kepner at ll.mit.edu
More information about the CRiB-list
mailing list