[Crib-list] SPEAKER: Gene Cooperman [Computational Research in Boston Seminar -- Friday, September 5, 2008]
Shirley Entzminger
daisymae at math.mit.edu
Thu Sep 4 14:06:03 EDT 2008
COMPUTATIONAL RESEARCH in BOSTON SEMINAR
DATE: FRIDAY, SEPTEMBER 5, 2008
TIME: 12:30 PM
LOCATION: Building 32, Room 144 (Stata Center)
Pizza and beverages will be provided by 12:15 PM.
Title: DISK BASED PARALLEL COMPUTATION AND CHECKPOINT RESTART
Speaker: GENE COOPERMAN (Northeastern University)
ABSTRACT:
This talk represents some joint work of the speaker's High Performance
Computing Laboratory. It highlights two loosely related topics. First, a
vision for disk-based parallel computation is presented, based on our catch
phrase "Disk is the New RAM". As the number of CPU cores grows, the RAM per
CPU core tends to diminish on commodity computers. The solution is to use the
many local disks of a computer cluster. Such a solution has been used to find
a lower bound on solutions to Rubik's cube, among other applications.
Fifty local disks have approximately the bandwidth of a _single_ RAM subsystem.
Thus, 50 local disks of a cluster have the potential to emulate a single 50
terabyte RAM subsystem. The obvious fallacy is the issue of disk latency. The
solution is a new run-time library with an abstraction for many data structures
and access methods. The library hides the awkward low-level plumbing of the
data parallelism. Appropriate language design principles (simpler language
constructs are more efficient than complex constructs) then bias the end user
application toward good latency tolerance.
The second part of this talk then describes a mature user-space
checkpoint-restart system, DMTCP, that transparently supports distributed,
multi-threaded applications. Naturally, DMTCP is fully sufficient to
checkpoint and restart our disk-based applications.
**********************************************************************************
Massachusetts Institute of Technology
Department of Mathematics
Cambridge, MA 02139
http://www-math.mit.edu/crib
For information on CRiB, contact:
Alan Edelman: edelman at math.mit.edu
Steven G. Johnson: stevenj at math.mit.edu
Jeremy Kepner: kepner at ll.mit.edu
Patrick Dreher: dreher at mit.edu
More information about the CRiB-list
mailing list