[Crib-list] SPEAKER: Constantinos Evangelinos (MIT) --- Computational Research in Boston Seminar --- Friday, February 5, 2010 --- 12:30 PM -- Room 32-124 (new location)
Shirley Entzminger
daisymae at math.mit.edu
Fri Feb 5 09:46:40 EST 2010
IMPORTANT NOTE: Due to a academic class, Schedules Office changed the
room for the CRiB seminar this Spring 2010 -- new location is Room 32-124.
***********************************************************************
COMPUTATIONAL RESEARCH in BOSTON SEMINAR
DATE: Friday, February 5, 2010
TIME: 12:30 PM
LOCATION: Building 32, Room 124 (new location)
Refreshments will be provided outside Room 32-124 at 12:15 PM.
TITLE: Scientific Computing on the Cloud: Many Task Computing
and other opportunities
SPEAKER: Constantinos Evangelinos (MIT)
Abstract:
Over the past few years the application of Cloud Computing to scientific
and not simply business uses has been mainly in the areas of
bioinformatics. The usual Cloud science application is one that is
essentially embarrassingly parallel (parameter studies etc.) and in many
cases expressed in the usual map-reduce paradigms that the Cloud has made
so popular.
We set out to explore a wider class of applications that can benefit from
the type of resources a commercial Cloud provider such as Amazon EC2
offers, starting from the lower hanging fruit of loosely coupled
applications.
Error Subspace Statistical Estimation (ESSE), an uncertainty prediction
and data assimilation methodology employed for real-time ocean forecasts,
is based on a characterization and prediction of the largest
uncertainties. This is carried out by evolving an error subspace of
variable size. We use an ensemble of stochastic model simulations,
initialized based on an estimate of the dominant initial uncertainties, to
predict the error subspace of the model fields. The ESSE procedure is a
classic case of Many Task Computing: These codes are managed based on
dynamic workflows for (i) the perturbation of the initial mean state, (ii)
the subsequent ensemble of stochastic PE model runs, (iii) the continuous
generation of the covariance matrix, (iv) the successive computations of
the SVD of the ensemble spread until a convergence criterion is satisfied,
and (v) the data assimilation. Its ensemble nature makes it a many task
data intensive application and its dynamic workflow gives it
heterogeneity. Subsequent acoustics propagation modeling involves a very
large ensemble of very short in duration acoustics runs.
We study the execution characteristics and challenges of a distributed
ESSE workflow on a large dedicated cluster and the usability of enhancing
this with runs on Amazon EC2 and the Teragrid and the I/O challenges
faced.
We then proceed to look into more closely coupled applications and the
issues they face on Amazon.
************************************************************************
Massachusetts Institute of Technology
Cambridge, MA 02139
More information about the CRiB-list
mailing list