[Crib-list] CCI Talk- Wed. Feb 6, 2013: The Convergence of HPC, BigData, and the Cloud
Shirley Entzminger
daisymae at math.mit.edu
Mon Feb 4 15:22:17 EST 2013
CCI is the Center for Cloud Innovation.
Seminar is at the Hariri Institute at Boston University
URL: http://www.bu.edu/hic/directions/
Title: The Convergence of HPC, BigData, and the Cloud
Speaker: David Cohen
Speaker Affiliation: EMC
DATE: February 6, 2013
TIME: 1:00 PM - 3:00 PM
LOCATION: Hariri Conference Room
Over the past decade or more, the SuperComputing community has refined the
notion of a “Scalable Unit (SU).” This consists of a data center
rack/frame that comes preconfigured with network, storage, and compute
resources in well-defined, balanced ratios. Many of these SUs are
aggregated into larger resource pools via an aggregation layer of
switching infrastructure. The resulting “cluster” provides a partitioning
scheme with supporting software so that per-node resources are
disaggregated and treated independently. Disaggregated resources are
scaled into fabric-wide pools, managed by cluster resource managers that
work in conjunction with job schedulers.
More recently and emerging in parallel, Cloud Computing has refined the
notions of an SU, fabric-based scaling, and resource disaggregation via
virtualization. Certainly, Amazon Web Service (AWS) stands as the
trailblazer. Of note is that AWS’s HPC offerings fielded an HPC cluster
that entered the 2011 annual SuperComputing event at 42nd. Competing
directly with AWS, Google Compute Engine (GCE) and Microsoft Windows Azure
are fast-followers. On the HPC front, Windows Azure’s Big Compute entered
the top500 at 165th at last year’’s annual Super Computing event. Clearly,
these Cloud operators are building infrastructure that can support HPC
workloads.
However, the respective infrastructures of Amazon, Google, and Microsoft
are proprietary systems, closed to innovation from the outside. The
emergence of the OpenStack project is enabling others to transform their
data centers into Cloud infrastructure. Certainly Rackspace stands as an
example while other, smaller entrants include Dreamhost and Endurance
International Group. It is in this context that we pose the questions: Can
this so-called Cloud architecture be employed by the Massachusetts Open
Cloud (MOC) initiative? And if so, can such an architecture satisfy the
demands of HPC and BigData workloads?
Dave Cohen’s Bio
Dave Cohen is a Director at EMC, reporting to John Roese, EMC’s CTO. Dave
is responsible for a variety of activities in the area of network
virtualization, especially as it relates to storage and data management.
He is the consummate technologist with a diverse set of skills and
experiences. Over the course of his tenure at EMC, Dave served as the
Atmos Cloud Storage product group’s acting CTO and most recently provides
technical leadership in the areas of OpenStack, OpenCompute, and Software
Defined Networking. His efforts in these areas have been key to EMC
joining the OpenStack and OpenCompute communities as well as instrumental
to Vmware’s acquisition of Nicira.
Dave joined EMC from Wall Street, where he worked most recently for
Goldman Sachs and previously Merrill Lynch. Over the course of his
30-year career, he has designed, engineered, and successfully
delivered large-scale, distributed systems for numerous enterprises
across industries. Dave is a published author, a sought-after speaker,
and a widely-respected practitioner in the field of distributed
computing.
References
DOE/Sandia Cplant – Concepts (see “Scalable Units”)
http://www.cs.sandia.gov/cplant/project/concepts.html
Barney, “Linux Clusters Overview,” 2013 (see “Cluster Configurations
and Scalable Units”)
https://computing.llnl.gov/tutorials/linux_clusters/
Winett, “Building Fast, Scalable I/O Infrastructures for
High-Performance Computing Clusters,” 2005
http://www.dell.com/downloads/global/power/ps4q05-20050332-DataDirect.pdf
Greenberg et al, “Enabling Department-Scale SuperComputing,” 1997
http://dakota.sandia.gov/papers/DeptScaleSC.pdf
More information about the CRiB-list
mailing list