[Crib-list] SPEAKER: Brett Smith (Curoverse, Inc.) -- Computational Research in Boston and Beyond Seminar (CRIBB) -- Friday, September 11, 2015 -- TIME: 12Noon in Building 32, Room 141 (Stata)

Shirley Entzminger daisymae at math.mit.edu
Tue Sep 8 14:23:26 EDT 2015



 			COMPUTATIONAL RESEARCH in BOSTON and BEYOND Seminar


DATE:		Friday, September 11, 2015
TIME:		12 Noon
LOCATION:	Building 32, Room 141
 		(Stata Center - 32 Vassar Street, Cambridge)

Pizza and beverages will be provided at 11:45 AM outside Room 32-141.


TITLE:		Arvados: A Free Software Platform for Big Data Science


SPEAKER:	BRETT SMITH   (Curoverse, Inc.)


ABSTRACT:


Large-scale bioinformatics such as genomics requires the application of 
cluster computing, with many nodes working in parallel to produce results 
in a reasonable amount of time.  When a compute job draws on terabytes of 
data, uses days compute time, and produces thousands of files, robust 
management of data sets and the analysis tools used on them is essential 
to avoid errors that may lead lead to wasted effort or invalid results. 
To best serve the needs of science, computing platforms should be designed 
from the ground up to achieve data integrity, provenance, and 
computational reproducibility.

This talk will introduce the Arvados (http://arvados.org) platform for 
data science.  Arvados  is  a software system for managing compute 
clusters built around  a scale-out content-addressed distributed file 
system (Arvados  Keep) for storage, a cluster  job queuing  system 
designed  for reproducibility (Arvados  Crunch), and a user and group 
permission system for controlling and sharing access to those resources. 
Arvados provides web based and command line tools for transferring, 
managing, sharing, and computing on very large data sets.

Arvados is designed to scale from a single laptop to cluster and cloud 
based deployments with dozens of nodes.  Arvados is also designed to 
federate with other Arvados instances, with easy transfer of data and 
computation between instances.   For example, only a single command 
arv-copy is required to copy a complex  computation  pipeline  from a 
laptop  to a  cluster  or cloud instance  (or between instances), where 
that computation can be run immediately with no additional provisioning or 
configuration on the target system.  The Arvados project is also a 
founding member of the Common Workflow Language working group, and 
provides robust support for running computational workflows that are 
portable across multiple vendor platforms.

This talk will describe the Arvados architecture, describe how Arvados has 
been used successfully in research, and how interested participants can 
download and try Arvados for themselves and join the community.  Arvados 
is free software, with services licensed under the GNU Affero General 
Public License version 3, with SDKs under the Apache License 2.0.

=========================================================

Massachusetts Institute of Technology
Cambridge, MA


For more information about the 'Computational Research in Boston and 
Beyond Seminar' (CRIBB), please visit...

                                 http://math.mit.edu/crib/







More information about the CRiB-list mailing list