[Crib-list] TODAY: SPEAKER: Brett Smith (Curoverse, Inc.) -- Computational Research in Boston and Beyond Seminar (CRIBB) -- Friday, September 11, 2015 -- TIME: 12Noon in Building 32, Room 141 (Stata)

Shirley Entzminger daisymae at math.mit.edu
Fri Sep 11 10:12:27 EDT 2015

 	T O D A Y . . .


DATE:		Friday, September 11, 2015
TIME:		12 Noon
LOCATION:	Building 32, Room 141
 		(Stata Center - 32 Vassar Street, Cambridge)

Pizza and beverages will be provided at 11:45 AM outside Room 32-141.

TITLE:		Arvados: A Free Software Platform for Big Data Science

SPEAKER:	BRETT SMITH   (Curoverse, Inc.)


Large-scale bioinformatics such as genomics requires the application of cluster 
computing, with many nodes working in parallel to produce results in a 
reasonable amount of time.  When a compute job draws on terabytes of data, uses 
days compute time, and produces thousands of files, robust management of data 
sets and the analysis tools used on them is essential to avoid errors that may 
lead lead to wasted effort or invalid results. To best serve the needs of 
science, computing platforms should be designed from the ground up to achieve 
data integrity, provenance, and computational reproducibility.

This talk will introduce the Arvados (http://arvados.org) platform for data 
science.  Arvados  is  a software system for managing compute clusters built 
around  a scale-out content-addressed distributed file system (Arvados  Keep) 
for storage, a cluster  job queuing  system designed  for reproducibility 
(Arvados  Crunch), and a user and group permission system for controlling and 
sharing access to those resources. Arvados provides web based and command line 
tools for transferring, managing, sharing, and computing on very large data 

Arvados is designed to scale from a single laptop to cluster and cloud based 
deployments with dozens of nodes.  Arvados is also designed to federate with 
other Arvados instances, with easy transfer of data and computation between 
instances.   For example, only a single command arv-copy is required to copy a 
complex  computation  pipeline  from a laptop  to a  cluster  or cloud instance 
(or between instances), where that computation can be run immediately with no 
additional provisioning or configuration on the target system.  The Arvados 
project is also a founding member of the Common Workflow Language working 
group, and provides robust support for running computational workflows that are 
portable across multiple vendor platforms.

This talk will describe the Arvados architecture, describe how Arvados has been 
used successfully in research, and how interested participants can download and 
try Arvados for themselves and join the community.  Arvados is free software, 
with services licensed under the GNU Affero General Public License version 3, 
with SDKs under the Apache License 2.0.


Massachusetts Institute of Technology
Cambridge, MA

For more information about the 'Computational Research in Boston and Beyond 
Seminar' (CRIBB), please visit...


More information about the CRiB-list mailing list