[Crib-list] SPEAKERS: Jeremy Kepner, Vijay Gadepally, et al -- Computational Research in Boston and Beyond Seminar (CRIBB) -- Friday, Feb. 7, 2014 -- TIME: 12:00 Noon in Bldg. 32, Room 141 (Stata Center)

Shirley Entzminger daisymae at math.mit.edu
Tue Feb 4 10:48:05 EST 2014



 		   COMPUTATIONAL RESEARCH in BOSTON and BEYOND SEMINAR


DATE:		FRIDAY, FEBRUARY 7, 2014
DATE:		12:00 Noon
LOCATION:	Building 32, Room 141   (Stata Center)

 	Pizza will be provided at 11:45 AM outside Room 32-141


TITLE:		Computing on Masked Big Data


SPEAKERS:	Jeremy Kepner, Vijay Gadepally, Pete Michaleas,
 		Nabil Schear, Mayank Varia   (MIT-Lincoln Laboratory)


ABSTRACT:

The growing gap between data and users calls for innovative tools that 
address the challenges faced by big data volume, velocity and variety. 
Along with these three Vs of big data, an increasingly important fourth 
challenge is veracity.  Big data volume stresses the storage, memory, and 
compute capacity of a computing system and requires access to a computing 
cloud.  The velocity of big data stresses the rate at which data can be 
absorbed and meaningful answers produced.  Big data variety requires vast 
quantities of highly diverse data (text, computer logs, and social media 
data, etc.) to be automatically ingested. Traditional techniques for 
assuring the veracity of data incur overheads that are often too large to 
apply to big data, and there is increasing interest in investigating 
alternative techniques.  Computing on Masked Data (CMD) is one such low 
overhead technique that allows data to be masked, operated on, and then 
unmasked when the answers are desired.  CMD relies on the sparse linear 
algebra of associative arrays to transform computations from a space where 
+ and * are the primary low-level operations to one where =, >, and < are 
the primary low-level operations.  Databases with strong support of sparse 
operations (such as SciDB or Apache Accumulo) are ideally suited to this 
technique.  A demonstration of the technique on DNA sequence data shows 
how DNA data can be masked, a complex DNA matching algorithm can be 
performed on the masked DNA data, and the result can be unmasked to reveal 
the true answer.  CMD can be performed with significantly less overhead 
than other approaches while also supporting a full range of linear 
algebraic operations on the masked data.

*********************************************************************************

Massachusetts Institute of Technology
Cambridge, MA

For more information, please visit...

 			http://math.mit.edu/crib/





More information about the CRiB-list mailing list