[Crib-list] Speaker: NOBUAKI TOUNAKA : CRIBB Seminar : Friday, August 2, 2013 -- TIME: 12:00 Noon : Building 66, Room 168

Shirley Entzminger daisymae at math.mit.edu
Fri Jul 26 21:01:11 EDT 2013



 	COMPUTATIONAL RESEARCH in BOSTON and BEYOND SEMINAR (CRIBB)

NOTE Location
-------------

DATE:		FRIDAY, AUGUST 2, 2013
TIME:		12:00 Noon
LOCATION:	MIT Building 66, Room 168  -- (Landau Building)
ADDRESS:	25 Ames Street, Cambridge

Pizza and beverages will be provided at 11:45 AM outside Room 66-168


TITLE:	  How to Analyze 50 Billion Records in Less than a
 	  Second without Hadoop or Big Iron


SPEAKER:  NOBUAKI TOUNAKA
 	  Developer of the Unicage Shell-Based Data Analytics Framework
 	  President and CEO  Universal Shell Programming Laboratory Ltd.


ABSTRACT:

Today if you need to perform complex analytics on datasets of tens to hundreds 
of billions of records in a reasonable amount of time, you either need to set 
up a very large Hadoop cluster or use expensive big iron. Not only does this 
increase the cost of your project, but it also slows you down significantly 
both in the programming phase and in the execution phase.

Nobuaki Tounaka is visiting us from Japan in order to explain the Unicage 
framework together with live demonstrations. Unicage is a complete 
high-performance data analytics package implemented entirely in a Unix shell. 
It consists of a customized shell called the Unicage Shell (ush) based on the 
Bourne shell but with much more robust error handling and better pipelining 
performance. It also includes over 200 Unicage Commands that implement the 
database and analytics functionality. Alongside traditional SQL equivalent 
commands it also provides import/export, data formatting and a complete set of 
statistical tools based on R. However, all of these commands have been 
optimized for high performance and compiled to take maximum advantage of system 
resources.

Unicage is fully consistent with the Unix philosophy; all programs are written 
in shell script so they are extremely easy and fast to develop, making ad-hoc 
analysis from the command line a reality. Unicages clustering technology (BOA 
BigData Oriented Architecture) scales linearly using the Bubun File System 
developed by USP Lab, and therefore does not suffer the diminishing performance 
at scale due to overhead of other cluster technologies such as Hadoop. Unicage 
has native support for parallel processing through the optimized pipelining 
support included in the Unicage Shell, making it ideal for applications such as 
real-time ETL of huge amounts of spatio-temporal data points or real-time 
mathematical analysis of billions of data points using the R toolkit.

Mr. Tounaka will share benchmarks comparing the processing speed of Unicage 
with Hadoop and other technologies.  He will also show live demonstrations of 
several large-scale projects he has conducted jointly with Japanese 
universities, including genomics research and traffic engineering projects 
involving enormous datasets. Free evaluation copies of Unicage will be offered 
to all CRIBB participants starting on July 2, 2013. Simply visit en.usp-lab.com 
and click on Get Unicage Now.

****************************************************************************

Massachusetts Institute of Technology
Cambridge, MA  02139

For more information, please visit...

 			http://math.mit.edu/crib



More information about the CRiB-list mailing list