[Crib-list] Speaker: NADYA BLISS -- Computational Research in Boston Seminar -- Friday, May 2, 2008 -- TIME: 12:30 PM -- Room 32-144

Wed Apr 23 18:35:46 EDT 2008

 		COMPUTATIONAL RESEARCH in BOSTON SEMINAR

DATE:		Friday, MAY 2, 2008
TIME:		12:30 PM
LOCATION:	Building 32, Room 144  (Stata Center)

(Pizza and beverages will be provided at 12:15 PM outside room.)

TITLE:		SMaRT: Sparse Mapping and Routing Toolbox

SPEAKER:	NADYA BLISS  (MIT - Lincoln Lab)

ABSTRACT:

Over the last few years, MIT Lincoln Laboratory has been developing 
techniques for automatic algorithm-to-architecture mapping of signal 
processing applications.  Due to the nature of signal processing 
operations, modeling, program analysis, and mapping techniques have been 
limited to dense matrix computations.  As more post-processing algorithms 
move to the sensor front end, efficient parallel implementations of those 
algorithms for sparse data are becoming necessary.  Post-processing 
algorithms are typically represented in graphical form, but can also be 
formulated using sparse matrix notation.  The efficiency of the graph 
algorithms is often only a small fraction (< 0.01) of the peak throughput 
of a conventional architecture, and the efficiency deteriorates as the 
problem size increases.  Furthermore, graph algorithms are not well suited 
for parallelization.  Sparse matrix formulations expose intrinsic 
algorithm parallelism; however, efficient mappings are still needed to 
exploit the parallelism and deliver desirable efficiency.

This talk will motivate the need for efficient sparse matrix processing 
with an example from the graph algorithm community and highlight a key 
computational kernel: sparse matrix-matrix multiply.  We will first 
introduce a high-level overview of our dense mapping framework and 
highlight its limitations with regards to parallelization of sparse 
computations.  We will then show how SMaRT overcomes these limitations 
using fine-grain architecture modeling, dependency analysis, mapping and 
routing techniques.  Finally, we will present results for a number of 
implementations of the matrix multiply kernel.  Our results show over an 
order of magnitude performance improvement over traditional sparse 2D 
block-cyclic mappings.  This performance improvement can be attributed to 
detailed runtime analysis of the computation, sensitivity to the sparsity 
of the data, co-optimized map and route determination, and the detailed 
hardware modeling capability.

This is joint work with Una-May O'Reilly at CSAIL and Sanjeev Mohindra at 
Lincoln Laboratory.

This work is sponsored by the Department of the Air Force under Air Force 
contract FA8721-05-C-0002. Opinions, interpretations, conclusions and 
recommendations are those of the author and are not necessarily endorsed 
by the United States Government

*****************************************************************************

Massachusetts Institute of Technology
Cambridge, MA  02139

http://www-math. mit.edu/crib

For information on CRiB, contact:

Alan Edelman:  edelman at math.mit.edu
Steven G. Johnson:  stevenj at math.mit.edu
Jeremy Kepner:  kepner at ll.mit.edu
Patrick Dreher:  dreher at mit.edu