Massive Graphs in Clusters
Many of today's data-intensive application domains,
including searches on social networks like Facebook and protein
matching in bioinformatics, require us to answer complex queries on
highly-connected data. This data is often represented as a graph of
data objects densely connected by edges. The UCSB Massive Graphs in
Clusters (MAGIC) project is focused on developing software
infrastructure that can efficiently answer queries on extremely
large graph datasets. The MAGIC software will provide an easy to
use interface for searching and analyzing data, and manage the
processing of these queries to efficiently take advantage of
computing resources like large data centers.
Advances from this project will include techniques to distribute and manage data across large datacenters, high-level interface for writing graph queries, and a software infrastructure that builds on and extends cluster-based software systems such as MapReduce and Dryad.
The MAGIC project is funded by the National Science Foundation under the CLuE (Cluster Exploratory) program.