DARPA HIVE – GraphChallenge

The DARPA HIVE Program: Providing Fuel for the Rocket

Large amounts of data are collected from numerous sources, such as social media, sensor feeds, and scientific data. Graph analytics has emerged as a way to understand the relationships between these types of data, allowing analysts to draw conclusions from the patterns in the data and to ask hard questions, which previously there would have been no hope of answering.

By understanding the complex relationships between different data feeds, a more complete picture of the problem can be understood. With lessons learned from innovations in the expanding realm of deep neural networks, the Defense Agencies Research Program Agency’s (DARPA) Hierarchical Identify Verify Exploit (HIVE) program seeks to advance graph analytics.

The DARPA HIVE program is looking to build a graph analytics processor that can process streaming graphs 1000X faster and at much lower power than current processing technology. This will provide the rocket for the advancement of graph analytics. In parallel with the development of the HIVE processor, DARPA is hosting the HIVE challenge to develop a trillion-edge dataset with solutions to provide the fuel for that rocket. The goal is to accelerate innovation in graph analytics to open new pathways for meeting the challenge of understanding an ever-increasing torrent of data.

Organizers will provide specifications, datasets, data generators, and serial implementations in various languages to participants. As part of the Challenge, AWS and DARPA have entered into a collaborative agreement which represents the first Department of Defense (DoD) Agency to participate in the AWS Public Datasets program. Additionally, eligible researchers doing work with the DARPA HIVE Challenge are encouraged to apply for AWS usage credits via the AWS Cloud Credits for Research program.

“The right dataset is crucial to innovation. Our participation in the AWS Public Datasets program will allow researchers to test algorithms with real and simulated data made available at no cost via AWS, “ said Trung Tran, DARPA MTO Program Manager. “This is particularly valuable to researchers who may not otherwise have been able to compete. With AWS, researchers will now be able to deploy their own clusters in the cloud without needing access to a physical data center.”

There are two initial challenges.

The first is a static graph problem focused on sub-graph Isomorphism. This provides the ability to search a large graph in order to identify a particular sub section of that graph.
The second is a streaming graph problem focused on finding the optimal partition for the graph at each stage of the observation. Both of these challenges will begin with the goal of processing large graphs with billions of edges and move towards even larger graphs with trillions of edges.