The Office of the Director of National Intelligence (ODNI) is building a computer system capable of automatically analyzing the massive quantities of data gathered across the entire intelligence community and extracting information on specific entities and their relationships to one another. The system which is called Catalyst is part of a larger effort by ODNI to create software and computer systems capable of knowledge management, entity extraction and semantic integration, enabling greater analysis and understanding of complex, multi-source intelligence throughout the government.
Catalyst, a component of DDNI/A’s Analytical Transformation Program, will process unstructured, semistructured, and structured data to produce a knowledge base of entities (people, organizations, places, events, …) with associated attributes and the relationships among them. It will perform functions such as entity extraction, relationship extraction, semantic integration, persistent storage of entities, disambiguation, and related functions (these are defined in the body of the report). The objective of this study is to assess the state-of-the-art and state-of-the-practice in these areas.