“The Defense Advanced Research Projects Agency (DARPA) is soliciting proposals for innovative research to maintain technological superiority in the area of content indexing and web search on the Internet,” the agency has announced last week.
Dubbed Memex (a combination of “memory” and “index”) after a hypothetical device described in a 1945 article by Vannevar Bush, director of the US Office of Scientific Research and Development during World War II, the project’s goal is to develop software that will enable domain-specific indexing of public web content and domain-specific search capabilities.
“Today’s web search is limited by a one-size-fits-all approach offered by web-scale commercial providers. They provide a centralized search, which has limitations in the scope of what gets indexed and the richness of available details,” they explained. “For example, common practice misses information in the deep web and ignores shared content across pages. Today’s largely manual search process does not save sessions or allow sharing, requires nearly exact input with one at a time entry, and doesn’t organize or aggregate results beyond a list of links.”
Initially, the program is aimed to help with a key Defense Department mission: fighting human trafficking.
“Human trafficking is a line of business with significant web presence to attract customers and is relevant to many types of military, law enforcement, and intelligence investigations,” they noted. “The use of forums, chats, advertisements, job postings, hidden services, etc., continues to enable a growing industry of modern slavery. An index curated for the counter trafficking domain (which includes labor and sex trafficking), along with configurable interfaces for search and analysis will enable a new opportunity to defeat trafficking enterprises.”
This project has already helped in at least one human trafficking investigation.
“Memex plans to explore three technical areas of interest: domain-specific indexing, domain-specific search, and DoD-specified applications,” DARPA noted, and added that the program is specifically not interested in proposals for attributing anonymous services, deanonymizing or attributing identity to servers or IP addresses, or accessing information not intended to be publicly available.