RetrievalWare

RetrievalWare
Developer(s)Fast Search & Transfer, Convera, Excalibur Technologies, ConQuest Software, Microsoft
Stable release
8.2 / October 13, 2006 (2006-10-13)
Written inC, C++, Java
Operating systemCross-platform
TypeSearch and Index

RetrievalWare is an enterprise search engine emphasizing natural language processing and semantic networks which was commercially available from 1992 to 2007 and is especially known for its use by government intelligence agencies.[1]

History

RetrievalWare was initially created by Paul Nelson,[2] Kenneth Clark,[3] and Edwin Addison[4] as part of ConQuest Software. Development began in 1989, but the software was not commercially available on a wide scale until 1992. Early funding was provided by Rome Laboratory via a Small Business Innovation Research grant.[5]

On July 6, 1995, ConQuest Software was merged with the NASDAQ company, Excalibur Technologies[6] and the product was rebranded as RetrievalWare. On December 21, 2000, Excalibur Technologies was combined with Intel Corporation's Interactive Media Services division to form the Convera Corporation.[7] Finally, on April 9, 2007, the RetrievalWare software and business was purchased by Fast Search & Transfer at which point the product was officially retired.[8] Microsoft Corporation continues to maintain the product for its existing customer base.

Annual revenues for RetrievalWare peaked in 2001 at around $40 million US dollars.[9]

Use of natural language techniques

RetrievalWare is a relevancy ranking text search system with processing enhancements drawn from the fields of natural language processing (NLP) and semantic networks. NLP algorithms include dictionary-based stemming (also known as lemmatisation) and dictionary-based phrase identification. Semantic networks are used by RetrievalWare to expand the query words entered by the user to related terms with terms weights determined by the distance from the user's original terms. In addition to automatic expansion, a feedback-mode whereby users could choose the meaning of the word before performing the expansion was available. The first semantic networks were built using WordNet.

In addition, RetrievalWare implemented a form of n-gram search (branded as APRP - Adaptive Pattern Recognition Processing[10]), designed to search over documents with OCR errors. Query terms are divided into sets of 2-grams which are used to locate similarly matching terms from the inverted index. The resulting matches are weighted based on similarly measures and then used to search for documents.

All of these features were available no later than 1993[11] and ConQuest software has claimed that it was the first commercial text-search system to implement these techniques.[12]

Other notable features

Other notable features of RetrievalWare include distributed search servers,[11] synchronizers for indexing external content management systems and relational databases,[13] a heterogeneous security model,[13] document categorization,[13] real-time document-query matching (profiling),[11] multi-lingual searches (queries containing terms from multiple languages searching for documents containing terms from multiple languages), and cross-lingual searches (queries in one language searching for documents in a different language).[14]

Participation in TREC

RetrievalWare participated in the Text REtrieval Conference in 1992 (TREC-1), 1993 (TREC-2), and 1995 (TREC-4).[15]

In TREC-1[16] and TREC-4,[17] the RetrievalWare runs for manually entered queries produced the best results based on the 11-point averages over all search engines which participated in the ad hoc category where search engines are allowed a single opportunity to process previously unknown queries against an existing database.

References

  1. ^ Vise, David A. (2004-12-03). "Agencies Find What They're Looking For". The Washington Post. Retrieved 2010-05-22.[dead link]
  2. ^ "Paul Nelson, Innovation Lead, Content Analytics at Accenture Analytics". Retrieved 1 December 2020.
  3. ^ "Arden & Ken". comcast.net. 23 July 2011. Archived from the original on 2011-07-23.
  4. ^ "Ed Addison, Serial Entrepreneur, Venture Capitalist, Business Executive, Professor".
  5. ^ . John McGrath joined the company in 1993 as VP of Sales and Marketing. The company quickly grew revenue from U.S. federal contracts, publishers, and enterprise customers requiring advanced text retrieval accuracy and performance. FY 1991 SBIR SOLICITATION - PHASE I AWARD ABSTRACTS - AIR FORCE PROJECTS - VOLUME III (PDF), 1992-07-06, pp. 70–71, archived from the original (PDF) on June 4, 2011 - Note that "Synchronetics" was the original name for ConQuest Software Incorporated.
  6. ^ "Excalibur Technologies to merge with ConQuest Software; text and multimedia information retrieval leaders join forces to expand products, channels and markets" (Press release). Business Wire. 1995-07-06.
  7. ^ "Intel and Excalibur Form Convera Corporation". Silicon Valley / San Jose Business Journal. 2000-12-21.
  8. ^ "FAST Acquires Convera's RetrievalWare Business". Information Today, Inc. 2007-04-09. While FAST will continue to support the RetrievalWare platform, it will not continue development on it or add new features. RetrievalWare customers will be offered an upgrade path to FAST's own offering.
  9. ^ Convera Corp · 10-K · For 1/1/01, 2001-01-01 - Indicates that Convera products accounted for 85% of the total revenue of $51.5 million.
  10. ^ Excalibur Announces Excalibur RetrievalWare 6.5 Featuring RetrievalWare FileRoom - Contains a description of APRP
  11. ^ a b c Site Report for the Text REtrieval Conference by ConQuest Software Inc. (TREC2) - Find the complete proceedings here
  12. ^ "Homework Helper debuts on Prodigy using ConQuest search engine" (Press release). Business Wire. 1995-02-09. ConQuest is the only search engine which uses dictionaries, thesauri and other lexical resources to build in a semantic knowledgebase of over 440,000 word meanings, and 1.6 million word relationships.
  13. ^ a b c "Excalibur RetrievalWare: more than information retrieval". KMWorld. 1999-10-01.
  14. ^ "Multimedia search, retrieval, categorization". KMWorld. 2002-03-25.
  15. ^ Flank, Sharon (1998). "A Layered Approach to NLP-Based Information Retrieval". Proceedings of the 36th annual meeting on Association for Computational Linguistics -. Vol. 1. dl.acm.org. p. 397. doi:10.3115/980845.980913. S2CID 581537. Retrieved 1 December 2020.
  16. ^ Site Report for the Text REtrieval Conference by ConQuest Software Inc. (TREC-1) - Find the complete proceedings here
  17. ^ The Excalibur TREC-4 System, Preparations, and Results - A PDF version of which can be found here Archived 2010-11-27 at the Wayback Machine and the complete proceedings can be found here