A thesaurus is a book that lists words grouped together according to similarity of meaning (containing synonyms and sometimes antonyms), in contrast to a dictionary, which contains definitions and pronunciations. The largest thesaurus in the world is the Historical Thesaurus of the Oxford English Dictionary, which contains more than 920,000 words. (Wikipedia)
Semantics assistance by thesaurus
A thesaurus is defined by a lexicon and relations between words.The thesaurus is used for: Clarifying the query, fighting against noise and silence, deal with user’s or application’s specific vocabulary.
The strong coupling between the thesaurus management functions ( development, consultation, use of autopostage (semantic expansion), update ...) and the functions of full text indexing and search, optimizes overall system performance at all levels.
Several relations can be defined by thesaurus:
- abbreviations and acronyms
- hierarchy and polyhierarchy
- orthographical variants
What is noise and silence in Information Retrieval?
When searching for information, noise corresponds to the irrelevant information that you get.
Silence corresponds to relevant information that you don't get.
Technologies such as the thesaurus are used to limit noise and silence in search engine solutions.