Expanding Lexicons with Graph Manifolds: A New Path for Semantic Discovery
A novel method called Local Graph-based Dictionary Expansion (LGDE) leverages the geometry of word embeddings and network science to significantly improve keyword discovery for information retrieval tasks. The technique constructs a semantic similarity graph from word embeddings and uses local community detection via graph diffusion to explore complex, nonlinear semantic relationships. This approach captures nuanced word associations beyond simple pairwise similarity, enabling the effective expansion of seed dictionaries. Validated on user-generated English corpora, LGDE outperformed traditional methods based on direct word similarities or co-occurrences, proving particularly useful in a real-world case expanding a conspiracy-related dictionary for communication science research.
Study Significance: For professionals in natural language processing and information retrieval, this method offers a more robust, data-driven tool for tasks like query expansion and online data collection. By moving beyond static word embeddings to capture dynamic semantic neighborhoods, it addresses a key limitation in current text mining and corpus linguistics workflows. This advancement could refine how large language models and other AI systems understand and retrieve contextually relevant information, directly impacting the accuracy of downstream applications like semantic search and topic modeling.
Source →Stay curious. Stay informed — with Science Briefing.
Always double check the original article for accuracy.
