skip to main content
10.1145/1242572.1242675acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

Measuring semantic similarity between words using web search engines

Published: 08 May 2007 Publication History
First page of PDF

References

[1]
{1} A. Bagga and B. Baldwin. Entity-based cross document coreferencing using the vector space model. In Proc. of 36th COLING-ACL, pages 79-85, 1998.
[2]
{2} Z. Bar-Yossef and M. Gurevich. Random sampling from a search engine's index. In Proceedings of 15th International World Wide Web Conference, 2006.
[3]
{3} R. Bekkerman and A. McCallum. Disambiguating web appearances of people in a social network. In Proceedings of the World Wide Web Conference (WWW), pages 463-470, 2005.
[4]
{4} D. Bollegala, Y. Matsuo, and M. Ishizuka. Disambiguating personal names on the web using automatically extracted key phrases. In Proc. of the 17th European Conference on Artificial Intelligence, pages 553-557, 2006.
[5]
{5} C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec 3. In Proc. of 3rd Text REtreival Conference, pages 69-80, 1994.
[6]
{6} H. Chen, M. Lin, and Y. Wei. Novel association measures using web search with double checking. In Proc. of the COLING/ACL 2006, pages 1009-1016, 2006.
[7]
{7} P. Cimano, S. Handschuh, and S. Staab. Towards the self-annotating web. In Proc. of 13th WWW, 2004.
[8]
{8} J. Curran. Ensemble menthods for automatic thesaurus extraction. In Proc. of EMNLP, 2002.
[9]
{9} D. R. Cutting, J. O. Pedersen, D. Karger, and J. W. Tukey. Scatter/gather: A cluster-based approach to browsing large document collections. In Proceedings SIGIR '92, pages 318-329, 1992.
[10]
{10} M. Fleischman and E. Hovy. Multi-document person name resolution. In Proceedings of 42nd Annual Meeting of the Association for Computational Linguistics (ACL), Reference Resolution Workshop, 2004.
[11]
{11} H. Han, H. Zha, and C. L. Giles. Name disambiguation in author citations using a k-way spectral clustering method. In Proceedings of the International Conference on Digital Libraries, 2005.
[12]
{12} M. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proc. of 14th COLING, pages 539-545, 1992.
[13]
{13} J. Jiang and D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In Proc. of the International Conference on Research in Computational Linguistics ROCLING X, 1998.
[14]
{14} F. Keller and M. Lapata. Using the web to obtain frequencies for unseen bigrams. Computational Linguistics, 29(3):459-484, 2003.
[15]
{15} M. Lapata and F. Keller. Web-based models of natural language processing. ACM Transactions on Speech and Language Processing, 2(1):1-31, 2005.
[16]
{16} D. Lin. Automatic retreival and clustering of similar words. In Proc. of the 17th COLING, pages 768-774, 1998.
[17]
{17} D. Lin. An information-theoretic definition of similarity. In Proc. of the 15th ICML, pages 296-304, 1998.
[18]
{18} C. D. Manning and H. Schäutze. Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge, Massachusetts, 2002.
[19]
{19} Y. Matsuo, J. Mori, M. Hamasaki, K. Ishida, T. Nishimura, H. Takeda, K. Hasida, and M. Ishizuka. Polyphonet: An advanced social network extraction system. In Proc. of 15th International World Wide Web Conference, 2006.
[20]
{20} Y. Matsuo, T. Sakaki, K. Uchiyama, and M. Ishizuka. Graph-based word clustering using web search engine. In Proc. of EMNLP 2006, 2006.
[21]
{21} D. McCarthy, R. Koeling, J. Weeds, and J. Carroll. Finding predominant word senses in untagged text. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), pages 279-286, 2004.
[22]
{22} D. Medin, R. Goldstone, and D. Gentner. Respects for similarity. Psychological Review, 6(1):1-28, 1991.
[23]
{23} P. Mika. Ontologies are us: A unified model of social networks and semantics. In Proc. of ISWC2005, 2005.
[24]
{24} G. Miller and W. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes, 6(1):1-28, 1998.
[25]
{25} M. Mitra, A. Singhal, and C. Buckley. Improving automatic query expansion. In Proc. of 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 206-214, 1998.
[26]
{26} J. Mori, Y. Matsuo, and M. Ishizuka. Extracting keyphrases to represent relations in social networks from web. In Proc. of 20th IJCAI, 2007.
[27]
{27} M. Pasca, D. Lin, J. Bigham, A. Lifchits, and A. Jain. Organizing and searching the world wide web of facts - step one: the one-million fact extraction challenge. In Proc. of AAAI-2006, 2006.
[28]
{28} X.-H. Phan, L.-M. Nguyen, and S. Horiguchi. Personal name resolution crossover documents by a semantics-based approach. IEICE Transactions on Information and Systems, E89-D:825-836, 2005.
[29]
{29} J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. Advances in Large Margin Classifiers, pages 61-74, 2000.
[30]
{30} R. Rada, H. Mili, E. Bichnell, and M. Blettner. Development and application of a metric on semantic nets. IEEE Transactions on Systems, Man and Cybernetics, 9(1):17-30, 1989.
[31]
{31} P. Resnik. Using information content to evaluate semantic similarity in a taxonomy. In Proc. of 14th International Joint Conference on Aritificial Intelligence, 1995.
[32]
{32} P. Resnik. Semantic similarity in a taxonomy: An information based measure and its application to problems of ambiguity in natural language. Journal of Aritificial Intelligence Research, 11:95-130, 1999.
[33]
{33} P. Resnik and N. A. Smith. The web as a parallel corpus. Computational Linguistics, 29(3):349-380, 2003.
[34]
{34} R. Rosenfield. A maximum entropy approach to adaptive statistical modelling. Computer Speech and Language, 10:187-228, 1996.
[35]
{35} H. Rubenstein and J. Goodenough. Contextual correlates of synonymy. Communications of the ACM, 8:627-633, 1965.
[36]
{36} M. Sahami and T. Heilman. A web-based kernel function for measuring the similarity of short text snippets. In Proc. of 15th International World Wide Web Conference, 2006.
[37]
{37} H. Schutze. Automatic word sense discrimination. Computational Linguistics, 24(1):97-123, 1998.
[38]
{38} P. D. Turney. Minning the web for synonyms: Pmi-ir versus lsa on toefl. In Proc. of ECML-2001, pages 491-502, 2001.
[39]
{39} A. Tversky. Features of similarity. Psychological Review, 84(4):327-352, 1997.
[40]
{40} B. Vlez, R. Wiess, M. Sheldon, and D. Gifford. Fast and effective query refinement. In Proc. of 20th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 6-15, 1997.
[41]
{41} D. M. Y. Li, Zuhair A. Bandar. An approch for measuring semantic similarity between words using multiple information sources. IEEE Transactions on Knowledge and Data Engineering, 15(4):871-882, 2003.

Cited By

View all
  • (2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-23352015:3(877-896)Online publication date: 14-May-2024
  • (2024)Shaping the Future of Content-based News Recommenders: Insights from Evaluating Feature-Specific Similarity MetricsProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659560(201-211)Online publication date: 22-Jun-2024
  • (2023)Semantic Similarity MeasuresPhenotropic Interaction10.1007/978-3-031-42819-7_4(49-69)Online publication date: 4-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '07: Proceedings of the 16th international conference on World Wide Web
May 2007
1382 pages
ISBN:9781595936547
DOI:10.1145/1242572
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

WWW'07
Sponsor:
WWW'07: 16th International World Wide Web Conference
May 8 - 12, 2007
Alberta, Banff, Canada

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)64
  • Downloads (Last 6 weeks)4
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A study of concept similarity in WikidataSemantic Web10.3233/SW-23352015:3(877-896)Online publication date: 14-May-2024
  • (2024)Shaping the Future of Content-based News Recommenders: Insights from Evaluating Feature-Specific Similarity MetricsProceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3627043.3659560(201-211)Online publication date: 22-Jun-2024
  • (2023)Semantic Similarity MeasuresPhenotropic Interaction10.1007/978-3-031-42819-7_4(49-69)Online publication date: 4-Nov-2023
  • (2021)AMFFKnowledge-Based Systems10.1016/j.knosys.2021.107525233:COnline publication date: 5-Dec-2021
  • (2020)Comorbidities, clinical signs and symptoms, laboratory findings, imaging features, treatment strategies, and outcomes in adult and pediatric patients with COVID-19: A systematic review and meta-analysisTravel Medicine and Infectious Disease10.1016/j.tmaid.2020.101825(101825)Online publication date: Aug-2020
  • (2019)DIS-CKnowledge and Information Systems10.1007/s10115-018-1200-359:1(33-65)Online publication date: 1-Apr-2019
  • (2019)A probabilistic model for semantic advertisingKnowledge and Information Systems10.1007/s10115-018-1160-759:2(387-412)Online publication date: 1-May-2019
  • (2018)Similarity analysis of product-line variantsProceedings of the 22nd International Systems and Software Product Line Conference - Volume 110.1145/3233027.3233044(226-235)Online publication date: 10-Sep-2018
  • (2018)ClassiNet -- Predicting Missing Features for Short-Text ClassificationACM Transactions on Knowledge Discovery from Data10.1145/320157812:5(1-29)Online publication date: 27-Jun-2018
  • (2018)Incorporating Prior Knowledge into Word Embedding for Chinese Word Similarity MeasurementACM Transactions on Asian and Low-Resource Language Information Processing10.1145/318262217:3(1-21)Online publication date: 2-Apr-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media