GermaNet

Jump to: navigation, search

GermaNet is a lexical-semantic net for the German language that relates nouns, verbs, and adjectives semantically by grouping lexical units that express the same concept into synsets and by defining semantic relations between these synsets.[1] GermaNet has much in common with the English WordNet and can be viewed as an on-line thesaurus or a light-weight ontology. GermaNet has been developed and maintained within various projects at the research group for General and Computational Linguistics, University of Tübingen since 1997. It has been integrated into the EuroWordNet, a multilingual lexical-semantic database.[2]

Database

Contents

GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a synset. A synset is a set of words (called lexical units) where all the words are taken to have the same or almost the same meaning.Thus a synset is a set of synonyms grouped under one definition, or "gloss".

In addition to the gloss, synsets are labeled with their syntactic function and accompanied by example sentences for each distinct meaning in the synset.[3] Just as in WordNet, for each word category the semantic space is divided into a number of semantic fields closely related to major nodes in the semantic network: Ort, or "location", Körper, or "body", etc.[2]

The following is an up-to-date statistics of GermaNet's version 11.0 contents (release May 2016):

  • Number of synsets: 110167
  • Number of Lexical Units: 142814
  • Number of Literals: 126348
  • Number of Conceptual Relations: 123678
  • Number of Lexical Relations (synonymy excluded): 4203
  • Number of Split Compounds: 66047
  • Number of Interlingual Index (ILI) Records: 28567
  • Number of Wikitionary Sense Descriptions: 28552[2]

Format

All GermaNet data is stored in a relational PostgreSQL 5 database. The database model follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc.[3] The distribution format of all GermaNet data is XML. The two types of files, one for synsets and the other for relations, represent all data that is available in the GermaNet database.

Interfaces

There are several Application Programming Interfaces (API) available for Java[4] and for Perl. These APIs are distributed freely and provide easy access to all information in various versions of GermaNet.

Licenses

GermaNet 11.0 (released May 2016) is free for academic. It can be distributed under one of the following types of license agreements:

  • Academic Research Agreement: free for the research purposes of academic institutions. Licenses are not given to individuals, and those seeking a license are required to talk to an academic advisor.
  • Research and Development Agreement: applies to non-academic institutions and research consortia. To be used strictly for technology development and internal research.
  • Commercial Agreement: applies to non-academic institutions and commercial enterprises. It permits technology development and internal research, as well as giving the non-exclusive right to distribute and market any derived product or service.[5]

Applications

GermaNet has been used for a variety of applications, including semantic analysis, shallow recognition of implicit document structure, compound analysis;[6] for analyzing sectional preferences,[7] for word sense disambiguation,[8] etc.

See also

References

  1. ^ Petra Storjohann (23 June 2010). Lexical-semantic relations: theoretical and practical perspectives. John Benjamins Publishing Company. pp. 165–. ISBN 978-90-272-3138-3. Retrieved 16 November 2011. 
  2. ^ a b c GermaNet homepage
  3. ^ a b V. Henrich, E. Hinrichs. 2010. GernEdiT - The GermaNet Editing Tool. In: Proceedings of the Seventh Conference on International Language Resources and Evaluation.
  4. ^ GermaNet APIs in Java
  5. ^ "Licenses". www.sfs.uni-tuebingen.de. Retrieved 2017-03-26. 
  6. ^ Manuela Kunze and Dietmar Rösner. 2004. Issues in Exploiting GermaNet as a Resource in Real Applications.
  7. ^ Sabine Schulte im Walde, 2004. GermaNet Synsets as Selectional Preferences in Semantic Verb Clustering.
  8. ^ Saito et al., 2002. Evaluation of GermanNet: Problems Using GermaNet for Automatic Word Sense Disambiguation.