Metadata of Linguistic Resources: History and Current State

 
PIIS160578800018917-4-1
DOI10.31857/S160578800018917-4
Publication type Article
Status Published
Authors
Occupation: Head Researcher
Affiliation: The Institute of Scientific Information for Social Sciences of the RAS (INION RAN)
Address: 51-21 Nakhimovskiy Prospect, Moscow, 117997, Russia
Journal nameIzvestiia Rossiiskoi akademii nauk. Seriia literatury i iazyka
EditionVolume 81 Issue 1
Pages21-36
Abstract

The main metadata projects for linguistic (language) resources developed over the past 20 years are described. These include the IMDI initiative, the OLAC metadata system. the META-SHARE meta-model, the International Standard Number of Language resources, the evaluation map of language resources, and the CLARIN component metadata model. The content of the ISO metadata standard is described. Projects for creating dictionaries, ontologies, and lexical databases for metadata of language resources are described.

KeywordsMetadata, linguistic resources, language resources, standards, dictionaries, ontologies
Received15.06.2021
Publication date11.03.2022
Number of characters48720
Cite  
100 rub.
When subscribing to an article or issue, the user can download PDF, evaluate the publication or contact the author. Need to register.

Number of purchasers: 0, views: 359

Readers community rating: votes 0

1. A Proposal for a Meta Description Standard for Language Resources https://www.mpi.nl/ISLE/documents/papers/white_paper_11.pdf

2. Metadata Elements for Lexicon Descriptions https://www.mpi.nl/ISLE/documents/draft/ISLE_Lexicon_1.0.pdf

3. IMDI Team, (August 2001), Vocabulary Taxonomy and Structure, Version 1.1, MPI Nijmegen

4. Mapping IMDI Session Descriptions with OLAC Draft Proposal Version 1.0 August, 2001 IMDI Technical Report Max-Planck-Institute for Psycholinguistics NL, Nijmegen

5. Arbil for editing and managing IMDI metadata. Version 2.6. https://www.mpi.nl/corpus/html/arbil-imdi/index.html

6. IMDI Documents https://www.mpi.nl/ISLE/documents/docs_frame.html

7. OLAC Metadata http://olac.ldc.upenn.edu/OLAC/metadata.html

8. OLAC Metadata Usage Guidelines http://olac.ldc.upenn.edu/NOTE/usage.html

9. Dublin Core XML https://dcxml.readthedocs.io/en/latest/

10. Documentation and User Manual of the META-SHARE Metadata Model http://www.meta-net.eu/public_documents/t4me/META-NET-D7.2.4-Final.pdf

11. Gavrilidou, M., Labropoulou, P., Piperidis, S., Speranza, M., Monachini, M., Arranz, V., Francopoulo, G. META-NET Deliverable D7.2.1 – Specification of Metadata-Based Descriptions for Language Resources and Technologies, 2011, http://t4me.dfki.de/intranet/document_repository/deliverables/wp07-infrastructure-functional-and-technical-specification/meta-net-d7.2.1-final.pdf/view

12. Technologies for the Multilingual European Information Society. Specification of metadata-based descriptions for language resources and technologies. Penny Labropoulou, Maria Gavrilidou, Elina Desipri, Stelios, Piperidis (R.C. Athena. ILSP), Francesca Frontini, Monica Monachini (ILC. CNR), Victoria Arranz (ELDA), Gil Francopoulo (LIMSI). Final Report, 2012 http://www.meta-net.eu/public_documents/t4me/META-NET-D7.2.2-Final.pdf

13. International Standard Language Resource Number http://www.islrn.org/

14. LRE map http://www.elra.info/en/catalogues/lre-map/

15. Component Metadata https://www.clarin.eu/content/component-metadata

16. CMDI 1.2 specification Version 1 Date 2016-10-20 https://office.clarin.eu/v/CE-2016-0880-CMDI_12_specification.pdf

17. CMDI 1.2 https://www.clarin.eu/cmdi1.2

18. CMDI Best Practices Guide https://www.clarin.eu/content/cmdi-best-practices-guide

19. AP3-007-CMDI_and_granularity.pdf https://www.clarin.eu/media/1790

20. CMDI-first-aid-kit.pdf https://www.clarin.eu/sites/default/files/CMDI-first-aid-kit.pdf

21. Component Registry Documentation. Component Registry, Browser and Editor Reference Manual https://www.clarin.eu/content/component-registry-documentation

22. CLARIN Concept Registry https://www.clarin.eu/ccr

23. Virtual Language Observatory (VLO) https://www.clarin.eu/content/virtual-language-observatory-vlo

24. Poiskovye servisy i instrumenty Instituta Meertensa [Search Services and Tools of the Mertens Institute] https://www.meertens.knaw.nl/cmdi/search/#q=*%3A* (In Russ.)

25. Fedora_OAI_Konfiguration_v3.pdf https://www.clarin-d.net/images/ leipzig/Fedora_OAI_Konfiguration_v3.pdf

26. IDS Repository Architecture and Ingest Pipelines http://repos.ids-mannheim.de/reposdescription.html

27. Linguistic Data and NLP Tools. About metadata https://lindat.mff.cuni.cz/repository/xmlui/page/ metadata

28. ISO 24622-1:2015 Language resource management – Component Metadata Infrastructure (CMDI) – Part 1: The Component Metadata Model https://www.iso.org/ru/standard/37336.html

29. ISO 24622-2:2019 Language resource management – Component metadata infrasctructure (CMDI) – Part 2: Component metadata specification language https://www.iso.org/obp/ui/#iso:std:iso:24622:-2:ed-1:v1:en

30. ISO 12620:2009 Terminology and other language and content resources – Specification of data categories and management of a Data Category Registry for language resources https://www.iso.org/standard/37243.html

31. ISO 12620:2019 Management of terminology resources – Data category specifications https://www.iso.org/standard/69550.html

32. GOST R ISO 12620-2012 Terminologiya, drugie yazykovye resursy i resursy soderzhaniya. Spetsifikatsiya kategorij dannykh i vedenie reestra kategorij dannykh dlya yazykovykh resursov http://docs.cntd.ru/document/ 1200104401 [GOST R ISO 12620-2012 Terminologiya, drugie yazykovye resursy i resursy soderzhaniya. Specifikaciya kategorij dannyh i vedenie reestra kategorij dannyh dlya yazykovyh resursov [GOST R ISO 12620-2012 Terminology, Other Language Resources and Content Resources. Specification of Data Categories and Maintaining a Register of Data Categories for Language Resources] http://docs.cntd.ru/document/ 1200104401 (In Russ.)].

33. The Center for Sustainability of Linguistic Data (NaLiDa) http://www.sfs.uni-tuebingen.de/nalida/en/

34. Rational Reconstruction for TDG Metadata http://www.sfs.uni-tuebingen.de/nalida/images/isocat/isocat_hierarchy.html

35. Data Category Repository (DCR) http://datcatinfo.net/

36. TERMWEB https://datcatinfo.termweb.se/termweb/app

37. CLARIN Concept Registry Browser https://concepts.clarin.eu/ccr/browser/

38. Linguistic Metadata (LIME) vocabulary https://lod-cloud.net/dataset/lime

39. About the ontology. What is LexInfo? https://lexinfo.net/

40. Antopol'skij A.B., Savchuk S.O., Tameev A.A. O razrabotke ontologii poiskovykh terminov po lingvistike // Informatsionnye resursy Rossii. 2020. № 4. S. 2–7. [Antopolsky, A.B., Savchuk, S.O., Tameev, A.A. O razrabotke ontologii poiskovyh terminov po lingvistike [On the Development of an Ontology of Search Terms in Linguistics] Informacionnye resursy Rossii [Information Resources of Russia]. 2020, No. 4, pp. 2–7. (In Russ.)].

41. Ontologiya poiskovykh terminov po lingvistike http://db.inion.ru/optel/ [Ontologiya poiskovyh terminov po lingvistike [Ontology of Search Terms in Linguistics] http://db.inion.ru/optel/ (In Russ.)].

42. Antopol'skij A.B., Maksimov N.V., Tameev A.A. Ehksperimental'naya baza dannykh istochnikov dlya sozdaniya ontologii po lingvistike // Informatsionnye resursy Rossii. 2021. № 3. S. 24–30. DOI: 10.46920/0204-3653_2021_03181_24 [Antopolsky, A.B., Maksimov, N.V., Tameev, A.A. Eksperimentalnaya baza dannyh istochnikov dlya sozdaniya ontologii po lingvistike [Experimental Database of Sources for Creating an Ontology on Linguistics]. Informacionnye resursy Rossii [Information Resources of Russia]. 2021, No. 3, pp. 24–30. DOI: 10.46920/0204-3653_2021_03181_24 (In Russ.)].

Система Orphus

Loading...
Up