Aims of the Tutorial
- Give an overview of Ontology Learning techniques as well as a synthesis of approaches
- Provide a ‘start kit’ for Ontology Learning
- Highlight interdisciplinary aspects and opportunities for a combination of techniques
Structure of the Tutorial
- Part I Introduction – Philipp Cimiano
- Part II Ontologies in Knowledge Management & Ontology
- Life Cycle – Michael Sintek
- Part III Methods in Ontology Learning from Text –
- Paul Buitelaar & Philipp Cimiano
- Part IV Ontology Evaluation – Marko Grobelnik
- Part V Tools for Ontology Learning from Text – All
- Wrap-up Paul Buitelaar
- AI: Knowledge Acquisition. Since 60s/70s: Semantic Network Extraction and similar for Story Understanding. Systems: e.g. MARGIE (Schank et al., 1973), LUNAR (Woods, 1973)
- NLP: Lexical Knowledge Extraction. 70s/80s: Extraction of Lexical Semantic Representations from Machine Readable: Dictionaries. Systems: e.g. ACQUILEX LKB (Copestake et al.). 80s/90s: Extraction of Semantic Lexicons from Corpora for Information Extraction. Systems: e.g. AutoSlog (Riloff, 1993), CRYSTAL (Soderland et al., 1995)
- IR: Thesaurus Extraction. Since 60s: Extraction of Keywords, Thesauri and Controlled Vocabularies. Based on construction and use of thesauri in IR (Sparck-Jones, 1966/1986, 1971). Systems: e.g. Sextant (Grefenstette, 1992), DR-Link (Liddy, 1994)
Ontologies in Computer Science
- Ontology refers to an engineering artifact:
- It is constituted by a specific vocabulary used to describe a certain reality, as well as
- a set of explicit assumptions regarding the intended meaning of
- the vocabulary.
- An ontology is an explicit specification of a conceptualization. ([Gruber 93])
- An ontology is a shared understanding of some domain of interest. ([Uschold & Gruninger 96])
Why Develop an Ontology?
- To make domain assumptions explicit
- Easier to change domain assumptions
- Easier to understand and update legacy data
- To separate domain knowledge from operational knowledge
- Re-use domain and operational knowledge separately
- A community reference for applications
- To share a consistent understanding of what information means
Tools for Ontology Learning from Text.
SEKTbar: User profiling
Jožef Stefan Institute
A Web-based user profile is automatically generated while the user is browsing the Web.
It is represented in the form of a user-interest-hierarchy (UIH)
The root node holds the user’s general interest, while leaves hold more specific interests
UIH is generated by using hierarchical k-means clustering algorithm
Nodes of current interest are determined by comparing UIH node centroids to the centroid computed out of the m most recently visited pages.
The user profile is visualized on the SEKTbar (Internet Explorer Toolbar)
The user can select a node in the hierarchy to see its specific keywords and associated pages (documents)
Availability: open source (C++, .NET)
© Paul Buitelaar, Philipp Cimiano, Marko Grobelnik, Michael Sintek: Ontology Learning from Text. Tutorial at ECML/PKDD, Oct. 2005, Porto, Portugal.