Simulating Human Associations with Linked Data

  • In recent years, enormous progress has been made in the field of Artificial Intelligence (AI). Especially the introduction of Deep Learning and end-to-end learning, the availability of large datasets and the necessary computational power in form of specialised hardware allowed researchers to build systems with previously unseen performance in areas such as computer vision, machine translation and machine gaming. In parallel, the Semantic Web and its Linked Data movement have published many interlinked RDF datasets, forming the world’s largest, decentralised and publicly available knowledge base. Despite these scientific successes, all current systems are still narrow AI systems. Each of them is specialised to a specific task and cannot easily be adapted to all other human intelligence tasks, as would be necessary for Artificial General Intelligence (AGI). Furthermore, most of the currently developed systems are not able to learn by making use of freely available knowledge such as provided by the Semantic Web. Autonomous incorporation of new knowledge is however one of the pre-conditions for human-like problem solving. This work provides a small step towards teaching machines such human-like reasoning on freely available knowledge from the Semantic Web. We investigate how human associations, one of the building blocks of our thinking, can be simulated with Linked Data. The two main results of these investigations are a ground truth dataset of semantic associations and a machine learning algorithm that is able to identify patterns for them in huge knowledge bases. The ground truth dataset of semantic associations consists of DBpedia entities that are known to be strongly associated by humans. The dataset is published as RDF and can be used for future research. The developed machine learning algorithm is an evolutionary algorithm that can learn SPARQL queries from a given SPARQL endpoint based on a given list of exemplary source-target entity pairs. The algorithm operates in an end-to-end learning fashion, extracting features in form of graph patterns without the need for human intervention. The learned patterns form a feature space adapted to the given list of examples and can be used to predict target candidates from the SPARQL endpoint for new source nodes. On our semantic association ground truth dataset, our evolutionary graph pattern learner reaches a Recall@10 of > 63 % and an MRR (& MAP) > 43 %, outperforming all baselines. With an achieved Recall@1 of > 34% it even reaches average human top response prediction performance. We also demonstrate how the graph pattern learner can be applied to other interesting areas without modification.

Download full text files

Export metadata

Metadaten
Author:Jörn Hees
URN:urn:nbn:de:hbz:386-kluedo-54309
Advisor:Andreas Dengel, Heiko Paulheim
Document Type:Doctoral Thesis
Language of publication:English
Date of Publication (online):2018/12/08
Year of first Publication:2018
Publishing Institution:Technische Universität Kaiserslautern
Granting Institution:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2018/04/09
Date of the Publication (Server):2018/12/10
Tag:SPARQL query learning; artificial intelligence; associations; dataset; embedding; end-to-end learning; evolutionary algorithm; graph embedding; linked data; machine learning; semantic web
GND Keyword:Evolutionary Algorithm; Association; Semantic Web; Linked Data; Artificial Intelligence; SPARQL; Machine Learning
Page Number:XIII, 216
Faculties / Organisational entities:Kaiserslautern - Fachbereich Informatik
CCS-Classification (computer science):G. Mathematics of Computing / G.3 PROBABILITY AND STATISTICS / Probabilistic algorithms (including Monte Carlo)
I. Computing Methodologies / I.2 ARTIFICIAL INTELLIGENCE / I.2.0 General
I. Computing Methodologies / I.2 ARTIFICIAL INTELLIGENCE / I.2.6 Learning (K.3.2)
I. Computing Methodologies / I.5 PATTERN RECOGNITION / I.5.0 General
J. Computer Applications / J.4 SOCIAL AND BEHAVIORAL SCIENCES / Psychology
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
MSC-Classification (mathematics):05-XX COMBINATORICS (For finite fields, see 11Txx) / 05Cxx Graph theory (For applications of graphs, see 68R10, 81Q30, 81T15, 82B20, 82C20, 90C35, 92E10, 94C15) / 05C60 Isomorphism problems (reconstruction conjecture, etc.) and homomorphisms (subgraph embedding, etc.)
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence / 68T01 General
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence / 68T05 Learning and adaptive systems [See also 68Q32, 91E40]
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence / 68T10 Pattern recognition, speech recognition (For cluster analysis, see 62H30)
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence / 68T30 Knowledge representation
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence / 68T37 Reasoning under uncertainty
68-XX COMPUTER SCIENCE (For papers involving machine computations and programs in a specific mathematical area, see Section {04 in that areag 68-00 General reference works (handbooks, dictionaries, bibliographies, etc.) / 68Txx Artificial intelligence
90-XX OPERATIONS RESEARCH, MATHEMATICAL PROGRAMMING / 90Cxx Mathematical programming [See also 49Mxx, 65Kxx] / 90C06 Large-scale problems
90-XX OPERATIONS RESEARCH, MATHEMATICAL PROGRAMMING / 90Cxx Mathematical programming [See also 49Mxx, 65Kxx] / 90C35 Programming involving graphs or networks [See also 90C27]
91-XX GAME THEORY, ECONOMICS, SOCIAL AND BEHAVIORAL SCIENCES / 91Exx Mathematical psychology / 91E40 Memory and learning [See also 68T05]
PACS-Classification (physics):00.00.00 GENERAL / 07.00.00 Instruments, apparatus, and components common to several branches of physics and astronomy (see also each subdiscipline for specialized instrumentation and techniques) / 07.05.-t Computers in experimental physics; Computers in education, see 01.50.H- and 01.50.Lc; Computational techniques, see 02.70.-c; Quantum computation architectures and implementations, see 03.67.Lx; Optical computers, see 42.79.Ta / 07.05.Mh Neural networks, fuzzy logic, artificial intelligence
Licence (German):Creative Commons 4.0 - Namensnennung, nicht kommerziell, keine Bearbeitung (CC BY-NC-ND 4.0)