Analogy between Concepts (slides)
Abstract: Analogical reasoning exploits parallels between situations. It enables us to state analogies for explanation purposes, to draw plausible conclusions, or to create new devices by transposing old ones in new contexts. As such, reasoning by analogy plays an important role in human thinking, as it is widely acknowledged. For this reason, it has been studied for a long time, in philosophy, linguistics, cognitive psychology and artificial intelligence, under various approaches. The classical view of analogy describes the parallel between two situations, which are described in terms of objects, properties of the objects, and relations linking the objects. An analogical situation may be also expressed by proportions. In that respect, a key pattern is a statement of the form ‘A is to B as C is to D’, as in the examples “a calf is to a bull as a foal is to a stallion”, or “gills are to fishes as lungs to mammals”. In the first example, the four items A, B, C, D belong to the same category, and we speak of analogical proportion. In the second example , the four items belong to two different categories, here A and C are organs and B and D classes of animals. In this second type of analogical statement, we speak of relational analogy or relational proportion.
It is only recently that these proportions have been systematically studied in terms of mathematic properties and of applications in AI tasks. As important examples, an exhaustive study of the ”logical” proportions (in propositionnal logic) has been produced by H. Prade and G. Richard; proportions between formal structures, including words over finite alphabets, have been studied by Y. Lepage, N. Stroppa and F. Yvon; the use of proportions for Pattern Recognition has been proved useful by S. Bayoud and L. Miclet. In this spirit, we present in this talk our results on proportions in lattices, with a focus on concept lattices.
Hence, the goal of this talk is to give a formalization of the analogical proportion between four elements of a general lattice, to see how it applies in Formal Concept Lattices, and to give operational definitions of an analogical proportion between formal concepts and of a relational proportion in a formal context. More precisely, concerning lattices of formal concepts, we will describe how an analogical proportion can be defined between concepts and how such pro- portions are deeply related to subcontexts of a special structure, that we call analogical complexes. We will also illustrate the fact that the set of these complexes is itself structured as a lattice. In that way, we will demonstrate how analogical proportions between concepts can be constructed from the formal context, without the construction of the whole lattice of concepts. We will also give a acceptable definition of what can be a relational proportion in terms of formal concept analysis, and illustrate it with some linguistic examples.
The talk will be mainly illustrative. No purely formal results nor demonstrations will be given, but mainly examples and illustrations of new notions and of the running of algorithms.
Bio: Laurent Miclet is Professor Emeritus at IRISA. His research field is Artificial Intelligence. He has been especially working on Speech Recognition, Grammar Induction and Formal Analogy. He has recently co-authored the book “Conception d’Algorithmes” (Eyrolles, 2016).
Abstract: Semantic web and information extraction technologies are enabling the creation of vast information and knowledge repositories, in the form of knowledge graphs comprising entities and the relationships between them. As the volume of such graph-structured data continues to grow, it has the potential to enable users’ knowledge expansion in application areas such as web information retrieval, formal and informal learning, scientific research, health informatics, entertainment, and cultural heritage. However, users are unlikely to be familiar with the complex structures and vast content of such datasets and hence need to be assisted by tools that support interactive exploration and flexible querying.
Recent work has proposed techniques for automatic approximation and relaxation of users’ queries over knowledge graphs, allowing query answers to be incrementally returned in order of their distance from the original form of the query. In this context, approximating a query means applying an edit operation to the query so that it can return possibly different answers, while relaxing a query means applying a relaxation operation to it so that it can return possibly more answers. Edit operations include insertion, deletion or substitution of a property label, while relaxation operations include replacing a class by a superclass, or a property by a superproperty.
The benefits of supporting such flexible query processing over knowledge graphs include: (i) correcting users’ erroneous queries; (ii) finding additional relevant answers that the user may be unaware of; and (iii) generating new queries which may return unexpected results and bring new insights. However, although this kind of flexible querying can increase a user’s understanding of the knowledge graph and underlying knowledge domain, it can return a large number of query results, all at the same distance away from the user’s original query. Therefore, a key challenge is how to facilitate users’ meaning making from flexible query results.
Meaning making is related to users’ domain knowledge and their ability to make sense of the entities that they encounter during their interactions with the knowledge graph. Empirical studies have suggested that paths which start with familiar entities (knowledge anchors) and then add new, possibly unfamiliar, entities can be beneficial for making sense of complex knowledge graphs. Recent work has proposed an approach to identifying knowledge anchors that adopts the cognitive science concept of basic-level objects in a domain taxonomy, with the development of a formal framework for deriving a set of knowledge anchors from a knowledge graph.
In this talk we discuss how a combination of these two directions of work – namely, flexible querying of graph-structured data, and identification of knowledge anchors in a knowledge graph – can be used to support users in incrementally querying, exploring and learning from large, complex knowledge graphs. Our hybrid approach combines flexible graph querying and knowledge anchors by including in the query results paths to the nearest knowledge anchor. This makes more evident the relationships between the entities returned within the query results, and allows the user to explore increasingly larger fragments of the domain taxonomy.
Bio: Alexandra Poulovassilis is Professor of Computer Science at the Department of Computer Science and Information Systems, Birkbeck, University of London, and Director of the Birkbeck Knowledge Lab. Her long-term research interests are in data integration, querying, visualisation and personalisation and she has published widely in these areas. She has held numerous research grants, many of them pursuing interdisciplinary research in collaboration with domain experts from education, the arts, and the sciences. The goal of such research is to effectively support learning communities in capturing, organising, discovering and sharing knowledge. Her doctoral and postdoctoral research was in data models and languages for graph-structured data. She joined Birkbeck in 1999 with the award of a Readership (Associate Professorship) under the college’s 175th Anniversary Chairs and Readers scheme, and she was promoted to full Professor in 2001. She served on the UK RAE 2008 and REF 2014 sub-panel for Computer Science and Informatics, and is currently Deputy Dean with a Research Enhancement remit in the School of Business, Economics and Informatics at Birkbeck.
Patterns, Sets of Patterns, and Pattern Compositions (slides)
Abstract: The goal of exploratory data analysis — or, data mining — is making sense of data. We develop theory and algorithms that help you understand your data better, with the lofty goal that this helps formulating (better) hypotheses. More in particular, our methods give detailed insight in how data is structured: characterising distributions in easily understandable terms, showing the most informative patterns, associations, correlations, etc.
My talk will consist of two parts. I will start by explaining what is a pattern composition. Patterns, such as formal concepts, can give valuable insight in data. Mining all potentially interesting patterns is a useless excercise, however: the result is cumbersome, sensitive to noise, and highly redundant. Mining a small set of patterns, that together describes the data well, leads to much more useful results. Databases, however, typically consist of different parts, or, components. Each such component is best characterised by a different set of patterns. Young parents, for example, exhibit different buying behaviour than elderly couples. Both, however, buy bread and milk. A pattern composition models exactly this. It jointly characterises the similarities and differences between such components of a database, without redundancy or noise, by including only patterns that are descriptive for the data, and assigning those patterns only to the relevant components of the data. Knowing what a pattern composition is, this leads to the question, how can we discover these from data? This question I answer in the second part of my talk.
Bio: Jilles Vreeken is the leader of the Independent Research Group on Exploratory Data Analysis at the DFG cluster of excellence on Multimodal Computing and Interaction at the Saarland University, Saarbr¸cken, Germany. In addition, he is a Senior Researcher in D5, the Databases and Information Systems group at the Max Planck Institute for Informatics. His research interests include data mining and machine learning, exploratory data analysis, causal inference, and pattern mining. He is particularly interested in developing well-founded theory and efficient methods for extracting informative models and characteristic patterns from large data, and putting these to good use. He has authored over 60 conference and journal papers, 3 book chapters, won the 2010 ACM SIGKDD Doctoral Dissertation Runner-Up Award, and two best (student) paper awards. He is tutorial chair for SIAM SDM 2017, was program co-chair for ECML PKDD 2016, publicity co-chair for IUI 2015, sponsorship co-chair for ECML PKDD 2014, workshop co-chair of IEEE ICDM 2012. He co-organised eight workshops and four tutorials. He is a member of the editorial board of Data Mining and Knowledge Discovery (DAMI) and of the ECML PKDD Journal Track Guest Editorial Board, in addition he regularly reviews for TKDD, KAIS, TKDE, as well as for KDD, ICDM, SDM, ECML PKDD. He obtained his M.Sc. in Computer Science from Universiteit Utrecht, the Netherlands. He pursued his Ph.D. at the same university under supervision of Arno Siebes, and defended his thesis ‘Making Pattern Mining Useful’ in 2009. Between 2009 and 2013 he was a post-doctoral researcher at the University of Antwerp, supported by a Post-doctoral Fellowship of the Research Foundation – Flanders (FWO).
Semantic Web: Big Data, some Knowledge and a bit or Reasoning (slides)
Abstract: Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with reasoning paves the way to make the Semantic Web a reality. In this presentation, I will describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transtivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. I will illustrate the expressivity of this framework for Semantic Web applications and its genericity for developing inference algorithms with good behaviour in data complexity. In particular, we will show how it allows to model the problem of data linkage as a reasoning problem on possibly decentralized data. We will also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness.
Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.
Bio: Marie-Christine Rousset is Professor of Computer Science at the University of Grenoble Alpes and senior member of Institut Universitaire de France. Her areas of research are Knowledge Representation, Information Integration, Pattern Mining and the Semantic Web. She has published around 100 refereed international journal articles and conference papers, and participated in several cooperative industry-university projects. She received a best paper award from AAAI in 1996, and has been nominated ECCAI fellow in 2005. She has served in many program committees of international conferences and workshops and in editorial boards of several journals.