Collective Entity Resolution in Familial Networks

TitleCollective Entity Resolution in Familial Networks
Publication TypeConference Paper
Year of Publication2017
AuthorsKouki, P, Pujara, J, Marcum, C, Koehly, L, Getoor, L
Conference NameIEEE International Conference on Data Mining (ICDM)

Entity resolution in settings with rich relational structure often introduces complex dependencies between coreferences. Exploiting these dependencies is challenging – it requires seamlessly combining statistical, relational, and logical dependencies. One task of particular interest is entity resolution in familial networks. In this setting, multiple partial representations of a family tree are provided, from the perspective of different family members, and the challenge is to reconstruct a family tree from these multiple, noisy, partial views. This reconstruction is crucial for applications such as understanding genetic inheritance, tracking disease contagion, and performing census surveys. Here, we design a model that incorporates statistical signals, such as name similarity, relational information, such as sibling overlap, and logical constraints, such as transitivity and bijective matching, in a collective model. We show how to integrate these features using probabilistic soft logic, a scalable probabilistic programming framework. In experiments on realworld data, our model significantly outperforms state-of-theart classifiers that use relational features but are incapable of collective reasoning. I