友爱  力行  乐学  日新


哈工大SCIR一篇长文被KDD 2019录用

哈工大SCIR一篇长文被KDD 2019录用


ACM SIGKDD(国际数据挖掘与知识发现大会,简称 KDD)是数据挖掘领域的顶级国际会议,将于2019年8月4日至8月8日在美国安克雷奇(阿拉斯加)举行。自 1995 年以来,该会议已经举办了二十多年,其对论文接收非常严格,每年的接收率不超过 20%。KDD 2019 包括两个 track:Research track 和 Applied Data Science track。Research track 共收到约 1200 篇论文投稿,其中约 110 篇被接收为 oral 论文,60 篇被接收为 poster 论文,接收率仅为 14%,相比去年下降了将近 4 个百分点。

今年KDD 大会对论文提交的要求也变得更加严格,首次采取双盲评审制度,所有提交论文必须严格按照论文提交要求撰写,论文中不得出现作者姓名和机构信息。

哈尔滨工业大学社会计算与信息检索研究中心共有1篇长文被KDD 2019录用,下面是论文简要信息及摘要:

•The Role of “Condition”: A Novel Scientific Knowledge Graph Representation and Construction Model

作者:姜天文,赵通,秦兵,刘挺,Nitesh V. Chawla,蒋朦



Abstract: Conditions play an essential role in scientific observations, hypotheses, and statements. Unfortunately, existing scientific knowledge graphs (SciKGs) represent factual knowledge as a flat relational network of concepts, as same as the KGs in general domain, without considering the conditions of the facts being valid, which loses important contexts for inference and exploration. In this work, we propose a novel representation of SciKG, which has three layers. The first layer has concept nodes, attribute nodes, as well as the attaching links from attribute to concept. The second layer represents both fact tuples and condition tuples. Each tuple is a node of the relation name, connecting to the subject and object that are concept or attribute nodes in the first layer. The third layer has nodes of statement sentences traceable to the original paper and authors. Each statement node connects to a set of fact tuples and/or condition tuples in the second layer. Inspired by a recent work that considers open information extraction as a sequence labeling task, we design a semi-supervised Multi-Input Multi-Output (MIMO) sequence labeling model that learns complex dependencies between the sequence tags from multiple signals and generates output sequences for fact and condition tuples. It has a new self-training module of multiple strategies to leverage the massive scientific data for better performance when manual annotation is limited. Experiments on a data set of 141M sentences show that our model outperforms existing methods and the SciKGs we constructed provide a good understanding of the scientific statements.

Copyright © Research Center for Social Computing and Information Retrieval 2001 - 2015
P.O.Box 321, HIT, P.R.China 150001
webmaster at ir dot hit dot edu dot cn