
知识图谱综述.pdf
42页Introduction to Knowledge Introduction to Knowledge GraphsGraphs 肖仰华 复旦大学知识工场实验室 shawyh@ 2017-7-13 OutlineOutline 源 起 内 涵 价 值 分 类 源起 历史脉络诞生背景核心优势 规模巨大语义丰富质量精良结构友好 历史脉络历史脉络 人工智能知识工程知识表示知识图谱 AI(Artificial Intelligence): Think, act, humanly or rationally “The exciting new effort to make computers think … machines with minds, in the full and literal sense.“ (Haugeland, 1985) “AI … is concerned with intelligent behaviorin artifacts.“ (Nilsson, 1998) KE(Knowledge engineering) is an engineering discipline that involves integrating knowledge into computer systems in order to solve complex problems normally requiring a high level of human expertise KR (Knowledge representation) is dedicated to representing information about the worldin a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. KG (Knowledge graph) is a large scale semantic network consisting of entities/concepts as well as the semantic relationships among them •2012年5月,Goolge正式发表自己的 知识图谱 •搜索核心需求: 让搜索通往答案 • 无法理解搜索关键词 • 无法精准回答 •根本问题 • 缺乏大规模背景知识 • 传统知识表示难以满足需求 诞生背景诞生背景 • Higher coverage over entities and concepts KGKG优势优势1 1:: largelarge scalescale KGs# of Entities/Concepts# of Relations YAGO10 Million120 Million DBpedia28 Million9.5 Billion Probase2.7 Million70 Billion BabelNet14 Million5 Billion CN-DBpedia17 Million200 Million • Higher coverage over numerous semantic relationships KGKG优势优势2: 2: semanticallysemantically richrich KGs# of Relations DBpedia1,650 YAGO114 YAGO374 CN-DBpedia100 Thousands KGKG优势优势3 3:: highhigh qualityquality • High quality • Big data: Cross validation by multiple sources • Crowd sourcing: quality guarantee [Yin, etc., Truth Discovery with Multiple Conflicting Information Providers on the Web, kdd07] KGKG优势优势4: 4: friendlyfriendly structurestructure • Structured organization • By RDF • By graph 时间知识图谱数量 2017-03-161,139 2014-08-30570 2011-09-19295 2010-09-22203 2009-07-1495 2008-09-1845 2007-11-0728 2007-05-0112 越来越多的知识越来越多的知识图谱图谱应运而生应运而生 “Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod- Yago,WordNet, FreeBase, Probase, NELL, CYC, DBpedia…. 内涵 KG组成 点 实体 概念 值 边 KG的表述 逻辑表示物理表示 KGKG组成组成- - NodeNode- -EntityEntity • Entity/Objects/Instances • Wikipedia: An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not. • 黑格尔《小逻辑》:能够独立存在的,作 为一切属性的基础和万物本原的东西 KGKG组成组成- - NodeNode- -ConceptConcept • Concept • In metaphysics, and especially ontology, a concept is a fundamental category of existence. • (mental) representations of categories • Category • Groups of entities which have something in common; • Type/class • WIKITIONARY: A grouping based on shared characteristics; a class. CATEGORIZATION: 1、the process of formation of categories; 2、the process of identifying X as a member of a particular category Y; DBpedia Types Probase Categories KGKG组成组成- - NodeNode- -ValueValue • Date • 特朗普 出生日期 1946年6月14日 • String • 特朗普 简介 “唐纳德·特朗普(Donald Trump),第45任美国总统,1946 年6月14日生于纽约,美国共和党籍政治家” • Numeric • 特朗普 年龄 71 KGKG组成组成- - 边边 • Relation • 侧重实体(individual)之间的关系 • Examples: • Sitting-On: An apple sitting on a table • Taller-than: Washington Monument is taller than the White House • Property/Attribute/Quality • A characteristic/quality that describes an object • Examples: • size, color, weight, composition, and so forth, of an object Models of Knowledge GraphModels of Knowledge Graph Entities •Concepts •Instance •Value Relationships •IsA •Co-occurrence •Synonyms •Others…. Knowledge Graph • A collections of entities and relationship between them • Entity • Relationships • Euler • Seven Bridges of Königsberg 17 What is a graph?What is a graph? Entities Relationship s Graph • Weighted graphs • Directed graphs • Probalistic graphs • Evolving graphs 18 Models of graphsModels of graphs • Vertices/Nodes • Edges/arcs • Neighbors of a vertex • Degree of a vertex • Subgraph • Shortest path • Example graph 19 NotationsNotations • Adjacent list • Space efficient on sparse graph • Matrix 20 Representation of a graph Representation of a graph RDF:RDF: Resource Resource Description Description FrameworkFramework • A framework (not a language) for describing resources, recommended by W3C • Facilitating reading and correct use of information by computers, not necessarily by people • Resource, Property, Property Value =Subject, Predicate, Object of a statement • RDF identifies resources with URIs • RDF offers only binary predicates. • Think of them as P(x,y) where P is the relationship between the objects x and y. • From the example, • X= • Y = Jan Egil Refsnes • P = author RDFRDF representationsrepresentations Egil Refsnes author Bob Dylan USA Columbia 10.90 1985 Bonnie Tyler UK CBS Records 9.90 1988 … . Root element of RDF documents S。
