Cyc/FAQ/1.0

From whitten@netcom.com Thu Feb  9 16:25:50 EST 1995
Article: 27197 of comp.ai
Newsgroups: comp.ai,comp.ai.nat-lang,comp.ai.philosophy
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!udel!gatech!howland.reston.ans.net!ix.netcom.com!netcom.com!whitten
From: whitten@netcom.com (David Whitten)
Subject: CYC FAQ
Message-ID: [whittenD3nnGp.Bw4@netcom.com]
Summary: CYC Frequently Asked Questions  
Keywords: CYC FAQ
Organization: AOT incorporated
X-Newsreader: TIN [version 1.2 PL1]
Date: Wed, 8 Feb 1995 00:17:12 GMT
Lines: 740
Xref: glinda.oz.cs.cmu.edu comp.ai:27197 comp.ai.nat-lang:2803 comp.ai.philosophy:25292

The Unofficial, Unauthorized CYC Frequently Asked Questions Information Sheet.

Written by David Whitten, with various input from other net citizens

If you think of questions that are appropriate for this FAQ, or would
like to improve an answer, please send email to me at whitten@netcom.com

*** Copyright:

Copyright (c) 1994 by David Whitten. All rights reserved.
Portions copyright (c) by MCC and Cycorp.

This FAQ may be freely redistributed in its entirety without
modification provided that this copyright notice is not removed.  It
may not be sold for profit or incorporated in commercial documents
(e.g., published for sale on CD-ROM, floppy disks, books, magazines,
or other print form) without the prior written permission of the
copyright holder.  Permission is expressly granted for this document
to be made available for file transfer from installations offering
unrestricted anonymous file transfer on the Internet.

This article is provided AS IS without any express or implied warranty.

*** Topics Covered:

  [1]     Introduction 
  [1-1]   What is CYC ?
  [1-2]   How do I find this FAQ ?
  [2]     Who is doing CYC ?
  [2-1]     Who is sponsoring it ?
  [2-2]     Who is Douglas Lenat ?
  [2-3]     Who else was previously or is currently working on CYC ?
  [2-4]     How do I contact the authors myself ?
  [3]     When did they start CYC and when will it be completed ?
  [4]     Why are they creating CYC ?
  [5]     Where can I get the source code ?
  [6]     What do they use to make it work ?
  [6-1]     What programming language is it written in ?
  [6-2]     What machines does it run on ?
  [7]     How does it work ?
  [7-1]     How do they store common sense in a computer ?
  [7-2]     How do they input common sense ?
  [7-3]     What theoretical foundation is behind CYC ?
  [7-4]     What is the difference between CYC and an Expert System ?
  [7-5]     What are they doing in Natural Language Processing ?
  [8]     Can CYC reason about information not stored in CYC format databases?
  [9]     What are the details of the functional interface to CYCL ?
  [A]     Acknowledgements
  [B]     Bibliography

Search for [#] to get to question number # quickly.

*** Recent changes:

;;; 1.0: 15-DEC-94 djw   Initial release

----------------------------------------------------------------
Subject: [1]  Introduction


Certain questions and topics come up frequently in the various artificial
intelligence discussion groups about the CYC program.

This file/article is an attempt to gather these questions and their answers
together as a convenient reference for AI researchers, students, hobbyists,
and practitioners. I post it whenever I notice a newbie asking about CYC on
one of the newsgroups I read.  I hope this will cut down on network traffic,
and increase the enjoyment of these newsgroups for the regular readers
by eliminating the necessity to read and respond to the same questions over 
and over again.  It may even answer some questions that readers may not have 
thought about yet, and hopefully will stimulate new discussions by 
increasing the total amount of information available.

Currently this FAQ covers the obvious questions and answers, but I plan to
add new questions and answers as they become common.

----------------------------------------------------------------
Subject: [1-1] What is CYC ?

CYC is the name of a very large, multi-contextual knowledge base and
inference engine, the development of which started at the Microelectronics
and Computer Technology Corporation (MCC) in Austin, Texas during the early
1980s.

Over the past eleven years the members of the CYC team have added to the
knowledge base a huge amount of fundamental human knowledge:  facts, rules
of thumb, and heuristics for reasoning about the objects and events of
modern everyday life.

CYC is an attempt to do symbolic AI on a massive scale.  It is not based on
numerical methods such as statistical probabilities, nor is it based on
neural networks or fuzzy logic.  All of the knowledge in CYC is represented
declaratively in the form of logical assertions.  CYC presently contains
approximately 400,000 significant assertions, which include simple
statements of fact, rules about what conclusions to draw if certain
statements of fact are satisfied (true), and rules about how to reason with
certain types of facts and rules.  New conclusions are derived by the
inference engine using deductive reasoning.

Its avowed purpose is to break the software brittleness bottleneck once and
for all by constructing a foundation of basic "common sense" knowledge -- a
sort of semantic substratum of terms, rules, and relations -- that will
enable a variety of knowledge-intensive products and services.  CYC is
intended to provide a "deep" layer of definitions/understanding that can be
used by other programs (such as domain-specific expert systems) to make
them more flexible.  To date, CYC has made possible ground-breaking pilot
applications in the areas of heterogeneous database browsing and
integration, captioned image retrieval, and natural language processing.

----------------------------------------------------------------
Subject: [1-2]   How do I find this FAQ ?

This FAQ is posted semi-regularly on the comp.ai and comp.ai.nat-lang
newsgroups, as well as being available from its author. (whitten@netcom.com)
It is not currently available from archive sites, WWW sites, and ftp sites.

----------------------------------------------------------------
Subject: [2]     Who is doing CYC ?

Much of the CYC work has been done at the Microelectronics and Computer
Technology Corporation in Austin, Texas.  As of the First of January in
1995, a new independent company named Cycorp has been created to further
the work done on the CYC project. Cycorp continues to be based in Austin,
Texas.

----------------------------------------------------------------
Subject: [2-1]   Who is sponsoring it ?

The development of CYC has been supported by several organizations,
including Apple, Bellcore, DEC, DoD, Interval, Kodak, and Microsoft.

----------------------------------------------------------------
Subject: [2-2]   Who is Douglas Lenat ?

Doug Lenat is one of the world's leading computer scientists and is head of
the CYC Project at MCC and President of Cycorp.  He has been a Professor of
Computer Science at Carnegie-Mellon University and Stanford University.

He is a prolific author, whose publications include the books

	Knowledge Based Systems in Artificial Intelligence (1982, McGraw-Hill) 
	Building Expert Systems (1983, Addison-Wesley)
	Knowledge Representation (1988, Addison-Wesley)
	Building Large Knowledge Based Systems (1989, Addison-Wesley) 

His 1976 Stanford thesis earned him the bi-annual IJCAI Computers and 
Thought Award in 1977.

He was named one of America's brightest scientists under the age of 40 in
the December 1984 Science Digest. 

In 1986, he was elected as Councilor of AAAI.

----------------------------------------------------------------
Subject: [2-3]     Who else was previously or is currently working on CYC ?

Here is a complete list of everyone who has worked with CYC for more than
one continous year since 1988.  Many other persons have made sporadic or
episodic contributions to CYC.  Present employees of Cycorp -- full-time,
part-time, occasional, or currently on leave -- are marked with *.


Paul Blair (1991-1993) is a graduate student in Philosophy who worked on
CYC as a knowledge enterer.  His contributions to the project included
writing a clear introduction to CYCL, CYC's representation language.  Paul
is currently continuing his graduate studies in New York City.

Judy Bowman (1986-1994) was the secretary for the CYC project during most of
its existence at MCC.

Rupert Brauch (1992-1994) holds an undergraduate degree in Philosophy and
contributed to CYC as a knowledge enterer.  He is now pursuing a graduate
degree in Computer Science at Stanford.

*Kathy Burns (1993-present) is a member of the Cycorp Technical Board, and
directs Cycorp's natural language processing development effort. She holds
an undergraduate degree in Linguistics from the University of Texas at
Austin, and has done graduate study in Linguistics at McGill University.
Kathy has written most of the semantic rules that allow translation from
English to CYCL (CYC's formal representation language).  She played an
important role in developing the new CYC database browsing and retrieval
pilot application.  She has composed CYC Mosaic interface and help pages,
and has done a significant amount of knowledge entry.

Mark Derthick (1988-1994) holds a PhD in Computer Science from
Carnegie-Mellon University.  He developed and maintained a variety of
interface, browsing, and knowledge entry tools, including the Heuristic
Level to Epistemological Level translator that proved to be a crucial
addition to the earlier (pre-1991) frame-based version of CYC.  With Karen
Pittman, he played a major role in developing the Cyccess pilot
application, which demonstrated the value of using CYC to integrate the
data contained in disparate structured information sources such as database
tables and spreadsheets.  Mark is currently working on a data
representation and visualization project at CMU.

*Lila Ghemri (1992-present) holds a PhD in Computational Linguistics from
the University of Bristol, England.  She works on the CYC natural language
processing effort, has done a significant amount of knowledge entry, and
has played an important part in expanding CYC's English language lexicon.
She is responsible for augmenting and maintaining the syntax templates that
enable translation from English to CYCL (CYC's representation language),
and for using corpora to test CYC's NL system.

*Keith Goolsbey (1990-present) is a member of Cycorp's Technical Board.  He
holds a Bachelor's degree in Electrical Engineering and a Master's degree
in Computer Science, both from the University of Texas at Austin.  Keith
has played a dominant role in porting CYC from Common Lisp to C, including
work on a Common Lisp to C translator that allows comparable
implementations of CYC to exist in both languages simultaneously.  He has
done a significant amount of knowledge entry, especially concerning CYC's
treatment of quantities and scalar intervals, and was a key participant in
building an early version of CYC's captioned image retrieval pilot
application.  He has worked to make possible aesthetically pleasing and
practical Mosaic interfaces to CYC, including the construction of a new
browsing/editing interface.

Ramanathan V. Guha (1987-1994) holds an MS in Mechanical Engineering from
UC-Berkeley, and a PhD in Computer Science from Stanford.  From 1990 to
1994 he was the technical director of the CYC project.  He has written
several important papers and technical reports, including (as co-author)
the book Building Large Knowledge Based Systems (1989, Addison-Wesley).
Many features of CYC as it exists today are due directly to R. V. Guha's
vision and hard work.  His most notable contributions include converting
CYC from a frame-based system to one in which the fundamental data objects
are logical assertions, and the implementation of logical contexts
(microtheories) as a method for structuring and effectively reasoning with
the knowledge in the CYC knowledge base.  More recently, he was the
principal developer of an innovative CYC database browsing and retrieval
pilot application.  Guha left the CYC project at the end of 1994 to pursue
other interests.

*Bill Jarrold (1990-present) holds a BS in Cognitive Science from MIT.  He
has done a great deal of knowledge entry work for CYC, most notably in the
domains of weather and naive spatial relations.  He is involved in the
design and implementation of test suites for the CYC knowledge base, the
inference engine, and the underlying source code.  Bill is currently
pursuing a PhD degree in Counseling Psychology at the University of Texas
at Austin.

Kate Joly (1992-1994) holds a Bachelor's degree in Linguistics from UC-Santa
Cruz.  She made important contributions to CYC's natural language
processing effort, including writing most of the syntactic parsing
templates for translating from English to CYCL.

Liz Lempert (1992-1994) holds a Bachelor's degree in Symbolic Systems from
Stanford.  She has done a great deal of knowledge entry for CYC in a wide
variety of domains.  She is now pursuing graduate studies in Boston.

*Bill MacCartney (1994-present) holds a Bachelor's degree in Philosophy
from Princeton.  In addition to doing knowledge entry, he has helped with
the process of porting CYC from Common Lisp to C, and has worked on CYC
interface development in C++ and Mosaic.

Alan McKendree (1991-1994) holds a BS in Mathematics from the University of
Texas at Austin.  He worked on CYC as a knowledge enterer and a system
support specialist.

*Kathy Mitchell (1991-present) holds a Bachelor's degree in Computer Science
from Texas A & M and an MS in Computer Science from the University of Texas
at Austin.  She has done much knowledge entry for CYC in just about every
domain.  She played an important part in testing and debugging the CYC
captioned image retrieval pilot application, and more recently has worked
on the CYC natural language processing effort.

*Kenneth Murray (1987-present) is pursuing a PhD in Computer Science at the
University of Texas at Austin, with a focus on developing and applying
large foundational knowledge bases.  He has contributed to CYC as a
knowledge enterer in a wide variety of domains.  In addition to knowledge
entry, he is responsible for helping to debug and maintain the CYC source
code.

Deborah Nichols (1992-1994) did knowledge entry for CYC in several domains.
She is now pursuing a PhD in Philosophy at the University of Texas at
Austin. 

*Karen Pittman (1987-present) is a member of Cycorp's Technical Board, and
plays a prominent role in directing general knowledge entry work on CYC,
training new knowledge enterers, scoping out new knowledge domains, and
planning application-specific knowledge entry tasks.  She is responsible
for a great deal of the knowledge presently in the CYC knowledge base.  She
holds an MS in Botany from the University of Texas at Austin, and is
currently pursuing an MS in Computer Science (also at UT).  With Mark
Derthick, she played a major role in developing the Cyccess pilot
application, which demonstrated the value of using CYC to integrate the
data contained in disparate structured information sources such as database
tables and spreadsheets.

*Dexter Pratt (1989-present) holds a BS in Chemistry from Yale.  Now an
independent software developer and consultant, he was a pioneer in the
development of the Lisp machine workstation in the early 1980s, when he
worked for LMI.  He has contributed to CYC in many ways, including writing
and maintaining interfaces, system software development and maintenance,
helping to port CYC from Common Lisp to C, and knowledge entry in a variety
of domains.  With Nick Siegel, he played a major role in developing CYC's
captioned image retrieval pilot application.  He now works for Cycorp on an
occasional basis.

Wanda Pratt (1990-1993) holds an MS in Computer Science from the University
of Texas at Austin.  She contributed to CYC as a knowledge enterer and a
system software developer and maintainer.  Her knowledge entry work covered
many different domains.  She is now pursuing a PhD in Computer Science at
Stanford.

Wei-Min Shen (1989-1991) holds a PhD in Computer Science from
Carnegie-Mellon University.  He did knowledge entry for CYC and explored
his interest in machine learning.  He is now pursuing a career in
university teaching and research.

*Mary Shepherd (1984-present) is a Sociologist by training.  She has
contributed to CYC as a knowledge enterer, and for much of the past eleven
years has dealt with the onerous administrative and personnel tasks
necessary to keep the CYC team functioning.  She is presently the
administrative manager of Cycorp.

*Nick Siegel (1988-present) is a member of Cycorp's Technical Board.  He
does knowledge entry on CYC, helps to plan and direct knowledge entry,
trains new knowledge enterers, and has been actively involved in testing
the system and adding knowledge-entry capability to the Mosaic user
interface.  He holds a BA in History of Religions from Creighton University
and an MA in Cultural Anthropology from the University of Texas at Austin.
With Dexter Pratt, he played a major role in developing CYC's captioned
image retrieval pilot application.

*Srinija Srinivasan (1993-present) holds a Bachelor's degree in Symbolic
Systems from Stanford.  She has done a great deal of knowledge entry for
CYC in a wide variety of domains, most notably in the area of human
emotional states.

Jamie Stephens (1992-1994) worked on CYC as a knowledge enterer and system
support specialist.  He is now pursuing graduate studies in Mathematics at
the University of Texas at Austin.

Dan Torosian (1990-1993) worked on CYC as a knowledge enterer.  A
professional jazz musician (clarinet, saxophone), he is now pursuing his
musical career full-time in Austin, Texas.

Ginger Webb (1990-1991) holds a Master's degree in French Linguistics from
the University of Texas at Austin.  She worked on CYC as a knowledge
enterer, contributing to several different domains.


Alan Kay, Michael Lesk, John McCarthy, Marvin Minsky, Tom Murphy, Bob
Simpson, Marvin Weinberger, and Steve Chenoweth have all provided useful
comments and met with the CYC team at various times.

----------------------------------------------------------------
Subject: [2-5]   How do I contact the authors myself ?

The CYC group maintains a low profile on the Internet. As it would be easy
to be deluged by email, they have chosen not to publicize their addresses.
If you wish to discuss the ideas behind CYC, or the philosophy of the CYC
project, you probably will get a faster response by posting to the comp.ai,
the comp.ai.philosophy, or the comp.ai.nat-lang newsgroups.

There are many talented people who read these groups, and it is possible
that someone not affiliated with the CYC project will be able to answer
your questions.

----------------------------------------------------------------
Subject: [3]     When did they start CYC and when will it be completed ?

The CYC project began as a dream to create a computerized encyclopedia.
When Alan Kay, one of computing's legendary figures, was at Atari's research
center, he asked Doug Lenat for something original to add to this project.
After Atari hit financial difficulties, Doug Lenat relocated his idea to MCC.
Initially, CYC was based on a re-implementation of RLL (the frame-based
language underlying Eurisko) which was similar to the simultaneously but
independently developed KRL.

The original ten year funding period for the CYC project at MCC was
supposed to end in 1994, but was extended for one year to the close of
1995.

>From the perspective of some people on the CYC team, asking when CYC will
be completed is sort of like asking of a person, "When will s/he be
finished?"  A more pertinent question is, when will it be useful?  The
practical answer is, when it can solve the problems and perform the tasks
its builders would like for it to perform.  The CYC team believes that CYC
is now ready to be used in some interesting and useful applications.

----------------------------------------------------------------
Subject: [4]     Why are they creating CYC ?

The CYC team doesn't believe there is any short cut toward being 
intelligent or creating an artificial intelligence based agent.
Addressing the need for a large body of knowledge complete with 
content and context may only be done by manually organizing and 
collating information.  This knowledge includes heuristic, rule
of thumb problem solving strategies, as well as facts that can only
be known to a machine if it is told.

Much of the useful common sense knowledge needed for life is 
prescientific and has therefore not been analyzed in detail.  
Thus a large part of the work of the CYC project is to formalize 
common relationships and fill in the gaps between the highly 
systematized knowledge used by specialists in the modern world.

----------------------------------------------------------------
Subject: [5]     Where can I get the source code ?

It is not free, nor is it freely available.  If you or your company are
willing to become a corporate sponsor of the CYC development effort, you
will be able to have the same access to the internal details and internal
documentation available to the other sponsors (excluding, of course,
information that is proprietary to particular sponsors).  This is not an
option for everyone, and Cycorp reserves the right to determine if they
will accept sponsorship by your company.  As the current sponsors have
invested a considerable sum of money in developing CYC, please do not
pursue this option unless you or your company are willing to make a similar
contribution.  Serious inquiries regarding collaboration or sponsorship may
be sent to:

	Doug Lenat
	Cycorp, Inc.
	3500 West Balcones Center Drive
	Austin, Texas  78759

While the intent is to make CYC widely available (so that it will become
the standard representation and reasoning system), Cycorp is committed to
protecting the intellectual property rights of those who have invested in
CYC's development.

----------------------------------------------------------------
Subject: [6]     What do they use to make it work ?

The CYC system itself (the knowledge base, inference engine, interface
modules, etc.) is a fairly large, complex piece of software.  However, it
now runs on what is basically stock, off-the-shelf hardware.  CYC can be
run in a networked mode (information provided on one machine is available
to all the other machines) or on a stand-alone workstation serving several
users at once.

----------------------------------------------------------------
Subject: [6-1]     What programming language is it written in ?

There are both Common Lisp and C versions of CYC.  Most development is
currently done in Common Lisp running on Symbolics Lisp machines.  Lisp
source code is translated into C, using a Common Lisp to C translator
developed by the CYC team, to produce source code that can be compiled by a
variety of standard ANSI C compilers.  The CYC team expects that by
mid-1996, most development will be done entirely in C.

----------------------------------------------------------------
Subject: [6-2]     What machines does it run on ?

The C version of CYC is intended to run on any system that provides an
ANSI C compiler, virtual memory with at least 150 Mb of swap space, and
at least a 32-bit flat virtual address space.

As of January 1995, C versions of CYC have been compiled and tested on the
following OS/hardware combinations:

UNIX OS:
 Sun Sparc
 DEC Alpha
Apple System 7 OS:
 Macintosh Powerbook
 Macintosh Quadra
 Power Macintosh

The CYC team expects to have a C version of CYC running under Microsoft
Windows NT on a 486-based IBM PC Compatible shortly.

The Common Lisp version runs on Symbolics Lisp machines, and under Lucid on
Sparc 10s (with memory requirements similar to those for the C version).

----------------------------------------------------------------
Subject: [7]     How does it work ?

In the old days (before 1991), CYC's representation language (CYCL) was
primarily a frame-based language, the CYC KB was thought of as a set of
unit/slot/entry triples, and inferencing was done pretty much by
inheritance.  This led to a set of increasingly baroque add-ons and
work-arounds, such as encoding higher-arity predicates as entries which
were tuples, having variant forms of predicates (in which the only
difference was the order of the arguments), and placing more and more stress
on frame-oriented editing interfaces to navigate around in the knowledge
base.

The CYC team now thinks of the CYC KB as a "sea" of assertions, with each
assertion being no more "about" its first argument than its last one.  For
example, if one says that Fred is Sally's father, this is now regarded as
being just as much a statement "about" Sally as Fred.  Inference has
broadened out into general logical deduction, with AI's well-known named
inference engines (such as inheritance, automatic classification, etc.)
just special cases that might or might not get treated specially in any
particular implementation of the CYC system; but in any event the persons
entering knowledge do not need to cater to that, or even know about it.  So
one way to visualize the CYC KB is as a circle filled with assertions; a
circular "assertion sea".  Above this sea (or outside it, from a
two-dimensional perspective) sit all the "constants".  Attached to each
constant is a bundle of thin wires or strings.   The other ends are
attached to all the assertions, in the sea, that mention that constant
anywhere.  Moreover, each of the assertions in the sea can itself be
treated as a constant, if you want, and have its own wires reaching to
other assertions which mention IT.

Inference rules in CYC can now be thought of as ways of saying that if you
have certain assertions in the sea (a set of them, that match a certain
pattern) then you are justified in adding a particular new assertion.  Each
time an assertion is added, wires are automatically strung to all the
constants that are mentioned anywhere inside the assertion, and "ripples"
of its adding may cause yet other inferences to occur, yet other new
assertions to get dumped into the sea, etc.  Sometimes one of the new
assertions is the answer someone was waiting for, for some problem;
sometimes one of the inference procedures reaches a contradiction and has
to cope with that.

CYCL, the CYC representation language, is essentially a form of First Order
Predicate Calculus (FOPC) with equality, augmentations for default
reasoning, skolemization, and some second-order features (e.g.,
quantification over predicates is allowed in some circumstances).  It uses
a form of circumscription, includes the unique names assumption, and can
make use of the closed world assumption where appropriate.

CYC currently does not store most of the information you would find in a
dictionary, encyclopedia, or an almanac.  For example, CYC may not know
that Birendra Bir Vikram Shah Dev is the current king of Nepal, or that
Kathmandu is its capital city.  It does know what the characteristics of a
capital city are, and it knows the significance of being a head of state.

----------------------------------------------------------------
Subject: [7-1]     How do they store common sense in a computer ?

See sections [1-1] and [7], above.
 
Each assertion in CYC (a statement of fact or a "rule-of-thumb") is
located in (or associated with) a specific microtheory or context.  Each
microtheory captures one "fairly adequate" solution to some knowledge
representation area (knowledge domain). These solutions may address general
areas like representing and reasoning about space, common devices, time,
substances, agents, and causality or specific areas like weather,
manufacturing a particular thing, and walking.

Different areas may have several different microtheories, since the way an
area is perceived or modeled may be different.  Different points of view,
different assumptions, different levels of granularity, and even what
distinctions are important or not important may be significant enough to
require creating a separate microtheory. A microtheory may be considered to
be a smaller and more modular knowledge-base within CYC, which is
specialized on a particular topic.

The important thing to realize is that neither the CYC team, nor CYC itself
claims to have a unified theory of time, space, and the universe. Nor does
it embody some great master Laws of Thought.  What they do have is a suite
of specialized microtheories whose union covers the most common cases.

----------------------------------------------------------------
Subject: [7-2]     How do they input common sense ?

The CYC team's basic knowledge editing tool is called the NUE (New Unit
Editor).  The NUE is basically a full screen assertion editor that allows
the user to view assertions in the knowledge base and perform a variety of
operations, including adding assertions, removing assertions, creating new
constants, killing constants, renaming constants, setting inference
performance parameters (e.g., forward or backward propagation for rules),
asking for conclusions to be derived (if possible), and viewing the
inference chains that resulted in particular conclusions.

Over the past few months (late 1994 - early 1995) the CYC team has written
an entirely new set of interface tools for knowledge browsing and knowledge
editing, based on Mosaic and HTML.  These tools duplicate or exceed the
functionality of the NUE, with the added benefit of complete
standardization and portability across platforms.

As these tools are for inhouse development, there is no public World Wide
Web site available.  There currently is some discussion of providing a 
subset of the CYC database on an example Web site. Should this happen, the
address will be publicized.  This may be available before the end of 1995.

There is also a variety of test suites that are run periodically to test
the integrity of the knowledge base and the functioning of the inference
engine.  The CYC team expects to give more emphasis to regular, automatic
testing of the system now that product development is on the horizon.

----------------------------------------------------------------
Subject: [7-3]     What theoretical foundation is behind CYC ?

CYC is not a theoretical effort, although there has been a lot of theory
used in its construction.  The primary focus of the CYC project is to
actually start consolidating a cohesive knowledge bank.  Any theoretical
issues which have been addressed have been directly motivated by the
requirements of solving specific problems.

Like First Order Predicate Calculus, CYCL allows using ForAll (universal
quantification), ThereExists (existential quantification), and
LogImplication (material implication), as well as the other common ways of
combining variables and logical expressions such as LogAnd (conjunction),
LogOr (disjunction), and LogNot (negation).

The CYC team believes that a hand-encoded effort using symbolic logic may
express a significant fraction of the fundamental human knowledge typically
shared by most people.  This bootstrap process is greatly enhanced by
the redundant nature of knowledge.  Most knowledge uses and re-uses the
same basic ideas and relationships in many different ways.  The day to day
entering of knowledge is not based on ethereal definitions of elaborate
Causality and Time-Space-Intelligence collections.  Most data is as plebian
as 'living organisms have to eat to stay alive' or 'broom handles tend to
be made of wood'.

It is hoped that as the Natural Language effort continues (see [7-5],
below), more knowledge may be entered by persons typing in assertions in
English, and eventually by having CYC 'read' source materials for itself,
bothering its human attendants only when disambiguation is required.

----------------------------------------------------------------
Subject: [7-4]     What is the difference between CYC and an Expert System ?

The knowledge in CYC is more densely interrelated, CYC has more information
about the common attributes of the world, and CYC has a broader focus than
any individual expert system.

A typical expert system uses highly detailed knowledge about a single,
tightly-focused domain.  CYC encodes general knowledge about many different
domains, viewed from a variety of perspectives.  Based on the bodies of
information (microtheories) it uses in inferencing, CYC may draw differing
conclusions.

CYC may be thought of as a tool for building expert systems and other
programs that use a rule-based knowledge representation.  It supports and
uses both forward and backward chaining.  CYC has an integrated
argumentation-based Truth Maintenance System to provide logical reasoning
as well as supporting non-monotonicity.  It can dynamically create terms,
and has several conflict resolution strategies.

----------------------------------------------------------------
Subject: [7-5]     What are they doing in Natural Language Processing ?

The CYC NL system is unique in having access to a very large, declaratively
represented common sense knowledge base.  CYC helps the natural language
system handle word/phrase disambiguation, and also provides a target
internal representation language (CYCL) that can be used to do interesting
things, such as inference.  A substantial portion of the CYC natural
language processing system (the lexicon and many semantic rules) is
actually represented in the CYC knowledge base; CYC "knows about" words
just like it "knows about" cars or trees.  Syntactic parsing is carried out
by a template-matching procedure.  Semantic rules are applied to the output
of the syntax module.  It is in the application of the semantic rules that
the knowledge in the knowledge base is proving especially advantageous.

Most of the CYC pilot applications now under development have some NL
component in their interfaces.  The captioned image retrieval application,
for example, accepts queries in English, and allows captioners to describe
new images to the system using English sentences.  The CYC NL team is
currently expanding the lexicon, extending the parser, and adding new
semantic capabilities to the system.

----------------------------------------------------------------
Subject: [8]       Can CYC reason about information not stored in 
                       CYC-format databases ?

Yes. There is an application named Cyccess which is used to interface
CYC to structured information sources (SIS) such as databases 
or spreadsheets.

Cyccess uses CYC to understand the contents of structured sources, to 
retrieve information, and to pose queries that depend on a combination
of CYC knowledge and the data in the SIS.

After the information in an SIS is appropriately linked to assertions in
CYC, all the CYC inferencing, guessing, and consistency checking
capabilities are available. An interesting implication of this is that CYC
may use specific facts or time-sensitive information without duplicating it
within the CYC knowledge base.

----------------------------------------------------------------
Subject: [9]   What are the details of the functional interface to CYCL ?

The Functional Interface of CYC is a method that can be used by external
programs to query CYC for conclusions or general information. It currently
consists of six operations.

ASK	 given  a logical formula, possibly including free variables,
	 given  a knowledge base subset
	 return a binding list of variables that will make the formula true.

ASSERT	 given  a logical formula,
	 given  a knowledge base subset
	 return a modified knowledge base with the formula as an axiom

UNASSERT given  a logical formula,
	 given  a knowledge base subset
	 return a modified knowledge base with the formula not an axiom
	          formula may still be a theorem,(follow from other axioms)

JUSTIFY	 given a logical formula
	 given a knowledge base subset
	 return a minimal subset of the knowledge base sufficient
	          to justify derivation of the formula as a theorem

CREATE   given a name string
         given a collection (set) term
         create and return a constant that is a member of the collection

KILL     given a term
         return a modified knowledge base in which the term no longer
                  exists and any assertion of which it was a component 
                  is no longer true

----------------------------------------------------------------
Subject: [A] Acknowledgements

This FAQ has been based on magazine articles and books published by the CYC
team, notably Doug Lenat and R.V. Guha, and personal communication with Doug
Lenat and Nick Siegel.

Any mistakes in it are my sole responsibility, although this is not a 
warranty.  I would appreciate a note about any inaccuracies or 
misrepresentations herein.  This FAQ could not be created without the 
generosity of the CYC team in sharing information with the computing
community about the methods and philosophy they have been using.

----------------------------------------------------------------
Subject: [B] Bibliography of Expert Systems books, introductions,
               documentation, periodicals, and conference proceedings.

Lenat, D.B. and Guha, R.V 
	Building Large Knowledge Based Systems, Addison Wesley, 
	Reading Mass, 1990

Guha, R.V and Lenat, D.B. Pittman K., Pratt, D. Shepherd M.
	CYC: a midterm report. Communications of the ACM
	July 1990/ Vol 33. No 8

Guha, R.V and Lenat, D.B.
	CYC: a midterm report. A.I. Magazine, Fall 1990
	
Lenat, D.B. and Guha, R.V 
	Enabling Agents to Work Together, Communications of the ACM
	July 1994/ Vol 37. No. 7

Davidson, Clive
	Common Sense and the Computer,  New Scientist
	April 2, 1994
   
----------------------------------------------------------------
Cyc/FAQ/1.0

Navigation menu

Views

Personal tools

Navigation

Search

Tools