Cyc Administrator Handbook/Backup and Restore

From Public Domain Knowledge Bank
Jump to: navigation, search


Prev: Routine Maintenance Tasks Home:Cyc Administrator Handbook Next: High Availability and Replication

This page is based on the original document Enterprise Cyc Administrator Handbook at https://www.cyc.com/documentation/enterprise-cyc-administrator-handbook/ There is no intent to infringe on Cycorp's Copyright.
It is Copyright (c) Cycorp 2019
Cycorp's address is 7718 Wood Hollow Drive Suite 250 Austin, TX 78731 USA
mailto:info@cyc.com   Main Phone: 512.342.4000

Dumping Units Files

Akin to the data dump in an SQL server, the Cyc server can write a binary format of the knowledge that is suitable for backup and restore operations.
For the duration of a dump, the Cyc server must be segregated from all user interactions by disabling all network services; see the discussion of starting and stopping all TCP/IP services below.

The Basic Dump Command

The basic SubL command for dumping a set of units files is (DUMP-KB path), where path is a slash terminated directory path that is interpreted relative to the ${CYC_BASE_DIR}.
Since all network services have been stopped, the command must be issued either from the startup scripts, or at the read prompt.

Notice that dumping a KB always increments the KB number for the dumped units; loading them will result in a KB number higher by one.
The resulting units are directly usable by the CYC-LOAD-KB command, for example by editing the ${CYC_BASE_DIR}/init/jrtl-init.lisp startup file.

The Clean-up Dump Command

In addition, the process of dumping units files can perform various forms of cleanup and compaction, which is most interesting for situations of significant knowledge changes. The dump first performs a variety of clean ups, and then behave as DUMP-KB described above.
The command is is (DUMP-STANDARD-KB path), where path is a slash terminated directory path that is interpreted relative to the ${CYC_BASE_DIR}.
The resulting units are directly usable by the CYC-LOAD-KB command, for example by editing the ${CYC_BASE_DIR}/init/jrtl-init.lisp startup file.

Dumping Units when using Transcript Servers

To be completed

Dump Performance

Because the Cyc server has to revisit every piece of knowledge in order to produce a proper representation of the dump, it is advantageous to read the base units and to write the target units to the fastest file systems available, including but not limited to RAM disk.

Using RAM disks under UN*X

The following example linux shell script shows how to setup a RAM disk and use it for dumping units.

# determine how much room is needed; assume KB number is 2100
# cd ${CYC_BASE_DIR}
cd ${CYC_BASE_DIR}

du -sh units/2100

# setting up the RAM disk of at least twice that size, e.g. 600MB

export RAMDISK=/var/local/ramdisk

sudo mkdir -p ${RAMDISK}

sudo mount -t tmpfs none ${RAMDISK} -o size=600m

sudo chown cyc ${RAMDISK}

# prepare the units directories, one for source & one for target

mkdir ${RAMDISK}/units

mkdir ${RAMDISK}/units/2100

 mkdir ${RAMDISK}/units/2101

# copy the units there and launch the CYC image, without TCP/servers

 cp –archive –verbose units/2100 ${RAMDISK}/units

 bin/run-cyc.sh -f “(progn (load “init/cyc-init.lisp”)

 (cyc-load-kb “${RAMDISK}/units/2100/”)

 (load “init/release-specific-init.lisp”))”

# use the CYC image, e.g. load transcript files, etc.
# dump the units back to the ramdisk,
# all the following commands are issued at the Cyc Readloop\
# which has a prompt of CYC(###), where ### is the interaction number

(dump-kb “/var/local/ramdisk/units/2101/”)

Creating KB Subsets

CYC supports the ability of abstracting out a subset of knowledge that is useful in its own right, as a full-fledged knowledge base. These sets of knowledge are called KB subsets.

The process of creating a KB subsets consists of two key steps:

1) Identifying the KB subset to serialize out.

2) Serializing out the knowledge and expanding that knowledge into KB units in their own right.

Identifying a KB subset

The identification of KB subsets is done via the following set of KB vocabulary. Users are advised to refer to the comment assertions in the KB on these terms for their proper use.

#$KBSubsetProfile

#$kbsProfileRemoveAssertion

#$kbsProfileRemoveCollectionExtent

#$kbsProfileRemoveFORT

#$kbsProfileRemovePredicateExtent

#$kbsProfileRetainAssertion

#$kbsProfileRetainTerm

The basic process is to introduce a new term of type KBSubsetProfile, which is then associated via the relations mentioned above to assertions, terms, predicates or collection extents that are to be added or removed.

Debugging a KBS Profile

Though the actual process of KB subset construction is quite complicated, the CYC browser can simulate the expected result of a KB subset process with a high degree of accuracy.
In order to test a specific KBS profile, evaluate the following form:

(progn
(set-kbs-definition-from-kbs-profile <kbs-profile-term>)
(identify-kbs-forts-and-assertions))

Now, when browsing the KB, terms and assertions that will be in the KBS partition will be marked with a green circle; terms and assertions that will be missing will be marked with red cross-marks.

Creating units from a KB Subset Profile Term

The following sequence of steps will produce a KB subset as as a set of unit files, as defined by a KBS profile term.

Generate the KBS Partition

This sequence of code steps will first identify the partition information and then serialize essential KB information into the partition file. The step also produces additional information about the utility of rules and the caching policies for SBHL, which are written (with standard names) to the partition directory provided.

(progn
(set-kbs-definition-from-kbs-profile <ProfileTerm>)
(make-kbs-partition <partition-dir> <partition-filename>)
)

Expanding the KBS Partition into Preliminary Units

This sequence of code steps will load a partition, the rule utility information, and the SBHL caching policies into an empty KB; the code expects to be started from a CYC image without a KB (e.g. bin/run-cyc-no-init.sh -i init/cyc-init.lisp)

(progn
(load-partition-into-empty-kb <partition-dir> <partition-filename>)
(dump-standard-kb <pre-units-directory-path>)
)


Cleanly Rebuilding Preliminary into Final Units

This sequence of code steps will boot into the preliminary units KB, perform another round of standard cleanups, and then dump the final units to their directory path. Since the sequence explicitly loads the KB units, the code expects to be started without a KB 

(e.g. the Linux command:

${CYC_BASE_DIR}/bin/run-cyc-no-init.sh -i ${CYC_BASE_DIR}/init/cyc-init.lisp
(progn
(cyc-load-kb <pre-units-directory-path>)
(kbs-image-cleanup)
(dump-standard-kb <final-units-directory-path>)
)