Makefile DTD

The content collection makefile DTD defines the format to which a makefile must conform.

<!-- 
	DTD for NXT 4 make files
	Version 4.0
	Copyright (c) 2005-2012, Rocket Software, Inc.
-->

<!-- 
The following parameter entities may be overridden in the make file's
internal DTD subset. For example, to change the default value for document
encodings from #IMPLIED (auto detect) to iso-8859-1, the internal
DTD subset in the make file would look like this (though the path to the
DTD may be different):

<!DOCTYPE makefile SYSTEM "file:///va:/production/dtd/makefile.dtd" [
	<!ENTITY % default-encoding '"iso-8859-1"'>
]>

-->
<!ENTITY % default-dse-id            "'file-system-dse'">
<!ENTITY % default-dse-id-metadata   "'file-system-dse'">
<!ENTITY % default-hidden            "'no'">
<!ENTITY % default-content-type      "#IMPLIED">
<!ENTITY % default-encoding          "#IMPLIED">
<!ENTITY % default-compression       "'none'">
<!ENTITY % default-indexsheet        "#IMPLIED">
<!ENTITY % fsysdse-class-id-ansi     "{023933C0-CFCF-11d1-B139-00C04F932EC0}">
<!ENTITY % fsysdse-class-id-unicode  "{DCFC3BDD-5125-4734-B422-81B65E50611A}">


<!-- Pre-defined DSE for easy use. -->
<!ENTITY fsysdse "<dse id='file-system-dse' class-id='%fsysdse-class-id-unicode;'/>">
<!ENTITY fsysdse-class-id "%fsysdse-class-id-unicode;">

<!ELEMENT makefile (dse+, (infobase|content-collection))>
<!ATTLIST makefile
	version	CDATA #FIXED "4.0">

<!ELEMENT dse (param*)>
<!ATTLIST dse
	id		ID #REQUIRED
	class-id	CDATA #REQUIRED
	chain		IDREF #IMPLIED>

<!ELEMENT param EMPTY>
<!ATTLIST param
	name	CDATA #REQUIRED
	value	CDATA #REQUIRED>	

<!ELEMENT content-collection ((field|field-def)*, indexsheet*, property*, document+)>
<!ATTLIST content-collection
	id			CDATA #REQUIRED
	title			CDATA #REQUIRED
	filename		CDATA #REQUIRED
	update-filename		CDATA "update.upd"
	encryption		(none|exportable|best) "none"
	password		CDATA #IMPLIED
	old-password		CDATA #IMPLIED
	stop-words		(yes|no) "no"
	index-headings		(yes|no) "no"
	lang-module		CDATA "NextPage US English Server Extension Module. Version 2.01">

<!-- infobase is deprecated in favor of content-collection -->
<!ELEMENT infobase ((field|field-def)*, indexsheet*, property*, document+)>
<!ATTLIST infobase
	id			CDATA #REQUIRED
	title			CDATA #REQUIRED
	filename		CDATA #REQUIRED
	update-filename		CDATA "update.upd"
	encryption		(none|exportable|best) "none"
	password		CDATA #IMPLIED
	old-password		CDATA #IMPLIED
	stop-words		(yes|no) "no"
	index-headings		(yes|no) "no"
	lang-module		CDATA "NextPage US English Server Extension Module. Version 2.01">

<!ELEMENT field EMPTY>
<!ATTLIST field
	name		CDATA #REQUIRED
	type		(text|long|double|date|time) "text"
	relevance	(normal|high|higher|highest) "normal"
	picture		CDATA #IMPLIED
	index		(yes|no) "yes"
	exclusive	(yes|no) "no"
	term-list	(yes|no) "no"
	phrase		(yes|no) "no"
	toc-section	(yes|no) "no"
	stop-words	(yes|no) "no"
	proximity	(yes|no) "yes"
	date-2000	(yes|no) "no">

<!-- field-def is deprecated in favor of field -->
<!ELEMENT field-def EMPTY>
<!ATTLIST field-def
	name		CDATA #REQUIRED
	type		(text|long|double|date|time) "text"
	relevance	(normal|high|higher|highest) "normal"
	picture		CDATA #IMPLIED
	index		(yes|no) "yes"
	exclusive	(yes|no) "no"
	term-list	(yes|no) "no"
	phrase		(yes|no) "no"
	toc-section	(yes|no) "no"
	stop-words	(yes|no) "no"
	proximity	(yes|no) "yes"
	date-2000	(yes|no) "no">

<!ELEMENT property EMPTY>
<!ATTLIST property
	name		CDATA #REQUIRED
	value		CDATA #REQUIRED>

<!-- compression best is deprecated in favor of fast -->
<!ELEMENT document (metadata?,(document)*)>
<!ATTLIST document
	id			CDATA #IMPLIED
	name			CDATA #REQUIRED
	title			CDATA #IMPLIED
	hidden			(yes|no) %default-hidden;
	index			(yes|no) "yes"
	location		CDATA #IMPLIED
	dse			IDREF %default-dse-id;
	content-type		CDATA %default-content-type;
	encoding		CDATA %default-encoding;
	compression		(none|fast|best) %default-compression;
	indexsheet		IDREF %default-indexsheet;
	first-child-content	(yes|no) "no">

<!ELEMENT metadata EMPTY>
<!ATTLIST metadata
	location		CDATA #IMPLIED
	dse			IDREF %default-dse-id-metadata;>

<!ELEMENT indexsheet EMPTY>
<!ATTLIST indexsheet
	id			ID #REQUIRED
	src			CDATA #REQUIRED
	default-for-content-type CDATA #IMPLIED>


Makefile Elements

The makefile can contain the following elements.

Element Name Description
content-collection Marks the beginning of content collection content. This is the root content collection element.
DOCTYPE Specifies that a file's document type is "makefile." It also gives the path to the makefile.dtd for the content collection.
document

Identifies a document or heading to store in a content collection.
dse Declares the Document Source Extension (DSE) to be used to import documents into a content collection.
field Defines a field for which to create an index.
indexsheet A new attribute which allows an indexsheet to be stored in a different file than a content collection makefile. A content collection makefile uses this attribute to specify the file name of an indexsheet. Indexsheets can only be accessed as external files, they cannot be included inline in the makefile. The value of this attribute is a full or relative path (or URL using the file protocol) to the indexsheet.
makefile Marks the beginning of makefile information. This is the root makefile element. Makefile was updated to 2.0 to coincide with the current version.
metadata Metadata file to include in the content collection. Metadata is stored as a document property and describes the document.
param Specifies a parameter (named value) to pass to a DSE.
property Specifies the value of a content collection property.
xml Specifies that a file contains a document conforming to an XML specification.

content-collection

Marks the beginning of the content within a content collection. This is the root content collection element.

note icon The term infobase has been deprecated in favor of content-collection. Infobase is maintained for backwards compatibility.

Definition

<!ELEMENT content-collection 
     ((field | field-def*), indexsheet*, property*, document+)>
<!ATTLIST content-collection
     id                  CDATA #REQUIRED
     title               CDATA #REQUIRED
     filename            CDATA #REQUIRED
     update-filename     CDATA "update.upd"
     encryption          (none|exportable|best) "none"
     password            CDATA #IMPLIED
     old-password        CDATA #IMPLIED
     stop-words          (yes|no) no
     lang-module         CDATA "NextPage US English 
	                     Server Extension Module. Version 2.01"
>

Attributes

Attribute Description
id [Required] ID assigned to the content collection. Document IDs are used to locate content collections when executing ID-based hypertext links.

This ID is a string of numbers and letters (maximum length is 127 characters) that is stored as a content collection property. The ID specified with this attribute is not the same as the ID generated by the NXT 4 server when a content collection is created.

A content collection's ID should be different than the ID of any other content collection that might be mounted on the same web site. You might want to investigate Digital Object Identifiers (DOI) as a method of developing a unique ID for your content collections.

You can change the ID of a content collection when updating it; however, changing the ID will break any links to the content collection.
title [Required] Title assigned to the content collection. This is an alphanumeric string (maximum length is 255 characters). When you use the Content Network Manager to add a content collection to an NXT 4 site, the title attribute is copied from the content collection to the Site Definition File (SDF). The NXT 4 site table of contents displays a content collection's title attribute as the root heading for the content collection. You can change the heading displayed for a content collection by modifying the title attribute assigned to that content collection in the Site Definition File.
filename [Required] Path and filename of the content collection to be created or updated. This is an alphanumeric string (maximum length is 255 characters).

If the specified content collection exists and you run ccBuild with the /A command line option, ccBuild deletes it and creates a new content collection with the specified filename.

If the specified content collection exists and you run ccBuild without the /A command line option, ccBuild modifies the content collection. ccBuild replaces, adds, and removes all field, property, and document attributes to make the content collection contain the items specified in the makefile.
encryption Level at which to encrypt content collection data. You can specify none, exportable, or best. The default is none which creates a content collection that is not encrypted. You cannot specify a password for a content collection that is not encrypted.

Exportable creates a content collection with the best encryption allowed to be exported from the USA (40-bit).

Best creates a content collection with RSA encryption. You may not export RSA encrypted content collections from the USA. You must specify a password for a content collection that uses encryption level best.

When updating a content collection, you may not change the encryption level unless you use the /A command line option to replace the content collection.
password Password to assign to the content collection. If you assign a password to a content collection, the password must be specified in the Site Definition File for the site on which the content collection is hosted. The password must also be specified when updating the content collection. You cannot specify a password for a content collection that uses encryption level none. You must specify a password for a content collection that uses encryption level best.

When updating a content collection that has a password, ccBuild tries to open the content collection using the password specified in the password attribute. If that fails, ccBuild tries to open it using old-password. If it successfully opens the content collection, the password is changed from old-password to password.

You do not need to specify a password when using the /A command line option to replace a content collection.
old-password Specifies the password of the content collection being updated.
stop-words Flag indicating whether or not to index stop words in the content collection. You can specify yes or no. The default is no, which decreases the size of a content collection by reducing the size of the index used for fast phrase searches. The language module used to build a content collection defines the stop words for the language. The stop words used in the English-US version of NXT 4 are:
a about after all an and are as bet but by can for from had has have
he his I if in is it its no not of on or out said than that
the their they this to up was we were when which who will with would    
lang-module Name of the language module to use to create the content collection. The language module determines how the content collection's text is parsed and indexed. The default value specified in makefile.dtd is NextPage US EnglNXT 4 Extension Module. Version 2.01.

NXT 3  provides three sets of language modules. Version 1.00 language modules are the original language modules (they are provided for backward compatibility only). Version 2.00 language modules have been updated to take advantage of unicode, and version 2.01 language modules are the latest update. It is recommended that you use version 2.01 language modules for new content collections.

NXT 4 Language Modules

Version 1.00 Language Modules
UK English "Folio UK English Server Extension Module. Version 1.00"
US English "Folio UK English Server Extension Module. Version 1.00"
Spanish "Modulo de Folio para el idioma espanol. Version 1.00."
French "Module Folio de gestion de la langue francaise. Version 1.00"
Canadian French "Module Folio de gestion de la langue francaise(Canada). Version 1.00"
Dutch "Folio Nederlandse taalmodule. Versie 1.00"
Brazilian Portugese "Modulo do Folio em portugues. Versao 1.00"
German "Folio-Sprachmodul Deutsch. Version 1.00"
Version 2.00 Language Modules
UK English "NextPage UK English Server Extension Module. Version 2.00"
US English "NextPage US English Server Extension Module. Version 2.00"
Spanish "Modulo de NextPage para el idioma espanol. Version 2.00."
French "Module NextPage de gestion de la langue francaise. Version 2.00"
Canadian French "Module NextPage de gestion de la langue francaise(Canada). Version 2.00"
Dutch "NextPage Nederlandse taalmodule. Versie 2.00"
Brazilian Portugese "Modulo do NextPage em portugues. Versao 2.00"
German "NextPage-Sprachmodul Deutsch. Version 2.00"
Version 2.01 Language Modules
UK English "NextPage UK English Server Extension Module. Version 2.01"
US English "NextPage US English Server Extension Module. Version 2.01"
Spanish "Modulo de NextPage para el idioma espanol. Version 2.01."
French "Module NextPage de gestion de la langue francaise. Version 2.01"
Canadian French "Module NextPage de gestion de la langue francaise(Canada). Version 2.01"
Dutch "NextPage Nederlandse taalmodule. Versie 2.01"
Brazilian Portugese "Modulo do NextPage em portugues. Versao 2.01"
German "NextPage-Sprachmodul Deutsch. Version 2.01"
New Version 1.00 Language Modules (These are newly supported languages; not compatible with NXT 3)

note icon The [Server\LanguageModules] section of the Publish.ini file contains mappings between these names and the corresponding DLL file names.

Remarks

You can specify only one content collection in a makefile. In addition, only one language can be specified for each content collection.

Example

<?xml version="3.0"?>
<!DOCTYPE makefile SYSTEM 
"file:///v:/production/dtd/makefile.dtd" [
     <!ENTITY prjdir "c:\Project\data\government">
]>
<makefile>
&fsysdse;

<content-collection
   id="gov"
   title="Government: A Sampling of Documents"
   filename="&prjdir;\government.nfo"
   update-filename="&prjdir;\government.upd"
   compression="best">

<document location="&prjdir;\titlepage.html" 
    content-type="text/html"/>
<document location="&prjdir;\govtlogo.gif" 
    content-type="image/gif" hidden=yes index=no/>

<document name="Acts-Bills">
   <document location="&prjdir;\Acts-Bills\brady.html"  
       content-type="text/html"/>
   <document location="&prjdir;\Acts-Bills\civil91.html"  
   content-type="text/html"/>
</document>

<document name="Cases">
   <document location="&prjdir;\Cases\meinhold.html" 
   content-type="text/html"/>
</document>

</content-collection>
</makefile>

DOCTYPE

Specifies a file's document type. It also gives the path to the document type definition (DTD) specifying how to parse and validate the XML file.

Definition

<!DOCTYPE {file type} SYSTEM "file:///{path to makefile.dtd}">

or

<!DOCTYPE {file type} SYSTEM "file:///{path to makefile.dtd}"
[
   <!ENTITY {entity name} "{entity value}">
]>

Remarks

A DTD can have two parts: an external subset, stored in the file specified using the DOCTYPE tag, and an internal subset, declared inside of the DOCTYPE tag. Entities declared in the DTD's internal subset take precedence over entities declared in the external subset.

The syntax for specifying the DTD is:

protocol://host/path/file.ext

The following examples show the syntax in a modified format.host is dropped because host is assumed to be localhost. The resulting syntax is:

file:///c:/myfile.txt

Example

<!DOCTYPE makefile SYSTEM "file:///c:/Rocket/makefile.dtd">
<!DOCTYPE makefile SYSTEM "file:///c:/Rocket/makefile.dtd"
[      <!ENTITY prjdir "c:\Rocket\data\sports">       
       <!ENTITY % default-hidden"'no'"> ]>

document

Specifies the document or heading to store in a content collection.

Definition

<!ELEMENT document ((document)*,metadata?)>
<!ATTLIST document
   id             CDATA #IMPLIED
   name           CDATA #REQUIRED
   title          CDATA #IMPLIED
   hidden         (yes|no) %default-hidden;
   index          (yes|no) yes
   location       CDATA #IMPLIED
   dse            IDREF %default-dse-id;
   content-type   CDATA %default-content-type;
   encoding       CDATA %default-encoding;
   compression    (none|fast|best) %default-compression;
   indexsheet     IDREF %default-indexsheet;
   first-child-content  (yes|no) no
>

Attributes

Attribute Description
id ID of the document. Document IDs are used to locate documents when executing ID based hypertext links. If you do not specify an ID for a document, the NXT 4 server generates an ID for it.

ccBuild assigns an ID to a document when it is first stored in a content collection. When updating a document, you can change the document ID but changing the ID invalidates links to the document unless they are updated with the new ID.

Because document IDs are strings (maximum length is 127 characters), they can hold IDs that meet the DOI specification. The NXT 4 server does not enforce strict compliance of DOI syntax. However, Rocket recommends the DOI system as good means of insuring unique IDs across all documents from all publishers that coexist on an Intranet library or an global Internet library. See http://www.doi.org/ for more information on DOI.
name Name of the document. Programs such as the NXT 4 server use document names internally to list and find documents.

Document paths in a content collection are based on document names. The last name in a document's path is the document's name. The maximum length of a document path is 2047 characters.

Because of the length of the URL, you should make the document name as short as possible. Moreover, Rocket strongly suggests that the name not be the same as the title of the document.

A name is assigned to a document when it is created. When updating a content collection, changing the name or path of a document causes it to be deleted and re-added in the new location.

If you do not specify a location for a document in the makefile you must specify a name. If a name is not specified in the makefile, ccBuild assigns the document the name provided by the DSE being used. The File System DSE specifies the name of a document as the name of the source file from which it was created.

Maximum length of the name is 127 characters. Characters that are valid for naming files, folders, content collections, and so forth include the following:

A-Z Any combination of uppercase letters
a-z Any combination of lowercase letters
0-9 Any combination of numbers
` Accent
~ Tilde
! Exclamation mark
@ At sign
# Pound sign
$ Dollar sign
% Percent
^ Carat
& Ampersand
- Hyphen
_ Underscore
= Equal sign
+ Plus sign
{ Left curly brace
} Right curly brace
' Apostrophe (single quotation mark)
. Period
&space; Space


Characters that are not valid for naming files, folders, content collections, and so forth include the following:

( Left parenthesis
) Right parenthesis
[ Left bracket
] Right bracket
\ Back slash
/ Forward slash
: Colon
; Semicolon
* Asterisk
? Question mark
" Quotation mark
< Less than
> Greater than
| Pipe
, Comma
title Title of the document as it appears in the content collection's table of contents. The maximum length of a document title is 32767 bytes. If you do not specify a title for a document in the makefile, ccBuild assigns the title provided by the DSE being used. The File System DSE does not provide a title for documents. If a title is not specified in the makefile or by the DSE, the document is assigned the title specified in the body of the document. For example: inside an HTML document the <H1> element can specify the title of a document.

You may modify this attribute when updating a document.
hidden Flag indicating whether or not to display the document title in the content collection's table of contents. You can specify yes or no. The default is no.

Redefine the default-hidden entity as #IMPLIED in a makefile's internal DTD to cause ccBuild to request the hidden value from the DSE when the hidden flag is not specified for a document.

You can change the default value of this attribute by defining the default-hidden entity in the internal subset of a makefile's DTD. You can modify this attribute when updating a document.

If a document is hidden, it is not accessible using Next or Prev Match links. A hidden document can be retrieved through direct links to the document, but not through navigation options such as Next or Prev Match.
index Flag indicating whether or not to index the document. You can specify yes or no. The default is yes. Setting the attribute to no turns off indexing for the document which means searches will not find hits in the document. This attribute only applies to documents with a content type that can be indexed. For example, if you say yes for a GIF image, ccBuild assumes you meant no and does not index the GIF document

location Location of the source data from which to create the document. Locations must follow the format supported by the specified dse.

If you do not specify the location attribute for a document, an empty document is created in the content collection. This is how you create headings without content. When a user clicks on an empty document in the NXT 4 table of contents, the displayed document does not change. To avoid this, you could specify a document for headings.

The File System DSE requires locations to be standard file names. Entities are frequently used in makefiles to shorten the text required to specify the path to a document. See Makefile Entities for File Paths for more information.
dse Document Source Extension (DSE) to use to retrieve the source document. The default value specified in makefile.dtd is file-system-dse. You can change the default value by defining the default-dse-id entity in the internal subset of the makefile DTD.

DSEs must be defined using the dse element before they are specified as the value of a document's dse attribute.
content-type Name of the document's MIME type. Web browsers use a document's MIME type to determine how to render it. The NXT 4 server uses a document's MIME type to determine whether or not to index it.

The NXT 4 server indexes documents of the following MIME types: text/html, text/xml, application/pdf, application/msexcel, application/msword, application/mspowerpoint, application/x-html-body-text, and text/plain. The engine uses the IFilter COM interface to extract terms from Microsoft Office and Acrobat PDF files.

If the content-type does not specify a MIME type, the NXT 4 server treats the document as binary data and does not index it. You can specify a default MIME type for documents by defining the default-content-type entity in the internal subset of the makefile DTD.

You can modify this attribute when updating a content collection.
encoding Specifies the character encoding for the document, such as ISO-8859.1, UTF-8, etc. The encoding attribute in the document element overrides the default encoding specified as a parameter entity in the makefile.

If the encoding attribute is not specified, the default encoding for the makefile is used. If no encoding is specified in the makefile, ccBuild attempts to detect the encoding by reading part of the document. If detection fails, the encoding is assumed to be ISO-9959.1.
compression Level at which to compress content collection data. You can specify none, fast, or best. The default is none. Specifying a higher compression level makes a content collection smaller but also makes it take longer to build. The compression is applied to each document, not to a content collection as a whole. You should specify none for document types which are already compressed such as GIF and JPEG images.

When updating a content collection, you may not change the compression level unless you use the /A command line option to replace the content collection, or the /U command line to create an update file (see ccBuild for more information).
indexsheet ID of the indexsheet defining how the NXT 4 server should index the document. The ID cannot include the slash (/) character.

If the indexsheet attribute is not specified for a document, the NXT 4 server indexes all terms in the document. You can specify a different default by defining the default-indexsheet entity in the internal subset of the makefile DTD.

Indexsheets must be defined using the indexsheet element before they are specified as the value of a document's indexsheet attribute.

first-child-content Redirects a request for a document to the document's first child. This attribute is primarily for documents which are used to create headings in the table of contents, but which do not contain content. This simulates the behavior seen with default documents for a directory on a web server. Set this attribute to yes on the document and make sure that a document giving an overview or containing links to the section is the first child.

Remarks

Establish headings in a content collection using document titles. Establish the order and hierarchy of a content collection through the order and hierarchy of documents listed in the makefile. Make one document the child of a second by nesting the first between the begin and end tags of the second.

Fields, levels, and HTML heading tags may be applied within a document to create sub-headings. You should not include subheadings in documents that have child documents. The resulting table of contents puts the subheadings below the child documents.

When documents reference secondary files for their content, such as with graphics, the secondary files may be stored within the same content collection or externally. An HTML path uses the document's name to specify the location of a document in a content collection. References from one document to another document in the same content collection should be relative URLs. For example, an HTML file could contain the following:

<IMG SRC="../images/myimage.gif">

As shown here, you may specify relative paths using the ".." and "/" characters commonly used in URLs. However, you may not specify a URL relative to the root by beginning the URL with "/". Instead you should use an absolute URL from the server by adding a replacement string as shown in the following example:

<IMG SRC="#!--#EXECUTIVE:HOME_PATH --#/images/myimage.gif">

This results in a reference that looks like:

<IMG SRC="http://localhost/NXT/gateway.dll/images/myimage.gif">

Example

<document name="Acts-Bills">
   <document location="&prjdir;\Acts-Bills\brady.html" 
      title="Brady Bill" content-type="text/html"/>
   <document location="&prjdir;\Acts-Bills\civil91.html" 
      title="Civil Rights Bill 191" content-type="text/html"/>
   <document location="&prjdir;\Acts-Bills\compfraud.html" 
      title="Compensation Fraud" 
	  content-type="text/html" 
	  encoding="UTF-8"/>
</document>

dse

Declares the Document Source Extension (DSE) to be used to import documents into a content collection.

Definition

<!ELEMENT dse (param*)>
<!ATTLIST dse
      id          ID #REQUIRED
      class-id    CDATA #REQUIRED
      chain       IDREF #IMPLIED>

Attributes

Attribute Description
id ID used to refer to the DSE when specifying it in the dse attribute of a document or in the chain attribute of a DSE.
class-id COM class ID uniquely identifying the DSE. ccBuild uses this ID to load and access the DSE. The class IDs of the NXT DSE are specified in makefile.dtd. DSE class IDs should be documented by DSE creators.
chain (Optional) ID of a DSE that the current DSE should chain to. You must declare a DSE before specifying that another DSE chain to it.

Remarks

DSEs must be defined between the makefile and content-collection elements.

Use the param argument to specify options for a DSE to use.

A DSE abstracts a source of documents to store in a content COLLECTION. ccBuild uses DSEs to read documents for storing in a content collection. DSEs may preprocess documents before handing them to ccBuild.

DSEs can be chained so that the output of one DSE is the input of another. In order to establish a chain, a DSE's chain attribute must be set to the ID of another DSE.

note icon The NXT File System DSE does not support chaining.

The following table shows the entity the makefile.dtd defines for using the NXT DSE.

Entity DSE ID Supports Chaining to Other DSEs Default DSE it Chains To
fsysdse (File System DSE) file-system-dse No none

Before specifying the NXT DSE in a makefile, you must include its entity. You must declare a DSE before declaring another DSE chain that chains to it.

Both Unicode and ANSI DSEs are supported in NXT 4. ccBuild will try to use the Unicode interface first and then switches to the ANSI interface. Each set of DSEinterfacess has a different interface identifier or GUID to distinguish between Unicode and ANSI. Because of the differences in the DSE interfaces, chaining between the two interfaces is not recommended and in most situations will not work. A DSE can be written to handle the string translations itself, which would allow chaining between Unicode and ANSI DSEs but this does not occur automatically.

Example

The following example demonstrates defining and using a custom DSE called my-dse.

<?xml version="3.0"?>
<!DOCTYPE makefile SYSTEM 
"file:///c:/Program Files/Rocket/NXT 4/Builder/makefile.dtd">

<makefile>
<dse id='my-dse' 
class-id='{025933C0-CFCF-11d1-B139-00C04F932EC0}'/>
<content-collection
     id="fisha"
     title="Fish Almanac"
     filename="c:\Project\content collection\Almanac.nfo">

     <document   name="Trout Records"
                 location="c:\Project\data\trout.mtf"
                 dse="my-dse"
     content-type="text/html"
     />
</content-collection>

</makefile>

field

Defines a field for which to create an index.

Definition

<!ELEMENT field EMPTY>
<!ATTLIST field
   name          CDATA #REQUIRED
   type          (text|long|double|date|time) "text"
   relevance     (normal|high|higher|highest) "normal"
   picture       CDATA #IMPLIED
   index         (yes|no) yes
   exclusive     (yes|no) no
   term-list     (yes|no) no
   phrase        (yes|no} no
   toc-section   (yes|no) no
   stop-words    (yes|no) no
   proximity     (yes|no) yes
   date-2000     (yes|no) no
>

Attributes

Attribute Description
name Name of the field to define. Field names can be a maximum of 127 characters and must be unique within a content collection. If you define more than one field with the same name, ccBuild reports errors and all but the first definition are ignored.
type Data type to assign to the field. A field's data type determines how the NXT 4 server indexes the terms to which the field is applied. You can specify text, long, double, date, or . The default is text.
relevance Adjusts the relevance weight of a field. Allowed values are: normal, high, higher, and highest.
picture Picture string specifies how to render the field's terms. See Picture Strings for a list of picture strings supported by the various language modules.
index Flag indicating whether or not to index terms to which the field is applied. Terms which are indexed can be searched separately from the remainder of the content collection. Fielded terms which are not indexed are not searchable. You can specify yes or no. The default is yes.

You should specify yes if yes is also specified for any of the following attributes: toc-section, stop-words, or date-2000.
exclusive Flag indicating whether a field's terms can only be found when searching the general index. If you specify yes, then the field's terms can be found when searching the field, but not when searching the general index. For Folio 4.x users, this is the same as choosing "Field Only" for the field.
term-list Used in conjunction with a term iterator such as a word-wheel component. When set to yes, a list of terms in this field are generated. When set to no, a list of terms is not generated and the terms will not be listed for this field.
phrase Specifies that the terms in a field should be indexed as a phrase instead of individual terms. yes indexes terms as a phrase and no indexes the terms individually. no is the default setting.
toc-section Flag indicating whether or not the field creates table of contents structure. You can specify yes or no. The default is no. Fields of this type are normally not needed for HTML and therefore, only used when you want to apply fields to create hierarchy for XML or custom HTML structure. When using toc-section fields, they must be used with an indexsheet to create headings (see np:index for information on including toc-heading in an indexsheet).

If you specify yes, the field's index attribute must also be set to yes. Using toc-section is the preferred method of specifying structure for a table of contents.
stop-words Flag indicating whether or not to use stop words when building the index for the field. You can specify yes or no. The default is no.

Using stop words decreases the size of a content collection by reducing the size of the index used for fast phrase searches. The language module used to build a content collection defines the stop words for the language.

If you specify yes, the field's index attribute must also be set to yes.
proximity Flag indicating whether or not it is a proximity field. You can specify yes or no. The default is yes. Rather than use proximity field, set term-list=yes and proximity=no to generate a separate termlist for each field, which enables you to perform an efficient field search and still perform a general search.
date-2000 Flag indicating whether or not to allow two digit years past the year 2000. You can specify yes or no." The default is no.

If you specify yes, two digit years greater than 50 are treated as though they are in the 1900's. Two digit years less than 50 are treated as though they are in the 2000's. For example, the date 4/5/96 would be interpreted as April 5, 1996, while the date 4/5/05 would be interpreted as April 5, 2005.

This attribute is ignored if the field's data type is not "date." If you specify yes, the field's index attribute must also be set to yes.

Remarks

To apply a field, you must use the indexsheet element to define rules which specify indexing for the field. You must also specify that a document use the indexsheet.

Example

<field name="Birth" type="date" 
   picture="mm/dd/yy" index=yes
   term-list=no toc-section=no 
   stop-words=no date-2000=yes/>

indexsheet

Specifies an indexsheet which determines how the indexer should index data.

Definition

<!ELEMENT indexsheet ANY>
<!ATTLIST indexsheet
   id                        ID     #REQUIRED
   src                       CDATA  #IMPLIED
   default-for-content-type  CDATA  #IMPLIED>

Attributes

Attribute Description
id ID used to reference the indexsheet. IDs cannot contain the slash (/) character.
src File name of the indexsheet.
default-for-content-type Specifies the content type to use the indexsheet with. For example, specifying text/xml would use the specified indexsheet for all documents of content type XML.

Remarks

To use an indexsheet for a document, specify the ID of the indexsheet in the document's indexsheet attribute. An indexsheet must be defined before it is used. To specify the default indexsheet to use for documents in a content collection, redefine the default-indexsheet entity in the internal portion of a Makefile's DTD.

See Indexsheet DTD for the format of indexsheets.

Example

<?xml version="3.0"?>
<!DOCTYPE makefile SYSTEM 
"file:///c:/Program Files/Rocket/NXT 4/Builder/makefile.dtd">
<makefile>
&fsysdse;

<content-collection title="my content" 
   filename="C:\Project\content collection\mycontent.nfo">

     <field-def name="Creator"/>

     <indexsheet id="my-index-rules"
      src="c:\Project\indexsheet\myindex.htm"/>
       <indexsheet id="index-html" 
	   src="c:\Project\indexsheet\default-html.htm" 
	   default-for-content-type "text/html" />
        <xsl:stylesheet>

           <xsl:template match='Creator'>
              <np:index field="Creator">
                 <xsl:process-children/>
              </np:index>
           </xsl:template>

           <xsl:template match='Editor'>
              <np:index field="Editor">
                 <xsl:process-children/>
              </np:index>
           </xsl:template>

        </xsl:stylesheet>
     </indexsheet>

     <document location="c:\Project\docs\mydoc1.html"
indexsheet="my-index-rules" 
content-type="text/html"/>

</content-collection>
</makefile>

makefile

Marks the beginning of makefile information. This is the root makefile element.

Definition

<!ELEMENT makefile (dse+, (infobase | content-collection))>
<!ATTLIST makefile version CDATA #FIXED "6.0">

Attributes

Attribute Description
version Specifies the makefile version number. The default is 6.0. Because the version attribute is defined in the DTD, you should not specify the makefile version in a makefile.

Remarks

A makefile can contain one or more dse elements and a single content-collection element.

Examples

<?xml version="3.0"?>
<!DOCTYPE makefile SYSTEM 
"file:///c:/Program Files/Rocket/NXT 4/Builder/makefile.dtd"
[
   <!ENTITY prjdir "c:\Project\Data\Government">
]>
<makefile>
&fsysdse;
<content-collection
     id="gov"
     title="Government: A Sampling of Documents"
     filename="&prjdir;\government.nfo">
     <document location="&prjdir;\titlepage.html"
     name="1"
     content-type="text/html"/>
</content-collection>
</makefile>

metadata

Specifies the metadata file to include in the content collection. Metadata is stored as a document property and describes the document.

Definition

<!ELEMENT metadata EMPTY>
<!ATTLIST metadata
   location	CDATA #IMPLIED
   dse	IDREF %default-dse-id-metadata;>

Attributes

Attribute Description
location Location of the metadata file to include in the content collection. The default location is the directory containing the parent document. The location must conform to the requirements of the DSE. For example, with the File System DSE (FSysDSE), the full path would be used. Another DSE might use an http:// URL path.

dse Document Source Extension file used with the metadata file. The dse attribute allows you to use a separate DSE for extracting metadata. By default, the file system DSE is used. For example,

<document name="doc.pdf" location="doc.pdf"
content-type="application/pdf">
<metadata location="metadata.rdf">
</document>

Remarks

If the location of the metadata file is the same as the document being referenced, then the metadata DSE must be different from the document DSE. If both location and DSEs are identical, then NXT 4 will index the document again instead of indexing the metadata file.

In order to distinguish the separate metadata file, either a distinguishing location must be provided, or the DSE itself must contain instructions for locating the RDF based on the main document's location. For example, one DSE might contain instructions to create a file with the same name but with a different extension. That is, mydoc.htm might be tied to a metadata file entitled mydoc.rtf. Another DSE might simply append the extension: mydoc.htm that is tied to a metadata file entitled mydoc.htm.rtf

The default indexsheet for metadata is metadata.xil.

Example

<document location="&prjdir;\Acts-Bills\brady.html" 
   title="Brady Bill" content-type="text/html"/>
  <metadata location="bradymetadata.rdf"/>
</document>

param

Specifies a parameter argument (named value) to pass to a DSE. Use parameters to customize how a DSE operates.

Definition

<!ELEMENT param EMPTY>
<!ATTLIST param
     name    CDATA #REQUIRED
     value   CDATA #REQUIRED>

Attributes

Attribute Description
name Name of the parameter.
value Value of the parameter.

Remarks

It is possible to specify one set of parameters for a DSE to use when parsing one group of files and another set of parameters for the DSE to use when parsing another group of files. To do so, define two dse elements for the DSE, each with a different ID. Then specify the desired DSE in the dse attribute for the document.

Example

<dse id="my-dse1" 
class-id="023923C0-CFCF-11d1-B139-00C04F932EC0">
     <param name="log-errors" value=yes/>
     <param name="default-font" value="Arial"/>
</dse>

<dse id="my-dse2" 
class-id="023923C0-CFCF-11d1-B139-00C04F932EC0">
     <param name="log-errors" value=no/>
     <param name="default-font" value="Times"/>
</dse>

property

Sets the value of a content collection property.

Definition

<!ELEMENT property EMPTY>
<!ATTLIST property
   name    CDATA #REQUIRED
   value   CDATA #REQUIRED>

Attributes

Attribute Description
name Name of the content collection property.
value Value for the content collection property.

Remarks

The following table shows commonly defined content collection properties:

Attribute Description
Name Usage
Author Specifies the author of a content collection. Usually, the department or company producing the content collection goes here. The author property can be any length.
Subject

Sets a subject for the content collection. Typically, subject is only one or two lines which describe the subject (or contents) of the content collection. The subject property can, however, be any length.
Abstract Provides a more detailed summary of the content collection than the subject. The abstract property can be any length.

Comment Provides additional information about the content collection or the person/organization which produced the content collection. Often used for copyright information. The comment property can be any length.

note icon ccBuild does not currently require or check any content collection properties.

Example

<property name="Version" value="1.0"/>
<property name="Creator" value="Rocket Development"/>

xml

Specifies that the file contains a document conforming to a specified version of XML.

Definition

<?xml version="{version number}" encoding=.(encoding type).?>

Remarks

This must be the first line of an XML file. The xml element informs the XML parser that the file is an XML version 3.0 file. If the version does not match what the parser expects, it will not continue to parse the file. The XML parser used by the NXT products parses only XML version 3.0 files.

The encoding parameter is an optional parameter used to specify the encoding for the makefile. Currently, UTF-8 is the only format supported. If the encoding parameter is not included in the XML element, the encoding type is assumed to be UTF-8. To ensure that the makefile is saved with UTF-8 encoding, use a text editor that is UTF-8 aware.

note icon Notepad is UTF-8 aware, but you must select UTF-8 from the Save As dialog.

Example

<?xml version="3.0"?>