
     _________________________________________________________________
                                      
                                    dtdtree
                                       
   dtdtree outputs the content hierarchy tree (in ASCII) of SGML elements
   defined in a DTD.
   
     _________________________________________________________________
                                      
Usage

   dtdtree is invoked from the command-line as follows:
   
   % dtdtree [options] elementname elementname ...
   
   Any strings after, and not part of, command-line options are treated
   as the elements (elementname) to output trees for. If no elements are
   specified, than the tree(s) for the top-most element(s) defined in the
   DTD are printed.
   
   The following are the list of options available:
   
   -catalog filename
          Use filename as the file for mapping public identifiers and
          external entities to system files. If -catalog is not
          specified, "catalog" is used as the default filename. See
          Resolving External Entities for more information.
          
   -dtd filename
          Use filename as the SGML DTD to parse. Otherwise, read from
          standard in.
          
   -help
          Print a brief usage description. No other action is performed.
          
   -level #
          Set the prune level of the content hierachy tree to # Defaults
          to 15.
          
   -treefile filename
          Output element content tree(s) to filename. Otherwise, dtdtree
          prints to standard out.
          
   -verbose
          Ouput to standard error messages of what dtdtree is doing. This
          option is mainly for debugging purposes.
          
     _________________________________________________________________
                                      
dtdtree Output

   The tree shows the overall content hierarchy for an element. Content
   hierarchies of descendents will also be shown. Elements that exist at
   a higher (or equal) level, or if the maximum depth has been reached,
   are pruned. The string "..." is appended to an element if it has been
   pruned due to pre-existance at a higher (or equal) level. The content
   of the pruned element can be determined by searching for the complete
   tree of the element (ie. elements w/o "..."). Elements pruned because
   maximum depth has been reached will not have "..." appended.
   
   Example:
   
     |__section+)
         |_(effect?, ...
         |__title, ...
         |__toc?, ...
         |__epc-fig*,
         |   |_(effect?, ...
         |   |__figure,
         |   |   |_(effect?, ...
         |   |   |__title, ...
         |   |   |__graphic+, ...
         |   |   |__assoc-text?)

   Note
          Pruning must be done to avoid a combinatorical explosion. It is
          common for DTD's to define content hierarchies of infinite
          depth. Even with a predefined maximum depth, the generated tree
          can become very large.
          
   Since the tree outputed is static, the inclusion and exclusion sets of
   elements are treated specially. Inclusion and exclusion elements
   inherited from ancestors are not propagated down to determine what
   elements are printed, but special markup is presented at a given
   element if there exists inclusion and exclusion elements from
   ancestors. The reason inclusions and exclusions are not propagated
   down is because of the pruning done. Since an element may occur in
   multiple contexts -- and have different ancestoral inclusions and
   exclusions in effect -- an element without "..." may be the only place
   of reference to see the content hierarchy of the element.
   
   Example:
   
    D1
     |  {+} idx needbegin needend newline
     |
     |_(head,
     |   | {A+} idx needbegin needend newline
     |   |  {-} needbegin needend
     |   |
     |   |_(((#PCDATA |
     |   |____((acro |
     |   |       | {A+} idx needbegin needend newline
     |   |       | {A-} needbegin needend
     |   |       |
     |   |       |_(((#PCDATA |
     |   |       |____((super | ...
     |   |       |______sub)))*)) ...

   Ignoring the lines starting with {}'s, one gets the content hierachy
   of an element as defined by the DTD without concern of where it may
   occur in the overall structure. The {} lines give additional
   information regarding the element with respect to its existance within
   a specific context. For example, when an ACRO element occurs within
   D1,HEAD -- along with its normal content -- it can contain IDX and
   NEWLINE elements due to inclusions from ancestors. However, it cannot
   contain NEEDBEGIN and NEEDEND regardless of its defined content since
   an ancestor(s) excludes them.
   
   Note
          Exclusions override inclusions. If an element occurs in an
          inclusion set and an exclusion set, the exclusion takes
          precedence. Therefore, in the above example, NEEDBEGIN, NEEDEND
          are excluded from ACRO.
          
   Explanation of {}'s keys:
   
   {+}
          The list of inclusion elements defined by the current element.
          Since this is part of the content model of the element, the
          inclusion subelements are printed as part of the content
          hierarchy of the current element after the base content model.
          Subelements that are inclusions will have {+} appended to the
          subelement entry.
          
   {A+}
          The list of inclusion elements due to ancestors. This is listed
          as reference to determine the content of an element within a
          given context. None of the ancestoral inclusion elements are
          printed as part of the content hierarchy of the element.
          
   {-}
          The list of exclusion elements defined by the current element.
          Since this is part of the content model of the element, any
          subelement in the content model that would be excluded will
          have {-} appended to the subelement listing.
          
   {A-}
          The list of exclusion elements due to ancestors. This is listed
          as reference to determine the content of an element within a
          given context. None of the ancestoral exclusion elements have
          any effect on the printing of the content hierarchy of the
          current element.
          
     _________________________________________________________________
                                      
Resolving External Entities

   Defining the mapping between external entities to system files may be
   done via the -catalog command-line option. The catalog provides you
   with the capability of mapping public identifiers to system
   identifiers (files) or to map entity names to system identifiers.
   
   Catalog Syntax
   
   The syntax of a catalog is a subset of SGML catalogs (as defined in
   SGML Open Draft Technical Resolution 9401:1994).
   
   A catalog contains a sequence of the following types of entries:
   
   PUBLIC public_id system_id
          This maps public_id to system_id.
          
   ENTITY name system_id
          This maps a general entity whose name is name to system_id.
          
   ENTITY %name system_id
          This maps a parameter entity whose name is name to system_id.
          
   Syntax Notes
   
     * A system_id string cannot contain any spaces. The system_id is
       treated as pathname of file.
     * Any line in a catalog file that does not follow the previously
       mentioned entries is ignored.
     * In case of duplicate entries, the first entry defined is used.
       
   Example catalog file:
   
        -- ISO public identifiers --
PUBLIC "ISO 8879-1986//ENTITIES General Technical//EN"            iso-tech.ent
PUBLIC "ISO 8879-1986//ENTITIES Publishing//EN"                   iso-pub.ent
PUBLIC "ISO 8879-1986//ENTITIES Numeric and Special Graphic//EN"  iso-num.ent
PUBLIC "ISO 8879-1986//ENTITIES Greek Letters//EN"                iso-grk1.ent
PUBLIC "ISO 8879-1986//ENTITIES Diacritical Marks//EN"            iso-dia.ent
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN"                iso-lat1.ent
PUBLIC "ISO 8879-1986//ENTITIES Greek Symbols//EN"                iso-grk3.ent
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 2//EN"                ISOlat2
PUBLIC "ISO 8879-1986//ENTITIES Added Math Symbols: Ordinary//EN" ISOamso

        -- HTML public identifiers and entities --
PUBLIC "-//IETF//DTD HTML//EN"                                    html.dtd
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML"          ISOlat1.ent
ENTITY "%html-0"                                                  html-0.dtd
ENTITY "%html-1"                                                  html-1.dtd

   Environment Variables
   
   The following envariables (ie. environment variables) are supported:
   
   P_SGML_PATH
          This is a colon (semi-colon for MSDOS users) separated list of
          paths for finding catalog files or system identifiers. For
          example, if a system identifier is not an absolute pathname,
          then the paths listed in P_SGML_PATH are used to find the file.
          
   SGML_CATALOG_FILES
          This envariable is a colon (semi-colon for MSDOS users)
          separated list of catalog files to read. If a file in the list
          is not an absolute path, then file is searched in the paths
          listed in the P_SGML_PATH and SGML_SEARCH_PATH.
          
   SGML_SEARCH_PATH
          This is a colon (semi-colon for MSDOS users) separated list of
          paths for finding catalog files or system identifiers. This
          envariable serves the same function as P_SGML_PATH. If both are
          defined, paths listed in P_SGML_PATH are searched first before
          any paths in SGML_SEARCH_PATH.
          
   The use of P_SGML_PATH is for compatibility with earlier versions.
   SGML_CATALOG_FILES and SGML_SEARCH_PATH are supported for
   compatibility with James Clark's nsgmls(1).
   
   Note
          When searching for a file via the P_SGML_PATH and/or
          SGML_SEARCH_PATH, if the file is not found in any of the paths,
          then the current working directory is searched.
          
   Note
          The file specified by -catalog is read first before any files
          specified by SGML_CATALOG_FILES.
          
     _________________________________________________________________
                                      
Availability

   This program is part of the perlSGML package; see
   <URL:http://www.oac.uci.edu/indiv/ehood/perlSGML.html>
   
     _________________________________________________________________
                                      
Author

   
    Earl Hood <ehood@medusa.acs.uci.edu>
    
     _________________________________________________________________
