Blob Blame History Raw
<?xml version="1.0"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
        "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
        
<!-- Translated to XML; the original SGML version uses
     "-//Davenport//DTD DocBook V3.0//EN" -->

<!--ArborText, Inc., 1988-1995, v.4001-->

<!-- It would be nice if we had these images to test the @image
handling, but...
<!NOTATION drw SYSTEM "DRW">
<!ENTITY markups SYSTEM "markups.eps" ndata eps>
<!ENTITY generic SYSTEM "generic.eps" ndata eps>
<!ENTITY sgmlexa SYSTEM "sgmlexa.drw" ndata drw>
<!ENTITY atilogo SYSTEM "atilogo.gif" ndata gif>
<!ENTITY gloss SYSTEM "gloss.sgml">
-->

<!ENTITY www "World Wide Web">
]>

<book>
<bookinfo>
<bookbiblio>
<title>Getting started with SGML</title>
<subtitle>A guide to the Standard Generalized Markup Language and its role
in information management</subtitle>
<authorgroup><corpauthor>ArborText, Inc.</corpauthor>
<othercredit><authorblurb>
<para>Ann Arbor, Michigan</para></authorblurb></othercredit>
</authorgroup>
<pubdate>18 October 1995</pubdate>
<abstract>
<para>As the world standard for textual information, SGML has gained prominence
in many industries. Hundreds of companies have adopted SGML and thousands
are considering it. If your organization produces a high volume of technical
or business information of significant value, and if that information lends
itself to a regular structure, then SGML probably offers significant benefits
to you and your organization.</para>
<para>This White Paper examines the factors that led to the development of
SGML, the basic knowledge you need to understand SGML, the reasons for adopting
SGML, lists those industries where SGML use is already widespread, and lists
resources for more information and training.</para>
</abstract>
</bookbiblio>

<!-- Added for docbook2X -->
<titleabbrev role="texinfo-file">at1</titleabbrev>

</bookinfo>
<chapter>
<title>The Business Challenge</title>

<para>The explosive success of the Internet is an obvious example of an information
revolution that's well under way. Companies that realize the tremendous cost
and value of information management are reengineering their processes for
creating, distributing and accessing information. The opportunities in each
of these areas can be enormous:</para>
<sect1>
<title>Information Creation</title>
<para>By some estimates, 20% of our GNP is spent on generating new information.
And over 90% of that information is in documents, not databases. When was
the last time you took a close look at how much your organization invests
in the creation of information?</para>
<para>In conventional word processing and desktop publishing systems, your
authors spend up to 30% of their time searching for information, and another
30% of their time applying styles and squeezing paragraphs so that each printed
page looks nice. Plus, nearly every 18 months, technology changes completely,
so you're continually paying for data conversions as software and hardware
become obsolete.</para>
</sect1>
<sect1>
<title>Information Distribution</title>
<para>A few years ago, you could provide your information on paper alone.
Then CD-ROM technology became low-cost and widespread, so you've either already
faced or soon expect to face the massive re-publishing effort needed to make
all your information available electronically. And in just the last year,
the &www; has thundered out of nowhere, creating yet another new format for
your information.</para>
<para>At the same time, your customers want your information tuned to their
needs: they don't want to wade through huge technical manuals that describe
all system variations and all possible uses for all possible users&mdash;they
want information tailored to their own needs, so they can get to it and use
it fast.</para>
</sect1>
<sect1>
<title>Information Access</title>
<para>In the U.S. alone, businesses produce 92 billion documents every year&mdash;and
that number is skyrocketing. Can your people easily access the information
you create in your own company? How about the information you receive from
other companies?</para>
<para>An organization's future can depend on how effectively it identifies,
manages, and uses its information. The latest thinking in information management
takes an enterprise-wide approach to the creation, distribution and maintenance
of information. Organizations that have taken this broad view have realized
enormous improvements in the cost, accuracy, timeliness, accessibility, and
variety of the information they create and use.</para>
<para>As part of this movement, companies in some industries are joining together
to develop standards for exchanging information with each other and with their
customers. Companies that keep up-to-date with these standards will be able
to do business more efficiently and compete more effectively in global markets.
This white paper describes how one such standard, the Standard Generalized
Markup Language (SGML), works as part of an overall information management
strategy.</para>
</sect1>
</chapter>
<chapter>
<title>Unleashing the Power of Information</title>
<para>Traditional documents and the methods for handling them suffer many
limitations. The printed document is often the result of a sophisticated information
process. Once it's printed, however, the document represents a dead-end in
the information flow because it has no link to the electronic information
base.</para>
<para>Raw data may start in the form of technical specifications or engineering
data. This information must be gathered, sorted, organized, and then manually
assembled into hard copy documents. With each step in the documentation process,
the information may have changed by mistake. The further removed the result
is from the original source of information, the greater the risk of erroneous
data. The problem can become so large that a majority of documents go out
of date as soon as they are printed.</para>
<para>A systematic approach to information management treats text and graphics
as part of an organization's electronic information base. This gives everyone
access to the information. By taking a broad view of the information creation
and delivery process, you can see documents as any composition of information&mdash;the
output from a database query, a printed document, an on-line diagnostic manual,
an illustrated parts catalog, a collection of video clips, or a home page
on the Internet's &www;.</para>
<para>SGML allows you to manage information as data objects instead of characters
on a page. Rather than a stream of indistinguishable bits and bytes, the data
is <quote>chunked</quote> into identifiable discrete elements of information.
This technology enables you to store and reuse the information efficiently,
share it with many users, and maintain it in a database.</para>
</chapter>
<chapter>
<title>Getting to Know SGML</title>
<para>This white paper provides an introduction to existing SGML technology,
its advantages and benefits, as well as an overview of some related standards
and how they fit into an overall approach to managing information. We also
define some of the terminology and acronyms to familiarize you with the language
associated with SGML. While SGML is a fairly recent technology, the use of <quote>
 markup</quote> in computer-based documents has existed for a while. Let's
first look at earlier markup schemes that led to SGML.</para>
<sect1>
<title>What is markup?</title>
<para>Markup is everything in a document that is not content. Markup originally
referred to the handwritten notations that a designer would add to typewritten
text; these notations contained instructions to a typesetter about how to
lay out the copy and what typeface to use. This kind of markup is known as <firstterm>
procedural markup</firstterm>.</para>
<sect2>
<title>Procedural markup</title>
<!--<graphic entityref="markups"></graphic>-->
<para>Most electronic publishing systems today, such as word processing software
and desktop publishing software, use procedural markup. Procedural markup
is typically unique to a specific software package such as <trademark>Microsoft 
</trademark> Word and <trademark>Quark XPress</trademark>. Each has its own
set of markup codes that make sense only to itself. This markup usually takes
the form of formatting codes that are mixed in with the text of the document.
Procedural markup codes apply to a single way of presenting the information,
such as a printed page, and provide no capability to define appearance for
other media, such as CD-ROM and Internet.</para>
</sect2>
<sect2>
<title>Descriptive markup</title>
<!--<graphic entityref="generic"></graphic>-->
<para>Descriptive markup, also known as <quote>generic markup,</quote> describes
the purpose of the text in a document, rather than its physical appearance
on the page. The basic concept of descriptive markup is that the content of
a document should remain separate from its style. Descriptive markup is based
on the <firstterm>structure</firstterm> or <firstterm>content</firstterm>
of a document and identifies elements accordingly&mdash;such as a chapter,
a section, or a table of contents&mdash;using notations that describe what
the element is, not how it appears. By separating presentation information
(<foreignphrase>i.e.</foreignphrase>, style) from the structure and content,
descriptive markup allows for multiple presentations of the same information.
For example, you can publish on paper, on-line, on CD-ROM and on the &www;
(Internet), all from the same set of source files with descriptive markup.
</para>
</sect2>
<sect2>
<title>Drawbacks of procedural markup</title>
<para>Producers of technical documentation increasingly prefer descriptive
markup over procedural markup. Procedural markup is tedious and expensive;
authors can spend 15% to 50% of their time on the appearance of each page.
If style guidelines change, or if you need to present the same information
in a different format, massive re-formatting is usually required. When a company
changes software or hardware systems, enormous data translation tasks arise,
often resulting in errors. Because procedural markup is tied to one final
printed product, you cannot change formats easily. Interchanging documents
based on procedural markup works easily only if both parties have the same
hardware and software system.</para>
</sect2>
</sect1>
<sect1>
<title>What is SGML?</title>
<para>The Standard Generalized Markup Language, or SGML, is an international
standard (ISO 8879) published in 1986. SGML prescribes a standard format for
embedding descriptive markup within a document. More importantly, and crucial
to its real value and power, SGML also specifies a standard method for describing
the structure of a document.</para>
<para>In other words, SGML allows you to set up structural rules for each
type of document you produce. SGML ensures that each element, which is labeled
with descriptive markup such as <quote>chapter,</quote> <quote>title,</quote>
and <quote>paragraph,</quote> fits in the logical, predictable structure of
your document type.</para>
<para>SGML supports an infinite variety of document structures. Users typically
create a different document structure for each category of information they
produce: information bulletins, technical manuals, parts catalogs, design
specifications, reports, letters and memos.</para>
<para>SGML allows you to create documents that are independent of any specific
hardware or software. Since SGML documents conform to an international standard,
they are portable. You can exchange them seamlessly with users who have different
systems.</para>
<para>The world of photography demonstrates the power of standards: SGML is
to documents as standardized film speed is to cameras. Today you can purchase
a roll of film marked <quote>ISO 100,</quote> put the film in your camera,
set the camera's film speed to 100 (which many cameras do automatically),
and you're ready to shoot. You don't have to worry that the brand of film
is not compatible with your particular make of camera. The film and camera
manufacturing industries&mdash;through the International Organization for
Standardization (ISO) and American Standards Association (ASA)&mdash;have
agreed on standards for film speeds. Many industries plan to use SGML so that
their documents work as easily on different computers as film works in different
cameras.</para>
</sect1>
<sect1>
<title>How does SGML work?</title>
<para>To understand SGML we must look at the three layers of a typical document:
 structure, content, and style. SGML separates these three aspects, but deals
mainly with the relationship between structure and content.</para>
<sect2>
<title>Structure</title>
<para>At the heart of an SGML application is a file called the <firstterm>
DTD</firstterm>, or <firstterm>Document Type Definition</firstterm>. The DTD
sets up the structure of a document, much like a database schema describes
the types of information it handles. A DTD provides a framework for the types
of elements (such as chapters and chapter headings, sections, and topics)
that constitute a document.</para>
<para>A DTD also specifies rules for the relationships between elements; for
example, <quote>a chapter heading must be the first element after the start
of a chapter</quote>; or <quote>each list must contain at least two items. 
</quote> These rules, which the DTD defines, help ensure that documents have
a consistent, logical structure. A DTD accompanies an SGML document wherever
it goes. A <quote>document instance</quote> is a document whose content has
been tagged in conformance with a particular DTD.</para>
</sect2>
<sect2>
<title>Content</title>
<para>Content is the information itself: content includes titles, paragraphs,
lists, tables, graphics, and audio. The method for identifying the content's
position within the DTD structure is called <quote>tagging.</quote> Creating
an SGML document involves inserting tags around content. These tags mark the
beginning and end of each part of the structure and identify the type of contents
they enclose. In the following example, <sgmltag class="starttag">par</sgmltag>
indicates the start of a paragraph, and <sgmltag class="endtag">par</sgmltag>
indicates the end of the paragraph:<programlisting>&lt;par>Paragraph content.&lt;/par>

</programlisting></para>
<para>You can nest elements within other elements; in the following example,
the paragraph (<sgmltag class="element">par</sgmltag>) is an element within
the topic (<sgmltag class="element">topic</sgmltag>):<programlisting>&lt;topic>&lt;par>Content.&lt;/par>&lt;/topic>

</programlisting></para>
<para>The structure of a particular document is revealed by the nesting of
tags:<programlisting>&lt;section>&lt;subhead>Content&lt;/subhead>
&lt;par>Content is the information 
itself.&lt;/par>&lt;/section></programlisting></para>
<para>Fortunately, human beings usually don't have to deal with manually typing
in tags and checking to make sure all the tags are there. Some SGML-based
authoring software programs make it easy to enter tags by clicking on pull-down
menus that guide you by listing only those tags that are valid at the cursor's
current position in the document. These programs rely on a software module
called a <quote>parser</quote> that verifies that the document follows the
rules of the DTD. (The parser also verifies that the DTD itself is structurally
correct.) The following illustration shows how an SGML-based authoring program
would display the tags for the previous ASCII example:</para>
<!--<graphic entityref="sgmlexa"></graphic>-->
</sect2>
<sect2>
<title>Style</title>
<para>SGML itself has nothing to do with setting standards for style, so most
systems still rely on proprietary methods of setting style. It is the style
that determines the final appearance of the document information. Some efforts
are being made to develop standards-based style sheets; two of these efforts
have resulted in the mature OS standard and the still unreleased DSSSL standard. 
</para>
<para>The U.S. Department of Defense CALS initiative developed its own style
standard, known as the Output Specification (OS). The OS is in the form of
a particular DTD that allows the user to create a Formatting Output Specification
Instance, or FOSI (usually pronounced <quote>fossy</quote>), that is well
suited to both print and electronic output.</para>
<para>A FOSI is essentially a powerful style sheet that specifies the formatting
for each tag in a DTD. With the FOSI, the document, and the DTD, you have
a complete interchange package for printed documents that maintains its format
and style as it is interchanged among systems. In early 1995, an ISO committee
released a draft of the Document Style Semantics and Specification Language
(DSSSL), which is on its way to becoming an international standard for presenting
SGML-based documents. Official release is expected later this year.</para>
<para>The complete DSSSL standard covers a broad scope, so subsets are being
developed to handle varying levels of functionality. A subset whose functionality
is approximately equivalent to FOSIs is expected, and work on tools to convert
FOSIs to and from DSSSL is under way.</para>
<para>Many military contracts currently require FOSIs, and many non-defense
firms have also embraced the Department of Defense's OS standard because it's
a mature and supported standard. It is expected that both DSSSL and FOSIs
will remain important standards for the foreseeable future.</para>
</sect2>
</sect1>
</chapter>
<chapter>
<title>What Does SGML Give Me?</title>
<para>SGML has become mainstream technology that you can use with confidence.
Your adoption of SGML will allow your organization to gain the maximum value
from your generation and use of information:</para>
<sect1>
<title>Increased productivity</title>
<para>A structured approach to documents helps writers organize the information
as they are creating it, and keeps content separate from style. This separation
enables you to set up centrally-controlled style guidelines, so authors can
focus on generating the content rather than adjusting each document's appearance.
That change alone can as much as double your authors' productivity.</para>
<para>You can also improve efficiency by keeping a central information base
so that authors don't have to recreate the same information in order to use
it. This also ensures that the most current information is made available
to all. And, a single update to the information base ensures that all documents
created from that information base will automatically be updated.</para>
</sect1>
<sect1>
<title>Reusability</title>
<para>A printed document is just one of many possible products from SGML-based
information. For example, a technical publications group can use tags to identify
a procedure as a sequence of steps. In this case, you identify the beginning
and end of the procedure, and each step within the procedure. The same procedure
can now appear in several forms: maintenance and operational manuals, on-line
technical manuals, training guides, etc. More importantly, since the tags
are machine-readable, the computer can manage and maintain the many different
uses of the same single source of information, so no re-keying is required
to produce this information in new document formats.</para>
</sect1>
<sect1>
<title>Information longevity</title>
<para>SGML is a simple, standard file format with an indefinite shelf life;
you'll never again have to convert your documents when a hardware or software
system becomes obsolete. Once you setup your SGML information base, the information
will always be available, because it carries everything needed to create a
document. So even when your hardware or software becomes obsolete, your information
remains usable, portable, and available.</para>
</sect1>
<sect1>
<title>Improved data integrity</title>
<para>Defining a document's structure helps ensure that the right information
is in the right place, which improves the organization of your information.
Because SGML eliminates the need for data conversion when it passes across
systems, you reduce the risk of losing information by filtering data from
one format to another.</para>
</sect1>
<sect1>
<title>Better data control</title>
<para>With SGML, you can define and manipulate information elements at any
level of detail. A tagged element can have attributes that provide characteristics
or properties about the element. This attribute information is useful for
managing and manipulating the information elements. For example, an ID (identifier)
attribute can uniquely identify a single paragraph, a whole section, a legal
notice, an illustration, a task, or any element that you may want to use repeatedly.
The following example shows a paragraph with an ID attribute:<programlisting>
&lt;para id=431>Content.&lt;/para></programlisting></para>
<para>By simply referencing the ID, you can include this information into
your document in as many places as you need. This eliminates re-typing and
ensures that the information is identical in every instance.</para>
<para>Plus, the IDs you set are machine readable so that the computer can
find and link related information. This allows you to use IDs for a variety
of information management controls. These controls can help you:<itemizedlist>
<listitem><para>Manage the security of information by allowing only certain
people to view or change information with selected IDs.</para>
</listitem>
<listitem><para>Automate the information flow&mdash;for example, updating
the data in one place can trigger the update of the same information in other
places within the same document and in other documents.</para>
</listitem>
</itemizedlist></para>
</sect1>
<sect1>
<title>Shareability</title>
<para>Since SGML is aware of the individual components of a document, you
can easily build entirely new documents out of existing information. This
capability enables users to share the latest information without duplicating
it. An example of this might be a standard legal notice or copyright statement
appearing in documents throughout a company. The legal department maintains
this module of information, updating it on occasion. A single tag in your
document can pull in the current legal notice each time you access or output
your document, eliminating needless duplication of information and ensuring
the accuracy of your information.</para>
</sect1>
<sect1>
<title>Portability of information</title>
<para>Today, information networks proliferate where different computers, operating
systems, and applications must share information. In these sort of networks,
portability becomes the key in making sure all who need it can access the
information. Thanks to the hardware and software independence of SGML, you
can easily exchange SGML documents among different environments.</para>
</sect1>
<sect1>
<title>Flexibility beyond traditional publishing</title>
<para>The information you create today may be used a year from now in ways
you haven't yet anticipated. Just last year, the need to publish on the &www;
did not even exist! The spectacular growth of the Web serves as dramatic proof
that we simply cannot anticipate all the purposes for which our information
may eventually be used.</para>
<para>SGML permits you to use your information for applications beyond traditional
publishing. For example:<itemizedlist>
<listitem><para>&www; pages</para>
</listitem>
<listitem><para>information databases</para>
</listitem>
<listitem><para>diagnostic/expert systems</para>
</listitem>
<listitem><para>electronic mail</para>
</listitem>
<listitem><para>hypermedia and hypertext documents</para>
</listitem>
<listitem><para>database publishing</para>
</listitem>
<listitem><para>CD-ROM publishing</para>
</listitem>
<listitem><para>Interactive Electronic Technical Manuals (IETMs)</para>
</listitem>
<listitem><para>electronic review</para>
</listitem>
</itemizedlist></para>
</sect1>
</chapter>
<chapter>
<title>Is SGML Right for Me?</title>
<para>In the life cycle of a product, the cost of gathering, producing, and
maintaining the necessary technical information can exceed the initial hardware
cost. For many industries, technical information is part of a deliverable
product, or a product in itself. Any industry whose product line is heavily
dependent on information can benefit from SGML.</para>
<para>In evaluating how SGML can help your organization, you may wish to consider
some strategic business issues to help in your information management plan.
A strategic approach should prompt you to examine your current information
needs and your current document management methodology. Some questions to
consider include:<itemizedlist>
<listitem><para>Does your information require a long life-span? (For example,
technical information related to airplanes often needs to be maintained for
over 20 years.)</para>
</listitem>
<listitem><para>Do you need to exchange documents across mixed hardware environments?
</para>
</listitem>
<listitem><para>Do you need to produce large documents with a disciplined
structure?</para>
</listitem>
<listitem><para>Do your documents contain information common to other documents
within a department, across corporate divisions, or even across separate organizations?
</para>
</listitem>
<listitem><para>Do you have information that's used for different purposes?
(For example, a part number may appear in a maintenance manual as well as
a parts inventory database.)</para>
</listitem>
<listitem><para>Does your information change frequently and get used often?
</para>
</listitem>
<listitem><para>Do you produce information that needs to comply to industry
or company guidelines?</para>
</listitem>
</itemizedlist></para>
<para>By examining your requirements, you can evaluate how SGML fits into
your information management strategy. Standardizing on SGML doesn't mean you
need to use it for all documents; SGML is most useful for documents with a
definable structure. Since SGML handles documents as collections of distinguishable
data elements, it is useful to think in terms of modules of information, rather
than complete printed documents.</para>
<para>SGML is most useful as a tool in an integrated information management
strategy. Making such a strategic choice and planning the implementation should
be decided by a company's high-level management. There will be initial implementation
costs in moving to SGML. But the payback comes from benefits that accrue over
time and enhance your investment in information. Any organization that exchanges
information between systems, applications, departments, and companies will
realize these benefits.</para>
</chapter>
<chapter>
<title>What Is a Good SGML System?</title>
<para>By design, SGML applications are meant to be customized. Just as there's
no out-of-box database application that can serve all the needs of an organization,
there are no one-size-fits-all SGML application. Since each organization's
information requirements are different, there are many DTDs. More organizations
are also looking at industry-wide information needs and developing standards
for handling that information.</para>
<para>A number of products on the market handle SGML to some degree. But not
all products handle all the features of the SGML standard. The sections that
follow describe some basic requirements.</para>
<sect1>
<title>Provides real-time interactive parsing</title>
<para>An invaluable feature in an SGML system is real-time, interactive SGML
validation. This feature allows the software to provide context-sensitive
editing assistance based on the cursor's current position in the document.
For example, if the cursor is immediately after the beginning tag for a section,
and all sections must have a section heading, the software allows you to insert
only a section heading tag. This feature ensures that the author does the
correct tagging at all times which ensures that the author creates a valid
SGML document the first time.</para>
<para>By contrast, systems that use batch parsing allow authors to insert
tags and text without checking each action against the DTD. In this approach,
authors create documents in one format, then filter parts of the document
into SGML, and then run the SGML through a validating parser. When the parser
finds errors, the author must correct the original document, then filter and
parse the changes again. The author must repeat this cycle until the entire
document parses successfully. This approach adds steps to the publishing process
that add no value. Time saved by authoring in a familiar format is lost in
the filtering and validating process. A system that creates native SGML information
eliminates the costly, time-consuming, and often error-prone process of retrofitting
documents into valid SGML.</para>
</sect1>
<sect1>
<title>Uses real SGML</title>
<para>If your authoring software merely produces SGML as output, then your
information is still tied to a proprietary format, and still at the mercy
of software and hardware obsolescence. A publishing system that uses SGML
as its native file format allows your information to remain accessible and
usable regardless of hardware and software changes. If you need your information
to remain accessible as you grow into new systems and new technologies then
using a native SGML file format provides a distinct advantage over a system
that filters the data into SGML. Here's an acid test to identify a real SGML
system: can the software accept any SGML document, display that document,
and then save that document, leaving it unchanged?</para>
</sect1>
<sect1>
<title>Supports any DTD</title>
<para>To be fully usable, a good SGML product allows you to create a variety
of new document types in addition to accepting existing DTDs used in some
industries. This feature is sometimes called the ability to handle <firstterm>
arbitrary</firstterm> or user-defined DTDs. With arbitrary DTDs you are free
to create any document type.</para>
</sect1>
<sect1>
<title>Supports SGML features</title>
<para>The developers of SGML built into the standard a number of features
that facilitate automated publishing and document reuse. A fully-featured
SGML publishing package should support this functionality. Some of the basic
features to look for include:<itemizedlist>
<listitem><para><firstterm>Marked sections.</firstterm> Marked sections let
you create multiple versions from a single master document using regions of
conditional text that only appear in specified versions. For example, you
might want to build a single source document that describes two variations
of your product. You simply write the source document with marked sections
for the areas that differ. The system can then identify these areas and produce
two different versions of your information from the same source file.</para>
</listitem>
<listitem><para><firstterm>External file entities.</firstterm> A file entity
is simply a pointer to a separate document file. You can use file entities
to break a large document into subdocuments. You can also use a file entity
to reference frequently repeated boilerplate information such as an electrical
caution.</para>
</listitem>
<listitem><para><firstterm>Graphic entities.</firstterm> A graphic entity
is a pointer to a separate graphic file.</para>
</listitem>
<listitem><para><firstterm>Text entities.</firstterm> A text entity is a single
tag that represents a common phrase repeated throughout a document. This allows
you to reference the tag instead of re-keying the phrase each time you need
to use it.</para>
</listitem>
</itemizedlist></para>
</sect1>
</chapter>
<chapter>
<title>Who Uses SGML Now?</title>
<para>Early in its history, the primary adopters of SGML were defense contractors.
In the last two years, however, the trickle of commercial users has turned
into a torrent. Many leading industrial groups recognize the benefits SGML
offers and have adopted it for information management and exchange among their
members, and between members and their vendors and customers.</para>
<para>Several industries have developed standards for information exchange:
</para>
<variablelist>
<varlistentry><term>AAP</term>
<listitem>
<para>The American Association of Publishers developed The American National
Standard for Electronic Manuscript Preparation and Markup, a general purpose
book DTD for publishers, authors and editors.</para>
</listitem>
</varlistentry>
<varlistentry><term>ATA (airlines)</term>
<listitem>
<para>The Air Transport Association, a consortium representing the commercial
airline industry, developed several DTDs under the ATA-100 specification.
The ATA's European counterpart, AECMA, is also adopting standards based on
SGML.</para>
</listitem>
</varlistentry>
<varlistentry><term>ATA (trucking)</term>
<listitem>
<para>The Maintenance Council of the American Trucking Association has initiated
a task force with the mission of <quote>Establishing the Standard for Electronic
Service Information.</quote> This task force represents large truck manufacturers
and fleet operators interested in standardizing the interchange of service
information, and they are developing the T2008 DTD, modeled after the SAE's
J2008 DTD for automobiles and light trucks. The first release of the standard
is expected in 1996.</para>
</listitem>
</varlistentry>
<varlistentry><term>DocBook</term>
<listitem>
<para>Founded by ten major producers and consumers of technical documentation
for computer systems, the Davenport Group has developed the DocBook DTD for
exchanging and delivering computer documentation. Founding members included
Novell, O'Reilly &amp; Associates, Fujitsu OSSI, Hewlett-Packard, Digital
Equipment Corporation, SCO, Hal Computer Systems, Hitachi Computer Products,
SunSoft and Unisys.</para>
</listitem>
</varlistentry>
<varlistentry><term>DoD</term>
<listitem>
<para>The U.S. Department of Defense created the Continuous Acquisition and
Life-Cycle Support (CALS) initiative (recently renamed from Computer-aided
Acquisition and Logistic Support). The next section describes CALS in more
detail.</para>
</listitem>
</varlistentry>
<varlistentry><term>Pinnacles</term>
<listitem>
<para>Led by Intel, National Semiconductor, Texas Instruments, Phillips, and
Hitachi, the Pinnacles Group is developing the Pinnacles Component Information
Standard (PCIS) to allow reusability of component data by semiconductor customers
and vendors. This data can include descriptions, specifications, physical
diagrams, code fragments, behavior models, and other text, tables, graphics,
and technical data.</para>
</listitem>
</varlistentry>
<varlistentry><term>SAE</term>
<listitem>
<para>The Society of Automotive Engineers is developing the J2008 DTD for
electronic interchange of service and diagnostic information. The J2008 Task
Force is part of the Vehicle Electronic/Electrical Systems Committee, whose
mission is to increase customer satisfaction and lower product life cycle
costs by recommending standards that promote more effective diagnosis of vehicle
systems. The DTD is expected to be released for approval as a Technical Draft
Standard in 1995. After three years, it will be voted upon again to determine
if it should become a Recommended Practice.</para>
</listitem>
</varlistentry>
<varlistentry><term>TCIF</term>
<listitem>
<para>The Telecommunications Industry Forum is an international association
of carriers and major vendors of telecommunications products and services.
The TCIF initiative is focused on the re-use of technical information across
multiple applications and different environments.</para>
</listitem>
</varlistentry>
</variablelist>
<para>Many SGML applications are in commercial use. Other industries moving
to SGML include pharmaceuticals, publishing, and manufacturing.</para>
<para>Overseas, SGML is gaining wide acceptance. The European Airbus, a consortium
of companies in the commercial airline industry in Europe, adopted SGML. Telecommunications,
aerospace, manufacturing, and other commercial and military interests throughout
Europe are also using SGML.</para>
</chapter>
<chapter>
<title>What Is CALS?</title>
<para>CALS stands for Continuous Acquisition and Life-Cycle Support (recently
renamed from Computer-aided Acquisition and Logistic Support). It is a large-scale,
long-term information management project initiated by the U.S. Department
of Defense (DoD). Since the DoD receives goods and services from a wide range
of suppliers, contractors and subcontractors, it constantly handles massive
quantities of technical information. Today's weapon systems are technologically
complex and can have a life span of 20 years or more. As a result, the amount
of technical data needed to support and maintain these systems is overwhelming.
</para>
<para>The CALS standards that apply to maintaining technical information include:<itemizedlist>
<listitem><para>MIL-STD-1840: The Automated Interchange of Technical Information:
this is the umbrella standard specifying overall guidelines for electronic
data storage and exchange of CALS documents on magnetic tape.</para>
</listitem>
<listitem><para>MIL-M-28001: SGML (Standard Generalized Markup Language) for
exchanging text.</para>
</listitem>
<listitem><para>MIL-D-28000 IGES (Initial Graphics Exchange Specification)
an object-oriented format for technical drawings.</para>
</listitem>
<listitem><para>MIL-R-28002 CCITT Group 4 (International Consultative Committee
on Telephony and Telegraphy) for raster images.</para>
</listitem>
<listitem><para>MIL-D-28003 CGM (Computer Graphics Metafile) for object-oriented
graphics.</para>
</listitem>
</itemizedlist></para>
</chapter>
<chapter>
<title>Resources</title>
<para>Here are a few resources for more information on SGML.</para>
<sect1>
<title>Conferences, tutorials, and training</title>
<para>The Graphic Communications Association (GCA) was instrumental in the
development of SGML. The GCA provides conferences, tutorials, newsletters,
and publication sales for both members and non-members.<literallayout>Graphic Communications Association
100 Daingerfield Road
Alexandria, Virginia 22314&ndash;2804 USA

+1 703.519.8160</literallayout></para>
<para>SGML Open is a non-profit, international consortium of providers of
SGML products and services dedicated to accelerating the further adoption,
application, and implementation of SGML.<literallayout>SGML Open
910 Beaver Grade Road, #3008
Coraopolis, Pennsylvania 15108 USA

+1 412.264.4258</literallayout></para>
<para>ArborText also offers a range of introductory to advanced level SGML
training courses, including DTD and FOSI training. For further information
on ArborText's training services, schedules, and course descriptions, please
contact ATI's Training Team at +1 313.996.3566.</para>
<bridgehead>Books on SGML</bridgehead>
<para><citation>SGML: An Author's Guide to the Standard Generalized Markup
Language</citation>, Martin Bryan, Addison-Wesley, 1988, ISBN 0&ndash;201&ndash;17537&ndash;5
</para>
<para><citation>The SGML Handbook</citation>, Charles Goldfarb, Oxford University
Press, 1990, ISBN 0&ndash;19&ndash;863737&ndash;9</para>
<para><citation>Practical SGML</citation>, Eric van Herwijnen, Kluwer Academic
Publishers, 1994, ISBN 0&ndash;7923&ndash;9434&ndash;8</para>
</sect1>
</chapter>
<glossary>
<title>Glossary</title>
<glossentry><glossterm>ASCII</glossterm>
<glossdef>
<para>(American Standard Code for Information Interchange) This standard character
encoding scheme is used extensively in data transmission.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>ANSI</glossterm>
<glossdef>
<para>(American National Standards Institute) This group is the U.S. member
organization that belongs to the ISO, the International Organization for Standardization.
</para>
</glossdef>
</glossentry>
<glossentry><glossterm>attribute</glossterm>
<glossdef>
<para>An attribute provides more information about an element such as classification
level, unique reference identifiers, or formatting information.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>CCITT Group 4</glossterm>
<glossdef>
<para>(International Consultative Committee on Telegraphy and Telephony) This
CALS standard for raster graphics incorporates tiling, which divides a large
image into smaller tiles. You can exchange graphic files in CCITT/4 format
in a compressed state so they take up much less file space.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>CITIS</glossterm>
<glossdef>
<para>(Contractor Integrated Technical Information Service) As part of CALS
Phase II, CITIS is a draft functional specification for services. DoD acquisition
managers designed CITIS as a plan to gain access to product-related digital
technical information.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>CGM</glossterm>
<glossdef>
<para>(Computer Graphics Metafile) CGM is one of the CALS standard formats
for representing 2&ndash;D technical illustrations. CGM is an object-oriented
graphic format.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>DSSSL</glossterm>
<glossdef>
<para>(Document Style Semantics and Specification Language) This draft international
standard (DIS 10179) applies to the specification of processing information
for SGML documents. DSSSL is expected to became an international standard. 
</para>
</glossdef>
</glossentry>
<glossentry><glossterm>DTD</glossterm>
<glossdef>
<para>(Document Type Definition) A DTD is the formal definition of the elements,
structures, and rules for marking up a given type of SGML document. You can
store a DTD at the beginning of a document or externally in a separate file.
</para>
</glossdef>
</glossentry>
<glossentry><glossterm>EDI</glossterm>
<glossdef>
<para>(Electronic Data Interchange) This is a set of computer interchange
standards for business documents such as invoices, bills, and purchase orders.
</para>
</glossdef>
</glossentry>
<glossentry><glossterm>element</glossterm>
<glossdef>
<para>An element is a piece of data within a document that may contain either
text or other subelements such as a paragraph, a chapter, and so on.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>element declaration</glossterm>
<glossdef>
<para>A statement in the DTD defining an element and declaring the order in
which it may appear in the document and what other elements it may include.
</para>
</glossdef>
</glossentry>
<glossentry><glossterm>entity</glossterm>
<glossdef>
<para>An entity is a self-contained piece of data that can be referenced as
a unit. You can refer to an entity by a symbolic name in the DTD or the document.
An entity can be a string of characters, a symbol character (unavailable on
a standard keyboard), a separate text file, or a separate graphic file.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>entity declaration</glossterm>
<glossdef>
<para>A statement in the DTD or document that assigns an SGML name to an entity
so you can reference it.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>FOSI</glossterm>
<glossdef>
<para>(Formatting Output Specification Instance) A FOSI is used for formatting
SGML documents for printing and other outputs. It is a separate file that
contains formatting information for each element in a document.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>HTML</glossterm>
<glossdef>
<para>(HyperText Markup Language) This is the format of files published on
the &www;. HTML is an application of SGML; to author in HTML using SGML-based
authoring software, you simply need the HTML DTD.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>IGES</glossterm>
<glossdef>
<para>(Initial Graphics Exchange Specification) The IGES standard for engineering,
product design, and manufacturing drawings is one of the CALS standard graphics
formats.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>Internet</glossterm>
<glossdef>
<para>The Internet is a worldwide communications network originally developed
by the U.S. Department of Defense as a distributed system with no single point
of failure. The Internet has seen an explosion in commercial use since the
development of easy-to-use software for accessing the Internet.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>ISO</glossterm>
<glossdef>
<para>(International Organization for Standardization) The ISO is an industry-supported
organization that establishes worldwide standards for everything from data
interchange formats to film speed specifications.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>markup</glossterm>
<glossdef>
<para>Markup is anything added to the content of the document that describes
the text.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>parser</glossterm>
<glossdef>
<para>A parser is a specialized software program that recognizes SGML markup
in a document. A parser that reads a DTD and checks and reports on markup
errors is a validating SGML parser. A parser can be built into an SGML editor
to prevent incorrect tagging and to check whether a document contains all
the required elements.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>PDES/STEP</glossterm>
<glossdef>
<para>(Product Data Exchange Standard/Standard for the Exchange of Product
Model Data). PDES/STEP are standards under development for communicating a
complete product model with sufficient information content that advanced CAD/CAM
applications can interpret. PDES is under development as a national standard
and STEP is under development as its international counterpart.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>tag</glossterm>
<glossdef>
<para>In the world of SGML, a tag is a marker embedded in a document that
indicates the purpose or function of the element. Each element has a beginning
tag and an end tag.</para>
</glossdef>
</glossentry>
<glossentry><glossterm>&www;</glossterm>
<glossdef>
<para>Often referred to as WWW or the Web, this usually refers to information
available on the Internet that can be easily accessed with software usually
called a <quote>browser.</quote> Organizations publish their information on
the Web in a format known as HTML; this information is usually referred to
as their <quote>home page</quote> or <quote>web site</quote>.</para>
</glossdef>
</glossentry>
</glossary>
</book>