Introduction to XML

XML (eXtensible Markup Language) is a configurable vehicle for any kind of information that can be used to store and organize data. Although the name may suggest XML to be a markup language, XML is not itself a markup language; it is a set of rules for building markup languages.

So, what is a markup language? It is information added to a document in order to enhance its meaning in certain ways. For example, when reading a web page, the various segments can be differentiated by content, design, and possibly by placement. A markup language annotates the different segments of the data in a document.

XML Basics Video

XML Basics from OU Campus Training on Vimeo.

Anatomy of XML Document

The fundamental building blocks of an XML document and all XML-derived languages are elements, attributes, entities, and processing instructions.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "">
          <meta name="author" content="John Doe" />
          New Page
          <p>Hello World</p>

XML Tree Structure

XML Tree Structure

Document Prolog

In its simplest form, a document prolog denotes that the document is an XML document and declares the version of the XML being used with the encoding definition. All valid XML documents begin with an XML declaration. For purposes of OU Campus, always begin any XML document with the following XML declaration:

<?xml version="1.0" encoding="UTF-8"?>

Declaration Attributes

version: The latest version of  XML is 1.0

encoding: This specifies the encoding of the document. When working with XML in OU Campus, please use UTF-8.

Document Type Declaration (DTD)

The DTD is a set of rules or declaration, which is used to model an XML document. It is also referred to as document modeling. DTD allows developers to declare the set of allowed elements, attributes, and the structure of the final XML document.

<!DOCTYPE document SYSTEM 


Elements are the building blocks of an XML document. Elements are nodes that include a tag set, along with any required attributes, attribute values, and content. 

     Hello World


An attribute is part of an element often used for data that identifies or describes the element. An attribute can be used to give an element a unique identity so that it can be easily located, or it can describe a property about the element.

<meta name="author" content="John Doe" />


XML namespaces allow developers to add vocabulary to an XML document. Namespaces that are identified provide a simple method of associating element and attribute names used in XML documents. The following is an example of a name space declaration.

Namespace Declaration Syntax


Namespace Declaration Example


Namespaces are useful to prevent name clashes and also allow the processor to treat different groups of namespaces differently. An example of such a processing is the transformation language XSLT, which relies on the namespace to perform XSL-specific instructions.


Entities are placeholders for content, which are declared once and can be used many times almost anywhere in the document. It doesn't add anything semantically to the markup. Rather, it's a convenience to make XML easier to write, maintain, and read.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "" [
  <!ENTITY fullname "John Doe">
  <!ENTITY address "123 ABC Road, USA">
  <!ENTITY phone "<number>123-456-7890</number>">

Additional XML Definitions

Tag: A piece of text that describes the semantics or structure of a unit of data.

Node: Smallest unit of structure in an XML document.

Parent node: A node in a tree data structure that links to one or more child nodes.

Child node: A node in a tree data structure that is linked to by a parent node.

Root node:The first opened and last closed element in an XML document. This node contains all other nodes as children nodes.

Publish Control File (PCF)

pcf-stylesheet declaration: The processing instruction at the top of an XML document used to instruct OU Campus which XSL file to use for a transformation.

Document Type Definition (DTD): The formal definition of the elements, structures, and rules for marking up an XML document. A DTD can be stored at the beginning of a document or externally in a separate file.

Character Entity: A reference to a named entity that has been predefined or explicitly declared in a DTD.

PCF Stylesheet Declaration

  1. The pcf-stylesheet declaration instructs OU Campus which XSL file to use to transform the PCF file. The PCF style sheet declaration following the XML style sheet declaration. For example:

    <?pcf-stylesheet path="/_resources/xsl/default.xsl" title=”web” extension="html"?>
    Declaration Attributes

    path:Defines the location of the XSL file. Must be a root relative path from the root of the staging server. This attribute is required.

    title: Defines the label on the OU Campus multi-output preview tab.

    extension: Specifies the file extension to be used for the file output by the transformation. This attribute is required.
  2. The WYSIWYG Editor in OU Campus by default uses an XHTML-compliant schema. HTML5 can be enabled for a site in its stead. In the case that the XHTML-compliant WYSIWYG is being used within OU Campus, it is necessary to ensure that most, if not all, HTML character entities can be used. This is done with the use of a DTD file as described in the example below. Link to the DTD in both headers of the PCF and XSL after all other declarations at the start of the file. Be aware that a doctype declaration gets added to the XSL to utilize character entities within the XSL file.
<?xml version="1.0" encoding="utf-8"?>
<?pcf-stylesheet path="/_resources/xsl/default.xsl" extension="html" title="Web"?>
<?pcf-stylesheet path="/_resources/xsl/page2pdf.xsl" extension="pdf" title="PDF" alternate="yes"?>
<!DOCTYPE document SYSTEM "
      <meta name="pagetype" content="3" />
      <!-- -->
      <title>Gallena Contacts</title>
      <meta name="keywords" content="Contacts" />
      <meta name="description" content="College Contact Information." />
      <meta name="show-pdf-link" content="false" />
<!-- / -->
      <!-- com.omniupdate.div label="content" group="Everyone" button="787" break="break" -->
      <!-- com.omniupdate.editor csspath="/_resources/ou/editor/controller.css" cssmenu="/_resources/ou/editor/controller.txt" width="776" -->
      <p><a name="top"> </a><br /><strong>Gallena University, California</strong></p>
<p>2314 Running Springs Rd<br />Gallena, CA 91255-4901 USA<br />Tel: (818) 555-9401</p>
      <p><a href="">OmniUpdate</a></p>
      <p><a href="/about/test9.html">test</a></p>
<!-- /com.omniupdate.div -->




Advanced XSL Topics Intro to XSL