Visual Quickstart Guide XML (Second Edition)
Kevin Howard Goldberg
US $34.99, CDN $37.99
XML (eXtensible Markup Language) has become the medium to move data in efficient and predictable ways. Derived from a similar markup language, SGML (Standard Generalized Markup Language), XML is structured, but not as highly as SGML. Structure is what itís all about. The very loosely structured HTML (HyperText Markup Language) is also derived from SGML. Even the XML markup looks amazingly like HTML, except, as the author explains, HTML defines how information will look, while XML defines how the information is formatted.
Here is a portion of an XML file:
If you analyze the code sample above, you should be able to see that there are three siblings defined. Each siblingís information is contained, or wrapped, between the <sibling> and </sibling> tags, and that the information on those three siblings is wrapped between the <my_siblings> and </my_siblings> tags. Taking this one step further, you can think of these sibling “chunks” as parts of a database: the content between the <sibling> and </sibling> tags would be defined as a record, while the <name></name>, <gender></gender>, and <age></age> tags define fields within a record. This content can then be transformed into content in a different format and reused in many different ways.
In XML, as in HTML, you can also see that each chunk of information is tagged with an opening and closing tag.
Why structure? With a standardized method of defining chunks of information, the information can be easily shared, re-used, translated, and manipulated in infinite ways, yet retain its integrity and its overall definition. XML, being an ASCII text format, is universalóit can be shared among multiple platforms without modification, save for some minor file system issues that are beyond the scope of this review.
One major use of XML is in content management systems (CMS), where it can be searched, selectively extracted, and assembled into larger documents that then can be transformed into final deliverables, such as a PDF file, Help files, or a set of HTML files. Sure, you could probably do this with plain text, but without the underlying required structure, it would be a lot harder, and would probably require a large amount of post-assembly editing before even attempting to create the deliverables.
Another popular use of XML is in Adobe Flash animations and programs. By building the text content in external files formatted as XML that the Flash file points to, dealing with localized (translated) content is an extremely simple matteróoften just changing the filename links in the main Flash file can transform an English language document into a Spanish, French, or whatever document in momentsóand by maintaining the master files in a database-driven content management system, you can translate content that might be used in multiple documents or even multiple times in the same document once and only once, which, I can assure you, results in huge cost savings.
Kevin Howard Goldberg has put together an excellent primer on the multifaceted alphabet soup that is XML. He updated the first edition of this book, originally authored by Elizabeth Castro, with Ms. Castroís assistance, adding information on some of the newer applications of XML: XSL-FO, XSLT 2.0, XPath 2.0, and XQuery 1.0.
The book is divided into the following sections, each of which builds on the previous chapter:
* XML ñ The basics of writing XML code, and the underlying structure.
* XSL ñ How to transform XML into multiple deliverables (HTML, XML, etc.). It also covers XSLT, XPath, and XSL-FO. XSL-FO is most widely used to transform XML files into PDF deliverables.
* DTD ñ Document Type Definition. DTDs are the underlying glue that holds the XML together. How? By defining and detailing the rules under which valid XML files function. Separate sections discuss entities and notations, as well as validations (ensuring the XML file follows the rules defined in the DTD).
* XML Schema ñ Developed to overcome some of the shortcomings of DTDs, the XML Schema is a more powerful document, designed to give the author even more control over how the XML content is structured and defined.
* XML Namespaces ñ A method of combining XML from multiple sources, even if there are identical element names. XML Namespaces provides a method to merge the content while retaining the definitions of each independent element (I hope I got that rightÖ).
* Recent W3C (World Wide Web Committee) Recommendations ñ Discusses some of the newest enhancements to the XML specifications including XSLT 2.0, XPath 2.0, and XQuery 1.0.
* XML in Practice ñ Applications of XML, especially in Web 2.0 usage. Topics and examples include Ajax, RSS, SOAP, WSDL, KML, ODF, OOXML, eBooks, ePub, and more. I told you it was an alphabet soup!
* Appendices ñ Discusses XML editors and tools. Full character set and entity tables.
This book is a great introduction to XML. Itís loaded with sample code and examples to get you started. Itís well illustrated and makes great use of color. Peachpit Press also offers a companion website with sample code, updates, etc.
XML is not for the faint-of-heart. There are just so many pieces that comprise the XML specification; it can be confusing, even with this Visual Quickstart Guide. The only thing I didnít see in this book, and most likely because of its inherent specialization is the DITA (Darwin Information Typing Architecture) specification. DITA is a highly specialized topic-based XML-based markup language, mainly used for creating instructional materials (user documentation, educational texts, and so on). I recommend this book highly.