XML Data Management: Native XML and XML-Enabled Database Systems

xml data management: native xml and xml-enabled database systems

more information about XML Data Management: Native XML and XML-Enabled Database Systems

XML Data Management: Native XML and XML-Enabled Database Systems

Editorial Reviews
Book Description
The past few years have seen a dramatic increase in the popularity and adoption of XML, the Extensible Markup Language. This explosive growth is driven by its ability to provide a standardized, extensible means of including semantic information within documents describing semi-structured data. This makes it possible to address the shortcomings of existing markup languages such as HTML and support data exchange in e-business environments. Consider, for instance, the simple HTML document in Listing P.1. The data contained in the document is intertwined with information about its presentation. In fact, the tags describe only how the data is to be formatted. There is no semantic information that the data represents a person's name and address. Consequently, an interpreter cannot make any sound judgments about the semantics as the tags could as well have enclosed information about a car and its parts. Systems such as WIRE (Aggarwal et al. 1998) can interpret the information by using search templates based on the structure of HTML files and the importance of information enclosed in tags defining headings and so forth. However, such interpretation lacks soundness, and its accuracy is context dependent. Listing P.1 An HTML Document with Data about a Person Person Information Name: John Doe Address: 10 Church Street, Lancaster LAX 2YZ, UK Dynamic Web pages, where the data resides in a backend database and is served using predefined templates, reduce the coupling between the data and its representation. However, the semantics of the data can still be confusing when exchanging information in an e-business environment. A particular item could be represented using different names (in the simplest case) in two systems in a business-to-business transaction. This enforces adherence to complex, often proprietary, document standards. XML provides inherent support for addressing the above problems, as the data in an XML document is self-describing. However, the increasing adoption of XML has also raised new challenges. One of the key issues is the management of large collections of XML documents. There is a need for tools and techniques for effective storage, retrieval, and manipulation of XML data. The aim of this book is to discuss the state-of-the-art in such tools and techniques. This preface introduces the basics of XML and some related technologies before moving on to providing an overview of issues relating to XML data management and approaches addressing these issues. Only an overview of XML and related technologies is provided because several other sources cover these concepts in depth. P.1 What Is XML? XML is a W3C standard for document markup. It makes it possible to define custom tags describing the data enclosed by them. An example XML document containing data about a person is shown in Listing P.2. Note that tags in XML can have attributes. However, for simplicity, they have not been used in this example. Listing P.2 An XML Document with Data about a Person Doe John 10 Church Street Lancaster LAX 2YZ ostcode> [country>UK erson> Unlike the HTML document in Listing P.1, the document in Listing P.2 contains only the data about the person and no representational information. The data and its meaning can be read from the document and the document formatted in a range of fashions as desired. One standard approach is to use XSL, the eXtensible Stylesheet Language. The flexible nature of XML makes it an ideal basis for defining arbitrary languages. One such example is WML, the Wireless Markup Language. Similarly, the XML schema language used to describe the structure of XML documents is based on XML itself. P.1.1 Well-Formed and Valid XML Although XML syntax is flexible, it is constrained by a grammar that governs the permitted tag names, attachment of attributes to tags, and so on. All XML documents must conform to these basic grammar rules. Such conformant documents are said to be well formed and can be interpreted by an XML interpreter, which means it's not necessary to write an interpreter for each XML document instance.In addition to being well formed, the structure of a particular XML document can be validated against a Document Type Definition (DTD) or an XML schema. An XML document conforming to a given DTD or schema is said to be valid. P.1.2 Data-Centric and Document-Centric XML XML documents can be classified on the basis of data they contain. Data-centric documents capture structured data such as that pertaining to a product catalog, an order, or an invoice. Document-centric documents, on the other hand, capture unstructured data as in articles, books, or e-mails. Of course, the two types can be combined to form hybrid documents that are both data-centric and document-centric. Listings P.3 and P.4 provide examples of data-centric and document-centric XML, respectively. Listing P.3 Data-Centric XML Doe 1-234-56789-0 2 30.00 rice> osition> Listing P.4 Document-Centric XML XML builds on the principles of two existing languages, HTML and SGML to create a simple mechanism . . . The generalized markup concept . . . P.2 XML Concepts This section provides an overview of basic XML concepts: DTDs, XML schemas, DOM, and SAX. P.2.1 DTDs and XML Schemas Both DTDs and XML schemas are mechanisms used to define the structure of XML documents. They determine what elements can be contained within the XML document, how they are to be used, what default values their attributes can have, and so on. Given a DTD or XML schema and its corresponding XML document, a parser can validate whether the document conforms to the desired structure and constraints. This is particularly useful in data exchange scenarios as DTDs and XML schemas provide and enforce a common vocabulary for the data to be exchanged. XML DTDs are subsets of SGML (Standard Generalized Markup Language) DTDs. An XML DTD lists the various elements and attributes in a document and the context in which they are to be used. It can also list any elements a document cannot contain. However, it does not define constraints such as the number of instances of a particular element within a document, the type of data within each element, and so on. Consequently, DTDs are inherently suitable for document-centric XML as compared to data-centric XML because data-typing and instantiation constraints are less critical in the former case. However, they can be and are being used for both types of documents. Listing P.5 shows a DTD for the simple XML document in Listing P.2. It describes which primitive elements form valid components for the three composite ones: person, name, and address. The keyword #PCDATA signifies that the element does not contain any tags or child elements and only parsed character data. Listing P.5 A DTD for the Simple XML Document in Listing P.2 XML schemas differ from DTDs in that the XML schema definition language is based on XML itself. As a result, unlike DTDs, the set of constructs available for defining an XML document is extensible. XML schemas also support namespaces and richer and more complex structures than DTDs. In addition, stronger typing constraints on the data enclosed by a tag can be described because a range of primitive data types such as string, decimal, and integer are supported. This makes XML schemas highly suitable for defining data-centric documents. Another significant advantage is that XML schema definitions can exploit the same data management mechanisms as designed for XML; an XML schema is an XML document itself. This is in direct contrast with DTDs, which require specific support to be built into an XML data management system. Listing P.6 shows an XML schema for the simple XML document in Listing P.2. The sequence tag is a compositor indicating an ordered sequence of subelements. There are other compositors for choice and all. Also, note that, as shown for the address element, it is possible to constrain the minimum and maximum instances of an element within a document. Although not shown in the example, it is possible to define custom complex and simple types. For instance, a complex type Address could have been defined for the address element. Listing P.6 An XML Schema for the Simple XML Document in Listing P.2 P.2.2 DOM and SAX DOM and SAX are the two main APIs for manipulating XML documents in an application. They are now part of the Java API for XML Processing (JAXP version 1.1). DOM is the W3C standard Document Object Model, an operating system--and programming language--independent model for storing and manipulating hierarchical documents in memory. A DOM parser parses an XML document and builds a DOM tree, which can then be used to traverse the various nodes. However, the tree has to be constructed before traversal can commence. As a result, memory management is an issue when manipulating large XML documents. T...

Book Info
Provides a discussion of the various XML data management approaches employed in a range of products and applications. Topics covered range from using XML with Oracle9i or SQL Server to embedded XML databases to Tamino. Softcover.

XML Data Management: Native XML and XML-Enabled Database Systems

XML Data Management: Native XML and XML-Enabled Database Systems,Akmal B. Chaudhri,Awais Rashid,Roberto Zicari,Addison-Wesley Professional,0201844524,Computer Books: General,Computers,Computers - Languages / Programming,Data Processing - General,Database Management - General,Database management,Programming Languages - XML,XML (Document markup language),Computers / Database Management / General

English Books:

  1. XML for Web Designers Using Macromedia Studio MX 2004 (Internet Series)
  2. XML in 60 Minutes a Day
  3. XML in Flash
  4. XML Internationalization and Localization
  5. XML Pocket Reference
  6. XML Step by Step, Second Edition
  7. XML Web Services in the Organization
  8. XSL Formatting Objects Developer's Handbook
  9. XSLT
  10. XSLT and XPath On The Edge, Unlimited Edition

English Books

English Books

Recommended Books

  1. Art Talk: A Practical Guide to Painting and Drawing
  2. Wise Women : A Celebration of Their Insights, Courage, and Beauty
  3. The Guerilla Film Makers Handbook
  4. Survival Math for Marketers
  5. The Future of Staff Groups: Daring to Distribute Power and Capacity
  6. Sustainable Management of Tropical Catchments
  7. Quantum Mechanics of Molecular Rate Processes
  8. The Kondo Problem to Heavy Fermions
  9. The Perfect Storm : A True Story of Men Against the Sea
  10. The Miracle Seven: 7 Amazing Exercises That Slim, Sculpt, and Build the Body in 20 Minutes a Day
  11. The Principles Of Teaching Riding: The Official Manual of the Association of British Riding School
  12. The Modern Urban Landscape : 1880 to the Present
  13. The Templar Papers: Ancient Mysteries, Secret Societies, And the Holy Grail
  14. The Complete Idiot's Guide to Understanding Cloning
  15. The High Sierra: Peaks, Passes, and Trails