Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
Comp 207: Tutorials and lab classes: Lab class 5     Comp 207: Tutorials and lab classes: Lab class 5     XML processing using Java The aim of this lab class is to familiarise ourselves with the most common programming interfaces for working with XML documents in Java. In particular, this lab focusses on  two tasks: parsing and validation of XML documents. Parsing  allows you to read an XML document to determine its structure and contents. There are  many open-source, no-cost XML parsers available. This lab focuses on creating parser objects, asking those parsers to process XML files, and handling the results. Validation lets you check whether an XML document conforms to a given DTD or XML document. The task in this lab is to write a java program that reads an XML document and prints out its tree structure. XML parsing A number of different programming interfaces have been defined by companies, standard bodies and user groups in order to support the processing of XML documents in Java. In this lab we will cover the Document Object Model (DOM). The DOM interface is a recommendation of the W3C that defines how to access and represent the content of an XML document. An XML parser is a piece of code  that reads the content and analyses the structure of an XML document. A parser supporting the DOM implements the interfaces defined in the DOM standard. A DOM parser will return a tree structure that represents the content of the XML document in input. All the components of the document (elements, attributes, text content, etc.) are represented as nodes in the tree structure. The DOM also provides a number of functions that can be used to inspect and manipulate the content of the tree. A DOM tree contains a number of Nodes n, and a Node is another Java interface. The Node is the  base datatype of the DOM: everything in a DOM tree is a node. In order to create or manipulate a DOM it helps to have an idea of how the nodes in a DOM are structured. The table below summarises the specification for Node. In this lab we will mainly need the following subinterfaces: •Element: It represents an XML element in the input document. •Attr: It represents an attribute of an XML element. •Text: It refers to the content of an element. An element with text contains text node children; the text of the element is not a property of the element itself. •Document: It represents the entire XML document. For each parsed XML document there is exactly one Document object. Given a Document object, one can find the root of the DOM tree and from the root, one can use DOM functions to recursively read the tree. In this lab class we will write a piece of code that performs the following tasks: 1. Reads the name of an XMl file from the command line; 2. Creates a parser object; 3. Uses the parser object to parse the input XML file; 4. Traverses the resulting DOM tree to print the content of the various nodes on standard output. However, before starting, we need to make sure that the required classes are imported. In the snippets of code below the classes are individually named, so that you can check the API documentation. From the Java XML API (JAXP): import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; Classes too handle exceptions thrown when the XML document is parsed: import javax.xml.parsers.ParserConfigurationException; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; Finally, the W3C definitions for a DOM, DOM exceptions, entities and nodes: import org.w3c.dom.Document; import org.w3c.dom.DocumentType; import org.w3c.dom.Entity; import org.w3c.dom.NamedNodeMap; import org.w3c.dom.Node; import org.w3c.dom.Attr; The first step consists in reading the file name from the command line. This step is relatively easy, it just looks for an argument that is supposed to be a filename or a URI (Uniform Resource Identifier, similar to a URL). If no argument is specified, the program prints a n error message and exits. public static void main(String argv[]) { if (argv.length == 0 || (argv.length == 1 && argv[0].equals("-help"))) { System.out.println("\n Usage: java XML document URI"); System.out.println("   where URI refers to the URI of the XML document to print"); System.exit(1); } readAndPrintXML r = new readAndPrintXML(); r.parseAndPrint(argv[0]); } If a file name has been successfully read in, then we can create a parser object. By using factory classes, one does not need to know which parser class they’re using, therefore parser can be changed easily without modifying the source code. In order to make use of a parser we need to create a factory object, and then have the factory object create the parser itself: public void parseAndPrint(String uri) { Document doc = null; try { DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); //setValidating(true) if this needs to be a validating parser.              domFactory.setValidating(false);           // and validating parser features domFactory.setIgnoringElementContentWhitespace(true); DocumentBuilder db = domFactory.newDocumentBuilder(); The DocumentBuilder object is the parser, and it implements all the DOM methods needed to parse and process XML. The main methods to use are: •Document.getDocumentElement(): Returns the root of the DOM tree. (This is a method of the Document interface and is not defined for other subtypes of Node. ) •Node.getChildNodes(): Returns the child nodes of a given Node. •Node.getNodeName(): Return the name associated with of a given Node. •Element.getAttribute(String attrName): For a given Element, returns the value of the attribute named attrName. If you want the value of the "id" attribute, use Element.getAttribute("id"). If that attribute doesn't exist, the method returns an empty string ( "" ). For more methods consult the Node documentation. Now we’re ready to parse the file whose name was read in step 1. doc = db.parse(uri); doc is an instance of the Document interface, and thus allows us to navigate the structure in order to analyse the content of the document. Finally we can print the content of the DOM tree. Since everything in the DOM tree is a Node, then we can recursively traverse the tree and traverse everything in it. The idea is to call a method to print a node, then, after printing the relevant info, the method will call itself to print each of the node’s children. if (doc != null) printDomTree(doc); }catch (SAXParseException spe) {             ...         } } public void printDomTree(Node node) { int type = node.getNodeType(); switch (type) { //print the document element node case Node.DOCUMENT_NODE: { System.out.println(""); printDomTree(((Document) node).getDocumentElement());                  break; } printDomTree(Node node) takes a node as an argument. If the node is a document, it will print the XML declaration, and then call printDomTree for the document element that contains the rest of the document. Since the document element will most certainly have children, then printDomTree is recursively used to print the children to the command line. In order to handle the Element node, we need to enclose the node name in “<“ and “>”. If the node has also attributes we need to call for each printDomTree attribute. After processing the node attributes, we  can look for children of the Element node, by calling printDomTree for each of the children nodes. Then, we need to close the ending tags.  Finally, processing text nodes just prints their value to output. case Node.ELEMENT_NODE: { System.out.print("<"); System.out.print(node.getNodeName()); NamedNodeMap attrs = node.getAttributes(); for (int i=0; i"); if (node.hasChildNodes()) { NodeList children = node.getChildNodes(); for (int j=0; j"); break; } case Node.ATTRIBUTE_NODE: { System.out.print(" " + node.getNodeName() + "=\"" +                                     ((Attr)node).getValue() + "\""); break; } case Node.TEXT_NODE: { System.out.print(node.getNodeValue()); break; } If you’re running this programme from Eclipse, you will need to choose Run configuration from the run menu, and specify the name of the XML file you want to parse with your program. Please make sure the file is in the same directory as your source code/ src directory, depending on the structure you chose for your project. Here is a sample XML file to use for your convenience. The code sample above do not consider the issues related to exception handling and validation. As an optional exercise to do in your time as preparation for the assignment you might want to add the code for validating the file with respect to a DTD or an XML schema, and for handling exceptions correctly. You can find more informations on handling exceptions in the JAXP Sun tutorial. You can use the same tutorial to read an overview of the validation process. Finally, here is the schema corresponding to the XML file given above to try out validation yourselves (some browsers do not render XML Schema, therefore you need to click on view source in order to see the document. Alternatively, save the XML schema directly in the directory with your program).