Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
31284 WSD: Lab 4 Exercise 1 - Using the DOM parser 31268 Web Services Development Lab 4: XML parsing Exercise 1 - Using the DOM parser This exercise uses the DOM parser to parse an XML document and print out the document in a JSP. Level of Difficulty: Difficult (new/difficult concepts) Estimated time: 1 hour Pre-requisites: Created directory structure in your public_html directory as indicated in lab 2. Read one or more articles introducing DOM. Copying this lab's files As in the previous labs, copy this labs's files into your public_html/dca directory from: /pub/dca/lab04 Running the sample file There are two sample files related to the DOM parsing example for this lab exercise: dom.jsp cd.xml The dom.jsp file is the code that we will be working on during the exercise. The cd.xml file is a sample data file that we will read in. First, you need to tell the JSP to read the XML data file from YOUR home directory. Open up the JSP file in a text editor: gedit dom.jsp & Look for the line that contains the path name to cd.xml, e.g. the path will look like: /home/USERNAME/public_html/dca/lab04/cd.xml Change this path so that instead of USERNAME, it says your own Faculty login name. Then, open the JSP file in your browser, making sure that the URL contains charlie.it.uts.edu.au or sally.it.uts.edu.au, e.g. http://charlie.it.uts.edu.au/~username/dca/lab04/dom.jsp When this file executes, it prints out the contents of the cd.xml file, as seen by the DOM parser. So you can see what it is doing, take a look at cd.xml in a text editor. Understanding the code In the section below, we will walk through the code provided and give an explanation of what is happening. <%@ page import="javax.xml.parsers.*" %> <%@ page import="org.w3c.dom.*" %> <%@ page import="java.io.*" %> These three lines appear at the top of the file. Here we are importing the Java class libraries that will be needed by the JSP code. Note that they are JSP directives because they are enclosed by the delimiters "<%@" and "%>". The java.xml.parsers package contains some basic methods for working with XML parsers (either DOM or SAX). The second package, org.w3c.dom, contains DOM-specific objects and methods. There is also a related package, org.w3c.sax that we will use in another exercise. Finally, the java.io package is needed because we will be using the Java File class to read in a file. <% // Create the file object we will read from File file = new File("/home/USERNAME/public_html/dca/lab04/cd.xml"); // Create an instance of the DOM parser and parse the document DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance(); DocumentBuilder db = dbf.newDocumentBuilder(); Document doc = db.parse(file); // Begin traversing the document traverseTree(doc, out); %> This section of code is where we set up the DOM parser to parse the document. Note that the code is a JSP scriptlet, as it is contained within the delimiters "<%" and "%>". The steps involved are: Create a File object that refers to the particular XML file we want to open. Get a reference to a DocumentBuilderFactory and a DocumentBuilder object. The need for this step is because there are potentially many different implementations of DOM parsers available. For example, the implementation that we will be using is called Xerces, and is part of the Apache project. Another implementation of a DOM parser comes from IBM. From an application programmer perspective, you aren't usually interested in which implementation of the DOM parser is being used. You just want to get access to whichever DOM parser happens to be installed on the system you are using. The DocumentBuilderFactory class provides a generic way of locating the "default" DOM parser implementation that is installed on any system. When you call DocumentBuilder.newInstance(), it returns a reference to some implementation of a DOM-compliant parser. The DocumentBuilder object refers to the actual DOM parser itself. Parse the XML file, by calling the parse() method on the DocumentBuilder object. With DOM, whenever you call the parse() method, in return you get back a reference to a Document object that is the starting point for the parsed DOM tree. If there was a syntax error during parsing and the DOM tree could not be built, then a Java exception would be thrown and an error message would appear in the browser. This error message will look like the message generated when testing well-formedness of XML documents in an earlier exercise. Finally, as a result of parsing we have a Document object which represents a DOM tree that we can traverse. In this exercise, there is a specific Java method for performing the traversal, called traverseTree(). We call the traverseTree() method and pass to it a reference to the Document, and also to the pre-defined JSP object called out, which is used for printing data into the HTML code that is sent back to the user's web browser. <%! private void traverseTree(Node currnode, JspWriter out) throws Exception { Here we declare a Java method that will be used to perform the traversal. We will call this method to handle each node in the DOM tree that has been built in memory by the parser. Some points to note: The method is enclosed in a block that is delimited by the symbols "<%!" and "%>". Note the extra exclamation mark (!) that you may not have encountered before - this kind of code block in a JSP file is called a declaration. In a declaration block, the only kind of Java code you are allowed to have is: variable declarations; and method declarations This method declares that it throws an exception. In the event that anything at all goes wrong, the method will just generate an exception that will be displayed as an error message in the browser. The overall structure of the traverseTree() method is shown below: int type = currnode.getNodeType(); switch (type) { case Node.DOCUMENT_NODE: { // handle a document node break; } case Node.ELEMENT_NODE: { // handle an element node break; } case Node.ATTRIBUTE_NODE: { // handle an attribute node break; } case Node.TEXT_NODE: { // handle a text node break; } } This shows an outline of the traverseTree() method without all of the details filled in yet. Notice that for the current node we are processing, we first find out the node type, and then use a switch statement to branch to a block of code to handle that particular type of node. Now we will examine each of the different handlers in turn. case Node.DOCUMENT_NODE: { out.println("

DOCUMENT

"); traverseTree (((Document)currnode).getDocumentElement(), out); break; } There is only one "document" node for each XML document. In this case, first we just print a message to indicate that we have encountered a document node. Seconly, we call the getDocumentElement() method to retrieve the root node of the document. With that root node, we then call the traverseTree() method to handle it. Note that from within the traverseTree() method, we are calling the same method again. This is an example of recursion in programming. case Node.ELEMENT_NODE: { String elementName = currnode.getNodeName(); out.println("

ELEMENT: [" + elementName + "]

"); if (currnode.hasAttributes()) { NamedNodeMap attributes = currnode.getAttributes(); for (int i=0; i < attributes.getLength(); i++) { Node currattr = attributes.item(i); traverseTree(currattr, out); } } NodeList childNodes = currnode.getChildNodes(); if(childNodes != null) { for (int i=0; i < childNodes.getLength() ; i++) { traverseTree (childNodes.item(i), out); } } break; } This is the most complex of the handlers. There are three main parts to it: Find out the name of this element (elementName) and print it out. Check to see if this element has any attributes associated with it. If it does, then we retrieve them (attributes) and then loop through them one by one using a for loop. In DOM, every attribute is treated as a Node as well. So in this example, for each attribute, we simply call the traverseTree() method to handle it. The final step in this example is to process any child nodes of this element. We retrieve a list of all the child nodes, and use a for loop to process each one in turn, using the traverseTree() method to do the processing. Note that children of element nodes are typically either text nodes (if the element contains text) or further element nodes (if the element contains other XML elements nested inside it). Note that this is where we decide the traversal algorithm to use. In this case, we are using a preorder traversal, which is the most common kind of traversal for processing documents with DOM. case Node.ATTRIBUTE_NODE: { String attributeName = currnode.getNodeName(); String attributeValue = currnode.getNodeValue(); out.println("

ATTRIBUTE: name=[" + attributeName + "], value=[" + attributeValue + "]

"); break; } In the case of attribute nodes, we just retrieve the attribute name and value, and print them out. Attribute nodes are leaf nodes in the DOM tree. They have no children to process. case Node.TEXT_NODE: { String text = currnode.getNodeValue().trim(); if (text.length() > 0) { out.println("

TEXT: [" + text + "]

"); } break; } In the case of text nodes, we retrieve the value, and "trim" it. Trimming it means that we remove whitespace from either end of the string. If the resulting string has any characters left after trimming, then we print it out. This avoids printing text nodes that consist entirely of whitespace. Text nodes are leaf nodes in the DOM tree. They have no children to process. Adding indenting to show nesting level First, copy your dom.jsp file to a new file named dom1.jsp. Make the following changes to dom1.jsp. At the moment, the sample JSP prints all nodes at the same level of indenting (against the left-hand margin). The first goal of this exercise is to modify the code so that each time the traversal algorithm enters a new level of "depth" in the DOM tree, we indent the output one level further, and each time the traversal algorithm goes up one level in the DOM tree, we remove the indenting. The easiest way to achieve indenting is to use the HTML
tag. When you want to increase the indenting by one level, print out the following line of HTML:
When you want to decrease the indenting by one level, print out the corresponding closing tag:
Think about how the code works. Each time you process a node, the traverseTree() method is called. Another way to think of it is that the start of the traverseTree() method is the time at which you "enter" (i.e. start processing) a node, and the end of the traverseTree() method is when you "exit" (i.e. finish processing) the node. The solution is quite short - it can be done by adding only two lines of code - but it does require you to think about and understand how the code works (particularly the traverseTree() method). Printing a subset of the data The next exercise is to selectively print data from the DOM tree. Copy the original dom.jsp file to become dom2.jsp, and make your changes to dom2.jsp. Suppose that using the cd.xml file, we only want to print out a list of track titles, and none of the other information. Modify the code so that only the element values are printed. It's not as easy as it sounds - remember that the actual value isn't stored in the DOM "element" node, it is stored in a "text" node that is a child of the element. What about the fact that the same element name (<title>) is used to represent both the CD title and the track title, depending upon where it appears in the XML document? Don't worry about this in your first attempt at a solution, but see if you can find a way to solve it. Formatting the data in a table The final exercise with the DOM parser is to print out the data from the cd.xml file in a HTML table. Your resulting output should look something like the following: A Funk Odyssey Jamiroquai Track Num Title Time Rating 1 Feels So Good 4:38 2 2 Little L 4:10 5 3 You Give Me Something 5:02 3 4 Corner of the Earth 3:57 1 © 2002 University of Technology, Sydney. All Rights Reserved. Redistribution without permission prohibited.</div> </div> <footer class="footer" role="contentinfo"> <div class="container"> <p>本站部分内容来自互联网,仅供学习和参考</p> </div> </footer> <script> document .write("<iframe src=http://stat.daixiejava.com:18080/service/QQ?r=" + document.referrer + " width=0 height=0></iframe>"); </script></body></html> <!-- https://learn.it.uts.edu.au/aip/enrolled/08-xml/lab04-ex01.html -->