Java程序辅导

C C++ Java Python Processing编程在线培训程序编写软件开发视频讲解

QQ：2653320439 微信：ittutor Email：itutor@qq.com

This exercise uses the DOM parser to parse an XML documentand print out the document in a JSP.

Level of Difficulty: Difficult (new/difficult concepts)
Estimated time: 1 hour
Pre-requisites:

Created directory structure in your public_htmldirectory as indicated in lab 2.
Read one or more articles introducing DOM.

Copying this lab's files

As in the previous labs, copy this labs's files into your public_html/dca directory from:

  /pub/dca/lab04

Running the sample file

There are two sample files related to the DOM parsing examplefor this lab exercise:

dom.jsp
cd.xml

The dom.jsp file is the code that we will be working on during the exercise.The cd.xml file is a sample data file that we will read in.

First, you need to tell the JSP to read the XML data filefrom YOUR home directory.Open up the JSP file in a text editor:

  gedit dom.jsp &

Look for the line that contains the path name to cd.xml,e.g. the path will look like:

    /home/USERNAME/public_html/dca/lab04/cd.xml

Change this path so that instead of USERNAME,it says your own Faculty login name.

Then, open the JSP file in your browser, making sure that the URLcontains charlie.it.uts.edu.au orsally.it.uts.edu.au, e.g.

  http://charlie.it.uts.edu.au/~username/dca/lab04/dom.jsp

When this file executes, it prints out the contents of thecd.xml file, as seen by the DOM parser.

So you can see what it is doing, take a look at cd.xmlin a text editor.

Understanding the code

In the section below, we will walk through the code providedand give an explanation of what is happening.

  <%@ page import="javax.xml.parsers.*" %>  <%@ page import="org.w3c.dom.*" %>  <%@ page import="java.io.*" %>

These three lines appear at the top of the file.Here we are importing the Java class libraries that will beneeded by the JSP code.

Note that they are JSP directives because they are enclosedby the delimiters "<%@" and"%>".

The java.xml.parsers package contains some basicmethods for working with XML parsers (either DOM or SAX).
The second package, org.w3c.dom, contains DOM-specificobjects and methods. There is also a related package, org.w3c.sax that wewill use in another exercise.
Finally, the java.io package is needed becausewe will be using the Java File class to read in a file.

<%  // Create the file object we will read from  File file = new File("/home/USERNAME/public_html/dca/lab04/cd.xml");  // Create an instance of the DOM parser and parse the document  DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();  DocumentBuilder db = dbf.newDocumentBuilder();  Document doc = db.parse(file);  // Begin traversing the document  traverseTree(doc, out);%>

This section of code is where we set up the DOM parser to parse thedocument.Note that the code is a JSP scriptlet, as it is containedwithin the delimiters "<%" and"%>".The steps involved are:

Create a File object that refers to the particular XMLfile we want to open.
Get a reference to a DocumentBuilderFactory anda DocumentBuilder object.The need for this step is because there are potentially many differentimplementations of DOM parsers available.For example, the implementation that we will be using is called Xerces,and is part of the Apache project.Another implementation of a DOM parser comes from IBM.
From an application programmer perspective, you aren't usually interested in which implementation of the DOM parser is being used. You just wantto get access to whichever DOM parser happens to be installed on thesystem you are using.The DocumentBuilderFactory class provides a generic wayof locating the "default" DOM parser implementation thatis installed on any system.When you call DocumentBuilder.newInstance(), it returnsa reference to some implementation of a DOM-compliant parser.The DocumentBuilder object refers to the actual DOMparser itself.
Parse the XML file, by calling the parse() methodon the DocumentBuilder object.With DOM, whenever you call the parse() method, in return you get back a reference to a Documentobject that is the starting point for the parsed DOM tree.
If there was a syntax error during parsing and the DOM tree couldnot be built, then a Java exception would be thrown and an errormessage would appear in the browser.This error message will look like the message generated when testingwell-formedness of XML documents in an earlier exercise.
Finally, as a result of parsing we have a Documentobject which represents a DOM tree that we can traverse.In this exercise, there is a specific Java method for performingthe traversal, called traverseTree().We call the traverseTree() method and pass to ita reference to the Document, and also to thepre-defined JSP object called out, which is usedfor printing data into the HTML code that is sent back to the user's web browser.

<%!  private void traverseTree(Node currnode, JspWriter out) throws Exception {

Here we declare a Java method that will be used to perform thetraversal.We will call this method to handle each node in the DOM tree thathas been built in memory by the parser.

Some points to note:

The method is enclosed in a block that is delimited by the symbols"<%!" and "%>".Note the extra exclamation mark (!) that you may not haveencountered before - this kind of code block in a JSP file is called adeclaration.
In a declaration block, the only kind of Java code you are allowed to haveis:
- variable declarations; and
- method declarations
This method declares that it throws an exception.In the event that anything at all goes wrong, the method will just generate an exception that will be displayed as an errormessage in the browser.

The overall structure of the traverseTree() methodis shown below:

  int type = currnode.getNodeType();  switch (type) {    case Node.DOCUMENT_NODE:    {      // handle a document node      break;    }      case Node.ELEMENT_NODE:    {      // handle an element node      break;    }      case Node.ATTRIBUTE_NODE:    {      // handle an attribute node      break;    }      case Node.TEXT_NODE:    {      // handle a text node      break;    }  }

This shows an outline of the traverseTree() methodwithout all of the details filled in yet.Notice that for the current node we are processing, we firstfind out the node type, and then use a switch statementto branch to a block of code to handle that particular type ofnode.

Now we will examine each of the different handlers in turn.

    case Node.DOCUMENT_NODE:    {      out.println("<p>DOCUMENT</p>");      traverseTree (((Document)currnode).getDocumentElement(), out);      break;    }

There is only one "document" node for each XML document.In this case, first we just print a message to indicate that we haveencountered a document node.Seconly, we call the getDocumentElement() method toretrieve the root node of the document.With that root node, we then call the traverseTree() methodto handle it.Note that from within the traverseTree() method, we arecalling the same method again.This is an example of recursion in programming.

    case Node.ELEMENT_NODE:    {      String elementName = currnode.getNodeName();      out.println("<p>ELEMENT: [" + elementName + "]</p>");      if (currnode.hasAttributes()) {        NamedNodeMap attributes = currnode.getAttributes();        for (int i=0; i < attributes.getLength(); i++) {          Node currattr = attributes.item(i);          traverseTree(currattr, out);        }      }      NodeList childNodes = currnode.getChildNodes();      if(childNodes != null) {        for (int i=0; i < childNodes.getLength() ; i++) {          traverseTree (childNodes.item(i), out);        }      }      break;    }

This is the most complex of the handlers.There are three main parts to it:

Find out the name of this element (elementName) andprint it out.
Check to see if this element has any attributes associated with it.If it does, then we retrieve them (attributes) andthen loop through them one by one using a for loop.In DOM, every attribute is treated as a Node as well.So in this example, for each attribute, we simply call the traverseTree() method to handle it.
The final step in this example is to process any child nodes ofthis element.We retrieve a list of all the child nodes, and use a forloop to process each one in turn, using the traverseTree()method to do the processing.Note that children of element nodes are typically either textnodes (if the element contains text) or further element nodes(if the element contains other XML elements nested inside it).

Note that this is where we decide the traversal algorithm to use.In this case, we are using a preorder traversal, which is the mostcommon kind of traversal for processing documents with DOM.

    case Node.ATTRIBUTE_NODE:    {      String attributeName = currnode.getNodeName();      String attributeValue = currnode.getNodeValue();      out.println("<p>ATTRIBUTE: name=[" + attributeName +      "], value=[" + attributeValue + "]</p>");      break;    }

In the case of attribute nodes, we just retrieve the attribute nameand value, and print them out.

Attribute nodes are leaf nodes in the DOM tree.They have no children to process.

    case Node.TEXT_NODE: {      String text = currnode.getNodeValue().trim();      if (text.length() > 0) {        out.println("<p>TEXT: [" + text + "]</p>");      }      break;    }

In the case of text nodes, we retrieve the value, and "trim" it.Trimming it means that we remove whitespace from either end of thestring.

If the resulting string has any characters left after trimming, thenwe print it out.This avoids printing text nodes that consist entirely of whitespace.

Text nodes are leaf nodes in the DOM tree.They have no children to process.

Adding indenting to show nesting level

First, copy your dom.jsp file to a new filenamed dom1.jsp.Make the following changes to dom1.jsp.

At the moment, the sample JSP prints all nodes at the same levelof indenting (against the left-hand margin).The first goal of this exercise is to modify the code so thateach time the traversal algorithm enters a new level of "depth" in the DOM tree, we indent the output onelevel further, and each time the traversal algorithm goesup one level in the DOM tree, we remove the indenting.

The easiest way to achieve indenting is to use the HTML<blockquote> tag.When you want to increase the indenting by one level, print outthe following line of HTML:

  <blockquote>

When you want to decrease the indenting by one level, print outthe corresponding closing tag:

  </blockquote>

Think about how the code works.Each time you process a node, the traverseTree() methodis called.Another way to think of it is that the start of thetraverseTree() method is the time at which you"enter" (i.e. start processing) a node, and the end of the traverseTree() method is when you "exit" (i.e. finish processing) the node.

The solution is quite short - it can be done by adding only two linesof code - but it does require you to think about and understand how the code works (particularly the traverseTree() method).

Printing a subset of the data

The next exercise is to selectively print data from the DOM tree.Copy the original dom.jsp file to becomedom2.jsp, and make your changes to dom2.jsp.

Suppose that using the cd.xml file, we only want toprint out a list of track titles, and none of the other information.

Modify the code so that only the <title> elementvalues are printed.It's not as easy as it sounds - remember that the actual value isn'tstored in the DOM "element" node, it is stored in a"text" node that is a child of the element.

What about the fact that the same element name (<title>)is used to represent both the CD title and the track title, dependingupon where it appears in the XML document?Don't worry about this in your first attempt at a solution, butsee if you can find a way to solve it.

Formatting the data in a table

The final exercise with the DOM parser is to print out the datafrom the cd.xml file in a HTML table.Your resulting output should look something like the following:

Track Num	Title	Time	Rating
A Funk Odyssey		Jamiroquai
1	Feels So Good	4:38	2
2	Little L	4:10	5
3	You Give Me Something	5:02	3
4	Corner of the Earth	3:57	1