Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
CP4044 – Programming for Systems and Networks 
Mel Ralph Page 1 25/10/2007 
Assignment 1 
 
Your task is to develop a word frequency counting program in Java. 
 
The program will read text from an input file, and produce a list of the words found in the file 
with a count of the number of times each word has been encountered.  For the purpose of this 
assessment the input files will be plain ASCII text files, not word processor files.   
 
The program will be executed from the command line, with the filename and any other 
required data being passed as command-line arguments.   
 
You may only use an array data structure (covered in the module) to implement the program.  
In particular, you cannot use Vector or JCF classes. 
 
 
 
Basic Solution (Grade D) 
 
The program will produce as output a list of words with counts of the number of times each 
word has been encountered in the input document.  The list should be in either alphabetical 
word order or in order of word frequency.  You may assume that words are delimited by 
white-space characters (i.e. newline, carriage return, tab or space). 
 
 
The following are suggestions for improving on the basic solution, with indicative grades: 
 
Grade C 
 
The program does not crash when run with invalid input. 
 
The order of words in the output will be determined by the presence of the command line flag, 
-a implying that the words be listed in alphabetical order or the flags –n or -nr implying the 
words are listed with the least (-n) or most (-nr) frequent first.  Use alphabetical order within 
groups of words with equal frequency. 
 
Grade B 
 
Care will be taken to remove punctuation from input text at the start and end of words (i.e. 
input such as “the end” will be seen as the two words the and end  with the quotes removed), 
but punctuation in the middle of a word (e.g. St.John) may be retained.  A further command 
line flag (-cn) will produce output in n columns equally spaced across the screen. 
 
For extra credit the program will be capable of processing a number of files listed on the 
command line.  The quality of the command line interface is comparable to DOS and 
Unix/Linux utilities. 
 
Grade A 
 
The program will handle files written in HTML.  All “mark-up” will be ignored in preparing the 
counts.  For extra credit the program will be capable of fetching HTML files “off-the-net” and 
will be capable of producing the output as an HTML formatted document.  
 
For extra credit attempt to determine how the processing time of you program is related to the 
size of the input file.  Compare the performance and results of your program with the 
Unix/Linux command: 
 
tr –cs ‘[:alnum:]’ ‘[\n*]’ < data-file | sort | uniq –c | sort –nr 
 
 
 
CP4044 – Programming for Systems and Networks 
Mel Ralph Page 2 25/10/2007 
Submitting the Assignment 
 
The deadline for handing this assignment in to Registry is Friday, 23 November (week 9). 
 
You should submit a printed copy of the program documentation with a floppy disk or CD 
securely attached containing the program source and executable files. 
 
All programs must be written in standard Java and any non-core classes used should be 
documented.  
 
The documentation should include a commented listing of the source code (printed in a non-
proportional font) and the output produced by the program processing various test files.  You 
should make an attempt to process at least one file of substantial size, although full output for 
processing a file is not required.  (You can obtain a suitable large test file from Project 
Gutenberg, .  For example, ‘The Count of Monte Cristo’ is over 
2MB). 
 
Include a critical evaluation of the program, commenting on the program performance, i.e. 
execution time, accuracy of the parsing, etc.  Reference any sources used other than the 
lecture and workshop notes. 
 
You may be required to demonstrate your program. 
 
In assessing your program, attention will be paid to the observation of good programming 
practice.