Java程序辅导

C C++ Java Python Processing编程在线培训 程序编写 软件开发 视频讲解

客服在线QQ:2653320439 微信:ittutor Email:itutor@qq.com
wx: cjtutor
QQ: 2653320439
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 1 
File Operations and Text Parsing in Java 
 
This assignment involves implementing a smallish Java program that performs some basic file parsing and navigation tasks, 
and parsing of character strings. 
 
The program will deal with two input files.  The first input file, whose name will be supplied from the command line, contains 
a collection of data records pertaining to geographical features, obtained from the website for the USGS Board on Geographic 
Names (www.usgs.gov/core-science-systems/ngp/board-on-geographic-names/download-gnis-data)[1].  The file begins with a 
descriptive header line, followed by a sequence of GIS records, one per line, which contain the following fields in the 
indicated order: 
 
Figure 1: Geographic Data Record Format 
 
Name Type 
Length/ 
Decimals 
Short Description 
Feature ID  Integer 10 
Permanent, unique feature record identifier and official feature name  Feature 
Name 
String 120 
Feature 
Class  
String 50 See Figure 3 later in this specification 
State Alpha String 2 
The unique two letter alphabetic code and the unique two number code for a US State  State 
Numeric 
String 2 
County 
Name  
String 100 
The name and unique three number code for a county or county equivalent 
County 
Numeric 
String 3 
Primary 
Latitude 
DMS 
String 7 
The official feature location 
 DMS-degrees/minutes/seconds 
 DEC-decimal degrees.  
 
Note: Records showing "Unknown" and zeros for the latitude and longitude DMS and 
decimal fields, respectively, indicate that the coordinates of the feature are unknown. 
They are recorded in the database as zeros to satisfy the format requirements of a 
numerical data type. They are not errors and do not reference the actual geographic 
coordinates at 0 latitude, 0 longitude. 
Primary 
Longitude 
DMS 
String 8 
Primary 
Latitude 
DEC 
Real Number 11/7 
Primary 
Longitude 
DEC 
Real Number 12/7 
Source 
Latitude 
DMS 
String 7 
Source coordinates of linear feature only (Class = Stream, Valley, Arroyo) 
 DMS-degrees/minutes/seconds 
 DEC-decimal degrees.  
 
Note: Records showing "Unknown" and zeros for the latitude and longitude DMS and 
decimal fields, respectively, indicate that the coordinates of the feature are unknown. 
They are recorded in the database as zeros to satisfy the format requirements of a 
numerical data type. They are not errors and do not reference the actual geographic 
coordinates at 0 latitude, 0 longitude. 
Source 
Longitude 
DMS 
String 8 
Source 
Latitude 
DEC 
Real Number 11/7 
Source 
Longitude 
DEC 
Real Number 12/7 
Elevation 
(meters)   
Integer 5 Elevation in meters above (-below) sea level of the surface at the primary coordinates  
Elevation Integer 6 Elevation in feet above (-below) sea level of the surface at the primary coordinates  
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 2 
(feet)   
Map Name String 100 Name of USGS base series topographic map containing the primary coordinates. 
Date 
Created 
String 
 
The date the feature was initially committed to the database. 
Date Edited String 
 
The date any attribute of an existing feature was last edited. 
 
Notes: 
 
 See https://geonames.usgs.gov/docs/metadata/gnis.xml for far more information than you need. 
 The type specifications used here have been modified from the source (URL above) to better reflect the realities of 
your programming environment.   
 Latitude and longitude may be expressed in DMS (degrees/minutes/seconds, 0820830W) format, or DEC (real 
number, -82.1417975) format.  In DMS format, latitude will always be expressed using 6 digits followed by a single 
character specifying the hemisphere, and longitude will always be expressed using 7 digits followed by a hemisphere 
designator. 
 Although some fields are mandatory, some may be omitted altogether.  Best practice is to treat every field as if it 
may be left unspecified.  Certain fields are necessary in order to index a record:  the feature name and the primary 
latitude and primary longitude.  If a record omits any of those fields, you may discard the record, or index it as far as 
possible. 
 
In the GIS record file, each record will occur on a single line, and the fields will be separated by pipe ('|') symbols.  Empty 
fields will be indicated by a pair of pipe symbols with no characters between them.  See the posted VA_Monterey.txt file 
for many examples. 
  
GIS record files are guaranteed to conform to the syntax described above, so there is no explicit requirement that you validate 
the files.  On the other hand, some error-checking during parsing may help you detect errors in your parsing logic. 
 
A file can be thought of as a sequence of bytes, each at a unique offset from the beginning of the file, just like the cells of an 
array.  So, each GIS record begins at a unique offset from the beginning of the file. 
 
File Formats and Line Termination   
 
Each input and output file will be plain ASCII text. 
 
Each line of a text file ends with a particular marker (known as the line terminator).  In MS-DOS/Windows file systems, the 
line terminator is a sequence of two ASCII characters (CR + LF, 0X0D0A).  In Unix systems, the line terminator is a single 
ASCII character (LF).  Other systems may use other line termination conventions. 
 
Why should you care?  Which line termination is used has an effect on the file offsets for all but the first record in the data 
file.  As long as we’re all testing with files that use the same line termination, we should all get the same file offsets.  But if 
you change the file format (of the posted data files) to use different line termination, you will get different file offsets than are 
shown in the posted log files.  That would mean that a command script prepared for use with GIS record files in one format 
will not work correctly with GIS record files in another format.  Most good text editors will tell you what line termination is 
used in an opened file, and also let you change the line termination scheme. 
 
All that being said, as project is auto-graded, all input files will have the same line termination, and the grading of correctness 
of your searches will depend on whether you report the correct search results, not on the file offsets you report.  On the other 
hand, if your indexing logic does not produce the correct offsets, you will definitely have difficulties with a future 
assignment. 
 
 
 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 3 
Figure 2: Sample Geographic Data 
Records 
 
Note that some record fields are optional, and 
that when there is no given value for a field, there 
are still delimiter symbols for it. 
 
Also, some of the lines are "wrapped" to fit into 
the text box; lines are never "wrapped" in the 
actual data files. 
  
F
E
A
T
U
R
E
_
I
D
|
F
E
A
T
U
R
E
_
N
A
M
E
|
F
E
A
T
U
R
E
_
C
L
A
S
S
|
S
T
A
T
E
_
A
L
P
H
A
|
S
T
A
T
E
_
N
U
M
E
R
I
C
|
C
O
U
N
T
Y
_
N
A
M
E
|
C
O
U
N
T
Y
_
N
U
M
E
R
I
C
|
P
R
I
M
A
R
Y
_
L
A
T
_
D
M
S
|
P
R
I
M
_
L
O
N
G
_
D
M
S
|
P
R
I
M
_
L
A
T
_
D
E
C
|
P
R
I
M
_
L
O
N
G
_
D
E
C
|
S
O
U
R
C
E
_
L
A
T
_
D
M
S
|
S
O
U
R
C
E
_
L
O
N
G
_
D
M
S
|
S
O
U
R
C
E
_
L
A
T
_
D
E
C
|
S
O
U
R
C
E
_
L
O
N
G
_
D
E
C
|
E
L
E
V
_
I
N
_
M
|
E
L
E
V
_
I
N
_
F
T
|
M
A
P
_
N
A
M
E
|
D
A
T
E
_
C
R
E
A
T
E
D
|
D
A
T
E
_
E
D
I
T
E
D
 
1
4
7
9
1
1
6
|
M
o
n
t
e
r
e
y
 
E
l
e
m
e
n
t
a
r
y
 
S
c
h
o
o
l
|
S
c
h
o
o
l
|
V
A
|
5
1
|
R
o
a
n
o
k
e
 
(
c
i
t
y
)
|
7
7
0
|
3
7
1
9
0
6
N
|
0
7
9
5
6
0
8
W
|
3
7
.
3
1
8
3
7
5
3
|
-
7
9
.
9
3
5
5
8
5
7
|
|
|
|
|
3
2
3
|
1
0
6
0
|
R
o
a
n
o
k
e
|
0
9
/
2
8
/
1
9
7
9
|
0
9
/
1
5
/
2
0
1
0
 
1
4
8
1
3
4
5
|
A
s
b
u
r
y
 
C
h
u
r
c
h
|
C
h
u
r
c
h
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
6
0
7
N
|
0
7
9
3
3
1
2
W
|
3
8
.
4
3
5
3
9
8
1
|
-
7
9
.
5
5
3
3
8
0
7
|
|
|
|
|
8
1
8
|
2
6
8
4
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
1
8
5
2
|
B
l
u
e
 
G
r
a
s
s
|
P
o
p
u
l
a
t
e
d
 
P
l
a
c
e
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
3
0
0
0
N
|
0
7
9
3
2
5
9
W
|
3
8
.
5
0
0
1
1
8
8
|
-
7
9
.
5
4
9
7
7
0
2
|
|
|
|
|
7
7
7
|
2
5
4
9
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
1
8
7
8
|
B
l
u
e
g
r
a
s
s
 
V
a
l
l
e
y
|
V
a
l
l
e
y
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
9
5
3
N
|
0
7
9
3
2
2
2
W
|
3
8
.
4
9
8
1
7
4
5
|
-
7
9
.
5
3
9
4
9
2
|
3
8
2
6
0
1
N
|
0
7
9
3
8
0
0
W
|
3
8
.
4
3
3
7
3
0
9
|
-
7
9
.
6
3
3
3
8
3
3
|
7
5
9
|
2
4
9
0
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
1
1
0
|
B
u
c
k
 
H
i
l
l
|
S
u
m
m
i
t
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
9
0
2
N
|
0
7
9
3
3
5
8
W
|
3
8
.
3
1
7
3
4
5
2
|
-
7
9
.
5
6
6
1
5
7
7
|
|
|
|
|
1
0
0
3
|
3
2
9
1
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
1
7
6
|
B
u
r
n
e
r
s
 
R
u
n
|
S
t
r
e
a
m
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
5
0
9
N
|
0
7
9
3
4
0
9
W
|
3
8
.
4
1
9
2
8
7
3
|
-
7
9
.
5
6
9
2
1
4
4
|
3
8
2
5
3
1
N
|
0
7
9
3
5
3
8
W
|
3
8
.
4
2
5
2
7
7
8
|
-
7
9
.
5
9
3
8
8
8
9
|
8
4
8
|
2
7
8
2
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
3
2
4
|
M
o
u
n
t
 
C
a
r
l
y
l
e
|
S
u
m
m
i
t
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
5
5
6
N
|
0
7
9
3
3
5
3
W
|
3
8
.
2
6
5
6
7
9
9
|
-
7
9
.
5
6
4
7
6
8
2
|
|
|
|
|
6
9
8
|
2
2
9
0
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
4
3
4
|
C
e
n
t
r
a
l
 
C
h
u
r
c
h
|
C
h
u
r
c
h
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
9
5
3
N
|
0
7
9
3
3
2
3
W
|
3
8
.
4
9
8
1
7
4
4
|
-
7
9
.
5
5
6
4
3
7
1
|
|
|
|
|
7
7
3
|
2
5
3
6
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
5
5
7
|
C
l
a
y
l
i
c
k
 
H
o
l
l
o
w
|
V
a
l
l
e
y
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
6
1
3
N
|
0
7
9
3
2
3
8
W
|
3
8
.
2
7
0
4
0
2
1
|
-
7
9
.
5
4
3
9
3
4
3
|
3
8
1
7
3
3
N
|
0
7
9
3
3
2
4
W
|
3
8
.
2
9
2
5
|
-
7
9
.
5
5
6
6
6
6
7
|
5
7
3
|
1
8
8
0
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
7
8
5
|
C
r
a
b
 
R
u
n
|
S
t
r
e
a
m
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
7
0
7
N
|
0
7
9
3
1
4
4
W
|
3
8
.
2
8
5
4
0
1
8
|
-
7
9
.
5
2
8
9
3
4
|
3
8
1
9
0
3
N
|
0
7
9
3
4
1
5
W
|
3
8
.
3
1
7
5
|
-
7
9
.
5
7
0
8
3
3
3
|
5
7
9
|
1
9
0
0
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
2
9
5
0
|
D
a
v
i
s
 
R
u
n
|
S
t
r
e
a
m
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
8
2
4
N
|
0
7
9
3
0
5
3
W
|
3
8
.
3
0
6
7
9
0
3
|
-
7
9
.
5
1
4
7
6
7
1
|
3
8
2
0
5
7
N
|
0
7
9
3
5
0
5
W
|
3
8
.
3
4
9
1
6
6
7
|
-
7
9
.
5
8
4
7
2
2
2
|
6
0
1
|
1
9
7
2
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
2
8
1
|
E
l
k
 
R
u
n
|
S
t
r
e
a
m
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
9
3
6
N
|
0
7
9
3
1
5
3
W
|
3
8
.
4
9
3
4
5
2
4
|
-
7
9
.
5
3
1
4
3
6
2
|
3
8
3
1
2
1
N
|
0
7
9
3
0
5
6
W
|
3
8
.
5
2
2
6
1
8
5
|
-
7
9
.
5
1
5
6
0
2
7
|
7
5
7
|
2
4
8
4
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
4
9
2
|
F
o
r
k
s
 
o
f
 
W
a
t
e
r
s
|
L
o
c
a
l
e
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
8
5
6
N
|
0
7
9
3
0
3
1
W
|
3
8
.
4
8
2
3
4
1
7
|
-
7
9
.
5
0
8
6
5
7
5
|
|
|
|
|
7
0
5
|
2
3
1
3
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
5
2
7
|
F
r
a
n
k
 
R
u
n
|
S
t
r
e
a
m
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
9
5
3
N
|
0
7
9
3
3
1
0
W
|
3
8
.
4
9
8
1
7
4
4
|
-
7
9
.
5
5
2
8
2
5
8
|
3
8
3
3
0
4
N
|
0
7
9
3
3
4
1
W
|
3
8
.
5
5
1
2
2
8
5
|
-
7
9
.
5
6
1
4
3
8
1
|
7
8
0
|
2
5
5
9
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
6
4
7
|
G
i
n
s
e
n
g
 
M
o
u
n
t
a
i
n
|
S
u
m
m
i
t
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
8
5
0
N
|
0
7
9
3
1
3
9
W
|
3
8
.
4
8
0
6
7
5
|
-
7
9
.
5
2
7
5
4
7
|
|
|
|
|
9
7
8
|
3
2
0
9
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
8
6
0
|
G
u
l
f
 
M
o
u
n
t
a
i
n
|
S
u
m
m
i
t
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
9
4
0
N
|
0
7
9
3
1
0
3
W
|
3
8
.
4
9
4
5
6
3
6
|
-
7
9
.
5
1
7
5
4
6
8
|
|
|
|
|
1
0
0
6
|
3
3
0
0
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
3
9
1
6
|
H
a
m
i
l
t
o
n
 
C
h
a
p
e
l
|
C
h
u
r
c
h
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
7
4
0
N
|
0
7
9
3
7
0
7
W
|
3
8
.
2
9
4
5
6
7
7
|
-
7
9
.
6
1
8
6
5
9
1
|
|
|
|
|
8
2
3
|
2
7
0
0
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
1
4
8
4
0
9
7
|
H
i
g
h
l
a
n
d
 
H
i
g
h
 
S
c
h
o
o
l
|
S
c
h
o
o
l
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
2
4
2
6
N
|
0
7
9
3
4
4
4
W
|
3
8
.
4
0
7
1
3
8
7
|
-
7
9
.
5
7
8
9
3
3
3
|
|
|
|
|
8
7
9
|
2
8
8
4
|
M
o
n
t
e
r
e
y
|
0
9
/
2
8
/
1
9
7
9
|
0
9
/
1
5
/
2
0
1
0
 
1
4
8
4
0
9
9
|
H
i
g
h
l
a
n
d
 
W
i
l
d
l
i
f
e
 
M
a
n
a
g
e
m
e
n
t
 
A
r
e
a
|
P
a
r
k
|
V
A
|
5
1
|
H
i
g
h
l
a
n
d
|
0
9
1
|
3
8
1
9
0
5
N
|
0
7
9
3
4
3
9
W
|
3
8
.
3
1
8
1
7
8
5
|
-
7
9
.
5
7
7
5
4
7
|
|
|
|
|
9
5
4
|
3
1
3
0
|
M
o
n
t
e
r
e
y
 
S
E
|
0
9
/
2
8
/
1
9
7
9
|
 
.
 
.
 
.
 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 4 
Your program may be invoked from the command line, in either of the following ways: 
 
java GISParser -index   
java GISParser -search    
 
Note the instructions given above imply your main class (the one that implements public static void main()) must 
be named GISParser, and the GISParser class must not be in a package (Eclipse calls this the default package).  If you 
execute your program from within Eclipse, you can still specify command-line parameters; see the Eclipse for 3114 notes 
for an example. 
 
Computing Record File Offsets [30%] 
 
java GISParser -index   
 
Your program will parse the given GIS record file and write, to the file named by the final parameter, the file offset and 
Feature Name field for each of the records found in the file, listed in the order the records occur in the input file. Here's the 
beginning of a sample input file (lines are truncated for the purpose of this display): 
 
 
 
 
 
 
 
 
 
 
Here's the beginning of the corresponding output file: 
 
 
 
 
 
 
 
 
 
 
 
The offset of a record is the position of the first character of that record within the GIS record file.  Implementing this 
functionality will require solving some small puzzles that are related to the rest of the assignment: 
 
 how to determine whether a file exists and can be opened for reading 
 how to determine the offset of the current character in a text file 
 how to read a whole line of text at once from a text file 
 how to break a delimited string into its components 
 how to properly close a file when you are finished processing it[2] 
 
The following Java classes provide functionality that was useful in the reference solution: 
 
 File 
 RandomAccessFile 
 FileWriter 
 String 
 Scanner 
 Formatter 
 
That is not to say that other Java classes could not play a role, or that alternative approaches might not use these classes.  It is 
absolutely not necessary to parse the input data character-by-character in order to achieve the specified results; avoiding that 
approach will not only lead you to a more efficient solution, but will also leave you with a better understanding of how to 
make use of the Java library. 
 
FEATURE_ID|FEATURE_NAME|FEATURE_CLASS|STATE_ALPHA|STATE_NUMERIC|COUNTY_ . . . 
885513|Siegrest Draw|Valley|NM|35|Eddy|015|323815N|1043256W|32.6376116| . . . 
885526|AAA Tank|Reservoir|NM|35|Eddy|015|321043N|1041456W|32.1786543|-1 . . . 
885566|Adobe Draw|Valley|NM|35|Eddy|015|322820N|1042141W|32.4723375|-10 . . . 
885567|Adobe Flat|Flat|NM|35|Eddy|015|322849N|1042119W|32.4803932|-104. . . . 
885607|Alacran Hills|Range|NM|35|Eddy|015|322812N|1041055W|32.4701183|- . . . 
885684|Alkali Lake|Lake|NM|35|Eddy|015|323039N|1041133W|32.5109371|-104 . . . 
 
NM_EddyKnown.txt contains the following records: 
 
     265   Siegrest Draw    
     425   AAA Tank 
     553   Adobe Draw 
     710   Adobe Flat 
     829   Alacran Hills 
     952   Alkali Lake 
 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 5 
It's not necessary to align your output neatly, but it's good practice, and not difficult, to do so. 
 
 
Performing Record Searches [70%] 
 
java GISParser -search    
 
Invoked in the second way, your program will process the given commands file (and NOT write out a list file offsets and 
Feature Name values), and write its output to a file named by the final parameter.  Each command must be echoed into the 
output file, on a line by itself, numbered as shown in the posted log files.  Following each echoed command, your program 
will write one line reporting the results of carrying out that command, as described below. 
 
The first input file will be a GIS record file, as described above.  The second input file, whose name will also be supplied on 
the command line, contains a sequence of search commands that must be processed.  The only types of search that must be 
supported are: 
 
show_name 
 
If the offset is valid (see below), write the Feature Name field for the record that occurs at that offset 
 
show_latitude 
show_longitude 
 
If the offset is valid (see below), write the primary latitude or primary longitude, as specified, for the record that 
occurs at that offset.  The specified fields should be separated by whitespace (your choice as to what).  If the 
specified field is not included in the record, write "Coordinate is not given". 
 
The Primary Latitude and Longitude fields are given in the GIS records in two formats, DMS and DEC.  You should 
parse the DMS version.  Your output must reformat the latitude or longitude in a human-friendly manner.  
Specifically: 
 
1090224W    109d 2m 24s West 
 
Note that latitude values will always consist of 6 decimal digits, followed by a hemisphere designator ('N' or 'S'), and 
longitude values will always consist of 7 decimal digits, followed by a hemisphere designator ('W' or 'E').  The Java 
String and Scanner classes both have useful methods for breaking out the relevant parts. 
 
show_elevation 
 
If the offset is valid (see below), write the elevation in feet field for the record that occurs at that offset.  One 
complication is that the elevation fields are optional; if the elevation is not given, write "Elevation is not given". 
 
What about invalid offsets? 
 
 If the offset is not positive, write the error message “Offset is not positive”. 
 If the offset is larger than the length of the data file, write the error message “Offset is too large”. 
 If the offset is non-negative but does not correspond to the first character on a line of the file that contains a 
GIS record, write the error message “Offset is unaligned”. 
 
The only other command is: 
 
quit 
 
Cease processing the commands file, log the message “Exiting”, close all files and exit the program. 
 
Each command will occur on a line by itself.  Lines beginning with a semi-colon character ';' are comments and should be 
ignored.  Blank lines are possible.  The command file is guaranteed to conform to this specification, so you do not need to 
worry about error-checking when reading it.  Here is a short sample commands file: 
  
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 6 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Here's a corresponding output sample: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Additional sample input and output files will be supplied on the course website, along with a program to generate additional 
samples, and a grading harness. 
 
Under no circumstances may your program store more than one complete GIS record in memory at any given instant, 
nor is your program allowed to store a collection of file offsets for the records in the file. 
 
 
A Look to the Future 
 
You may have noticed (at least) one bit of weirdness in the commands:  it's unrealistic to think that a user who wanted to find 
a specific record in a data file would already know the offset at which the record occurred in the file.  It's much more realistic 
to expect a user will want to find a record that matches a given criterion.  For example: 
 
 Show me a record that contains the feature name "Blacksburg". 
 Show me a record that matches the coordinates (32 28 49N, 104 25 15W). 
 
In order to satisfy queries like these, we would want to have indexing capabilities.  All and index does is map a key value to a 
location (or set of locations) where matching records can be found.  We could answer the first sort of query if we had an index 
that supplied a file offset (or set of file offsets) that matched the feature name "Blacksburg". 
 
A future project in this course will involve building indexing features. 
 
 
  
; CS 3114:  test script for GIS file parsing project. 
; 
; Report feature name: 
show_name 634 
;   
show_longitude 9778 
;   
; Report feature elevation: 
show_elevation 1749 
;   
; Report feature latitude: 
show_latitude 11976 
; 
; Exit program 
quit 
 
1: show_name   634 
   Carlsbad 
2: show_longitude   9778 
   104d 13m 34s West 
3: show_elevation   1749 
   Unaligned offset 
4: show_latitude   11976 
   Offset is too large 
5: quit 
   Exiting 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 7 
Test Data Generator and Output Comparator 
 
The course website supplies a tar file with two useful tools for testing your solution.  Both tools are packaged as executable 
Java jar files, and require that you have installed the Java JDK[3] version 11.   
 
The first tool, gisGenerator.jar, can be used to create test data and reference results.  The tool is designed to be 
executed from a command-line environment.  Here is a sample session, using a Windows 10 terminal window, that shows the 
tool at work.  The same approach works on a Linux system.  First, put the jar file, and the supplied GIS data master files into 
a directory: 
 
 
 
 
 
 
 
The generator has hard-coded dependencies on the two data files shown in the display above, so those files must be present in 
the same directory when you run the generator. 
 
You can execute the jar file by invoking the Java VM from the command line; in this case, the tool will display help regarding 
the command-line interface: 
 
 
 
 
 
 
 
Running the tool with valid command-line parameters will produce test data and reference solution files: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The created files are: 
 
gisDB01.txt GIS records extracted from the two master files 
gisCommands01.txt search commands suitable for using with the previous file 
RefOffsets.txt (annotated) correct offsets for records in gisDB01.txt 
Refresults.txt (annotated) correct results from performing the generated searches 
 
The reference files are annotated with point values used by the comparison tool that is described next. 
 
  
Z:\J1.GISParser\testing > dir 
 
06/16/2020  10:23 PM            17,831 GISGenerator.jar 
06/09/2020  08:21 PM           218,541 NM_EddyKnown.txt 
06/09/2020  08:30 PM             7,021 NM_EddyUnknown.txt 
Z:\J1.GISParser\testing > java -jar GISGenerator.jar 
Invocation:  java -jar GISGenerator.jar dbfilename commandsfilename [-repeat] 
   Creates database file and commands file for testing GISParser. 
   Creates reference file of record offsets and feature names. 
   Creates reference log file showing command processing results. 
Z:\J1.GISParser\testing > java -jar GISGenerator.jar gisDB01.txt gisCommands01.txt 
 
 
Z:\J1.GISParser\testing >dir 
 
06/17/2020  07:37 PM             1,191 gisCommands01.txt 
06/17/2020  07:37 PM            12,427 gisDB01.txt 
06/16/2020  10:23 PM            17,831 GISGenerator.jar 
06/09/2020  08:21 PM           218,541 NM_EddyKnown.txt 
06/09/2020  08:30 PM             7,021 NM_EddyUnknown.txt 
06/17/2020  07:37 PM             3,373 RefOffsets.txt 
06/17/2020  07:37 PM             1,489 RefResults.txt 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 8 
The second tool, LogComparator.jar, can be used to compare output files produced by your solution to reference 
results, and generate scores.  The tool is designed to be executed from a command-line environment.  Here is a sample 
session, using a Windows 10 terminal window, that shows the tool at work.  The same approach works on a Linux system.  
First, put the jar file, and the relevant GIS reference files and your output files into a directory: 
 
 
 
 
 
 
 
 
 
 
 
In this case, the stu* files were produced by my solution, using the test data files produced earlier by the generator tool.  
Executing the comparison tool with no parameters displays instructions: 
 
 
 
 
 
 
 
 
Invoking the comparison tool on the two offsets files[4]: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The tool writes its results to the terminal window (which simplifies use in a script); you can use redirection to save the output 
to a file and then open the file in a text editor.  If there were any deductions, those would be shown in the output, next to the 
lines that did not match.  You can then examine the failures and use those to debug your solution. 
 
The reference output files are annotated with a point value for each line: 
 
 
 
 
 
 
 
 
 
 
 
Your output should NOT include point annotations.  Sample GIS and command files will be supplied on the course website, 
along with the corresponding correct results.  You should test your solution with each of those samples.  There is no guarantee 
these will cover all the logical cases, so you should use the supplied tools to create additional test files. 
 
 
[ 0] RefOffsets.txt contains the following records: 
[ 0]  
[40]      265   10231 Water Well 
[10]      389   Cole Place 
[10]      508   Hughes Ranch 
[10]      628   10216 Water Well 
[10]      752   05036 Water Well 
. . . 
Z:\J1.GISParser\testing > dir 
 
06/03/2017  10:38 PM             5,140 LogComparator.jar 
06/17/2020  07:37 PM             3,373 RefOffsets.txt 
06/17/2020  07:37 PM             1,489 RefResults.txt 
06/17/2020  08:33 PM             2,870 stuOffsets01.txt 
06/17/2020  08:34 PM             1,227 stuSearches01.txt 
              11 File(s)        286,942 bytes 
               2 Dir(s)  45,703,204,864 bytes free 
Z:\Fall2020\3114\Projects\J1.GISParser\testing > java -jar LogComparator.jar 
Invoke:  java -jar LogComparator.jar    
   The reference results file should be created with the posted generator tool. 
   The student results file should be created by your solution, using the. 
       the input file corresponding to the reference results file. 
   The order of the file names on the command line does matter. 
Z:\Fall2020\3114\Projects\J1.GISParser\testing > java -jar LogComparator.jar 1 
RefOffsets.txt stuOffsets01.txt 
Maximum score 700 
           gisDB01.txt contains the following records: 
 
                265   10231 Water Well 
                389   Cole Place 
                508   Hughes Ranch 
                628   10216 Water Well 
                752   05036 Water Well 
. . . 
 
 1 >> Score: 700.00 / 700 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 9 
Compiling and Running Your Solution from the Command Line 
 
It's easy to execute your Java program from the command line (terminal window, if you prefer).  The example here was 
created using the Windows 10 terminal, but the procedure is essentially identical in Linux.  This does assume that you have 
installed the Java JDK. 
 
To compile, simply invoke the Java compiler (javac) on your source files.  I produced one solution that did not use 
packages; in Eclipse terminology, each of my classes is in the default package.  I can then compile with the command:  
javac *.java: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
I then copied the test input files into the same directory and invoked the Java VM on my (compiled) top-level class: 
 
 
 
 
 
 
 
 
 
 
 
 
Now, I could use the comparison tool described earlier to compare my output to the reference output. 
 
 
Grading 
 
Solutions to this assignment will be graded using Java 11, as installed on the rlogin cluster.  The evaluation will be performed 
using command-line tools only.  You should consult the tutorials on the course website, as well as the course staff, when you 
have questions. 
 
You may develop your solution on Windows or Linux, as you like, and you may use Eclipse or any other IDE you like.  But, 
it is your responsibility to ensure that you have tested your code with the posted testing/grading code on Windows or CentOS 
8 using Java 11.  Failure to do that is likely to result in a score of zero on this assignment. 
 
 
  
Z:\J1.GISParser\soln>dir /W *.java 
 
LongitudeHemi.java      Longitude.java          cmdParser.java 
. . . 
GISParser.java 
 
Z:\J1.GISParser\soln>javac *.java 
Z:\J1.GISParser\soln>dir /W /A-D 
 
LongitudeHemi.java       Longitude.java           cmdParser.java  
GISParser.java           GISParser.class          LatitudeHemi.class 
Longitude.class          cmdParser.class          . . . 
Z:\J1.GISParser\soln>java GISParser -index gisDB01.txt stuOffsets01.txt 
 
Z:\J1.GISParser\soln>java GISParser -search gisDB01.txt gisCmds01.txt stuSearches01.txt 
 
Z:\J1.GISParser\soln>dir *.txt 
 
01/11/2022  01:45 PM            12,030 gisDB01.txt 
01/11/2022  01:45 PM               949 gisCmds01.txt 
01/11/2022  02:12 PM             2,836 stuOffsets01.txt 
01/11/2022  02:13 PM             1,089 stuSearches01.txt 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 10 
Grading Code 
 
The website has a link to a zip file containing tools you can use to test and grade the correctness of your solution.  In fact, 
these are the same tools that we will use to do the grading of your solution (as described earlier). 
 
readme.txt explains how this all works 
gradeJ1.py Python script that runs the grading tools[2] 
J1Tools.zip zip file containing: 
   GISGenerator.jar  Java test data generator tool 
   NM_EddyKnown.txt  GIS record input files used by the generator tool 
   NM_EddyUnknown.txt  
   LogComparator.jar  comparison tool that scores your results 
 
Since this is the exact code that will be used when we grade your solution, it is imperative that you make good use of this 
code in your own testing.  See the readme.txt file for detailed instructions. 
 
 
Documentation 
 
You should document your implementation in accordance with the Programming Standards page on the course website.  It is 
possible that your implementation will be evaluated for documentation, as well as for correctness of results.  If so, your 
submission will be evaluated by one of the TAs, who will assess a deduction (ideally zero) against your score from the 
grading code. 
 
Note that the evaluation of your project may depend substantially on the quality of your code and documentation. 
 
 
Design Considerations  
 
You should apply good object-oriented design principles in your project. Think through object responsibilities and 
interactions, and sketch out your design before you start coding.  The most common design shortcomings with an assignment 
like this are to identify a too-small set of candidate classes, or to adopt a minimal design in order to reduce coding time.  As 
inspiration, I will tell you that my solution incorporates 8 distinct classes and 2 enumerated types, all of which play important 
roles within the requirements of the assignment, and also in my solution for a future assignment. 
 
Keep in mind that later projects in this course may build on this one. For example, it is likely that in a later project some other 
part of the GIS database system will need to actually do something with various fields of the GIS records that are retrieved in 
this assignment. 
 
You should consider what possible errors might be encountered, and which ones you should check for.  One common 
example is the validation of command-line parameters.  We will not test your solution with invalid parameters, but in reality 
it's fairly common for a user to supply incorrect, or no, parameters.  It's simple to test for the existence of files, and log an 
error message and exit cleanly if expected files do not exist.  As mentioned earlier, a GIS record may not specify a value for 
every field.  In other cases, but not this assignment, logically invalid values may be supplied; for example, a character string 
(e.g., "Fred") where a numeric value is expected.  A program that crashes, or even just computes nonsensical results, in such a 
situation looks very unprofessional. 
 
Finally, you should be careful about designing your solution to catch exceptions and write useful diagnostics to standard 
output if an exception is caught, especially if it cannot be handled internally.   
 
Checking for unexpected input, even if not strictly required, is simply good practice.  So is catching exceptions, and handling 
them internally; writing a program that crashes makes you look unprofessional.  Providing meaningful error messages when 
catching an exception, or detecting input-related problems, also helps you when debugging. 
 
 
  
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 11 
What to Submit 
 
For this assignment, you must submit a zip file containing all the Java source code files for your implementation (i.e., .java 
files).  Submit only the Java source files.  Do not submit Java bytecode (i.e., .class) files.  If you use packages in your 
implementation (and that's good practice), your zip file must include the correct directory structure for those packages.  That's 
easy to verify by running the supplied grading script on the zip file you are planning to submit. 
 
This assignment will be auto-graded using Java SE 11, and the posted testing/grading code.  Using a different version of the 
Java compiler/libraries, whether older or newer, may very well result in your solution failing testing. 
 
Your submitted zip file will be placed in the appropriate subdirectory with the packaged test code, and will then be evaluated 
by running the supplied testing script.  We will make no accommodations for submissions that do not work with that script. 
 
Warning:  the requirement here is for a zip file, based on the fact that there is a standard utility for creating zip files on Linux 
systems.  See "man zip" and "man unzip" for details, and that it's trivial to create a zip file in Windows.  We will not accept 
files in any other format (e.g., tar files, 7-zip'd files, gzip'd files, jar files, rar files, …).  Such submissions will NOT work with 
the supplied script, and will be discarded when we run the grading code ourselves. 
 
Instructions, and the appropriate link, for submitting to the Curator are given in the Student Guide at the Curator website:  
 
http://www.cs.vt.edu/curator/ 
 
You will be allowed to submit your solution several times, in order to make corrections.  Your score will be determined by 
testing your last submission.  The Curator will not be used to grade your submissions in real time, but you will have already 
done the grading using the posted code. 
 
 
Some Java Gotchas 
 
A character may not be a 1-byte character 
 
Be careful of how Java deals with the notion of a "character".  Some methods do not work in the manner you might expect.  
For example, here's the beginning of the Oracle documentation for the method readChar() in the very useful 
RandomAccessFile class: 
 
public final char readChar() 
                    throws IOException 
Reads a character from this file. 
 
So far, so good; looks like you could use this to read a single character from a text file… but here's the rest of the 
documentation: 
 
This method reads two bytes from the file, starting at the current file pointer. If the 
bytes read, in order, are b1 and b2, where 0 <= b1, b2 <= 255, then the result is equal 
to: 
     (char)((b1 << 8) | b2) 
 
So, this method deals with two-byte representations of characters, which is certainly not what we have with an ASCII input 
file.  The writeChar() method writes two-byte representations of a character.  Neither is appropriate for this project. 
 
A quick examination of the other methods in RandomAccessFile will identify some alternatives that are more suitable for 
your needs: 
 read() in its various incarnations 
 writeByte() and writeBytes() 
 readLine() 
 
Do not assume that there are no other, equally useful, methods.  Similar comments apply to other useful classes, like 
FileWriter.  Moral:  be careful of the methods supplied by classes in the Java library.  Always check the documentation, 
carefully. 
 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 12 
Escaping the pipe 
 
Each GIS record is represented as a single line in the GIS data files.  The most efficient approach is to grab the whole line as a 
String, and then decompose the String.  That can be accomplished in several ways, including using the various next() 
methods in the Scanner class and using the split() method in the String class.  Either way, you'd need to specify that 
the delimiter between fields is a pipe character '|'.  That turns out to be a little subtle. 
 
Scanner objects use delimiters specified by a Pattern object set by a String; so does split().  In the syntax of the 
Pattern class, the pipe character has special meaning (logical OR if you care).  Therefore "\" will not work.  An "escape" 
is required.  But, "\" is also special in String literals, so "\|" won't work either.  The correct syntax would be something 
like "\\|".  Note the double backslashes! 
 
 
Pledge 
  
Each of your program submissions must be pledged to conform to the Honor Code requirements for this course.  Specifically, 
you must include the following pledge statement at the beginning of the file that contains main(): 
 
//    On my honor: 
//    
//    - I have not discussed the Java language code in my program with 
//      anyone other than my instructor or the teaching assistants 
//      assigned to this course. 
//    
//    - I have not used Java language code obtained from another student, 
//      or any other unauthorized source, including the Internet, either 
//      modified or unmodified.  
//    
//    - If any Java language code or documentation used in my program 
//      was obtained from another source, such as a text book or course 
//      notes, that has been clearly noted with a proper citation in 
//      the comments of my program. 
//    
//    - I have not designed this program in such a way as to defeat or 
//      interfere with the normal operation of the supplied grading code. 
// 
//     
//     
 
I reserve the option of assigning a score of zero to any submission that is undocumented or 
does not contain this statement. 
 
Change Log 
 
Version Date Page Change(s) 
7.00 Jan 18  Base version. 
7.10 Jan 20 7 Corrected invocations of GISGenerator.jar in two shell displays. 
    
 
  
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 13 
Appendix: Using Packages 
 
I also produced a solution using packages for the logical components of my design, and not placing my top-level class 
(GISParser) into a package.  In Eclipse, GISParser would be in the default package.  Since all the components in the 
subdirectories are imported by my top-level class, GISParser, I just need to invoke the Java compiler on that file: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Then, I can copy the test input files into the same directory and invoke the Java VM on my (compiled) top-level class: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Finally, I can use the comparison tool described earlier to compare my output to the reference output, as shown before. 
 
 
  
Z:\J1.GISParser\soln > dir 
 
06/11/2020  08:19 PM              DS 
06/11/2020  08:24 PM             6,604 GISParser.java 
 
Z:\J1.GISParser\soln > tree 
 
Z:. 
└───DS 
    └───J1 
        ├───Components 
        └───Types 
 
Z:\J1.GISParser\soln > javac GISParser.java 
 
Z:\J1.GISParser\soln > dir 
06/11/2020  08:19 PM              DS 
06/17/2020  09:40 PM             5,385 GISParser.class 
06/11/2020  08:24 PM             6,604 GISParser.java 
Z:\J1.GISParser\soln >dir 
 
06/11/2020  08:19 PM              DS 
06/17/2020  09:03 PM             1,191 gisCommands01.txt 
06/17/2020  09:03 PM            12,427 gisDB01.txt 
06/17/2020  09:40 PM             5,385 GISParser.class 
06/11/2020  08:24 PM             6,604 GISParser.java 
 
Z:\J1.GISParser\soln >java GISParser -index gisDB01.txt stuOffsets01.txt 
 
Z:\J1.GISParser\soln >java GISParser -search gisDB01.txt gisCommands01.txt 
stuSearches01.txt 
 
Z:\J1.GISParser\soln >dir 
 
06/11/2020  08:19 PM              DS 
06/17/2020  09:03 PM             1,191 gisCommands01.txt 
06/17/2020  09:03 PM            12,427 gisDB01.txt 
06/17/2020  09:40 PM             5,385 GISParser.class 
06/11/2020  08:24 PM             6,604 GISParser.java 
06/17/2020  09:53 PM             2,870 stuOffsets01.txt 
06/17/2020  09:53 PM             1,227 stuSearches01.txt 
CS 3114 Data Structures & Algorithms  Project 1: File Operations and Parsing 
 
Version 7.10 This is a purely individual assignment! 14 
Notes 
 
[1] The file format used on this site has changed a number of times since I first began using them.  For the purposes of this 
assignment, we will be using files that were selected a few years ago, and there is no guarantee these files will 
correspond to the current format. 
 
[2] Output written to a file stream using Library classes is typically buffered (stored in memory), not written to the file 
immediately.  This improves performance, and is invisible to your code.  But, if your program crashes, or fails to close 
an output file properly, it is possible that some output will be lost and never written to the file. 
 
[3] The JDK can be obtained from:  www.oracle.com/java/technologies/downloads/archive/ 
 
[4] The first parameter to the comparison tool is an integer that the automated grading script uses in labelling results.  
When running the tool manually, it doesn't matter what you use for that value, but you must use something. 
 
[2] If Python is properly installed on your Windows installation, you can run the supplied script this way: 
 
 
 
 
On CentOS, I installed Python as follows: 
 
 
 
 
 
 
I then ran the script this way: 
 
 
 
I could make the script executable, without invoking Python directly by adding the following lines at the beginning of 
the script (but that causes problems if I update to a new version of Python): 
 
#! /usr/bin/python3.9 
# 
 
 
 
 
Z:\J1.GISParser\soln > gradeJ1.py -all wmcquain.J1.1.zip J1tools.zip 
#1043 wmcquain: ~> su 
Password:  
[root@localhost wmcquain]# yum install python39 
. . . 
#1066 wmcquain: testing> python3.9 gradeJ1.py -all wmcquain.J1.1.zip J1tools.zip