CSCU9T4: Managing Information Assignment 1, 2017 This assignment is to exercise your skills in programming in Java. You will read a file containing some information, and use this information to create a set of objects. In addition, in response to some user interaction, you may produce some output files. The assignment aims to test you on the material in the module CSCU9T4: that is, on the structuring and use of objects, and on the writing of resilient programs. The marking will concentrate on the structure of the program, on the documentation – and on whether it actually works! At the end of this document we provide submission information, as well as important general information about the assignment. Here is a more detailed specification of the problem: We have a set of files which contain the start and end times of phonemes1 inside some speech. The files are ASCII files, and each line should (and for part 1, will) contain a start sample number, an end sample number, and a short string identifying the phoneme identity. There may be any number (including 0) of such lines. It is permissible for a line to contain additional information after the string identifying the phoneme. The strings identifying the phonemes are divided into a number of types, as shown in the table below Stops: b d g p t k dx q Closures: bcl dcl gcl pcl tck kcl tcl Fricatives: s sh z zh f th v dh Affricatives: jh ch Nasals: m n ng em en eng nx Semivowels: l r w y hh hv el Vowels: iy ih eh ey ae aa aw ay ah ao oy ow uh uw ux er ax ix axr ax-r Others: pau epi h# 1 2 Part 1: Write a command-line based Java program that reads in one file, and creates a new instance of a class for each line read, storing the information in each line. You should create a number of classes of phoneme types, corresponding to the eight types of phonemes above. The program then creates a number of output files. Lines that have other phoneme strings should be placed “Others” class. The command line should have the form Command –s samplerate input_file output_file_stem Where samplerate is the sampling rate for the original sound (probably 16,000), input_file is the file containing the data, and output_file_stem is used as the prefix for the name of the output file. All the files should end in .dat. Thus, a sample command line might be 1 Although for this assignment, what a phoneme is, is not important, it might help you to have a very basic understanding of what a phoneme is: for that see https://en.wikipedia.org/wiki/Phoneme or any book on linguistics. CSCU9T4: Managing Information Assignment 1, 2017 Command –s 16000 file4653.dat outfile4653 This would read in the file file4653.dat, use a sampling rate of 16000, and create (some of) outfile4653Stops.dat, outfile4653Closures.dat, outfile4653Fricatives.dat, outfile4653Affricatives.dat, outfile4653Nasals.dat, outfile4653Semivowels.dat, outfile4653Vowels.dat, outfile4653Others.dat. Where there are no phonemes of a particular type, no file should be produced. It should then write out a set of (up to) 8 files with one line for each instance of an object in each class. The output files should replace the start sample number and end sample number by start time and end time in seconds (simply divide the sample number by the sample rate), using the sample rate supplied on the command line. It should also write out the number of each type of phoneme present, and the total number of phonemes. The program should check the validity of the command line, and should cope with the input file being absent, and the output files not being creatable in an appropriate manner (i.e. produce an appropriate error message, and then terminate). You may assume that the data inside the file is valid. In addition to a well commented program, you should provide a document briefly describing the classes you have used, and why you have used this structure. These should be placed in a subdirectory called part1. Part 2: 2.1: In reality, there may be errors in the data file. Modify your program so that it does not crash when the input lines of the file are invalid, but produces and error message, and simply ignores that line. 2.2: Extend your program so that after creating the files from part 1, the program asks the user to provide a time offset. It then reads that time, checks that it is a valid time (that is, a positive number), asking for a new time if not, and then prints out the beginning and end time and the string representing the phoneme that included that time. If the time is beyond the end of the phoneme times stored, an appropriate message should be output. Provide a 1 page description of what you have altered. The new programs, and the documentation should be placed in a subdirectory called part2. Part 3: The people who wanted the program now want you to develop a new version for simple single-note-at-a-time musical instruments (like a flute, or a trumpet). The file then holds the beginning of the musical note, the end of the musical note, and a string representing the note played2 (it might also hold the beginning of a silence, the end of a silence, and a string representing silence). 2 The usual way of doing this is to provide the name of the note and the octave number, for example C4 is middle C, and C#4 is the note one semitone higher: again this is not CSCU9T4: Managing Information Assignment 1, 2017 Without actually writing the program, discuss how you might modify and/or reuse the code from your original program. This document should be placed in a subdirectory called part3. Submission information: On 13 February, the automated submission system will copy some assignment files to the folder CSCU9T4/Assignment1 in your home folder. These will include some test files. Some relevant files are already in the Assignment2017 folder in groups on Wide. This assignment is worth 40% of the final marks for the module. As well as technical accuracy (e.g. correct reading and parsing of the input file, correct writing of output file, correct implementation of the required functionality for searching, correct encryption and decryption), good programming style will also be taken into account (e.g. appropriate use of object oriented design, appropriate use of Java constructs, effective use of comment text, consistency, legibility and tidiness of program layout, suitably informative choices of variable and method names.) At 5PM on the 3rd March 2017, the automated assignment system will collect everything in CSCU9T4\Assignment1 under your home folder, i.e. your programs and documents, in the subdirectories part1, part2 and part3. If you decide to work away from the University on the assignment, make sure to place your current work back into this folder prior to collection. Although everything in this folder will be copied, any irrelevant files will be ignored. If you cannot meet the assignment hand in deadline and have good cause, please see Professor Leslie Smith (4B85, lss@cs.stir.ac.uk) to explain your situation and ask for an extension. University Regulations state that coursework will be accepted up to seven days after the hand in deadline (or expiry of any agreed extension) but the grade will be lowered by three grade marks per calendar day or part thereof. After seven days the work will be deemed a non-submission and will receive no grade. The further consequence of this will be a no grade for the module overall. General Information: During the assignment, a FAQ (Frequently Asked Questions list) will be maintained. Questions posed by the class will be answered for everyone on this page. Consult the following URL periodically in case it helps with your own problems: http://www.cs.stir.ac.uk/courses/CSCU9T4/assignments/ Work which is submitted for assessment must be your own work. You are permitted to adapt the code you worked on for the Java practicals. The penalties for plagiarism can be severe. The University has a formal policy on plagiarism which can be found at: http://www.stir.ac.uk/academicpolicy/handbook/assessmentincludingacademicmisconduct/ last updated 9 Feb 2017, LSS information you actually need to do the assignment. See https://en.wikipedia.org/wiki/Piano_key_frequencies for a table (I am referring to what is called the Scientific names in this table, but without using super- or subscripts).