More Input File Processing Using the Scanner Class • Review Token by Token Processing • Line by Line Processing • String Scanner Exception Handling • throws clause • try catch block Scanner Review – Token Based Processing • We can use the Scanner class to read from (and tokenize) the standard input stream, System.in: – Scanner console = new Scanner(System.in); – Prompt user to enter data • We can use the Scanner class to read (and tokenize) a file: – Scanner input = new Scanner(new File(“data.txt”)); • Same Scanner methods used in both cases to look for and read in tokens import java.util.*; import java.io.*; public class DoubleAverage { public static void main(String[] args) throws FileNotFoundException { userInterface(); } public static void userInterface() throws FileNotFoundException { Scanner console = new Scanner(System.in); Scanner fileScanner= getInputScanner(console); System.out.println("Average: " + processFile(fileScanner)); fileScanner.close(); } … … } Calculate the Average of all doubles in a file //getInputScanner method option 1 – exception thrown if file not found public static Scanner getInputScanner(Scanner console) throws FileNotFoundException { System.out.print("input file name? "); String name = console.next(); Scanner fileScanner = new Scanner(new File (name)); return fileScanner; } // getInputScanner method – option 2 – exit without exception if file not found public static Scanner getInputScanner(Scanner console) throws FileNotFoundException { System.out.print("input file name? "); String name = console.next(); Scanner fileScanner = null; File f = new File(name); if (f.exists()) { fileScanner = new Scanner(f); } else { System.out.println("File: " + name + " not found"); System.exit(1); } return fileScanner; } Calculate the double average (cont) getInputScanner – reprompt public static Scanner getInputScanner(Scanner console) throws FileNotFoundException { Scanner fileScanner = null; while (fileScanner == null) { System.out.print("input file name? "); File f = new File(console.next()); if (f.exists()) { fileScanner = new Scanner(f); } else { System.out.println("File not found . Please try again."); } } return fileScanner; } public static Scanner getInputScanner(Scanner console) throws FileNotFoundException { System.out.print("Enter a file name to process: "); File file = new File(console.next()); while (!file.exists()) { System.out.print("File doesn't exist. “ + " Enter a file name to process: "); file = new File(console.next()); } Scanner fileScanner = new Scanner(file); return fileScanner; } 5 Calculate the double average (cont) public static double processFile(Scanner input) { double sum = 0; int count = 0; while (input.hasNext()) { if (input.hasNextDouble()) { sum += input.nextDouble(); count++; } else { input.next(); } } return sum / count; } Another Token-Based File Processing Example Let’s write a program HoursWorked1 that processes a file hours1.txt which contains employees and hours worked each week. We want to calculate the total hours worked by each employee. Desired Output: Erica 7.5 8.5 10.25 8 8.5 Erin 10.5 11.5 12 11 10.75 Simone 8 8 8 Ryan 6.5 8 9.25 8 Kendall 2.5 3 Solution Solution (cont.) Solution (cont.) Modified Input File 101 Erica 7.5 8.5 10.25 8 8.5 783 Erin 10.5 11.5 12 11 10.75 114 Simone 8 8 8 238 Ryan 6.5 8 9.25 8 156 Kendall 2.5 3 Employee Id What if our input file is now hours2.txt and it also contains an employee id as the first token? Flawed Solution Flawed Output: 101 Erica 7.5 8.5 10.25 8 8.5 783 Erin 10.5 11.5 12 11 10.75 114 Simone 8 8 8 238 Ryan 6.5 8 9.25 8 156 Kendall 2.5 3 Reason for Flawed Solution The inner while loop is grabbing the next person's ID. 101 Erica 7.5 8.5 10.25 8 8.5 783 Erin 10.5 11.5 12 11 10.75 114 Simone 8 8 8 238 Ryan 6.5 8 9.25 8 156 Kendall 2.5 3 101 Erica 7.5 8.5 10.25 8 8.5\n783 Erin 10.5 11.5 12 11 10.75\n… The Scanner looks at the file as one continuous stream of input, line breaks are ignored We want to process the tokens, but we also care about the line breaks (they mark the end of a person's data). Better Solution 101 Erica 7.5 8.5 10.25 8 8.5 783 Erin 10.5 11.5 12 11 10.75 114 Simone 8 8 8 238 Ryan 6.5 8 9.25 8 156 Kendall 2.5 3 Read this line first and process it token by token A hybrid approach: 1. First, break the overall input into lines. 2. Then break each line into tokens Line-Based Processing • Processing input line by line – Use nextLine() and hasNextLine() methods – Takes all of the text up to the new line character (\n) • Preserves white space and line breaks of the text being processed: - Processing a poem or formatted text - count lines in file, words in line Counting Lines // Counts the number of lines in the input.txt file Scanner input = new Scanner(new File("input.txt")); int count = 0; while(input.hasNextLine()) { String line = input.nextLine(); count++; } System.out.println(“File has " + count + " lines"); • What if we want to count the words on a line? Scanners on Strings 17 Copyright 2008 by Pearson Education A Scanner can tokenize the contents of a String: Scanner name = new Scanner(String); Example: String text = "15 3.2 hello 9 27.5"; Scanner scan = new Scanner(text); int num = scan.nextInt(); System.out.println(num); double num2 = scan.nextDouble(); System.out.println(num2); String word = scan.next(); System.out.println(word); // 15 // 3.2 // hello Tokenizing Lines of a File Input file input.txt: Output to console: The the quick brown lazy dog. fox jumps over Line Line has has 6 3 words words Scanner to read the file Scanner to tokenize the line Close Scanner for the file Close the Scanner for the line Return to Hours Worked Example So how would we process hours2.txt ? 101 Erica 7.5 8.5 10.25 8 8.5 783 Erin 10.5 11.5 12 11 10.75 114 Simone 8 8 8 238 Ryan 6.5 8 9.25 8 156 Kendall 2.5 3 Desired Output: Corrected Solution Close Scanner for the line Line-by-Line File Processing Template public static void userInterface () throws FileNotFoundException { Scanner console = new Scanner(System.in); Scanner input = getInputScanner(console); processFile (input); input.close(); } public static void processFile(Scanner fileScanner) { while (fileScanner.hasNextLine()) { String line = fileScanner.nextLine(); processLine(line); } } public static void processLine(String line) { Scanner lineScanner = new Scanner(line); while (lineScanner.hasNext()) { String token = lineScanner.next(); // process the token } lineScanner.close(); } Line-by-Line File Processing Template public static void userInterface () throws FileNotFoundException { Scanner console = new Scanner(System.in); Scanner input = getInputScanner(console); processFile (input); input.close(); } public static void processFile(Scanner fileScanner) { while (fileScanner.hasNextLine()) { String line = fileScanner.nextLine(); Scanner lineScanner = new Scanner(line); while (lineScanner.hasNext()) { String token = lineScanner.next(); // process the token } lineScanner.close(); } } In-class Exercise • Go to the moodle page and work on the LineProcessing.java assignment. • Your LineProcessing class must contain the following methods: • public static void main (String[] args) – call the userInterface() method • public static void userInterface() – create a Scanner for console input – pass this Scanner to the getInputScanner method – pass the Scanner returned from the getInputScanner method to the processFile method • public static Scanner getInputScanner(Scanner console) – prompt the user for the file name of a file to process (use the console Scanner parameter) – create a Scanner to read from this file and return it – if the file does not exist, print a file does not exist message and reprompt • public static void processFile(Scanner input) – Read each line of the input file (using the input Scanner parameter). – For each line: » Create a Scanner that gets its input from the line. » Calculate and print the number of tokens and the length of the longest token in the line (use the output format shown below). • If the file contains the following text: Your method should produce the following output: Beware the Jabberwock, my son, the jaws that bite, the claws that catch, Beware the JubJub bird and shun the frumious bandersnatch. Line 1 has 5 tokens (longest = 11) Line 2 has 8 tokens (longest = 6) Line 3 has 6 tokens (longest = 6) Line 4 has 3 tokens (longest = 13) Exception Handling • Review – A FileNotFoundException is a “checked exception” – The compiler “checks” to make sure we are aware that it could happen. – We can choose to not worry about the FileNotFoundException: public static void main(String [] args) throws FileNotFoundException { Scanner input = new Scanner(new File(“data.txt”)); } Exception Handling (cont.) • We can use a “try catch” block to handle the FileNotFoundException (a better idea☺): • See Appendix C in Textbook public static void main(String[] args) { try { Scanner input = new Scanner(new File(“data.txt”)); //do our processing } catch (FileNotFoundException e) { System.out.println(“File: data.txt does not exist!”); } } Exception Handling (cont.) NOTE: If method X throws a checked exception, then any method that calls X must handle the exception with either a “throws clause” or a “try catch block.” import java.util.*; import java.io.*; public class DoubleAverage { public static void main(String [] args) throws FileNotFoundException { userInterface(); } public static void userInterface() throws FileNotFoundException { Scanner console = new Scanner(System.in); Scanner fileScanner= getInputScanner(console); System.out.println("Average: " + processFile(fileScanner)); fileScanner.close(); } … … } Recall: Calculate the Average of all doubles in a file Recall: Calculate the double average (getInputScanner) //get inputScanner method – reprompt option public static Scanner getInputScanner(Scanner console) throws FileNotFoundException { Scanner fileScanner = null; while (fileScanner == null) { System.out.print("input file name? "); String name = console.next(); File f = new File(name); if (f.exists()) { fileScanner = new Scanner(f); } else { System.out.println("File: " + name + " not found . Please try again."); } } return fileScanner; } Redo: Calculate the double average (getInputScanner – try/catch) //get inputScanner method – try/catch public static Scanner getInputScanner(Scanner console) { //remove throws Scanner fileScanner = null; while (fileScanner == null) { System.out.print("input file name? "); String name = console.next(); try { //add try catch fileScanner = new Scanner(new File(name)); } catch (FileNotFoundException e) { System.out.println("File: " + name + " not found . Please try again."); } } return fileScanner; } import java.util.*; import java.io.*; public class DoubleAverage { public static void main(String [] args) { //remove throws userInterface(); } public static void userInterface() { //remove throws Scanner console = new Scanner(System.in); Scanner fileScanner= getInputScanner(console); System.out.println("Average: " + processFile(fileScanner)); fileScanner.close(); } … … } Redo: Calculate the Average – try/catch For example, a file with the following content: Should result in the following output: Students are often asked to write term papers containing a certain number of words. Counting words in a long paper is a tedious task, but the computer can help. Write a program that counts the number of words, lines, and total characters (not including whitespace) in a paper, assuming that consecutive words are separated either by spaces or end-of-line characters. Your program should be named WordCount.java. Total lines = 5 Total words = 66 Total chars = 346 In-class Group Exercise • Go to GitHub • Clone the csc116-xxx-Lab14-yy repository, where yy is your team number, xxx is your section number – Under clone or download, copy the url – In exercises folder on local machine: git clone url • Go to the moodle page and work on the WordCount.java assignment. • Write a WordCount class that counts the number of words, lines, and total characters (not including whitespace) in a paper, assuming that consecutive words are separated by whitespace. • Use try/catch (do not use the throws clause). • Push to github, do not submit to Moodle.