Assignment Probabilistic Text Generation with HashMap and java.util.ArrayList
You are to implement class RandomWriterWithHashMap.java that provides a random writing application using your
HashMap class to store all possible seeds with its list of all possible following characters. We will be grading these live
on Wed 4-May 10:00-10:50 in a Gould Simpson lab. If you cannot attend, email the program to your section leader.
While testing your code, use any input file you want from Project Gutenberg. Use the your HashMap class,
java.util.ArrayList, and the following algorithm.
1. Read all file input into one big string (already done in RandomWriterWithHashMap.java)
2. Create a HashMap object that has all possible seeds of the given length as the key and an ArrayList of
followers as the value. (You do this)
3. Pick a random seed from the original text. (already done in RandomWriterWithHashMap.java)
4. For each character you need to print (You do this)
Randomly select one of the characters in the list of followers
Print that random character
Change the seed so the first character is gone and the just printed random character is appended
Here is a class that uses our familiar HashMap class. It is intended to provide a start to the second part of the
Project: RandomWriterWithHashMap. Method reads all of the text from an input file into one big
StringBuilder object. The StringBuilder class has the methods of the String class with a very efficient append
method that we recommend you use. It also has code to get a random seed initially . Feel free to use this
code as a start http://www.cs.arizona.edu/people/mercer/Assignments/RandomWriterWithHashMap.java
// The beginning of the probabalistic text generation
public class RandomWriterWithHashMap {
public static void main(String[] args) {
// Assume there is a file named alice with the text: "Alice likes icy olives"
RandomWriterWithHashMap rw = new RandomWriterWithHashMap("alice", 2);
rw.printRandom(100);
}
private HashMap> all;
private int seedLength;
private String fileName;
private StringBuilder theText;
private static Random generator;
private String seed;
public RandomWriterWithHashMap(String fileName, int seedLength) {
this.fileName = fileName;
this.seedLength = seedLength;
generator = new Random();
makeTheText();
setRandomSeed();
setUpMap(); // Algorithm to be considered during class
}
private void () {
Scanner inFile = null;
try {
inFile = new Scanner(new File(fileName));
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
theText = new StringBuilder();
while (inFile.hasNextLine()) {
theText = theText.append(inFile.nextLine().trim());
theText = theText.append(' ');
}
}
public void setRandomSeed() {
generator = new Random();
int start = generator.nextInt(theText.length() - seedLength);
seed = theText.substring(start, start + seedLength);
}
}
Consider algorithms for these two methods that use the put and get methods of HashMap
public void setUpMap() "Alice likes icy olives"
Added after lecture: Something like this was drawn on the whiteboard to explain what must be done in
setUpMap. Using "Alice likes icy olives", create a HashMap of seed/list mappings where the seed is a string of
length 2 and the value mapped to each key is a list of ALL characters that follow that seed in the original text.
Seed list toString()
"Al" [i]
"li" [c, k ] // 'v' to be added
"ic" [e, y]
"ce" [ ] This ArrayList has one element (that you cannot see): a space ' '
"e " [l]
" l" [i]
"li" "li" is already a key, add follower 'k' to the list already mapped to the key "li"
"ik" [e]
"ke" [s]
"es" [ ] This ArrayList has one element: a space ' '. Another ' ' should be added later
"s " [i]
" i" [c]
"ic" "ic" is already a key, add follower 'k' to the list already mapped to the key "ic"
... ... 8 more possible seeds to map
public void printRandom(int n)
Use this algorithm from page 1
For each character you need to print
Randomly select one of the characters in the list of followers (using current seed such as " i")
Print that random character
Change the seed so the first character is gone and the just printed random character is appended
Sample output (does it look like a bit like the original text?):
cy olikes ice lice lives ice likes icy olice lice
lives icy olives ice lice lives icy olives icy oli
Grading Criteria
___/ +20 Generates text that is gets closer to the original as the seed increases (subjective). For example when
seed length = 2, a few words may appear; but when 12, some sentences appear close to the original text
-20 If no text is generated with a printRandom(400) message
-20 If you did not use a HashMap and the algorithm presented in the project that uses a Map to set up all
seeds and the list of followers for each
-19 If text has no apparent difference with different seed lengths or file input