CSC263 Fall 2006 Assignment 2 October 18, 2006 Due Date : Thursday, October 26th at 10am Late Due Date ( - 25%) : Friday, October 29th at 10am Your work must be shown for all questions unless otherwise stated. 1 Written Question 1 (20 marks) (a) Can the black-heights of nodes in a red-black tree be maintained as fields in the nodes of the tree without affecting the asymptotic performance of any of the red-black tree operations? Show how or argue why not. (b) Can the depths of nodes in a red-black tree be efficiently maintained as fields in the nodes of the tree? Show how or argue why not (c) We wish to augrment red-black trees with an opertaion RB-ENUMERATE(x, a, b) that outputs all the keys k such that a ≤ k ≤ b in a red-black tree rooted at x. Describe how RB-ENUMERATE can be implemented in Θ(m + log n) time where m is the number of keys that are output and n is the number of internal nodes in the tree. 2 Written Question 2 (15 marks) Consider a hash table using open addressing with linear probing. Give an efficient algorithm for deletion so that deleting a key leaves the table exactly as it would have been had that key not been inserted. Prove that your algorithm is correct. Your algorithm should be as efficient as possible, although correctness is most important. 3 Written Question 3 (20 marks) Consider the following data structure for representing a set. The elements of the set are stored in a singly linked list of sorted arrays, where the number of elements in each array is a power of 2 and the sizes of all the arrays in the list are different. Each element in the set occurs in exactly one array. The arrays in the linked list are kept in order of increasing size. To perform SEARCH, perform binary search separately on each array in the list until either the desired element is found or all arrays have been considered. To insert a new element x into the set (given the precondition that x is not already in the set), perform the following algorithm: 1 create a new array of size 1 containing x insert this new array at the beginning of the linked list while the linked list contains 2 arrays of the same size merge the 2 arrays into one array of twice the size (a) What is the worst case time, using asymptotic notation, for performing SEARCH when the set has size n? Justify your answer. (b) What is the worst case time, in asymptotic notation, for performing INSERT when the set has size n? Justify your answer. (c) Use the accounting method to prove that the amortized insertion time of a sequence of m insertions, starting with an initially empty set, is O(log m). 4 Written Question 4 (20 marks) Consider two hash tables, T1 and T2, with the same number of buckets. With T1, we use chaining to resolve collisions, where each chain is a doubly-linked list and insertions are done at the front of the list. With T2, we use linear probing to resolve collisions, and deletions are done by replacing the item with a special “deleted” item. For both tables, we use the same hash function. (a) Let S = S1, S2, ..., Sn be a sequence of INSERT and DELETE operations. Suppose that we perform S on both T1 and T2. For 1 ≤ i ≤ n, let C1,i be the number of item comparisons made when performing Si on T1, and let C2,i be the number of item comparisons made when performing Si on T2. Show that for 1 ≤ i ≤ n, C1,i ≤ C2,i. (b) Explain why the claim in part (a) is no longer true if the sequence S also contains SEARCH operations. (c) Let S be as in part (a). Suppose we perform S on both T1 and T2, then do a SEARCH operation where each item in the table is equally likely to be searched for. Let E1 be the expected number of item comparisons made when performing the SEARCH operation on T1, and let E2 be the expected number of item comparisons made when performing the SEARCH operation on T2. Prove that E1 ≤ E2. 5 Programming Question (25 marks) Your task for this question is to implement a hash table using an open addressing scheme. First, download the file HashTable.tar.gz from the course website. In it, you will find 6 files: • HashTable.java - This class is mostly empty as you receive it. You are responsible for filling in the insert, search and delete functions. This class has a single constructor which takes a ProbeSequence which dictates what the hash function and the probe sequence are for this hash function. • Record.java - This class is the type of object that is stored in the hash table. This class should be extended in order to store other data in the hash table. 2 • Employee.java - This class extends the Record class and stores a String variable. • ProbeSequence.java - This interface contains a single function with signature int probe(int i, int key, int m). Any class that implements this interface must provide such a function which given a key, an index in the probe sequence and a table size m, returns the table index to probe. • MyLinearProbeSequence.java - This class implements the ProbeSequence interface and provides a probe sequence using the multiplication method for the hash function and linear probing for the probe sequence. • HashTableTest.java - This class shows how to use the HashTable class. You should begin by implementing the insert and search functions. The hash table should use a dynamic array such that when an element is inserted and the hash table is already full, the array size is doubled. Keep in mind that these functions must support arbitrary hash functions and probe sequences. However, if the probe sequence does not cover all entries in the table, the behaviour is undefined (i.e. don’t worry about it). When you are finished implementing the insert and search functions, you should implement the delete function that you described in Question 4. This delete function will only ever be called when the ProbeSequence that is used to instantiate the HashTable uses the linear probing technique. Note: You should assume throughout this assignment that duplicate keys are not allowed. You should build a test suite to test your code and put these tests in Tester.java such that when a command such as java Tester is run, your test suite is run. Write a very short report outlining why you chose the tests that you chose and which of your tests your code failed (if any). Your tests should include classes which implement ProbeSequence other than MyLinearProbeSequence. You should include one probe sequence that uses quadratic probing and one that uses double hash- ing in MyQuadraticProbeSequence.java and MyDoubleHashingProbeSequence.java. You may use any good hash functions for the hash functions in these classes. You are responsible for handing in the following files: • HashTable.java - Contains your HashTable implementation • MyQuadraticProbeSequence.java - Contains your implementation of a probe sequence that uses quadratic probing • MyDoubleHashingProbeSequence.java - Contains your implementation of a probe sequence using double hashing • Tester.java - Contains your test suite 6 Handing In Handing in your assignment will take place in two parts: 3 • Drop box handin in BA 2220 Your first page should be the cover sheet for this assignment found on the course web page under the assignments. Then, your answers to the written questions should follow. Next, you should include a copy of your report for the Programming Question. Finally, you should include a printout of the files that you modified. • Electronic submission on cdf You should submit your code on the cdf systems using the submit -c csc263h -a A2 files.ext command. 4