CS 220 - Lab 8 CS 220 - Lab 8: The Instruction Set Architecture Level: The Full JVM Due December 10, 2003 General Lab Goals This lab examines the instruction set architecture of a contemporary machine (the Java Virtual Machine). Here, we are interested in how the JVM provides supports for several different Java programming constructs, such as arrays, floats, and input/output. We are also interested in the details of implementing certain JVM instructions at the MicroArchitecture level, thus extending the IJVM instruction set. Read those sections of Chapter 5 in your text that emphasize the JVM realization of an instruction set architecture. Available on-line is a document from Sun describing the full JVM instruction set in detail. You may refer to this manual as a supplement to Chapter 5. Part 1 - The Full JVM Instruction Set The full Java Virtual Machine has instructions that support a wide range of programming options. These options include the manipulation of various data types beyond int (long, float, char, and double), the dynamic allocation and manipulation of arrays, and the management of Java objects and classes. Consider the following Java program, which is similar in some ways to the one discussed in Lab 7. public class maxCalc2 { public static void main (String[] args) { float[] A = new float[10]; int n = 0; System.out.println("Enter a few numbers, ended by 0: "); A[0] = Keyboard.readFloat(); while (A[n] != 0) { n = n + 1; A[n] = Keyboard.readFloat(); } System.out.println("Their maximum is " + MAX(A, n)); } public static float MAX (float[] A, int n) { float result = A[0]; for (int i=0; i result) result = A[i]; return result; } } This program computes the maximum value of a series of 10 or fewer real (float) values entered at the keyboard and stored in the array A. It differs from the program maxCalc in Lab 7 in significant ways. First, it uses the float data type, part of the JVM but not the IJVM. Second, it uses arrays, and finally it uses input and output of decimal numbers, which is not part of IJVM. This simple example exposes the need for a more robust machine language and memory management scheme to support many common programming needs. The full JVM provides that support, and we shall look at some of its features in the following sections. Part 2 - Data Types The IJVM instruction set includes arithmetic, load, and store operations for 2s complement 32-bit integers. This is type int in Java. The full JVM includes instructions for Java types long (64-bit integers), float (32-bit floating point), and double (64-bit floating-point) values. Some of these are summarized below: Java int Java long Java float Java double ILOAD ISTORE IADD ISUB IMUL (product) IDIV (quotient) IREM (remainder) INEG (complement) I2F (convert int to float) ICONST_0, ICONST_1 IFLT IFEQ IF_ICMPEQ IRETURN LLOAD LSTORE LADD LSUB LMUL LDIV LREM LNEG L2F LCONST_0 LCMP LRETURN FLOAD FSTORE FADD FSUB FMUL FDIV FREM FNEG F2I FCONST_0, FCONST_1 FCMPL FRETURN DLOAD DSTORE DADD DSUB DMUL DDIV DREM DNEG F2D DCONST_0 DCMPL DRETURN The details of these instructions can be found in the full JVM instruction set on-line document. The important point here is that the code for, say, summing two integers is similar to that of summing two floats. E.g., consider the Java statement n = n + 1 in the above program. If n is an int variable, the code on the left would be used; however, if n is a float variable, the code on the right would be used. ILOAD n FLOAD n ICONST_1 FCONST_1 IADD FADD ISTORE n FSTORE n Part 3 - Arrays The above Java program also has an array A in it. Arrays in Java are objects, which means that they can be created either locally or globally and their memory allocation takes place in a separate part of memory called the "heap." Consider the following heap layout, which shows a 10-element array A storing a series of Java int's beginning at heap address 2400. Initial values are also shown (the int's are in decimal, and the addresses are in hex), so that A[0] = 7, A[1] = 3, and so forth. A -> | 7 | 3 | 83 | 45 | 156 | -250 | 1 | 3 | 12 | -2 | [2400][2401] ... [2409] SP -> | 2400 | a | 0A | b Also shown here is the stack, with local variables a and b, which contain the heap address of the array A and A's length, respectively. The full JVM provides instructions that deal directly with arrays. For instance, the instruction IALOAD (see text page 367) pushes onto the stack an integer from an array whose base address and index are the top two words on the stack. For instance, to push the integer 45 from the array entry A[3] onto the stack, the address 2400 and the index 3 should first be pushed onto the stack, and then the IALOAD instruction is issued to retrieve the half-word at address [2403]. That is, we would write: LDC_W a BIPUSH 3 IALOAD which would transform the stack in the following three steps: SP -> | 2400 | SP -> | 3 | SP -> | 45 | | 2400 | | 2400 | | 2400 | | 0A | | 2400 | | 0A | | 0A | If we want to loop through an array and compute, say, the sum of the elements, we would embed these three statements inside a loop which repeated them 10 times. Upon each repetition, a new element would be loaded from A onto the stack and then added to a running sum (another local variable) in the usual way. In addition to IALOAD, the following JVM instructions are available for arrays: LALOAD, FALOAD, DALOAD -- pushes onto the stack an element from an array of longs , floats, or doubles. IASTORE -- Stores the top value on the stack into the array element which is addressed by the next two words on the stack (index and base address, as for IALOAD), and then pops all three from the stack. LASTORE, FASTORE, DASTORE -- ditto for longs, floats, and doubles. To declare an array and activate it in the heap, the size of the array must first be pushed onto the stack, followed by: NEWARRAY where is either int, long, float, double, boolean, char, byte, or short. This instruction pops the size off the stack and replaces it with the heap address of the first element. So to create the array A declared in the above program, we would write the following JVM code: BIPUSH 10 NEWARRAY int Assuming the address of A[0] is returned as 2400, the stack will be transformed as follows by each of these instructions: SP -> | 0A | SP -> | 2400 | | | | | More detailed descriptions of array allocation and referencing can be found in the full JVM instruction set on-line document. Part 4 - Input/Output Input and output in the IJVM are pretty feeble. All you can do is input or output a single keystroke, using IN and OUT respectively. The methods getnum and print, provided with the add.jas program, allow user input and output of a hexadecimal integer. Unfortunately, to do more than this requires writing additional methods or implementing new JVM instructions in microcode for the Mic-1 simulator. A sketch of an algorithm for converting a series of keystrokes k to a binary integer sum is as follows: sum = 0; read a keystroke k as an ASCII character k while (k != the return character) { convert k to a binary integer i sum = 10*sum + i read a keystroke k as an ASCII character k } For instance, if the keystrokes are 235 The value of sum will accomulate as follows: sum = 0 sum = 10*0 + 2 = 2 sum = 10*2 + 3 = 23 sum = 10*23 + 5 = 235 The full JVM also provides some support for converting values among the related types, in the form of the conversion instructions x2y, where x and y are either I, L, F, or D (denoting int, long, float, or double). For instance, the instruction I2F converts the top word on the stack from int to float. More detailed descriptions of these conversions in the full JVM instruction set on-line documentation. Part 5 - The Dreaded IMUL Instruction The previous example provides more evidence that the IMUL instruction really needs to be implemented in microcode for the Mic-1 simulator. Here is a sketch of what's involved in such a design. First, the product a*b can be computed as the sum of b instances of the integer a, provided that b is a nonnegative integer. (If b is negative, then its complement can be used to count the number of times the sum is accumulated.) Second, efficiency should be taken into account in the design of IMUL. For instance, the product 27*45 can be computed either by adding 27 to itself 45 times, or else it can be computed by the following alternative method. Using 8-bit notation, suppose we have: 27 = 00011011 45 = 00101101 The product 27x45 can be computed by repeatedly adding 27 into a special 16-bit "accumulator" register and shifting that register. The addition is done only for bits in the number 45 that are 1. That is: 00011011 <- since bit 0 = 1 00000000 <- since bit 1 = 0 00011011 <- since bit 2 = 1 00011011 <- since bit 3 = 1 00000000 ... 00011011 00000000 00000000 <- since bit 7 = 0 ================= 0000010010111111 The resulting product (121510) can be computed by using a shift-and-mask strategy similar to that used in the print method inside add.jas. That is, a 1-bit mask can isolate the rightmost bit of the 45. If that bit is 1, 27 is added to the sum (initially 0) and then the 27 is left-shifted in its own register. That is, 27 becomes 27*2 = 54. Now the mask shifts left and isolates the next bit in the 45, and if it is 1 the shifted value 54 is added to the sum. If it is not 1, the 1-bit mask and the 54 are left-shifted again without an add. If this strategy is followed, the maximum number of adds required to compute this product is 8, which is much more efficient than the original method (45 adds). Part 6 - Notes on Microprogramming the Mic-1 Here is some information that will help you get good results out of the micro-assembler and simulator. 1. Statement numbers. Generally, there is one line for each microinstruction, and one or more actions on that line (separated by semicolons). For instance, the following microinstruction has two actions, an assignment statement and an if statement: ifgt4 N = -H; if (N) goto T; else goto F // Branch on N bit For details on the syntax of micro-assembly code, consult the on-line documentation. 2. Designing and testing larger microprograms. It really does make sense to carefully design and hand-test your microprogram before assembling and debugging it on the system. Most errors will be logic errors and are not easily found at run time. 3. Loops. If you have looping microcode, you need to define the two alternative branch points using a strict discipline, since the only conditional branching statements available are: if (Z) goto A; else goto B if (N) goto A; else goto B where A is located 0x100 words higher in microstorage than B. For example, suppose you want to define two branch points for one of these conditional branching statements within the microcode for IMUL. Then you could define the branch points imul7T and imul7f at the beginning of .label imul1 0x68 .label imul7F 0x70 .label imul7T 0x170 The number 68 is the IMUL opcode in JVM. The number 70 is rather arbitrary, though it must not be the same as the value of any other defined label (op code) in this list, since it names an absolute microstorage address. Now within the microcode for IMUL, you can write either of the following statements: if (Z) goto imul7T; else goto imul7F if (N) goto imul7T; else goto imul7F where imul7T and imul7F are labels on any other statements within the IMUL microcode. This gives you arbitrary branching and looping capability. Hand In: 1. Convert the Java program at the beginning of this lab to a complete JVM program, in the spirit of the IJVM programs you have written but assuming the larger JVM instruction set described above. Assume that the methods readFloat and println already exist and can be called using INVOKEVIRTUAL and the usual parameter passing methods. Assume also that a String value (such as "Their maximum is ") or a float value (such as 0.5) can be declared in the .constant area of the program by giving it a name. E.g., .constant msg1 "Their maximum is " onehalf 0.5 .end-constant 2. Design a complete IJVM input method IIN, based on the discussion in Part 4 , which receives a decimal integer from the user and leaves its 2s complement integer representation on the top of the stack. 3. Do problems 21, 23, and 38 (in IJVM) on page 398. The method in question 38 should be called IOUT, it should have one parameter, and it should show the result in decimal in the Mic-1 display area. 4. Implement IMUL in Mic-1 microcode. It should assume the same operands on the stack as does IADD, and should replace them by their 32-bit product as the top word on the stack when it is done. For details on the syntax of the Mic-1 microprogramming assembly language, see the on-line Micro-Assembly Language (MAL) Specification . To implement a new microprogram, you need to add it to all the microprograms for the other IJVM instructions that are in the file mic1ijvm.mal in the Mic-1 folder, and then reassemble that file using the assembler mic1asm. To assemble a test IJVM program that contains the IMUL instruction, you need to alter the file ijvm.conf in the ijvmasm folder so that this instruction and its hex opcode are added to the others already in that file. 5. Multiplication, as well as addition and subtraction, of 32-bit 2s complement integers can cause overflow. How would you define overflow when a multiplication takes place? How is overflow signaled and detected in the full JVM instruction set? You should turn in hard copies of your answers to all these questions (including hard copies of the programs), and also leave the following files in the CS220 -> Drop Box. mic1ijvm.mal with microcode added for the IMUL instruction. add.jas with new method IIN, and test code added for IIN, IOUT, and IMUL.