CS2504, Spring'2007 ©Dimitris Nikolopoulos 50 MIPS floating point example float f2c (float fahr) { return ((5.0/9.0) * (fahr – 32.0)); } /* Assume fahr passed in $f12, three constants located within reach of $gp */ CS2504, Spring'2007 ©Dimitris Nikolopoulos 51 MIPS floating point example f2c: lwc1 $f16,const5($gp) lwc1 $f18,const9($gp) div.s $f16,$f16,$f18 lwc1 $f18,const32($gp) sub.s $f18,$f12,$f18 #fahr 32 mul.s $f0,$f16,$f18 jr $ra CS2504, Spring'2007 ©Dimitris Nikolopoulos 52 MIPS floating point array example void mm (double x[][], double y[], double z[]) { int i, j, k; for (i=0;i!=32;i++) for (j=0;j!=32;j++) for (k=0;k!=32;k++) x[i][j] = x[i][j]+y[i][k]*z[k][j]; } /* Array addresses in $a0,$a1,$a2, integer counters in $s0,$s1,$s2. x[i][j] can be stored in a register for k=0,...,32 */ CS2504, Spring'2007 ©Dimitris Nikolopoulos 53 Arrays in C Row-major order Adjacent row elements in adjacent memory locations Formula to locate element [i][j] of r by c array i*c + j Offset: 4*i*c+j*4 =4*(i*c+j) for single-precision floating point numbers 8*i*c+j*8=8*(i*c+j) for double-precision floating point numbers CS2504, Spring'2007 ©Dimitris Nikolopoulos 54 MIPS floating point array example mm:... #save calleesave registers to stack li $t1,32 li $s0,0 # i = 0 L1: li $s1,0 # j = 0 L2: li $s2,0 # k = 0 sll $t2,$s0,5 # i*32 (size(row)) addu $t2,$t2,$s1 #t2 = i*size(row)+j sll $t2,$t2,3 #byte offset of [i][j] addu $t2,$a0,$t2 #address of x[i][j] l.d $f4,0($t2) #$f4$f5 hold x[i][j] L3: sll $t0,$s2,5 # k*32 (size(row) addu $t0,$t0,$s1 #k*size(row)+j sll $t0,$t0,3 #byte offset of [k][j] addu $t0,$a2,$t0 #address of z[k][j] l.d $f16,0($t0) #$f16$f17 hold z[k][j] CS2504, Spring'2007 ©Dimitris Nikolopoulos 55 MIPS floating point array example mm:... #continued... sll $t0,$s0,5 #t0 is i * 32 addu $t0,$t0,$s2 #t0 is i * 32 + k sll $t0,$t0,3 #t0 is offset [i][k] addu $t0,$a1,$t0 #t0 pointer to y[i][k] l.d $f18,0($t0) #$f18$f19 mul.d $f16,$f18,$f16 #y[i][k]*z[k][j] add.d $f4,$f4,$f16 #x[i][j]+=y[i][k]*z[k][j] addiu $s2,$s2,1 #k = k + 1 bne $s2,$t1,L3 #next k iteration s.d $f4,0($t2) #update x[i][j] addiu $s1,s1,1 # j = j + 1 bne $s1,$t1,L2 #next j iteration addiu $s0,$s0,1 # i = i + 1 bne $s1,$t1,L1 #next i iteration ... #pop arguments and return CS2504, Spring'2007 ©Dimitris Nikolopoulos 56 Elaboration MIPS uses pairs of fp registers for doubles e.g. $f0,$f1 MIPS later introduced paired single operations add.ps $f0,$f2,$f4 add.s $f0,$f2,$f4 add.s $f1,$f3,$f5 Limited form of vectorization execution Also know as SIMD execution Single instruction operating on multiple data CS2504, Spring'2007 ©Dimitris Nikolopoulos 57 Floating point accuracy Floating point numbers necessarily lose accuracy Infinite numbers between 0 and 1 Only 253 can be represented in double-precision Rounding not for free in hardware: Think decimal, if we can afford 4 digits 0.69849 rounded to 0.6985 What if we had 0.99998? What if we had 0.69845? CS2504, Spring'2007 ©Dimitris Nikolopoulos 58 Floating point accuracy IEEE 754 standard specifies guard bits Need two because of multiplication possible generating an extra bit Rounding modes always up always down to nearest even (IEEE 754) CS2504, Spring'2007 ©Dimitris Nikolopoulos 59 Summary You learned: Integer arithmetic Binary addition, multiplication, division Hardware implementations Floating point representation (IEEE 754) Floating point arithmetic Assembly using floating point instructions