T H E U N I V E R S I T Y O F E D I N B U R G H AI Large Practical: Assignment 1 Alan Smaill School of Informatics Sep 28 2011 Alan Smaill AI Large Practical: Assignment 1 1/1 T H E U N I V E R S I T Y O F E D I N B U R G H AILP: First Assignment The handout describes the first of the two assignments for this course. We will look at offline isolated handwritten character recognition Some code is provided for you (java, and (shell/perl) scripts). In particular: Java code to compute PIXEL distance between two patterns, according to one metric. Shell scripts for perform recognition experiment, Scripts (in perl) to evaluate system performance Alan Smaill AI Large Practical: Assignment 1 2/1 T H E U N I V E R S I T Y O F E D I N B U R G H Directory structure You should make your own copy of the initial programs, by copying the directory: /group/teaching/ailp/programs/ This directory contains: 0.README CV-LISTS/ files for k-fold Cross Validation tests PIXEL/ Pixel-based system bin/ script files src/ java source code Have a look at README file in each directory. Alan Smaill AI Large Practical: Assignment 1 3/1 T H E U N I V E R S I T Y O F E D I N B U R G H The pixel data for individual characters The characters are stored in an ascii representation of a 2-dimensional 64 × 64 array of 0s and 1s: > WIDTH=64 HEIGHT=64 > x[][] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... ............... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 ... 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 ... 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ... 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ... 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ... 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 ... ............... Alan Smaill AI Large Practical: Assignment 1 4/1 Such files can by displayed with disp ascii.py: T H E U N I V E R S I T Y O F E D I N B U R G H Example code static float computeDist(float a[][], float b[][]) { int width = a.length; int height = a[0].length; float sum = 0; for (int i = 0; i < width; ++i) { for (int j = 0; j < height; ++j) { sum += (a[i][j]-b[i][j])*(a[i][j]-b[i][j]); } } return sum/(width*height); } Pen data are treated as a simple 2D float array – elsewhere have probability values, not just 0/1. Alan Smaill AI Large Practical: Assignment 1 6/1 T H E U N I V E R S I T Y O F E D I N B U R G H Key scripts PIXEL/run-pixel-train.sh compute average pixel values for given patterns PIXEL/run-pixel-dist.sh perform simple recognition experiment PIXEL/cv-ana-test.pl evaluate system performance Alan Smaill AI Large Practical: Assignment 1 7/1 T H E U N I V E R S I T Y O F E D I N B U R G H Simple experiment For given digits 0–9, after computing average values for each pixel and storing this as the reference pattern, 80 pen samples from different writers are compared against these 10 reference patterns. The closest-distance pattern will be used as recognition result. Total distances to compute: 10 80 10 = 8000 ./run-pixel-dist.sh | tee out Alan Smaill AI Large Practical: Assignment 1 8/1 T H E U N I V E R S I T Y O F E D I N B U R G H Sample output # REFDIR= REF64-d10-cv2-0 # NWRITERS= 80 # CLASSES= d01 d02 d03 d04 d05 d06 d07 d08 d09 d10 #>> CLASS= 0 0 0 0.015086275 0 1 0.01769532 0 2 0.016683647 0 3 0.016455308 0 4 0.017667066 0 5 0.017037405 0 6 0.016136223 0 7 0.018054256 0 8 0.016939178 0 9 0.018400097 1 0 0.020108137 1 1 0.023403272 1 2 0.02261407 Alan Smaill AI Large Practical: Assignment 1 9/1 T H E U N I V E R S I T Y O F E D I N B U R G H Analysis of output ./cv-ana-test.pl < out cid 1 57 / 80 = 71.2 % cid 2 70 / 80 = 87.5 % cid 3 54 / 80 = 67.5 % cid 4 50 / 80 = 62.5 % cid 5 61 / 80 = 76.2 % cid 6 38 / 80 = 47.5 % cid 7 46 / 80 = 57.5 % cid 8 59 / 80 = 73.8 % cid 9 49 / 80 = 61.3 % cid 10 51 / 80 = 63.7 % Total: 535 / 800 = 66.88 % Alan Smaill AI Large Practical: Assignment 1 10/1 T H E U N I V E R S I T Y O F E D I N B U R G H Analysis of output (confusion matrix) I\O 1 2 3 4 5 6 7 8 9 10 1 71.2 8.8 1.2 0.0 2.5 3.8 7.5 0.0 5.0 0.0 2 0.0 87.5 1.2 0.0 2.5 2.5 0.0 5.0 0.0 1.2 3 0.0 0.0 67.5 5.0 1.2 11.2 3.8 7.5 3.8 0.0 4 0.0 0.0 1.2 62.5 3.8 22.5 0.0 6.2 2.5 1.2 5 0.0 2.5 0.0 0.0 76.2 0.0 1.2 6.2 0.0 13.8 6 5.0 1.2 6.2 21.2 5.0 47.5 1.2 5.0 1.2 6.2 7 17.5 6.2 3.8 0.0 5.0 0.0 57.5 0.0 10.0 0.0 8 0.0 0.0 2.5 1.2 8.8 0.0 0.0 73.8 2.5 11.2 9 11.2 0.0 2.5 8.8 2.5 3.8 5.0 2.5 61.3 2.5 10 1.2 0.0 0.0 2.5 20.0 0.0 0.0 12.5 0.0 63.7 Alan Smaill AI Large Practical: Assignment 1 11/1 T H E U N I V E R S I T Y O F E D I N B U R G H Legacy code The code supplied is not in a unified form; it is common to have to deal with code that combines different tools. Note that: There may be bugs! Let us know, it is in everyone’s interests to sort this out. The scripts rely on details of the format used in the output of some routines (on how the output file is formatted – use of headings, and blank lines, for example). This is documented, but will be hard to spot if you break such conventions! There are alternatives . . . Also note that names for file locations appear in the scripts, and may need to be changed. Alan Smaill AI Large Practical: Assignment 1 12/1 T H E U N I V E R S I T Y O F E D I N B U R G H Reasons for Misclassification There are several ways in which things can do wrong: Mismatch between input pattern and reference pattern (model) – eg deformation (pattern transformation) approx affine transformation (linear transformation + translation) other non-linear transformation degeneration (due to noise, digitising) Alan Smaill AI Large Practical: Assignment 1 13/1 T H E U N I V E R S I T Y O F E D I N B U R G H Maybe it’s the data . . . Maybe the problem is with the data: (outlier) data collection error data labelling error Alan Smaill AI Large Practical: Assignment 1 14/1 T H E U N I V E R S I T Y O F E D I N B U R G H Pre-processing Aim to deal with aspects of the first sort of problem by aligning the characters similarly before doing anything else. We could nomalise the centre of gravity (centroid) – move every point (x , y) to (x ′, y ′) by: x ′ = (x − xc) + x ′c y ′ = (y − yc) + y ′c where (xc , yc) is the centroid of the given character, and (x ′ c , y ′ c) is an arbitrarily chosen fixed position. Alan Smaill AI Large Practical: Assignment 1 15/1 T H E U N I V E R S I T Y O F E D I N B U R G H Computing the centroid xc = 1 S H∑ y=1 W∑ x=1 x .o(x , y) yc = 1 S H∑ y=1 W∑ x=1 y .o(x , y) S = H∑ y=1 W∑ x=1 o(x , y) This transformation does not change the size or slant of the figure, but does move it. (What if it goes outside the 64 × 64 grid?) Alan Smaill AI Large Practical: Assignment 1 16/1 T H E U N I V E R S I T Y O F E D I N B U R G H Other normalisations Size: A linear transformation that stretches or shrinks uniformly to reach some nomalised dimensions: work out the right α, β and use: x ′ = αx y ′ = βy Alan Smaill AI Large Practical: Assignment 1 17/1 T H E U N I V E R S I T Y O F E D I N B U R G H Other normalisations Slant: x ′ = x − (y − yc) tan θ y ′ = y where tan θ = u11 u02 upq = H∑ y=1 W∑ x=1 (x − xc)p(y − yc)qo(x , y) Alan Smaill AI Large Practical: Assignment 1 18/1 T H E U N I V E R S I T Y O F E D I N B U R G H Why this formula for slant normalisation? There are several different ways of normalising the slant. Notice that in this version, the centroid stays fixed. An example may help: notice: the size of the character changes no good to compare slash and backslash this way! x y θ ⇒ x y Alan Smaill AI Large Practical: Assignment 1 19/1 T H E U N I V E R S I T Y O F E D I N B U R G H Summary Overview of starting point for first assignment. Some ideas on how to do pre-processing. Alan Smaill AI Large Practical: Assignment 1 20/1