Introduction to Neural Networks September 14, 2011 Professor Daniel Kersten Brain & Cognitive Engineering, KU & University of Minnesota Overview and “catch-up” lecture Wednesday, September 14, 11 Outline • Syllabus • Overview of lectures 1-3 • Introduction (Lecture 1) • The Neuron (Lecture 2) • Neural models (Lecture 3) • Generic neuron model (Lecture 4) • Later inhibition (Lecture 5) Wednesday, September 14, 11 Syllabus Information courses.kersten.org kersten@umn.edu Tae-Eui Kam, tekam@image.korea.ac.kr Prerequisites linear algebra, multivariate calculus some programming experience and probability/statistics helps Wednesday, September 14, 11 Introduction to neural processing models in brain and cognitive science. Topics include: linear and non-linear feedforward models, learning, self- organization, probabilistic inference, and representation of neural information. Applications to sensory processing, perception, learning, and memory, with a strong emphasis on biological vision. Goals Wednesday, September 14, 11 Goals • Learn about research by developing thinking/programming/presentation skills in addition to acquiring facts • Requirements • Programming exercises, 50% • Final independent project, 50% • involves writing and oral presentation Clear writing is the result of clear thinking. And clear writing is more important than perfect grammar. Wednesday, September 14, 11 Reading materials • Lecture notes will be online • Mathematica and pdf format • Updated on day of lecture • Supplemental readings to augment lectures and provide inspiration for final projects Wednesday, September 14, 11 Why Mathematica? Useful to know more than one language (I have used Matlab, Mathematica, C++, Lisp, Java in my research) Mathematica pros: • Good for quickly prototyping an idea • e.g. http://demonstrations.wolfram.com/ • “Notebooks” useful for documenting the development of ideas • GUI tools quick and easy • Functional programming • Symbolic processing Wednesday, September 14, 11 Mathematica cons: • Default is symbolic, so need to be careful • Variables are not strongly typed, so need to be careful • Less predictable • Not as many 3rd party applications Wednesday, September 14, 11 Why Mathematica? • Lectures are in Mathematica format, so you can execute and experiment with the demonstrations embedded in lecture notes • Programming assignments are given in Mathematica format • Use the assignments as templates that you fill in and then email to TA Wednesday, September 14, 11 Final Project • Write in form of Mathematica notebook • Explore and extend what you’ve learned in the course and readings • Opportunity to pursue a question that interests you • But I can suggest topics too Wednesday, September 14, 11 Your final project will involve: 1) a computer simulation/program and 2) a 1000 to 3000 word final paper in English describing your simulation. You will do one of the following: 1) Devise a novel application for a neural network model studied in the course (could be as a “tutorial”) or 2) Simulate/apply a model from the neural network literature in the readings or 3) Program a method for solving a problem in perception, cognition or motor control or 4) Write a program that tests a behavioral hypothesis about human perception and/or cognition, and whose results you can interpret in terms of underlying neural computations. Wednesday, September 14, 11 Lecture 1: Introduction • Understand the functioning of the brain as a computational device. • Cognitive Science: The interdisciplinary study of the acquisition, storage, retrieval and utilization of knowledge. • Problems: perception, learning, memory, planning, action Wednesday, September 14, 11 Three primary disciplines that influence current neural network research Neuroscience, computational neuroscience. Basic building blocks or "hardware" of the nervous system: neurons, and their connections, the synapses • Note: Emphasis in our course will be on large scale neural networks. Requires great simplification in the model of the neuron...in order to compute and theorize about what large numbers of them can do. Our modeling will lack detail, but driven by a curiosity about how the complex processes of perception, and memory work. Computational theory, mathematics, statistical pattern recognition. Statistical inference, information and communication theory, statistical physics and computer science. • Provide the tools to abstract and formalize neural theory for analysis and simulation. Relate the neural models to statistical methods of inference to understand the computational principles behind a neural implementation. How can a system be designed to get from input to output representations? Methods useful to tie theory to behavioral tests. Behavioral sciences, psychology, cognitive science and ethology • Understand what subsystems (e.g. vision, memory, motor control, ...) are supposed to do as a functioning organism in the environment. Examples from biological/human vision. What information is useful, how should it be represented? Wednesday, September 14, 11 Levels of explanation Functional/Behavioral level • Psychology/Cognitive Science/Ethology tells us what is actually solved by functioning behaving organisms. Descriptions of behavior. Statistical Inference level • Theories of pattern recognition, inference, estimation, learning, control. • Functionalities supported by neural network computing provide a useful way of categorizing models in terms of the computational tasks required: 1. Learning input/ouput mappings from examples (learning as regression, classification boundaries) 2. Inferring outputs from inputs (continuous estimation, discrete classification): memory recall, perceptual inference, optimization or constraint satisfaction 3. Modeling data (learning as probability density estimation): self-organization of sensory data into useful representations or classes (e.g principal components analysis, clustering). (Can view 1. Learning input/output mappings as a special case) Neural network level: algorithms and implementation • Relationship to algorithms: Mathematics of computation tells us what is computable and how. Practical limits. Parallel vs. serial. Input and output representation, and algorithms for getting from input to output. Programming rules, data structures. • Implementation: Wetware, hardware. Neuroscience, neurophysiology and anatomy tell us the adequacies and inadequacies of our modeling assumptions. Wednesday, September 14, 11 Levels of explanation This course will emphasize bridging levels Wednesday, September 14, 11 ...a quick introduction to Mathematica Wednesday, September 14, 11 Lecture 2: The neuron • Passive properties • Active properties “generic” neuron Wednesday, September 14, 11 Lecture 2: The neuron • Passive properties • Simple RC circuit model of neuron membrane • Cable equation Wednesday, September 14, 11 The neuron • Active properties • action potentials • Hodgkin-Huxley Wednesday, September 14, 11 Take-home message for Lecture 2: Slow potential neuron Wednesday, September 14, 11 Lecture 3: Neural models • Types of neural models • Structure-less or “point” models • Structured models • McCulloch-Pitts • Integrate-and-fire Wednesday, September 14, 11 Structure-less models: Discrete (binary) signals--discrete time • The action potential is the key characteristic in these models. Signals are discrete (on or off), and time is discrete. These models have a close analog with digital logic circuits (e.g. McCulloch-Pitts). • At each time unit, the neuron sums its (excitatory and inhibitory) inputs, and turns on the output if the sum exceeds a threshold. e.g. McCulloch-Pitts, elements of the Perceptron, Hopfield discrete nets. A gross simplification...but the collective computational power of a large network of these simple model neurons can be great. • When the model neuron is made more realistic, the computational properties of these networks can be preserved (Hopfield, 1984). • Below, we'll briefly discuss the computational properties of networks of McCulloch-Pitts neurons. Wednesday, September 14, 11 Structure-less models: Continuous signals -- discrete time • Action potential responses can also interpreted in terms of a single scalar continuous value--the spike frequency-- at the ith time interval. Here we ignore the fine-scale temporal structure of a neuron's voltage changes. • The continuous signal, discrete time model gives us the standard the basic building block for the majority of networks considered in this course. Useful for large scale models (thousands of neurons) as in visual processing. • The continuous signal model is an approximation of the "leaky integrate and fire" model (see lecture 3 notes) Wednesday, September 14, 11 Structure-less models: Continuous signals -- continuous time • Quantifies the "Slow potential model". These are analog models, and more realistic than discrete models. Emphasizes nonlinear dynamics, dynamic threshold, refractory period, membrane voltage oscillations. Behavior in networks is represented by systems of coupled differential equations. • "integrate and fire" model -- takes into account membrane capacitance. Threshold is a free parameter. • Hodgkin-Huxley model--Realistic characterization of action potential generation at a point. Parameters defining the model have a physical interpretation (e.g. various sodium and potassium ion currents), that can be used to account for threshold. Models the form and timing of action potentials. • (see course syllabus web page for simulation Hodgin- Huxley.nb) Wednesday, September 14, 11 Structured models • Passive -- cable, compartments • Dynamic - compartmental models Wednesday, September 14, 11 Structured models Passive -- cable, compartments • Dendritic structure shows what a single neuron can compute, e.g. Rall's motion selectivity example Wednesday, September 14, 11 Structured models • Dynamic - compartmental models • Add spike generation components to morphological description Wednesday, September 14, 11 Lecture 3: Neural models • Types of neural models • Structure-less or “point” models • Structured models • McCulloch-Pitts • go to Mathematica Notebook for Lecture 3 • Integrate-and-fire Wednesday, September 14, 11 Take-home message for Lecture 3 For the purposes of this course, much of our modeling will involve structure-less neuron models with continuous signals and discrete time steps. But it is important to understand how this simplification is related to more realistic models, such as the leaky integrate- and-fire model. Wednesday, September 14, 11 Lecture 4: Generic neuron model • Go to Mathematica notebook for Lecture 4 Wednesday, September 14, 11