Advanced Operating Systems COMP9242 Introduction 2COMP9242 03s2 Staff • Lecturer in Charge – Gernot Heiser • Lecturer – Kevin Elphinstone • Various Support Staff – TBA 3COMP9242 03s2 Why are you here? • You’ve done comp3231 – Did well (minimum credit) – You would like delve deeper into issues of modern OS construction – You’d like more challenging projects to really get your hands dirty • Curious about where research is heading in the field of operating systems. • Thinking about doing a thesis project in operating systems 4COMP9242 03s2 What can you expect? • Challenging Project • Lectures in general: – Background required for project – Exposure to local research projects – An in-depth look at OS issues • Building upon the background in COMP3231 – Exposure to recent and seminal research papers – Guest lectures by active researchers (PhD students and local researchers) 5COMP9242 03s2 Project Goal Provide students with a deeper understanding of Operating Systems through practical experience. • Approach: Participate in the design and implementation of a simple operating system (SOS). 6COMP9242 03s2 Project Aims • Provide experience in OS design and development, including: – Microkernel based systems (L4::Pistachio). • User-level OS servers. • User-level page fault handlers. – Device drivers – Performance evaluation – Implications of cache architectures – Exposure to alternative OS Designs • Demonstrate the importance of design • Provide experience of being a team member in a large software project. 7COMP9242 03s2 Project Aims • Expose students to a mostly realistic OS development environment. – Similar to professional OS and or embedded systems developer. • Give an understanding of what’s involved in constructing an entire OS on bare hardware. • Give an understanding of the interaction between low-level software and hardware. • Encourage you to undertake a thesis, or do research within Distributed Systems Group. 8COMP9242 03s2 Prerequisites • Students are expected to be very competent C programmers. • Students are expected to be familiar with – basic computer architecture concepts. – Assembly language (read-only) – Basic RISC processor characteristics (we’ll use a MIPS R4600 for the project) 9COMP9242 03s2 Lectures (subject to change) • Introduction and Overview • Introduction to the L4 Microkernel – L4 system calls and usage (to get you started on the project) • A close look at selected OS issues – Protection, capabilities – Caching, and its implications for OS – Page tables for wide address spaces – SMP issues: locking, cache coherence, scheduling – File systems 10COMP9242 03s2 Lectures (subject to change) • Microkernels and User-level Servers – History and motivation for microkernel systems, Hydra, Mach, discussion, experiences; second- generation microkernel systems, L4, Exokernel, Spin; design and implementation of microkernel-based systems, including user-level page fault handling and device drivers • Microkernel Implementation – A detailed look at a real microkernel (L4Ka::Pistachio). • Persistent systems and Single-address-space operating systems – Concepts and examples; UNSW Mungi project 11COMP9242 03s2 Project/Lab Work • Build a simple operating system (SOS) from the ground up • Major component of the course • Use L4Ka::Pistachio – ported to MIPS here by Carl van Schaik • Develop and test on U4600 computers – R4600 based machines design and built by Kevin Elphinstone and Dave Johnson – “Clean” machine to get your hands dirty • Can also use CPU simulator Simular – Developed locally – Demos must be on real hardware 12COMP9242 03s2 Project • Some warm-up experiments • Students will work in groups of two • End goal: – To produce a small efficient operating system • Project will have a series of due milestones – Demo to pass the milestone and be awarded marks – Help you manage your time – Avoid major problems 13COMP9242 03s2 Milestones • Details released RSN (week 2) • Late milestones – Less than one week late, 25% of the milestone mark is lost – More than one week late, but still less than two weeks late, 50% of the milestone mark lost 14COMP9242 03s2 Alternative Projects • Special arrangements might be made for particular student to do alternative projects – Must be at least as challenging as the original project – Must convince us that you can actually do it. 15COMP9242 03s2 Assessment • 65% for project work • 35% for final exam – A minimum of 14 (40%) required in final exam to pass • Final Exam – 24hr take-home exam – Read and analyse two recent research papers and submit a critical report 16COMP9242 03s2 Textbook • No particular textbook for course – See course web page for useful reference books • Selected research papers referred to in the course L4 and Microkernels Background 18COMP9242 03s2HW Bit Byte Word Register Instructions Application documents windows symbols stacks & heaps arrays & structures variables threads coroutines modules procedures statements File Socket PipeAddress Space PageACLSegment Process TaskThread Event IPCSemaphore Monitor Mutex Priority Schedule Monolithic Kernel 19COMP9242 03s2 Monolithic Kernels - Advantages • Kernel has access to everything, potentially: – All optimizations are possible – All techniques/mechanisms/concepts are implementable • Can be extended by simply adding more code to the kernel 20COMP9242 03s2 Linux Kernel Evolution Linux Kernel Size (.tar.gz) 0 5 10 15 20 25 30 35 31-Jan-93 28-Oct-95 24-Jul-98 19-Apr-01 14-Jan-04 Date S i z e ( M b ) For reference: Linux 2.4.18 = 2.7 million lines of code 21COMP9242 03s2 Approaches to tackling complexity • Monolithic approaches – Layered Kernels – Modular Kernels – Object Oriented Kernels • Alternatives – Extensible Kernels – Microkernels 22COMP9242 03s2 HW Bit Byte Word Register Instructions µ-kernel Address Space Thread Application documents windows symbols stacks & heaps arrays & structures variables threads coroutines modules procedures statements Server File 23COMP9242 03s2 History • monolithic kernels 24COMP9242 03s2 History • monolithic kernels • 1st- generation µ-kernels – Mach – Chorus – Amoeba – (L3) External Pager User-Level Driver CMU, OSF Inria, Chorus Vrije Universiteit GMD 25COMP9242 03s2 Brief History • monolithic kernels • 1st- generation µ-kernels – Mach – Chorus – Amoeba – (L3) • 2nd- generation µ-kernels – (Spin) – Exokernel – L4 CMU, OSF Inria, Chorus Vrije Universiteit GMD U Washington MIT GMD / IBM / UKa User-Level Address Space External Pager User-Level Driver COMP9242 03s2 HW L4 classic OS Security RT MM classic + HW L4 native Java em- bedded app thin HW L4 highly-specialized component specialized COMP9242 03s2 HW L4 SOS Security sosh Project sosh 28COMP9242 03s2 The Great Promise • coexistence of different • APIs • file systems • OS personalities • flexibility • extensibility • simplicity • maintainability • security • safety 29COMP9242 03s2 The Great PromiseThe Big Disaster • SLOW • UNFLEXIBLE • LARGE • coexistence of different • APIs • file systems • OS personalities • flexibility • extensibility • simplicity • maintainability • security • safety Macro µ Performance Flexibilty Simplicity rf r l i ilit Macro µ The 100-µs Disaster 25 MHz 386 50 MHz 486 90 MHz Pentium 133 MHz Alpha The 100-µs Disaster COMP9242 03s2 0 100 200 300 400 0 1000 2000 3000 4000 5000msg len raw move µs / ipcIPC Costs (486, 50 MHz) COMP9242 03s2 0 50 100 0 50 100 150msg len Mac 0 100 200 300 400 0 1000 2000 3000 4000 5000msg len raw move IPC Costs (486, 50 MHz) 36COMP9242 03s2 L4 IPC 0.73 µs (Pentium 166 MHz) 0.91 µs (R4600 100 MHz) 0.10 µs (21164 433 MHz) [cycles] 236 180 20 0 50 100 150 200 250 Pent III P3 Sysops P3 Lipc ? 0.47 µs (P III 500 MHz) 0.36 µs (P III 500 MHz) 0.04 µs (P III 500 MHz) (hopefully) 82 16 23 36 55 7 38 0 20 40 60 80 100 120 140 Pentium R4600 Alpha 37COMP9242 03s2 Minimality Elegance Architectural Integration Efficiency Flexibility • A µ-Kernel does the Job • if Properly Designed • if Carefully Implemented Thesis: 38COMP9242 03s2 When analyzing IPC performance, Cycles are not the only the to consider!! 39COMP9242 03s2 Processor-DRAM Gap (latency) µProc 60%/yr. DRAM 7%/yr. 1 10 100 1000 1 9 8 0 1 9 8 1 1 9 8 3 1 9 8 4 1 9 8 5 1 9 8 6 1 9 8 7 1 9 8 8 1 9 8 9 1 9 9 0 1 9 9 1 1 9 9 2 1 9 9 3 1 9 9 4 1 9 9 5 1 9 9 6 1 9 9 7 1 9 9 8 1 9 9 9 2 0 0 0 DRAM CPU 1 9 8 2 Processor-Memory Performance Gap: (grows 50% / year) P e r f o r m a n c e Time “Moore’s Law” Slide originally from Dave Patterson, Parcon 98 40COMP9242 03s2 Today’s Situation: Microprocessor • Microprocessor-DRAM performance gap – time of a full cache miss in instructions executed 1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 or 136 2nd Alpha (8400): 266 ns/3.3 ns = 80 clks x 4 or 320 3rd Alpha (t.b.d.): 180 ns/1.7 ns =108 clks x 6 or 648 – 1/2X latency x 3X clock rate x 3X Instr/clock ⇒ ≈5X Slide originally from Dave Patterson, Parcon 98 41COMP9242 03s2 99.85% 0.15% L2 cache • 8192 cache lines (256K) • 12 lines used for IPC 98.83% 1.17% L1 cache • 1024 cache lines (16K + 16K) • 12 lines used for IPC Cache Working Sets 42COMP9242 03s2 Other Complications • P4 trace cache – A cache of recently translated µ-ops – Flushed on every page-table switch • Virtual Caches – Need to be flushed on address space switch COMP9242 03s2 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 100 200 400 800 1600 3200 6400 12800 25600 51200 102400 204800 Mach 486 Mach Alpha Chorus 486 Spin Alpha L4 Pentium average cycles between successive IPCs o v e r h e a d d u e t o I P C 44COMP9242 03s2 Α µ-kernel does no real work. µ-Kernel services are only required to overcome µ-kernel constraints. Therefore, µ-kernels have to be infinitely fast! Minimality is the key! • Threads • Address Spaces IPC Mapping 45COMP9242 03s2 Threads, CommunicationAddress Spaces Threads 46COMP9242 03s2 Drivers at User Level User Driver Device INTR = ipc • IO ports: part of the user address space • interrupts: messages from hardware 47COMP9242 03s2 Address Spaces 48COMP9242 03s2 Address Spaces • map • unmap • grant 49COMP9242 03s2 Address Spaces • map • unmap • grant 50COMP9242 03s2 Address Spaces • map • unmap • grant 51COMP9242 03s2 Page Fault Handling Pager "PF" msg map msg Application 52COMP9242 03s2 Page Fault Handling Pager "PF" msg map msg Application PF IPC res IPC 53COMP9242 03s2 Address Spaces Physical Memory 54COMP9242 03s2 Address Spaces Physical Memory Initial AS 55COMP9242 03s2 Address Spaces Physical Memory Initial AS Pager 1 Pager 2 56COMP9242 03s2 Address Spaces Physical Memory Initial AS Pager 1 Pager 2 Pager 3 Pager 4 57COMP9242 03s2 Address Spaces Physical Memory Initial AS Pager 1 Pager 2 Pager 3 Pager 4 Application Application Application Application 58COMP9242 03s2 Address Spaces Physical Memory Initial AS Pager 1 Pager 2 Pager 3 Pager 4 Application Application Application Application Driver Driver 59COMP9242 03s2 Mach Virtual Memory In comparison Physical Memory Paging Policy Mach Application Application Application External Pager 60COMP9242 03s2 Size Comparison Linux (all platforms): 2.7 Million lines Mach 4 x86: 90,000 lines L4Ka::Pistachio/ia32 10,000 lines COMP9242 03s2 HW LN classic OS Security RT MM classic + HW LN native Java em- bedded app thin HW LN highly-specialized component specialized