Technical Report Number 625 Computer Laboratory UCAM-CL-TR-625 ISSN 1476-2986 TCP, UDP, and Sockets: rigorous and experimentally-validated behavioural specification Volume 2: The Specification Steve Bishop, Matthew Fairbairn, Michael Norrish, Peter Sewell, Michael Smith, Keith Wansbrough March 2005 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c© 2005 Steve Bishop, Matthew Fairbairn, Michael Norrish, Peter Sewell, Michael Smith, Keith Wansbrough Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/TechReports/ ISSN 1476-2986 TCP, UDP, and Sockets: rigorous and experimentally-validated behavioural specification Volume 2: The Specification Steve Bishop∗ Matthew Fairbairn∗ Michael Norrish† Peter Sewell∗ Michael Smith∗ Keith Wansbrough∗ ∗University of Cambridge Computer Laboratory †NICTA, Canberra March 18, 2005 Brief Contents Brief Contents i How to read this document iv Full Contents v 1 Utility functions 2 2 Error codes 7 3 Signal names 10 4 Base types 13 5 Network datagram types 25 6 System call types 33 7 Host LTS labels and rule categories 38 8 Rule names 42 9 Timers 45 10 Host types 53 11 Host behavioural parameters 66 12 Auxiliary functions 79 13 Relational monad 103 14 Auxiliary functions for TCP segment creation and drop 106 15 Host LTS: Socket Calls 124 15.1 accept() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 15.2 bind() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 15.3 close() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 15.4 connect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 15.5 disconnect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 15.6 dup() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 15.7 dupfd() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.8 getfileflags() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 15.9 getifaddrs() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 15.10 getpeername() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.11 getsockbopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 15.12 getsockerr() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 15.13 getsocklistening() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 15.14 getsockname() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 15.15 getsocknopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 15.16 getsocktopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 15.17 listen() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 15.18 pselect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 15.19 recv() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 i BRIEF CONTENTS ii 15.20 recv() (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 15.21 send() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 15.22 send() (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 15.23 setfileflags() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 15.24 setsockbopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.25 setsocknopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 15.26 setsocktopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 15.27 shutdown() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 15.28 sockatmark() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 15.29 socket() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 16 Host LTS: TCP Input Processing 278 – deliver in 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 – deliver in 1b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 – deliver in 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 – deliver in 2a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 – deliver in 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 – deliver in 3a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 – deliver in 3b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 – deliver in 3c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 – deliver in 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 – deliver in 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 – deliver in 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 – deliver in 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 – deliver in 7a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 – deliver in 7b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 – deliver in 7c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 – deliver in 7d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 – deliver in 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 – deliver in 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 17 Host LTS: TCP Output 322 – deliver out 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 18 Host LTS: TCP Timers 325 – timer tt rexmtsyn 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 – timer tt rexmt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 – timer tt persist 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 – timer tt keep 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 – timer tt 2msl 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 – timer tt delack 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 – timer tt conn est 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 – timer tt fin wait 2 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 19 Host LTS: UDP Input Processing 333 – deliver in udp 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 – deliver in udp 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 – deliver in udp 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 20 Host LTS: ICMP Input Processing 335 – deliver in icmp 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 – deliver in icmp 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 – deliver in icmp 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 – deliver in icmp 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 – deliver in icmp 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 – deliver in icmp 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 – deliver in icmp 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 21 Host LTS: Network Input and Output 341 – deliver in 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver in 99a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver out 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver loop 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 22 Host LTS: BSD Trace Records and Interface State Changes 343 Rule version: BRIEF CONTENTS iii 23 Host LTS: Time Passage 345 24 Initial state 351 Index 354 Rule version: BRIEF CONTENTS iv How to read this document This document is a rigorous specification of the behaviour of TCP, UDP, and the Sockets interface, experi- mentally validated against the behaviour of several implementations. It is written in the higher order logic of the HOL system. For a full discussion of the specification we refer the reader to the companion Volume 1: Overview and especially to the section there titled “The Specification — Introduction”, which gives a brief introduction to the HOL language and to the structure of the model. The specification is organised as a reference (in approximately the logical order in which it is presented to the HOL system), not as a tutorial. To read it one should first look at the key types used (base types, network datagram types, and host types) and then browse the Host LTS Socket Call rules and TCP and UDP input and output processing rules. Rule version: Full Contents Brief Contents i How to read this document iv Full Contents v I TCP1 utils 1 1 Utility functions 2 1.1 Basic utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – funupd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – funupd list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – clip int to num . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – left shift num . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – right shift num . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – rounddown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – roundup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – real of int . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – num floor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – num floor and frac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – fm exists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 – onlywhen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 List utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – SPLIT REV 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – SPLIT REV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – SPLIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – TAKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – DROP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – TAKEWHILE REV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – TAKEWHILE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – REPLICATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – decr list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – NOTIN ′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – MAP OPTIONAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – CONCAT OPTIONAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – ORDERINGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 – INSERT ORDERED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 – ASSERTION FAILURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 v FULL CONTENTS vi II TCP1 errors 6 2 Error codes 7 2.1 The type of errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 – error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 III TCP1 signals 9 3 Signal names 10 3.1 The type of signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 – signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 IV TCP1 baseTypes 12 4 Base types 13 4.1 Network and OS-related types (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 – port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 – ip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 – ifid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 – netmask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 – fd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2 File and socket flags (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 – filebflag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 – sockbflag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 – socknflag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 – socktflag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 – msgbflag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 – socktype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Language interaction types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 – tid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 – err . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 – TLang type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 – TLang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 – tlang typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Time types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – type abbrev duration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time lt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time lte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time gt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time gte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time plus dur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time minus dur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Rule version: FULL CONTENTS vii – real mult time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 – time zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 – duration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 – abstime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 – realopt of time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 – the time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.5 Basic network types: sequence numbers (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . 20 4.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – type abbrev byte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 plus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 minus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 plus ′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 minus ′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 lt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 leq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 gt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 geq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 fromto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 coerce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 min . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – seq32 max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 – tcpLocal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tcpForeign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – type abbrev tcp seq local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – type abbrev tcp seq foreign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tcp seq local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tcp seq foreign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tcp seq local to foreign . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tcp seq foreign to local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – tstamp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 – type abbrev ts seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 – ts seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 V TCP1 netTypes 24 5 Network datagram types 25 5.1 TCP segments (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 – tcpSegment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 – sane seg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.2 UDP datagrams (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 – udpDatagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 – sane udpdgm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.3 ICMP datagrams (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 – protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 – icmp unreach code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 – icmp source quench code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 – icmp redirect code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 – icmp time exceeded code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 – icmp paramprob code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Rule version: FULL CONTENTS viii – icmpType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 – icmpDatagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.4 IP messages (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 – msg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 – sane msg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 – msg is1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 – msg is2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 VI TCP1 LIBinterface 32 6 System call types 33 6.1 The interface (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 – LIB interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 – retType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.2 Useful groups of calls (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 – fd op . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 – fd sockop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 VII TCP1 host0 37 7 Host LTS labels and rule categories 38 7.1 Transition labels (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 – Lhost0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.2 Rule categories (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 7.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – rule proto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – rule status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – rule cat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – urgent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – nonurgent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 – is urgent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 VIII TCP1 ruleids 41 8 Rule names 42 8.1 names (Rule only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 – rule ids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 IX TCP1 timers 44 9 Timers 45 9.1 Properties (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 9.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 9.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 – time pass additive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Rule version: FULL CONTENTS ix – time pass trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 – opttorel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.2 Basic timer timer (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – fuzzy timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – sharp timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – never timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – upper timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – timer expires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 – Time Pass timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 9.3 Deadline timer timed (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 9.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 9.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 – timed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 – timed val of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 – timed timer of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 – timed expires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 – Time Pass timed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 9.4 Time-window timer timewindow (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . 48 9.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 9.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 – timewindow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 – timewindow val of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 – timewindow open . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 – Time Pass timewindow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 9.5 Ticker ticker (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 9.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 9.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – ticker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – ticks of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – Time Pass ticker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – ticker ok . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – tick imin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 – tick imax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 9.6 Stopwatch stopwatch (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 9.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 9.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 – stopwatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 – stopwatch val of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 – Time Pass stopwatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 X TCP1 hostTypes 52 10 Host types 53 10.1 Files (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 10.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 10.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – fid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – sid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – filetype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – fileflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 – File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 10.2 TCP states (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Rule version: FULL CONTENTS x – tcpstate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.3 The TCP control block (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 10.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 – tcpReassSegment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 – rexmtmode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 – rttinf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 – tcpcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 10.4 Sockets (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 10.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 10.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 – iobc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 – socket listen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 – tcp socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – dgram msg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – dgram error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – dgram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – udp socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – sockflags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – protocol info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 – socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – TCP Sock0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – TCP Sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – UDP Sock0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – UDP Sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – Sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – tcp sock of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – udp sock of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – proto of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 – proto eq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 10.5 The host (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 10.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 10.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 – arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 – ifd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 – routing table entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 – type abbrev routing table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 – bandlim reason . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 – type abbrev bandlim state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 – hostThreadState . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 – host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 10.6 Trace records (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 10.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 10.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 – traceflavour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 – type abbrev tracerecord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 – tracecb eq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 – tracesock eq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 XI TCP1 params 65 11 Host behavioural parameters 66 11.1 Model parameters (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 11.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 11.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 – INFINITE RESOURCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 – BSD RTTVAR BUG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 11.2 Scheduling parameters (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Rule version: FULL CONTENTS xi 11.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 11.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 – dschedmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 – diqmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 – doqmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 11.3 Timers (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 11.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 11.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 – HZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – tickintvlmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – tickintvlmax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – stopwatchfuzz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – stopwatch zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – SLOW TIMER INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – SLOW TIMER MODEL INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – FAST TIMER INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – FAST TIMER MODEL INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – KERN TIMER INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 – KERN TIMER MODEL INTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 11.4 Ports, sockets, and files (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 11.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 11.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – privileged ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – ephemeral ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – OPEN MAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – OPEN MAX FD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – FD SETSIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 – SOMAXCONN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.5 UDP parameters (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 – UDPpayloadMax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.6 Buffers (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 – MCLBYTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 – MSIZE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 – SB MAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 – oob extra sndbuf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 11.7 File and socket flag defaults (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 11.7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 11.7.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 – ff default b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 – ff default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 – sf default b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 – sf default n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 – sf default t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 – sf default . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 – sf min n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 – sf max n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 – sndrcv timeo t max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 – pselect timeo t max . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 11.8 RFC-specified limits (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 11.8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 11.8.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 – dtsinval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 – TCP MAXWIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 – TCP MAXWINSCALE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Rule version: FULL CONTENTS xii 11.9 Protocol parameters (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 11.9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 11.9.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – MSSDFLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – SS FLTSZ LOCAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – SS FLTSZ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – TCP DO NEWRENO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – TCP Q0MINLIMIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – TCP Q0MAXLIMIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 – backlog fudge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 11.10 Time values (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 11.10.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 11.10.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV DELACK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV RTOBASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV RTTVARBASE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV MIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV REXMTMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 – TCPTV MSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV PERSMIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV PERSMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV KEEP INIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV KEEP IDLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV KEEPINTVL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV KEEPCNT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCPTV MAXIDLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 11.11 Timing-related parameters (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 11.11.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 11.11.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCP BSD BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCP LINUX BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCP WINXP BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 – TCP MAXRXTSHIFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 – TCP SYNACKMAXRXTSHIFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 – TCP SYN BSD BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 – TCP SYN LINUX BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 – TCP SYN WINXP BACKOFFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 XII TCP1 auxFns 78 12 Auxiliary functions 79 12.1 Architecture handling (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 – windows arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 – bsd arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 – linux arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 – unix arch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.2 Interfaces and IP addresses (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 12.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – mask . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – mask bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – IN MULTICAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – INADDR BROADCAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – LOOPBACK ADDRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – ip localhost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Rule version: FULL CONTENTS xiii – in loopback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – in local . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – local ips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – local primary ips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – is localnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – if broadcast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – if any . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 – is broadormulticast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 – routeable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 – outroute ifids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 – ifid up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 – outroute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 – auto outroute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 – test outroute ip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 – test outroute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 – loopback on wire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 12.3 Files, file descriptors, and sockets (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . 83 12.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 12.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 – fdlt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 – fdle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 – leastfd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 – nextfd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 – fid ref count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 – sane socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 12.4 Binding (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 12.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 12.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – bound ports protocol autobind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – bound port allowed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – autobind . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – bound after . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – match score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 – lookup udp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 – tcp socket best match . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 – lookup icmp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 12.5 Timers (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 12.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 12.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – slow timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – fast timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – kern timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – sched timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – inqueue timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 – outqueue timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 12.6 Time values for socket options (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . 89 12.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 12.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 – time of tltime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 – time of tltimeopt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 – tltimeopt wf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 – tltimeopt of time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 12.7 Queues (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 12.7.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 12.7.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – enqueue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – enqueue iq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – enqueue oq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Rule version: FULL CONTENTS xiv – dequeue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – dequeue iq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – dequeue oq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 – route and enqueue oq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – enqueue list qinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – enqueue list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – enqueue oq list qinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – enqueue oq list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – accept incoming q0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – accept incoming q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 – drop from q0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 12.8 TCP Options (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 12.8.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 12.8.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 – do tcp options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 – calculate tcp options len . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 12.9 Buffers, windows, and queues (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 12.9.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 12.9.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 – calculate buf sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 – calculate bsd rcv wnd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 – send queue space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 12.10 Band limiting (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 12.10.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 12.10.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 – bandlim state init . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 – bandlim rst ok always . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 – simple limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 – bandlim rst ok simple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 – bandlim rst ok . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 – enqueue oq bndlim rst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 12.11 UDP support (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 12.11.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 12.11.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 – dosend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 12.12 TCP timing and RTT (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 12.12.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 12.12.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 – tcp backoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 – tcp syn backoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 – mode of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – shift of . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – computed rto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – computed rxtcur . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – start tt rexmt gen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – start tt rexmt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – start tt rexmtsyn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – start tt persist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 – update rtt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 – expand cwnd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 12.13 Path MTU Discovery (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 12.13.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 12.13.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 – next smaller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 – mtu tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 12.14 Reassembly (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 12.14.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 12.14.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Rule version: FULL CONTENTS xv – tcp reass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 – tcp reass prune . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 12.15 The initial TCP control block (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 12.15.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 12.15.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 – initial cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 13 Relational monad 103 13.1 Relational monad (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 13.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 13.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – andThen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – cont . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – assert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – assert failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – chooseM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – get sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – get tcp sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – get cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – modify sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – modify tcp sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – modify cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 – emit segs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 – emit segs pred . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 – mliftc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 – mliftc bndlm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 14 Auxiliary functions for TCP segment creation and drop 106 14.1 SYN and RST Segment Creation (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 14.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 14.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 – make syn segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 – make syn ack segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 – make ack segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 – bsd make phantom segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 – make rst segment from cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 – make rst segment from seg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 14.2 General Segment Creation (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 14.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 14.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 – tcp output required . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 – tcp output really . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 – tcp output perhaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 14.3 Segment Queueing (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 14.3.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 14.3.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 – rollback tcp output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 – enqueue or fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 – enqueue or fail sock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 – enqueue and ignore fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 – enqueue each and ignore fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 – mlift tcp output perhaps or fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 14.4 Incoming Segment Functions (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.4.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.4.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 – update idle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.5 Drop Segment Functions (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 14.5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Rule version: FULL CONTENTS xvi 14.5.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 – dropwithreset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 – mlift dropafterack or fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 – dropwithreset ignore fail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 14.6 Close Functions (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 14.6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 14.6.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 – tcp close . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 – tcp drop and close . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 XIII TCP1 hostLTS 123 15 Host LTS: Socket Calls 124 15.1 accept() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 15.1.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 15.1.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 15.1.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 15.1.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 15.1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 15.1.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 – accept 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 – accept 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 – accept 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 – accept 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 – accept 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 – accept 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 – accept 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 15.2 bind() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 15.2.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 15.2.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 15.2.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 15.2.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 15.2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 15.2.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 – bind 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 – bind 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 – bind 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 – bind 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 – bind 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 – bind 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 15.3 close() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 15.3.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.3.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.3.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.3.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 15.3.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 – close 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 – close 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 – close 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 – close 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 – close 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 – close 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 – close 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 – close 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 – close 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 15.4 connect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 15.4.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Rule version: FULL CONTENTS xvii 15.4.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.4.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.4.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 15.4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 15.4.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 – connect 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 – connect 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 – connect 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 – connect 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 – connect 4a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 – connect 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 – connect 5a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 – connect 5b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 – connect 5c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 – connect 5d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 – connect 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 – connect 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 – connect 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 – connect 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 – connect 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 15.5 disconnect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 15.5.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 15.5.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 15.5.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 15.5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 15.5.5 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 – disconnect 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 – disconnect 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 – disconnect 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 – disconnect 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 – disconnect 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 15.6 dup() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 15.6.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 15.6.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 15.6.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 15.6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 15.6.5 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 – dup 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 – dup 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 15.7 dupfd() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.7.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.7.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.7.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 15.7.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 15.7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 15.7.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 – dupfd 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 – dupfd 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 – dupfd 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 15.8 getfileflags() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 15.8.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.8.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.8.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.8.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.8.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 – getfileflags 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 15.9 getifaddrs() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Rule version: FULL CONTENTS xviii 15.9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 15.9.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 15.9.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 15.9.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.9.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 – getifaddrs 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.10 getpeername() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 15.10.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 15.10.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 15.10.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 15.10.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 15.10.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 15.10.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 – getpeername 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 – getpeername 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 15.11 getsockbopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 15.11.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 15.11.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 15.11.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 15.11.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 15.11.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 15.11.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 – getsockbopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 – getsockbopt 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 15.12 getsockerr() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 15.12.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 15.12.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 15.12.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 15.12.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 15.12.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 15.12.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 – getsockerr 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 – getsockerr 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 15.13 getsocklistening() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 15.13.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 15.13.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 15.13.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 15.13.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 15.13.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 15.13.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 – getsocklistening 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 – getsocklistening 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 – getsocklistening 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 15.14 getsockname() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 15.14.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 15.14.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 15.14.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 15.14.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 15.14.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 15.14.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 – getsockname 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 – getsockname 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 – getsockname 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 15.15 getsocknopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 15.15.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 15.15.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 15.15.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Rule version: FULL CONTENTS xix 15.15.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 15.15.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 15.15.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 – getsocknopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 – getsocknopt 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 15.16 getsocktopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 15.16.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 15.16.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 15.16.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 15.16.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 15.16.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 15.16.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 – getsocktopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 – getsocktopt 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 15.17 listen() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 15.17.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 15.17.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 15.17.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 15.17.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 15.17.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 15.17.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 – listen 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 – listen 1b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 – listen 1c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 – listen 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 – listen 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 – listen 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 – listen 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 – listen 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 15.18 pselect() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 15.18.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 15.18.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 15.18.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 15.18.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 15.18.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 15.18.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 – pselect 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 – soreadable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 – sowriteable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 – soexceptional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 – pselect 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 – pselect 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 – pselect 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 – pselect 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 – pselect 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 15.19 recv() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 15.19.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 15.19.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 15.19.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 15.19.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 15.19.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 15.19.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 – recv 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 – recv 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 – recv 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 – recv 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 – recv 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 – recv 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Rule version: FULL CONTENTS xx – recv 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 – recv 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 – recv 8a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 – recv 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 15.20 recv() (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 15.20.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 15.20.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 15.20.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 15.20.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 15.20.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 15.20.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 – recv 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 – recv 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 – recv 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 – recv 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 – recv 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 – recv 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 – recv 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 – recv 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 – recv 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 – recv 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 – recv 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 – recv 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 15.21 send() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 15.21.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 15.21.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 15.21.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 15.21.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 15.21.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 15.21.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 – send 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 – send 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 – send 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 – send 3a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 – send 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 – send 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 – send 5a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 – send 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 – send 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 – send 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 15.22 send() (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 15.22.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 15.22.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 15.22.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 15.22.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 15.22.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 15.22.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 – send 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 – send 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 – send 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 – send 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 – send 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 – send 14 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 – send 15 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 – send 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 – send 17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 – send 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 – send 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 Rule version: FULL CONTENTS xxi – send 21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 – send 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 – send 23 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 15.23 setfileflags() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 15.23.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 15.23.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 15.23.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 15.23.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 15.23.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 15.23.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 – setfileflags 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 15.24 setsockbopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.24.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.24.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.24.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 15.24.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.24.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.24.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 – setsockbopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 – setsockbopt 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 15.25 setsocknopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 15.25.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 15.25.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 15.25.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 15.25.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 15.25.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 15.25.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 – setsocknopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 – setsocknopt 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 – setsocknopt 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 15.26 setsocktopt() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 15.26.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 15.26.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 15.26.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 15.26.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 15.26.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 15.26.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 – setsocktopt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 – setsocktopt 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 – setsocktopt 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 15.27 shutdown() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 15.27.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 15.27.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 15.27.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 15.27.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 15.27.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 15.27.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 – shutdown 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 – shutdown 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 – shutdown 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 – shutdown 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 15.28 sockatmark() (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 15.28.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 15.28.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 15.28.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 15.28.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 15.28.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 15.28.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Rule version: FULL CONTENTS xxii – sockatmark 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 – sockatmark 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 15.29 socket() (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 15.29.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 15.29.2 Common cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 15.29.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 15.29.4 Model details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 15.29.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 15.29.6 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 – socket 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 – socket 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 15.30 Miscellaneous (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 15.30.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 15.30.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 15.30.3 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 – return 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 – badf 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 – notsock 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 – intr 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 – resourcefail 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 – resourcefail 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 16 Host LTS: TCP Input Processing 278 16.1 Input Processing (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 16.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278 16.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 – deliver in 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 – deliver in 1b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 – deliver in 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 – deliver in 2a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 – deliver in 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 – di3 topstuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 – di3 newackstuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 – di3 ackstuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 – di3 datastuff really . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 – di3 datastuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 – di3 ststuff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 – di3 socks update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 – deliver in 3a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 – deliver in 3b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 – deliver in 3c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 – deliver in 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 – deliver in 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 – deliver in 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 – deliver in 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 – deliver in 7a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 – deliver in 7b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 – deliver in 7c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 – deliver in 7d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318 – deliver in 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 – deliver in 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 17 Host LTS: TCP Output 322 17.1 Output (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 17.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 17.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 – deliver out 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 Rule version: FULL CONTENTS xxiii 18 Host LTS: TCP Timers 325 18.1 Timers (TCP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 18.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 18.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 – timer tt rexmtsyn 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325 – timer tt rexmt 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 – timer tt persist 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 – timer tt keep 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 – timer tt 2msl 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 – timer tt delack 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 – timer tt conn est 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 – timer tt fin wait 2 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 19 Host LTS: UDP Input Processing 333 19.1 Input Processing (UDP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 19.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 19.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 – deliver in udp 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 – deliver in udp 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 – deliver in udp 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 20 Host LTS: ICMP Input Processing 335 20.1 Input Processing (ICMP only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 20.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 20.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 – deliver in icmp 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 – deliver in icmp 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 – deliver in icmp 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 – deliver in icmp 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 – deliver in icmp 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 – deliver in icmp 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 – deliver in icmp 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 21 Host LTS: Network Input and Output 341 21.1 Input and Output (Network only) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 21.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 21.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver in 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver in 99a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver out 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 – deliver loop 99 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 22 Host LTS: BSD Trace Records and Interface State Changes 343 22.1 Trace Records and Interface State Changes (BSD only) . . . . . . . . . . . . . . . . . . . . . . 343 22.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 22.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 – trace 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 – trace 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 – interface 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 23 Host LTS: Time Passage 345 23.1 Time Passage auxiliaries (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 23.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 23.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 – Time Pass timedoption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 – Time Pass tcpcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345 – Time Pass socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 – fmap every . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 – fmap every pred . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 Rule version: FULL CONTENTS xxiv – Time Pass host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 23.2 Host transitions with time (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 23.2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 23.2.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 – epsilon 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 – epsilon 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 – rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 XIV TCP1 evalSupport 350 24 Initial state 351 24.1 Initial state (TCP and UDP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 24.1.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 24.1.2 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 – simple ifd eth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 – simple ifd lo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 – simple rttab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 – tid initial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 – simple host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 – dummy cb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 – dummy socket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 – dummy sockets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 – initial host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Index 354 Rule version: Part I TCP1 utils 1 Chapter 1 Utility functions This file contains various utility functions and definitions, for functions, lists, and numeric types, that are used throughout the specification. 1.1 Basic utilities Basic utilities for functions, numbers, maps, and records. 1.1.1 Summary funupd update one point of a function funupd list update multiple points of a function clip int to num clip int to num left shift num left shift, written right shift num right shift, written rounddown round v down to multiple of bs, unless v < bs already roundup round v up to next multiple of bs; if v = k ∗bs then no change real of int inject int into real num floor num floor of real num floor and frac num floor and fractional part of real fm exists finite map exists, written ∃(k , v) :: fm.P(k , v) onlywhen used for conditional record updates 1.1.2 Rules – update one point of a function: f ⊕ (x 7→ y) = λx ′.if x ′ = x then y else f x ′ – update multiple points of a function: funupd list f xys = foldl(λf (x , y).f ⊕ (x 7→ y))f xys – clip int to num : clip int to num(i : int) = if i < 0 then 0 else num i – left shift, written : left shift num(n : num)(i : num) = n ∗ 2 ∗∗ i – right shift, written : right shift num(n : num)(i : num) = n div 2 ∗∗ i – round v down to multiple of bs, unless v < bs already : rounddown bs v = if v < bs then v else (v div bs) ∗ bs – round v up to next multiple of bs; if v = k ∗ bs then no change : 2 SPLIT REV 0 3 roundup bs v = ((v + (bs − 1))div bs) ∗ bs – inject int into real : real of int(i : int) = if i < 0 then ¬(real of num(num¬i)) else real of num(num i) – num floor of real : num floor(x : real) = least(n : num). real of num(n + 1) > x – num floor and fractional part of real : num floor and frac(x : real) = let n = least(n : num). real of num(n + 1) > x in (n, x − real of num n) – finite map exists, written ∃(k , v) :: fm.P(k , v) : fm exists fm P = ∃k .k ∈ dom(fm) ∧ P(k , fm[k ]) – used for conditional record updates : (x onlywhen b) = if b then K x else I 1.2 List utilities This section contains a number of basic functions for manipulating lists. 1.2.1 Summary SPLIT REV 0 split worker function SPLIT REV split a list after n elements, returning the reversed prefix and the remainder SPLIT split a list after n elements, returning the prefix and the remainder TAKE take the first n elements of a list DROP drop the first n elements of a list TAKEWHILE REV split a list at first element not satisfying p, returning reversed prefix and remainder TAKEWHILE split a list at first element not satisfying p, returning prefix and remainder REPLICATE make a list of n copies of x decr list decrement a list of nums by a num, dropping any that count below zero NOTIN ′ not in MAP OPTIONAL map with optional result CONCAT OPTIONAL concatentation of option list that drops all ∗s ORDERINGS the set of all orderings of a set INSERT ORDERED insert ordered 1.2.2 Rules – split worker function: (SPLIT REV 0 0 ls rs = (ls, rs)) ∧ (SPLIT REV 0(SUC n)ls(r :: rs) = SPLIT REV 0 n(r :: ls)rs) ∧ (SPLIT REV 0(SUC n)ls[ ] = (ls, [ ])) – split a list after n elements, returning the reversed prefix and the remainder: Rule version: $Id: TCP1 utilsScript.sml,v 1.69 2005/02/07 15:12:27 kw217 Exp $ ASSERTION FAILURE 4 SPLIT REV n rs = SPLIT REV 0 n[ ]rs – split a list after n elements, returning the prefix and the remainder: SPLIT n rs = let (ls, rs) = SPLIT REV n rs in (REVERSE ls, rs) – take the first n elements of a list: TAKE n rs = let (ls, rs) = SPLIT REV n rs in REVERSE ls – drop the first n elements of a list: DROP n rs = let (ls, rs) = SPLIT REV n rs in rs – split a list at first element not satisfying p, returning reversed prefix and remainder: TAKEWHILE REV p ls(r :: rs) = TAKEWHILE REV p(if p r then (r :: ls) else ls)rs ∧ TAKEWHILE REV p ls[ ] = ls – split a list at first element not satisfying p, returning prefix and remainder: TAKEWHILE p rs = REVERSE (TAKEWHILE REV p[ ]rs) – make a list of n copies of x : (REPLICATE 0 x = [ ]) ∧ (REPLICATE(SUC n)x = x :: REPLICATE n x ) – decrement a list of nums by a num, dropping any that count below zero: ((decr list : num→ num list→ num list) d [ ] = [ ]) ∧ (decr list d(n :: ns) = (if n < d then I else CONS (n − d))(decr list d ns)) – not in : (x /∈ y) = ¬(mem x y) – map with optional result: MAP OPTIONAL f (x :: xs) = append(case f x of ∗ → [ ] ‖ ↑ y → [y ]) (MAP OPTIONAL f xs) ∧ MAP OPTIONAL f [ ] = [ ] – concatentation of option list that drops all ∗s: CONCAT OPTIONAL xs = MAP OPTIONAL I xs – the set of all orderings of a set : ORDERINGS s l = (list to set l = s ∧ length l = card s) – insert ordered: INSERT ORDERED new old bad = filter(λfd .fd ∈ new ∨ fd ∈ bad)old 1.3 Assertions This definition is an alias for false, which induces the checker to emit a special message indicating an assertion failure. 1.3.1 Summary ASSERTION FAILURE assertion failure (causes checker to halt) 1.3.2 Rules – assertion failure (causes checker to halt) : Rule version: $Id: TCP1 utilsScript.sml,v 1.69 2005/02/07 15:12:27 kw217 Exp $ ASSERTION FAILURE 5 ASSERTION FAILURE (s : string) = F Rule version: $Id: TCP1 utilsScript.sml,v 1.69 2005/02/07 15:12:27 kw217 Exp $ Part II TCP1 errors 6 Chapter 2 Error codes This file contains the datatype of all possible error codes. The names are generally the common Unix ones; in the case of Winsock, the obvious mapping is used. Not all error codes are used in the body of the specification; those that are are described in the ‘Errors’ section of each socket call. 2.1 The type of errors The union of all (relevant) errors on the supported architectures. 2.1.1 Summary error 2.1.2 Rules – : error = E2BIG | EACCES | EADDRINUSE | EADDRNOTAVAIL | EAFNOSUPPORT | EAGAIN | EWOULDBLOCK (* only used if EWOULDBLOCK 6= EAGAIN *) | EALREADY | EBADF | EBADMSG | EBUSY | ECANCELED | ECHILD | ECONNABORTED | ECONNREFUSED | ECONNRESET | EDEADLK | EDESTADDRREQ | EDOM | EDQUOT | EEXIST | EFAULT | EFBIG | EHOSTUNREACH 7 error 8 | EIDRM | EILSEQ | EINPROGRESS | EINTR | EINVAL | EIO | EISCONN | EISDIR | ELOOP | EMFILE | EMLINK | EMSGSIZE | EMULTIHOP | ENAMETOOLONG | ENETDOWN | ENETRESET | ENETUNREACH | ENFILE | ENOBUFS | ENODATA | ENODEV | ENOENT | ENOEXEC | ENOLCK | ENOLINK | ENOMEM | ENOMSG | ENOPROTOOPT | ENOSPC | ENOSR | ENOSTR | ENOSYS | ENOTCONN | ENOTDIR | ENOTEMPTY | ENOTSOCK | ENOTSUP | ENOTTY | ENXIO | EOPNOTSUPP | EOVERFLOW | EPERM | EPIPE | EPROTO | EPROTONOSUPPORT | EPROTOTYPE | ERANGE | EROFS | ESPIPE | ESRCH | ESTALE | ETIME | ETIMEDOUT | ETXTBSY | EXDEV | ESHUTDOWN | EHOSTDOWN Rule version: $Id: TCP1 errorsScript.sml,v 1.16 2004/12/09 15:43:08 kw217 Exp $ Part III TCP1 signals 9 Chapter 3 Signal names This file contains the datatype of signal names, with all the signals known to POSIX, Linux, and BSD. The specification does not model signal behaviour in detail, however: it treats them very nondeterministically. 3.1 The type of signals The union of the signals suported by the target architectures. Names based on POSIX. 3.1.1 Summary signal 3.1.2 Rules – : signal = SIGABRT | SIGALRM | SIGBUS | SIGCHLD | SIGCONT | SIGFPE | SIGHUP | SIGILL | SIGINT | SIGKILL | SIGPIPE | SIGQUIT | SIGSEGV | SIGSTOP | SIGTERM | SIGTSTP | SIGTTIN | SIGTTOU | SIGUSR1 | SIGUSR2 | SIGPOLL(* XSI only *) | SIGPROF(* XSI only *) | SIGSYS(* XSI only *) | SIGTRAP(* XSI only *) | SIGURG | SIGVTALRM(* XSI only *) 10 signal 11 | SIGXCPU(* XSI only *) | SIGXFSZ(* XSI only *) Rule version: $Id: TCP1 signalsScript.sml,v 1.12 2004/12/09 16:09:34 kw217 Exp $ Part IV TCP1 baseTypes 12 Chapter 4 Base types This file defines basic types used throughout the specification. 4.1 Network and OS-related types (TCP and UDP) The specification distinguishes between the types port and ip, for which we do not use the zero values, and option types port option and ip option, with values ∗ (modelling the zero values) and ↑ p and ↑ i , modelling the non-zero values. Zero values are used as wildcards in some places and are forbidden in others; this typing lets that be captured explicitly. 4.1.1 Summary port ip ifid netmask fd 4.1.2 Rules – : port = Port of num (* really 16 bits, non-zero *) Description TCP or UDP port number, non-zero. – : ip = ip of num (* really 32 bits, non-zero *) Description IPv4 address, non-zero. – : ifid = LO | ETH of num 13 sockbflag 14 Description Interface ID: either the loopback interface, or a numbered Ethernet interface. – : netmask = NETMASK of num Description Network mask, represented as the number of 1 bits (as in a CIDR /nn suffix). – : fd = FD of num Description File descriptor. On Unix-like systems this is a small nonnegative integer; on Windows it is an arbitrary handle. 4.2 File and socket flags (TCP and UDP) This defines the types of various flags used in the sockets API: file flags, socket flags, message flags (used in send and recv calls), and socket types (used in socket calls). The socket flags are partitioned into those with boolean, natural-number and time-valued arguments. 4.2.1 Summary filebflag sockbflag socknflag socktflag msgbflag socktype 4.2.2 Rules – : filebflag = O NONBLOCK | O ASYNC Description Boolean flags affecting the behaviour of an open file (or socket). O NONBLOCK makes all operations on this file (or socket) nonblocking. O ASYNC specifies whether signal driven I/O is enabled. – : sockbflag = SO BSDCOMPAT(* Linux only *) | SO REUSEADDR | SO KEEPALIVE Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ msgbflag 15 | SO OOBINLINE(* ? *) | SO DONTROUTE Description Boolean flags affecting the behaviour of a socket. SO BSDCOMPAT Specifies whether the BSD semantics for delivery of ICMPs to UDP sockets with no peer address set is enabled. SO DONTROUTE Requests that outgoing messages bypass the standard routing facilities. The destina- tion shall be on a directly-connected network, and messages are directed to the appropriate network interface according to the destination address. SO KEEPALIVE Keeps connections active by enabling the periodic transmission of messages, if this is supported by the protocol. SO OOBINLINE Leaves received out-of-band data (data marked urgent) inline. SO REUSEADDR Specifies that the rules used in validating addresses supplied to bind() should allow reuse of local ports, if this is supported by the protocol. Variations Linux The flag SO BSDCOMPAT is Linux-only. – : socknflag = SO SNDBUF | SO RCVBUF | SO SNDLOWAT | SO RCVLOWAT Description Natural-number flags affecting the behaviour of a socket. SO SNDBUF Specifies the send buffer size. SO RCVBUF Specifies the receive buffer size. SO SNDLOWAT Specifies the minimum number of bytes to process for socket output operations. SO RCVLOWAT Specifies the minimum number of bytes to process for socket input operations. – : socktflag = SO LINGER | SO SNDTIMEO | SO RCVTIMEO Description Time-valued flags affecting the behaviour of a socket. SO LINGER specifies a maximum duration that a close(fd) call is permitted to block. SO RCVTIMEO specifies the timeout value for input operations. SO SNDTIMEO specifies the timeout value for an output function blocking because flow control prevents data from being sent. – : msgbflag =MSG PEEK(* recv only, [in] *) |MSG OOB(* recv and send, [in] *) |MSG WAITALL(* recv only, [in] *) |MSG DONTWAIT(* recv and send, [in] *) Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ TLang type 16 Description Boolean flags affecting the behaviour of a send or recv call. MSG DONTWAIT: Do not block if there is no data available. MSG OOB: Return out-of-band data. MSG PEEK: Read data but do not remove it from the socket’s receive queue. MSG WAITALL: Block untill all n bytes of data are available. – : socktype = SOCK STREAM | SOCK DGRAM Description The two different flavours of socket, as passed to the socket call, SOCK STREAM for TCP and SOCK DGRAM for UDP. 4.3 Language interaction types The specification makes almost no assumptions on the programming language used to drive sockets calls. It supposes that calls are made by threads, with thread IDs of type tid, and that calls return values of the err types indicating success or failure. Our OCaml binding maps the latter to exceptions. Values occuring as arguments or results of sockets calls are typed. There is a HOL type TLang type of the names of these types and a HOL type TLang which is a disjoint union of all of their values. An inductive definition defines a typing relation between the two. 4.3.1 Summary tid err TLang type TLang tlang typing 4.3.2 Rules – : tid = TID of num Description Thread IDs. – : err = OK of ′a | FAIL of error Description Each library call returns either success (OK v) or failure (FAIL err). Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ tlang typing 17 – : TLang type = TLty int | TLty bool | TLty string | TLty one | TLty pair of (TLang type#TLang type) | TLty list of TLang type | TLty lift of TLang type | TLty err of TLang type | TLty fd | TLty ip | TLty port | TLty error | TLty netmask | TLty ifid | TLty filebflag | TLty sockbflag | TLty socknflag | TLty socktflag | TLty socktype | TLty tid | TLty signal Description Type names for language types that are used in the sockets API. – : TLang = TL int of int | TL bool of bool | TL string of string | TL one of () | TL pair of TLang#TLang | TL list of TLang list | TL option of TLang option | TL err of TLang err | TL fd of fd | TL ip of ip | TL port of port | TL error of error | TL netmask of netmask | TL ifid of ifid | TL filebflag of filebflag | TL sockbflag of sockbflag | TL socknflag of socknflag | TL socktflag of socktflag | TL socktype of socktype | TL tid of tid | TL signal of signal Description Language values. – : (∀i .tlang typing(TL int i)TLty int) ∧ Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ Time types 18 (∀b.tlang typing(TL bool b)TLty bool) ∧ (∀s.tlang typing(TL string s)TLty string) ∧ tlang typing(TL one ())TLty one ∧ (∀p1 p2 ty1 ty2. tlang typing p1 ty1 ∧ tlang typing p2 ty2 =⇒ tlang typing(TL pair(p1, p2))(TLty pair(ty1, ty2))) ∧ (∀tl ty .(∀e.mem e tl =⇒ tlang typing e ty) =⇒ tlang typing(TL list tl)(TLty list ty)) ∧ (∀p ty .tlang typing p ty =⇒ tlang typing(TL option(↑ p))(TLty lift ty)) ∧ (∀ty .tlang typing(TL option ∗)(TLty lift ty)) ∧ (∀e ty .tlang typing(TL err(FAIL e))(TLty err ty)) ∧ (∀p ty .tlang typing p ty =⇒ tlang typing(TL err(OK p))(TLty err ty)) ∧ (∀fd .tlang typing(TL fd fd)TLty fd) ∧ (∀i .tlang typing(TL ip i)TLty ip) ∧ (∀p.tlang typing(TL port p)TLty port) ∧ (∀e.tlang typing(TL error e)TLty error) ∧ (∀nm.tlang typing(TL netmask nm)TLty netmask) ∧ (∀ifid .tlang typing(TL ifid ifid)TLty ifid) ∧ (∀ff .tlang typing(TL filebflag ff )TLty filebflag) ∧ (∀sf .tlang typing(TL sockbflag sf )TLty sockbflag) ∧ (∀sf .tlang typing(TL socknflag sf )TLty socknflag) ∧ (∀sf .tlang typing(TL socktflag sf )TLty socktflag) ∧ (∀st .tlang typing(TL socktype st)TLty socktype) ∧ (∀tid .tlang typing(TL tid tid)TLty tid) ∧ (* (!l ty. tlang typing (TL ref (Loc (ty,l))) (TLty ref ty)) /\ *) (* (!ex. tlang typing (TL exn ex) TLty exn ) /\ *) (* (!p ty. tlang typing p ty ==> *) (* tlang typing (TL except (EOK p)) (TLty except ty)) /\ *) (* (!ex ty. tlang typing (TL exn ex) TLty exn ==> *) (* tlang typing (TL except (EEX ex)) (TLty except ty)) /\ *) (∀s.tlang typing(TL signal s)TLty signal) 4.4 Time types Time and duration are defined as type synonyms. Time must be non-negative and may be infinite; duration must be positive and finite. 4.4.1 Summary time type abbrev duration time lt written < time lte written ≤ time gt written > time gte written ≥ Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ time min 19 time min written min x y time max written max x y time plus dur written + time minus dur written − real mult time written ∗ time zero duration abstime realopt of time the time written the 4.4.2 Rules – : time =∞ | time of real – : type abbrev duration : real – written < : ((time lt : time→ time→ bool)(time x )(time y) = x < y) ∧ (time lt ∞ ys = F) ∧ (time lt xs ∞ = T) – written ≤ : time lte(time x )(time y) = x ≤ y ∧ time lte t ∞ = T ∧ time lte ∞ t = (t =∞) – written > : time gt xs ys = time lt ys xs – written ≥ : time gte xs ys = time lte ys xs – written min x y : time min(time x )(time y) = time(min x y) ∧ time min(time x )∞ = time x ∧ time min ∞(time x ) = time x ∧ time min ∞∞ =∞ – written max x y : time max(time x )(time y) = time(max x y) ∧ time max ∞(time x ) =∞∧ time max(time x )∞ =∞∧ time max ∞∞ =∞ – written + : ((time plus dur : time→ duration→ time) (time x )y = time(x + y)) ∧ Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ Basic network types: sequence numbers (TCP only) 20 (time plus dur ∞ y =∞) – written − : ((time minus dur : time→ duration→ time) (time x )y = time(x − y)) ∧ (time minus dur ∞ y =∞) – written ∗ : (real mult time : real → time→ time) x (time y) = time(x ∗ y) ∧ real mult time x ∞ =∞ – : (0 : time) = time 0 – : (duration : num→ num→ duration)sec usec = $&sec + $&usec/1000000 Description Some durations may be represented as duration sec usec, where sec and usec are both natural numbers. – : (abstime : num→ num→ duration)sec usec = $&sec + $&usec/1000000 Description Some times may be represented as duration sec usec, where sec and usec are both natural numbers. – : (realopt of time : time→ real option)(time x ) = ↑ x ∧ realopt of time ∞ = ∗ – written the : the time(time x ) = x 4.5 Basic network types: sequence numbers (TCP only) We have several flavours of TCP sequence numbers, all represented by 32-bit values: local sequence numbers, foreign sequence numbers, and timestamps. This helps prevent confusion. We also define tcp seq flip sense, which converts a local to a foreign sequence number and vice versa. Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ seq32 plus 21 4.5.1 Summary type abbrev byte seq32 seq32 plus written + seq32 minus written − seq32 plus ′ written + seq32 minus ′ written − seq32 diff written − seq32 lt written < seq32 leq written ≤ seq32 gt written > seq32 geq written ≥ seq32 fromto seq32 coerce seq32 min written min x y seq32 max written max x y tcpLocal tcpForeign type abbrev tcp seq local type abbrev tcp seq foreign tcp seq local tcp seq foreign tcp seq local to foreign tcp seq foreign to local tstamp type abbrev ts seq ts seq 4.5.2 Rules – : type abbrev byte : char – : seq32 = SEQ32 of ′a => word32 Description 32-bit wraparound sequence numbers, as used in TCP, along with their special arithmetic. – written + : seq32 plus(SEQ32 a n)(m : num) = SEQ32 a(n + n2w m) – written − : seq32 minus(SEQ32 a n)(m : num) = SEQ32 a(n − n2w m) – written + : seq32 plus′(SEQ32 a n)(m : int) = SEQ32 a(n + i2w m) – written − : seq32 minus′(SEQ32 a n)(m : int) = SEQ32 a(n − i2w m) – written − : Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ tstamp 22 seq32 diff(SEQ32(a : ′a)n)(SEQ32(b : ′a)m) = w2i(n −m) – written < : seq32 lt(n : ′a seq32)(m : ′a seq32) = ((n −m) : int) < 0 – written ≤ : seq32 leq(n : ′a seq32)(m : ′a seq32) = ((n −m) : int) ≤ 0 – written > : seq32 gt(n : ′a seq32)(m : ′a seq32) = ((n −m) : int) > 0 – written ≥ : seq32 geq(n : ′a seq32)(m : ′a seq32) = ((n −m) : int) ≥ 0 – : seq32 fromto(a : ′a)b(SEQ32(c : ′a)n) = SEQ32 b n – : seq32 coerce(SEQ32 a n) = SEQ32 ARB n – written min x y : seq32 min(n : ′a seq32)(m : ′a seq32) = if n < m then n else m – written max x y : seq32 max(n : ′a seq32)(m : ′a seq32) = if n < m then m else n – : tcpLocal = TcpLocal – : tcpForeign = TcpForeign – : type abbrev tcp seq local : tcpLocal seq32 – : type abbrev tcp seq foreign : tcpForeign seq32 – : tcp seq local(n : word32 ) = SEQ32 TcpLocal n – : tcp seq foreign(n : word32 ) = SEQ32 TcpForeign n – : tcp seq local to foreign = seq32 coerce : tcp seq local→ tcp seq foreign – : tcp seq foreign to local = seq32 coerce : tcp seq foreign→ tcp seq local Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ ts seq 23 – : tstamp = Tstamp – : type abbrev ts seq : tstamp seq32 – : ts seq(n : word32 ) = SEQ32 Tstamp n Rule version: $Id: TCP1 baseTypesScript.sml,v 1.62 2005/01/25 14:38:48 mf266 Exp $ Part V TCP1 netTypes 24 Chapter 5 Network datagram types This file defines the types of the datagrams that appear on the network, with an IP message being either a TCP segment, a UDP datagram, or an ICMP datagram. These types abstract from most fields of the IP header: version, header length, type of service, identification, DF, MF, and fragment offset, time to live, header checksum, and IP options. They faithfully model the IP header fields: protocol (TCP, UDP, or ICMP), total length, source address, and destination address. The tcpSegment type abstracts from the TCP checksum, reserved, and padding fields of the TCP header, from the ordering of TCP options, and from ill-formed TCP options. It faithfully models all other fields. The udpDatagram type abstracts from the UDP checksum but faithfully models all other fields. Lengths are represented by allowing simple lists of data bytes rather than explicit length fields. All these types collapse the encapsulation of TCP/UDP/ICMP within IP, flattening them into single records, to reduce syntactic noise throughout the specification. For ease of comparison we reproduce the RFC 791/793/768 header formats below. 3.1. Internet Header Format A summary of the contents of the internet header follows: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ TCP Header Format 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | 25 tcpSegment 26 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 0 7 8 15 16 23 24 31 +--------+--------+--------+--------+ | Source | Destination | | Port | Port | +--------+--------+--------+--------+ | | | | Length | Checksum | +--------+--------+--------+--------+ | | data octets ... +---------------- ... 5.1 TCP segments (TCP only) TCP segments (really datagrams, since we include the IP data) are modelled as follows. 5.1.1 Summary tcpSegment TCP datagram type sane seg segment well-formedness test (physical constraints imposed by format) 5.1.2 Rules – TCP datagram type : tcpSegment =〈[ is1 : ip option; (* source IP *) is2 : ip option; (* destination IP *) ps1 : port option; (* source port *) ps2 : port option; (* destination port *) seq : tcp seq local; (* sequence number *) ack : tcp seq foreign; (* acknowledgment number *) URG : bool; ACK : bool; PSH : bool; RST : bool; SYN : bool; FIN : bool; win : word16 ; (* window size (unsigned) *) ws : byte option; (* TCP option: window scaling; typically 0..14 *) Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ ICMP datagrams (TCP and UDP) 27 urp : word16 ; (* urgent pointer (unsigned) *) mss : word16 option; (* TCP option: maximum segment size (unsigned) *) ts : (ts seq# ts seq) option; (* TCP option: RFC1323 timestamp value and echo-reply *) data : byte list ]〉 Description The use of ”local” and ”foreign” here is with respect to the sending TCP. – segment well-formedness test (physical constraints imposed by format) : sane seg seg = length seg .data < (65536− 40) 5.2 UDP datagrams (UDP only) UDP datagrams are very simple. They are modelled as follows. 5.2.1 Summary udpDatagram UDP datagram type sane udpdgm message well-formedness test (physical constraints imposed by format) 5.2.2 Rules – UDP datagram type : udpDatagram =〈[ is1 : ip option; (* source IP *) is2 : ip option; (* destination IP *) ps1 : port option; (* source port *) ps2 : port option; (* destination port *) data : byte list ]〉 – message well-formedness test (physical constraints imposed by format) : sane udpdgm dgm = length dgm.data < (65536− 20− 8) 5.3 ICMP datagrams (TCP and UDP) ICMP messages have type and code fields, both 8 bits wide. The specification deals only with some of these types, as characterised in the HOL type icmpType below. For each type we identify some or all of the codes that have conventional symbolic representations, but to ensure the model can faithfully represent arbitrary codes each code (HOL type) also has an OTHER constructor carrying a byte. The values carried are assumed not to overlap with the symbolically-represented values. In retrospect, there seems to be no reason not to have types and codes simply particular byte constants. Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ ICMP datagrams (TCP and UDP) 28 5.3.1 Summary Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ icmp redirect code 29 protocol protocol type for use in ICMP messages icmp unreach code icmp source quench code icmp redirect code icmp time exceeded code icmp paramprob code icmpType icmpDatagram ICMP datagram type 5.3.2 Rules – protocol type for use in ICMP messages : protocol = PROTO TCP | PROTO UDP – : icmp unreach code = NET | HOST | PROTOCOL | PORT | SRCFAIL | NEEDFRAG of word16 option | NET UNKNOWN | HOST UNKNOWN | ISOLATED | NET PROHIB | HOST PROHIB | TOSNET | TOSHOST | FILTER PROHIB | PREC VIOLATION | PREC CUTOFF | OTHER of byte#word32 (* really want this not to overlap *) – : icmp source quench code = QUENCH | SQ OTHER of byte#word32 (* writen OTHER *) – : icmp redirect code = RD NET (* written NET *) | RD HOST (* written HOST *) | RD TOSNET (* written TOSNET *) | RD TOSHOST (* written TOSHOST *) | RD OTHER of byte#word32 (* written OTHER *) Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ IP messages (TCP and UDP) 30 – : icmp time exceeded code = INTRANS | REASS | TX OTHER of byte#word32 (* written OTHER *) – : icmp paramprob code = BADHDR | NEEDOPT | PP OTHER of byte#word32 (* written OTHER *) – : icmpType = ICMP UNREACH of icmp unreach code | ICMP SOURCE QUENCH of icmp source quench code | ICMP REDIRECT of icmp redirect code | ICMP TIME EXCEEDED of icmp time exceeded code | ICMP PARAMPROB of icmp paramprob code (* FreeBSD 4.6-RELEASE also does: ICMP ECHO, ICMP TSTMP, ICMP MASKREQ *) – ICMP datagram type : icmpDatagram =〈[ is1 : ip option; (* this is the sender of this ICMP *) is2 : ip option; (* this is the intended receiver of this ICMP *) (* we assume the enclosed IP always has at least 8 bytes of data, i.e., enough for all the fields below *) is3 : ip option; (* source of enclosed IP datagram *) is4 : ip option; (* destination of enclosed IP datagram *) ps3 : port option; (* source port *) ps4 : port option; (* destination port *) proto : protocol; (* protocol *) seq : tcp seq local option; (* seq *) t : icmpType ]〉 5.4 IP messages (TCP and UDP) An IP datagram is (for our purposes) either a TCP segment, an ICMP datagram, or a UDP datagram. We use the type msg for IP datagrams. IP datagrams may be checked for sanity, and may have their is1 and is2 fields inspected. 5.4.1 Summary Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ msg is1 31 msg IP message type sane msg message well-formedness test (physical constraints imposed by format) msg is1 source IP of a message, written x .is1 msg is2 destination IP of a message, written x .is2 5.4.2 Rules – IP message type : msg = TCP of tcpSegment | ICMP of icmpDatagram | UDP of udpDatagram – message well-formedness test (physical constraints imposed by format) : sane msg(TCP seg) = sane seg seg ∧ sane msg(ICMP dgm) = T ∧ sane msg(UDP dgm ′) = sane udpdgm dgm ′ – source IP of a message, written x .is1 : msg is1(TCP seg) = seg .is1 ∧ msg is1(ICMP dgm) = dgm.is1 ∧ msg is1(UDP dgm ′) = dgm ′.is1 – destination IP of a message, written x .is2 : msg is2(TCP seg) = seg .is2 ∧ msg is2(ICMP dgm) = dgm.is2 ∧ msg is2(UDP dgm ′) = dgm ′.is2 Rule version: $Id: TCP1 netTypesScript.sml,v 1.45 2004/12/09 15:43:08 kw217 Exp $ Part VI TCP1 LIBinterface 32 Chapter 6 System call types This file gives the system call API that is modelled by the specification. 6.1 The interface (TCP and UDP) The Sockets API is modelled by the library interface below. As discussed in volume 1, we refine the C interface slightly: • We use ML-style datatypes, abstracting from pointers and length parameters. • Where the C API provides multiple entry points to a single operation (such as send/sendto/sendmsg/write, or pselect/select) we combine them all into a single general function. • Certain special cases of general functions (such as getsockopt with SO_ERROR, ioctl with SIOCATMARK, and fcntl with F_GETFL) have been pulled out into separate functions (getsockerr, sockatmark (following POSIX), and getfileflags respectively). • Features not relevant to TCP or UDP (e.g. Unix domain sockets), or historical artifacts (such as the address family / protocol family distinction in socket) are elided. The HOL type LIB interface defines the calls. It takes their arguments to be the relevant HOL types (rather than values of TLang) so that HOL typechecking ensures consistency. The return types of the calls cannot be embedded so neatly within the HOL type system, so an additional retType function defines these (and HOL typechecking does not check this data at present). 6.1.1 Summary LIB interface retType 6.1.2 Rules – : LIB interface = accept of fd | bind of (fd#ip option#port option) | close of fd | connect of (fd#ip#port option) | disconnect of fd | dup of fd | dupfd of (fd#int) 33 retType 34 | getfileflags of fd | getifaddrs of () | getpeername of fd | getsockbopt of (fd#sockbflag) | getsockerr of fd | getsocklistening of fd | getsockname of fd | getsocknopt of (fd#socknflag) | getsocktopt of (fd#socktflag) | listen of (fd#int) | pselect of (fd list#fd list#fd list#(int#int) option#signal list option) | recv of (fd#int#msgbflag list) | send of (fd#(ip#port) option#string#msgbflag list) | setfileflags of (fd#filebflag list) | setsockbopt of (fd#sockbflag#bool) | setsocknopt of (fd#socknflag#int) | setsocktopt of (fd#socktflag#(int#int) option) | shutdown of (fd#bool#bool) | sockatmark of fd | socket of socktype Description Sockets calls with their argument types. – : retType(accept ) = TLty pair(TLty fd,TLty pair(TLty ip,TLty port)) ∧ retType(bind ) = TLty one ∧ retType(close ) = TLty one ∧ retType(connect ) = TLty one ∧ retType(disconnect ) = TLty one ∧ retType(dup ) = TLty fd ∧ retType(dupfd ) = TLty fd ∧ retType(getfileflags ) = TLty list TLty filebflag ∧ retType(getifaddrs ) = TLty list (TLty pair(TLty ifid,TLty pair(TLty ip,TLty pair((TLty list TLty ip),TLty netmask)))) ∧ retType(getpeername ) = TLty pair(TLty ip,TLty port) ∧ retType(getsockbopt ) = TLty bool ∧ retType(getsockerr ) = TLty one ∧ retType(getsocklistening ) = TLty bool ∧ retType(getsockname ) = TLty pair(TLty lift TLty ip,TLty lift TLty port) ∧ retType(getsocknopt ) = TLty int ∧ retType(getsocktopt ) = TLty lift(TLty pair(TLty int,TLty int)) ∧ retType(listen ) = TLty one ∧ retType(pselect ) = TLty pair(TLty list TLty fd, TLty pair(TLty list TLty fd, TLty list TLty fd)) ∧ retType(recv ) = TLty pair(TLty string, TLty lift(TLty pair(TLty pair(TLty ip, TLty port), TLty bool))) ∧ retType(send ) = TLty string ∧ retType(setfileflags ) = TLty one ∧ retType(setsockbopt ) = TLty one ∧ retType(setsocknopt ) = TLty one ∧ retType(setsocktopt ) = TLty one ∧ retType(shutdown ) = TLty one ∧ retType(sockatmark ) = TLty bool Rule version: $Id: TCP1 LIBinterfaceScript.sml,v 1.37 2005/02/07 16:31:21 kw217 Exp $ fd sockop 35 ∧ retType(socket ) = TLty fd Description Return types of sockets calls. 6.2 Useful groups of calls (TCP and UDP) For some purposes it is useful to group together all the system calls that expect a single fd, and those that expect a socket fd. 6.2.1 Summary fd op fd sockop 6.2.2 Rules – : fd op fd opn = ( opn = accept(fd) ∨ (∃is ps.opn = bind(fd, is, ps)) ∨ opn = close(fd) ∨ (∃i p.opn = connect(fd, i , p)) ∨ opn = disconnect(fd) ∨ opn = dup(fd) ∨ (∃fd ′.opn = dupfd(fd, fd ′)) ∨ (opn = getfileflags(fd)) ∨ (∃flags.opn = setfileflags(fd,flags)) ∨ opn = getsockname(fd) ∨ opn = getpeername(fd) ∨ (∃sfb.opn = getsockbopt(fd, sfb)) ∨ (∃sfn.opn = getsocknopt(fd, sfn)) ∨ (∃sft .opn = getsocktopt(fd, sft)) ∨ (∃sfb b.opn = setsockbopt(fd, sfb, b)) ∨ (∃sfn n.opn = setsocknopt(fd, sfn,n)) ∨ (∃sft t .opn = setsocktopt(fd, sft , t)) ∨ (∃n.opn = listen(fd,n)) ∨ (∃n opt .opn = recv(fd,n, opt)) ∨ (∃data opt .opn = send(fd, data, opt)) ∨ (∃r w .opn = shutdown(fd, r ,w)) ∨ opn = sockatmark(fd) ∨ opn = getsockerr(fd) ∨ opn = getsocklistening(fd) ) Description Calls that expect a (single) fd. – : fd sockop fd opn = ( opn = accept(fd) ∨ Rule version: $Id: TCP1 LIBinterfaceScript.sml,v 1.37 2005/02/07 16:31:21 kw217 Exp $ fd sockop 36 (∃is ps.opn = bind(fd, is, ps)) ∨ (∃i p.opn = connect(fd, i , p)) ∨ opn = disconnect(fd) ∨ opn = getsockname(fd) ∨ opn = getpeername(fd) ∨ (∃sfb.opn = getsockbopt(fd, sfb)) ∨ (∃sfn.opn = getsocknopt(fd, sfn)) ∨ (∃sft .opn = getsocktopt(fd, sft)) ∨ (∃sfb b.opn = setsockbopt(fd, sfb, b)) ∨ (∃sfn n.opn = setsocknopt(fd, sfn,n)) ∨ (∃sft t .opn = setsocktopt(fd, sft , t)) ∨ (∃n.opn = listen(fd,n)) ∨ (∃n opt .opn = recv(fd,n, opt)) ∨ (∃data opt .opn = send(fd, data, opt)) ∨ (∃r w .opn = shutdown(fd, r ,w)) ∨ opn = sockatmark(fd) ∨ opn = getsockerr(fd) ∨ opn = getsocklistening(fd) ) Description Calls that expect a (single) socket fd. Rule version: $Id: TCP1 LIBinterfaceScript.sml,v 1.37 2005/02/07 16:31:21 kw217 Exp $ Part VII TCP1 host0 37 Chapter 7 Host LTS labels and rule categories This file defines the labels for the host labelled transition system, characterising the possible interactions between a host and its environment. It also defines various categories for the host LTS rules. 7.1 Transition labels (TCP and UDP) Host transition labels. 7.1.1 Summary Lhost0 Host transition labels 7.1.2 Rules – Host transition labels : Lhost0 = (* library interface *) Lh call of tid#LIB interface (* invocation of LIB call, written e.g. tid·(socket(socktype)) *) | Lh return of tid#TLang (* return result of LIB call, written tid·v *) (* message transmission and receipt *) | Lh senddatagram of msg (* output of message to the network, written msg *) | Lh recvdatagram of msg (* input of message from the network, written msg *) | Lh loopdatagram of msg (* loopback output/input, written ←−−→msg *) (* connectivity changes *) | Lh interface of ifid#bool (* set interface status to boolean up, written Lh interface(ifid , up) *) (* miscellaneous *) | τ (* internal transition, written τ *) | Lh epsilon of duration (* time passage, written dur *) | Lh trace of tracerecord (* TCP trace record, written Lh trace tr *) 7.2 Rule categories (TCP and UDP) A rule carries a number of flags: the protocol it relates to, its status (success, failure, or ‘bad’ failure), its category (fast or slow system call, network, etc.), and its urgency (whether it must fire immediately, or may be delayed). 38 urgent 39 7.2.1 Summary rule proto rule status rule cat urgent nonurgent is urgent 7.2.2 Rules – : rule proto = rp tcp | rp udp | rp all Description Rules are classified as to whether they relate to TCP, to UDP, or to both. – : rule status = succeed | fail | badfail Description Socket call rules marked succeed construct an OK v value to be returned to the calling thread, whereas those maked fail or badfail construct a FAIL e error to be returned. The badfail rules are those involving (unusual) lack of resources, e.g. of ephemeral ports, file descriptors, or kernel memory. They are distinguished from the fail rules to make it easy to state properties of the form ”if no bad failures occur, then...”. – : rule cat = fast of rule status | block | slow of bool => rule status | network of bool | misc of bool Description Socket call rules are either fast, immediately constructing a return value or error, block, entering a state in which the calling thread is blocked, or slow, completing processing for a blocked thread. fast and slow rules have a rule status as above. The network rules include message send and receive and the internal actions involved in the protocol. The misc rules cover the remainder: returning values to threads, timer expiry, TCP tracing, interface status changes, and time passage. The bool argument to slow, network, and misc rule categories indicates whether the rule is urgent. If an urgent rule is enabled then no time may pass. Rule version: $Id: TCP1 host0Script.sml,v 1.97 2004/12/09 15:43:08 kw217 Exp $ urgent 40 – : urgent = T – : nonurgent = F – : is urgent(slow b ) = b ∧ is urgent(network b) = b ∧ is urgent(misc b) = b ∧ is urgent = F Rule version: $Id: TCP1 host0Script.sml,v 1.97 2004/12/09 15:43:08 kw217 Exp $ Part VIII TCP1 ruleids 41 Chapter 8 Rule names This file defines the names of transition rules in the specification. 8.1 names (Rule only) We list here the names of all rules in the host LTS. 8.1.1 Summary rule ids 8.1.2 Rules – : rule ids = return 1 | socket 1 | socket 2 | accept 1 | accept 2 | accept 3 | accept 4 | accept 5 | accept 6 | accept 7 | bind 1 | bind 2 | bind 3 | bind 5 | bind 7 | bind 9 | close 1 | close 2 | close 3 | close 4 | close 5 | close 6 | close 7 | close 8 | close 10 | connect 1 | connect 2 | connect 3 | connect 4 | connect 4a | connect 5 | connect 5a | connect 5b | connect 5c | connect 5d | connect 6 | connect 7 | connect 8 | connect 9 | connect 10 | disconnect 1 | disconnect 2 | disconnect 3 | disconnect 4 | disconnect 5 | dup 1 | dup 2 | dupfd 1 | dupfd 3 | dupfd 4 | listen 1 | listen 1b | listen 1c | listen 2 | listen 3 | listen 4 | listen 5 | listen 7 | getfileflags 1 | setfileflags 1 | getifaddrs 1 | getsockbopt 1 | getsockbopt 2 | setsockbopt 1 | setsockbopt 2 | getsocknopt 1 | getsocknopt 4 | setsocknopt 1 | setsocknopt 4 | setsocknopt 2 | getsocktopt 1 | getsocktopt 4 | setsocktopt 1 | setsocktopt 4 | setsocktopt 5 | getsockerr 1 | getsockerr 2 | getsocklistening 1 | getsocklistening 2 | getsocklistening 3 | shutdown 1 | shutdown 2 | shutdown 3 | shutdown 4 | recv 1 | recv 2 | recv 3 | recv 4 | recv 5 | recv 6 | recv 7 | recv 8 | recv 8a | recv 9 | recv 11 | recv 12 | recv 13 | recv 14 | recv 15 | recv 16 | recv 17 | recv 20 | recv 21 | recv 22 42 rule ids 43 | recv 23 | recv 24 | send 1 | send 2 | send 3 | send 3a | send 4 | send 5 | send 5a | send 6 | send 7 | send 8 | send 9 | send 10 | send 11 | send 12 | send 13 | send 14 | send 15 | send 16 | send 17 | send 18 | send 19 | send 21 | send 22 | send 23 | sockatmark 1 | sockatmark 2 | pselect 1 | pselect 2 | pselect 3 | pselect 4 | pselect 5 | pselect 6 | getsockname 1 | getsockname 2 | getsockname 3 | getpeername 1 | getpeername 2 | badf 1 | notsock 1 | intr 1 | resourcefail 1 | resourcefail 2 | deliver in 1 | deliver in 1b | deliver in 2 | deliver in 2a | deliver in 3 | deliver in 3a | deliver in 3b | deliver in 3c | deliver in 4 | deliver in 5 | deliver in 6 | deliver in 7 | deliver in 7a | deliver in 7b | deliver in 7c | deliver in 7d | deliver in 8 | deliver in 9 | deliver in icmp 1 | deliver in icmp 2 | deliver in icmp 3 | deliver in icmp 4 | deliver in icmp 5 | deliver in icmp 6 | deliver in icmp 7 | deliver in udp 1 | deliver in udp 2 | deliver in udp 3 | deliver in 99 | deliver in 99a | timer tt rexmt 1 | timer tt rexmtsyn 1 | timer tt persist 1 | timer tt 2msl 1 | timer tt delack 1 | timer tt conn est 1 | timer tt keep 1 | timer tt fin wait 2 1 | deliver out 1 | deliver out 99 | deliver loop 99 | trace 1 | trace 2 | interface 1 | epsilon 1 | epsilon 2 Rule version: $Id: TCP1 ruleidsScript.sml,v 1.19 2005/02/05 17:36:07 pes20 Exp $ Part IX TCP1 timers 44 Chapter 9 Timers This file defines the various kinds of timer that are used by the host specification. Timers are host-state components that are updated by the passage of time, in dur transitions. We define four kinds of timer: 1. the deadline timer (′a timed), which wraps a value in a timer that will count towards a (possibly fuzzy) deadline, and stop the progress of time when it reaches the maximum deadline. 2. the time-window timer (′a timewindow), which wraps a value in a timer just like a deadline timer, except that the value merely vanishes when it expires, rather than impeding the progress of time. These are an optimisation, designed to avoid having an extra rule (and consequent τ transitions) just for processing the expiry of such values. 3. the ticker (ticker), which contains a ts seq (integral wraparound 32-bit type) that is incremented by one for every time a certain interval passes. It also contains the real remainder, and the interval size that corresponds to a step. 4. the stopwatch (stopwatch), which may be reset at any time and counts upwards indefinitely from zero. Note it may be necessary to add some fuzziness to this timer. For each timer we define a constructor and a time-passage function. The time-passage function takes a duration (positive real) and a timer, and returns either the timer, or ∗ if time is not permitted by the timer to pass that far (i.e., an urgent instant would be passed). Timers that never need to stop time do not return an option type. Timers that behave nondeterministically are defined relationally (taking the ”result” as argument and returning a bool). For all of them, we want the two properties defined by Lynch and Vaandrager in Inf. and Comp., 128(1), 1996 (http://theory.lcs.mit.edu/tds/papers/Lynch/IC96.html) as S1 and S2 to hold. 9.1 Properties (TCP and UDP) Axioms of time, that all timers must satisfy. 9.1.1 Summary time pass additive time pass trajectory opttorel 9.1.2 Rules – : (time pass additive : (duration→ ′a → ′a → bool)→ bool) time pass 45 Basic timer timer (TCP and UDP) 46 = ∀dur1 dur2 s0 s1 s2. time pass dur1 s0 s1 ∧ time pass dur2 s1 s2 =⇒ time pass(dur1 + dur2)s0 s2 Description Property S1, additivity: If s ′ d−→ s ′′ and s ′′ d ′ −→ s then s ′ d + d ′ −−−−−→ s. – : (time pass trajectory : (duration→ ′a → ′a → bool)→ bool) time pass = ∀dur s0 s1. time pass dur s0 s1 =⇒ ∃w . w 0 = s0 ∧ w dur = s1 ∧ ∀t t ′. 0 ≤ t ∧ t ≤ dur ∧ 0 ≤ t ′ ∧ t ′ ≤ dur ∧ t < t ′ =⇒ time pass(t ′ − t)(w t)(w t ′) Description Property S2 is defined as follows: Each time passage step s ′ d−→ s has a trajectory, where a trajectory is defined as follows. If I is any left-closed interval of R ≥ 0 beginning with 0, then an I-trajectory is a function w from I to states(A) such that w(t) t ′ − t−−−−→ w(t ′) for all t,t′ in I with t < t′. Now define w.fstate = w(0), w.ltime to be the supremum of I, and if I is right-closed, w.lstate = w(w.ltime). Then a trajectory for a step s ′ d−→ s is a [0, d]-trajectory with w.fstate = s′ and w.lstate = s. In our case, S2 (which we call “trajectory”) may be stated as follows: For each time passage step s ′ d−→ s, there exists a function w from [0, d] to states such that w(0) = s′, w(d) = s, and w(t) t ′ − t−−−−→ w(t ′) for all t,t′ in [0, d] with t < t′. – : (opttorel : (duration→ ′a → ′a option)→ (duration→ ′a → ′a → bool)) tp dur x y = case tp dur x of ↑ x ′ → y = x ′ ‖ ∗ → F Description Impedance-matching coercion. 9.2 Basic timer timer (TCP and UDP) The basic timer, timer, is a triple of the elapsed time, the minimum expiry time, and the maximum expiry time. It may expire at any time after the minimum expiry time, but time may not progress beyond the maximum expiry time. 9.2.1 Summary Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Time Pass timer 47 timer fuzzy timer timer that goes off in the interval [d − eps, d + fuz ], like a BSD ticks-based timer sharp timer timer that goes off at exactly d after now never timer timer that never goes off upper timer timer that goes off between now and d timer expires true if the timer may expire now Time Pass timer state of timer after time passage 9.2.2 Rules – : timer = Timer of duration#time#time – timer that goes off in the interval [d − eps, d + fuz ], like a BSD ticks-based timer : (* fuz is some fuzziness added to mask the atomic nature of the model. *) (fuzzy timer : time→ duration→ duration→ timer) d eps fuz = Timer(0, d − eps, d + fuz ) – timer that goes off at exactly d after now : sharp timer d = fuzzy timer d 0 – timer that never goes off : never timer = Timer(0,∞,∞) – timer that goes off between now and d : upper timer d = Timer(0, 0, d) – true if the timer may expire now : (* NB: we assume below that this is monotonic; if it is once true it is always true (at least at any time that can be reached *) (timer expires : timer→ bool)(Timer(e, deadmin, deadmax )) = (time e ≥ deadmin) – state of timer after time passage : (Time Pass timer : duration→ timer→ timer option) dur(Timer(e, deadmin, deadmax )) = let e ′ = e + dur in if time e ′ ≤ deadmax then ↑(Timer(e ′, deadmin, deadmax )) else ∗ Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Time-window timer timewindow (TCP and UDP) 48 9.3 Deadline timer timed (TCP and UDP) The deadline timer ′a timed is simply a value ′a annotated by a timer. This is a very convenient idiom. 9.3.1 Summary timed timed val of timed timer of timed expires Time Pass timed 9.3.2 Rules – : timed = Timed of ′a#timer – : timed val of((x ) ) = x – : timed timer of((x )d) = d – : timed expires(( )d) = timer expires d – : (Time Pass timed : duration→ ′a timed→ ′a timed option) dur((x )d) = case Time Pass timer dur d of ↑ d ′ → ↑((x )d′) ‖ ∗ → ∗ 9.4 Time-window timer timewindow (TCP and UDP) The time-window timer ′a timewindow, rendered as (x )TimeWindowd , is like a deadline timer ′atimed, except that when it expires the value merely evaporates, rather than causing time to stop. Thus an ′a timewindow never induces urgency. 9.4.1 Summary Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Ticker ticker (TCP and UDP) 49 timewindow timewindow val of timewindow open Time Pass timewindow 9.4.2 Rules – : timewindow = TimeWindow of ′a#timer | TimeWindowClosed – : timewindow val of((x )TimeWindow) = ↑ x ∧ timewindow val of TimeWindowClosed = ∗ – : timewindow open(( )TimeWindow) = T ∧ timewindow open TimeWindowClosed = F – : (Time Pass timewindow : duration→ ′a timewindow→ ′a timewindow→ bool) dur((x )TimeWindowd )tw ′ = (case Time Pass timer dur d of ∗ → tw ′ = TimeWindowClosed ‖ ↑ d ′ → tw ′ = (x )TimeWindowd′ ∨ (timer expires d ′ ∧ tw ′ = TimeWindowClosed)) ∧ Time Pass timewindow dur TimeWindowClosed tw ′ = (tw ′ = TimeWindowClosed) 9.5 Ticker ticker (TCP and UDP) A ticker ticker models a discrete time counter. It contains a counter, a remainder, a minimum duration, and a maximum duration. The counter is incremented at least once every maximum duration, and at most once every minimum duration. The remainder stores the time since the last increment. 9.5.1 Summary ticker ticks of Time Pass ticker ticker ok tick imin tick imax Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Stopwatch stopwatch (TCP and UDP) 50 9.5.2 Rules – : ticker = Ticker of ts seq#duration (* may be zero *)#duration#duration – : ticks of(Ticker(ticks, , , )) = ticks – : (Time Pass ticker : duration→ ticker→ ticker→ bool) dur(Ticker(ticks, remdr , intvlmin, intvlmax ))t ′ = let d = remdr + dur in ∃delta remdr ′. d − real of num delta ∗ intvlmax ≤ remdr ′ ∧ remdr ′ ≤ d − real of num delta ∗ intvlmin ∧ 0 ≤ remdr ′ ∧ remdr ′ < intvlmax ∧ t ′ = Ticker(ticks + delta, remdr ′, intvlmin, intvlmax ) – : ticker ok(Ticker(ticks, remdr , imin, imax )) = (0 ≤ remdr ∧ remdr < imax ∧ imin ≤ imax ∧ 0 < imin) – : tick imin(Ticker(t , r , imin, imax )) = imin – : tick imax(Ticker(t , r , imin, imax )) = imax 9.6 Stopwatch stopwatch (TCP and UDP) The stopwatch stopwatch records the time since it was started, with fuzziness introduced by means of a minimum and maximum rate factor applied to the passage of time. 9.6.1 Summary stopwatch stopwatch val of Time Pass stopwatch Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Time Pass stopwatch 51 9.6.2 Rules – : stopwatch = Stopwatch of duration (* may be zero *)#real#real – : stopwatch val of(Stopwatch(d , , )) = d – : (Time Pass stopwatch : duration→ stopwatch→ stopwatch→ bool) dur(Stopwatch(d , ratemin, ratemax ))s ′ = ∃rate.ratemin ≤ rate ∧ rate ≤ ratemax ∧ s ′ = Stopwatch(d + (dur ∗ rate), ratemin, ratemax ) Rule version: $Id: TCP1 timersScript.sml,v 1.59 2005/02/07 16:31:22 kw217 Exp $ Part X TCP1 hostTypes 52 Chapter 10 Host types This file defines types for the internal state of the host and its components: files, TCP control blocks, sockets, interfaces, routing table, thread states, and so on, culminating in the definition of the host type. It also defines TCP trace records, building on the definition of TCP control blocks. Broadly following the implementations, each protocol endpoint has a socket structure which has some common fields (e.g. the associated IP addresses and ports), and some protocol-specific information. For TCP, which involves a great deal of local state, the protocol-specific information (of type tcp socket) consists of a TCP state (CLOSED, LISTEN, etc.), send and receive queues, and a TCP control block, of type tcpcb, with many window parameters, timers, etc. Roughly, the socket structure and tcp socket substructure contain all the information required by most sockets rules, whereas the tcpcb contains fields required only by the protocol information. 10.1 Files (TCP and UDP) 10.1.1 Summary fid file ID sid socket ID filetype type of file, with pointer to details structure fileflags flags set on a file file open file description File helper constructor 10.1.2 Rules – file ID : fid = FID of num – socket ID : sid = SID of num Description File IDs fid and socket IDs sid are really unique, unlike file descriptors fd. – type of file, with pointer to details structure : filetype = FT Console | FT Socket of sid – flags set on a file : fileflags =〈[ b : filebflag→ bool]〉 – open file description : 53 tcpReassSegment 54 file =〈[ ft : filetype;ff : fileflags]〉 – helper constructor : File(ft ,ff ) =〈[ ft := ft ;ff :=ff ]〉 Description A file is represented by an ”open file description” (in POSIX terminology). This contains file flags and a file type; the specification only covers FT Console and FT Socket files. For most file types, it also contains a pointer to another structure containing data specific to that file type – in our case, a sid pointing to a socket structure for files of type FT Socket. The file flags are defined in TCP1 baseTypes: see filebflag (p14). 10.2 TCP states (TCP only) 10.2.1 Summary tcpstate TCP protocol states 10.2.2 Rules – TCP protocol states : tcpstate = CLOSED | LISTEN | SYN SENT | SYN RECEIVED | ESTABLISHED | CLOSE WAIT | FIN WAIT 1 | CLOSING | LAST ACK | FIN WAIT 2 | TIME WAIT Description The states laid down by RFC793, with spelling as in the BSD source. 10.3 The TCP control block (TCP only) 10.3.1 Summary tcpReassSegment segment reassembly queue elements rexmtmode retransmission mode rttinf round-trip time calculation parameters tcpcb the TCP control block 10.3.2 Rules – segment reassembly queue elements : Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ tcpcb 55 tcpReassSegment =〈[ seq : tcp seq foreign; spliced urp : tcp seq foreign option; FIN : bool; data : byte list ]〉 Description The TCP reassembly queue (the t segq component of the TCP control block) holds informa- tion about TCP segments received out of order, pending their reassembly. It is a list of these tcpReassSegments, recording just the information we need about each. If a byte of urgent data has been spliced from data for out-of-line delivery, its sequence number is recorded in the spliced urp component here to permit correct reassembly. – retransmission mode : rexmtmode = RexmtSyn | Rexmt | Persist Description TCP has three output modes: idle, retransmitting, and persisting. We introduce one more, retransmitting-syn, since the behaviour is slightly different. These modes all share the same timer, and use this ”mode” parameter to distinguish. The idle mode is represented by the timer not running. – round-trip time calculation parameters : rttinf =〈[ t rttupdated : num; (* number of times rtt sampled *) tf srtt valid : bool; (* estimate is currently believed to be valid *) t srtt : duration; (* smoothed round-trip time *) t rttvar : duration; (* variance in round-trip time *) t rttmin : duration; (* minimum rtt allowed *) t lastrtt : duration; (* most recent instantaneous RTT obtained *) (* Note this should really be an option type which is set to ∗ if no value has been obtained. The same applies to t lastshift below. *) (* in BSD, this is the local variable rtt in tcp xmit timer(); we put it here because we don’t want to store rxtcur in the tcpcb *) t lastshift : num; (* the last retransmission shift used *) t wassyn : bool (* whether that shift was RexmtSyn or not *) (* these two also are to avoid storing rxtcur in the tcpcb; they are somewhat annoying because they are *only* required for the tcp output test that returns to slow start if the connection has been idle for >=1RTO *) ]〉 Description This collects data used for round-trip time estimation. tf srtt valid is not in BSD; instead, BSD uses t srtt = 0 to indicate t srtt invalid, and does horrible hacks in retransmission calculations to allow the continued use of the old t srtt even after marking it invalid. We do it better! Unlike BSD, we don’t store the current retransmission interval explicitly; instead we recalculate it if it is needed. Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ tcpcb 56 – the TCP control block : tcpcb =〈[ (* timers *) tt rexmt : (rexmtmode#num)timed option; (* retransmit timer, with mode and shift; ∗ is idle *) (* see tcp_output.c:356ff for more info. *) (* as in BSD, the shift starts at zero, and is incremented each time the timer fires. So it is zero during the first interval, 1 after the first retransmit, etc. *) tt keep : () timed option; (* keepalive timer *) tt 2msl : () timed option; (* 2 ∗MSL TIME WAIT timer *) tt delack : () timed option; (* delayed ACK timer *) tt conn est : () timed option; (* connection-establishment timer, overlays keep in BSD *) tt fin wait 2 : () timed option; (* FIN WAIT 2 timer, overlays 2msl in BSD *) t idletime : stopwatch; (* time since last segment received *) (* flags, some corresponding to BSD TF_ flags *) tf needfin : bool; (* send FIN (implicit state, used for app close while in SYN RECEIVED) *) tf shouldacknow : bool; (* output a segment urgently – similar to TF_ACKNOW, but used less often*) bsd cantconnect : bool; (* connection establishment attempt has failed having sent a SYN – on BSD this causes further connect() calls to fail *) (* send variables *) snd una : tcp seq local; (* lowest unacknowledged sequence number *) snd max : tcp seq local; (* highest sequence number sent; used to recognise retransmits *) snd nxt : tcp seq local; (* next sequence number to send *) snd wl1 : tcp seq foreign; (* seq number of most recent window update segment *) snd wl2 : tcp seq local; (* ack number of most recent window update segment *) iss : tcp seq local; (* initial send sequence number *) snd wnd : num; (* send window size: always between 0 and 65535*2**14 *) snd cwnd : num; (* congestion window *) snd ssthresh : num; (* threshold between exponential and linear snd cwnd expansion (for slow start)*) (* receive variables *) rcv wnd : num; (* receive window size *) tf rxwin0sent : bool; (* have advertised a zero window to receiver *) rcv nxt : tcp seq foreign; (* lowest sequence number not yet received *) rcv up : tcp seq foreign; (* received urgent pointer if any, else = rcv nxt *) irs : tcp seq foreign; (* initial receive sequence number *) rcv adv : tcp seq foreign; (* most recently advertised window *) last ack sent : tcp seq foreign; (* last acknowledged sequence number *) (* connection parameters *) t maxseg : num; (* maximum segment size on this connection *) t advmss : num option; (* the mss advertisment sent in our initial SYN *) tf doing ws : bool; (* doing window scaling on this connection? (result of negotiation) *) request r scale : num option; (* pending window scaling, if any (used during negotiation) *) snd scale : num; (* window scaling for send window (0..14), applied to received advertisements (RFC1323) *) rcv scale : num; (* window scaling for receive window (0..14), applied when we send advertisements (RFC1323) *) (* timestamping *) tf doing tstmp : bool; (* are we doing timestamps on this connection? (result of negotiation) *) tf req tstmp : bool; (* have/will request(ed) timestamps (used during negotiation) *) ts recent : ts seq timewindow; (* most recent timestamp received; TimeWindowClosed if invalid. Timer models the RFC1323 end-§4.2.3 24-day validity period. *) (* round-trip time estimation *) t rttseg : (ts seq# tcp seq local) option; (* start time and sequence number of segment being timed *) t rttinf : rttinf; (* round-trip time estimator values *) Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ socket listen 57 (* retransmission *) t dupacks : num; (* number of consecutive duplicate acks received (typically 0..3ish; should this wrap at 64K/4G ack burst?) *) t badrxtwin : () timewindow; (* deadline for bad-retransmit recovery *) snd cwnd prev : num; (* snd cwnd prior to retransmit (used in bad-retransmit recovery) *) snd ssthresh prev : num; (* snd ssthresh prior to retransmit (used in bad-retransmit recovery) *) snd recover : tcp seq local; (* highest sequence number sent at time of receipt of partial ack (used in RFC2581/RFC2582 fast recovery) *) (* other *) t segq : tcpReassSegment list; (* segment reassembly queue *) t softerror : error option (* current transient error; reported only if failure becomes permanent *) (* could cut this down to the actually-possible errors? *) ]〉 10.4 Sockets (TCP and UDP) 10.4.1 Summary iobc out-of-band data and status socket listen extra info for a listening socket tcp socket details of a TCP socket dgram msg ordinary datagram on UDP receive queue dgram error error (pseudo-)datagram on UDP receive queue dgram receive queue elements for a UDP socket udp socket details of a UDP socket sockflags flags set on a socket protocol info protocol-specific socket data socket details of a socket TCP Sock0 helper constructor TCP Sock helper constructor UDP Sock0 helper constructor UDP Sock helper constructor Sock helper constructor tcp sock of helper accessor (beware ARBitrary behaviour on non-TCP socket) udp sock of helper accessor (beware ARBitrary behaviour on non-UDP socket) proto of helper accessor proto eq compare protocol of two protocol info structures 10.4.2 Rules – out-of-band data and status : iobc = NO OOBDATA | OOBDATA of byte | HAD OOBDATA – extra info for a listening socket : Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ sockflags 58 socket listen =〈[ q0 : sid list; (* incomplete connections queue *) q : sid list; (* completed connections queue *) qlimit : int(* backlog value as passed to listen *) ]〉 – details of a TCP socket : tcp socket =〈[ st : tcpstate; (* here rather than in tcpcb for convenience as heavily used. Called t_state in BSD *) cb : tcpcb; lis : socket listen option; (* invariant: ∗ iff not LISTEN *) sndq : byte list; sndurp : num option; rcvq : byte list; rcvurp : num option; (* was ”oobmark” *) iobc : iobc ]〉 – ordinary datagram on UDP receive queue : dgram msg =〈[ data : byte list; is : ip option; (* source ip *) ps : port option(* source port *) ]〉 – error (pseudo-)datagram on UDP receive queue : dgram error =〈[ e : error]〉 – receive queue elements for a UDP socket : dgram = Dgram msg of dgram msg | Dgram error of dgram error – details of a UDP socket : udp socket =〈[ rcvq : dgram list]〉 Description UDP sockets are very simple – the protocol-specific content is merely a receive queue. The receive queue of a UDP socket, however, is not just a queue of bytes as it is for a TCP socket. Instead, it is a queue of messages and (in some implementations) errors. Each message contains a block of types and some ancilliary data. Variations WinXP On WinXP, errors are returned in order w.r.t. messages; this is modelled by placing them in the receive queue. FreeBSD,Linux On FreeBSD and Linux, only messages are placed in the receive queue, and errors are treated asynchronously. Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ TCP Sock0 59 – flags set on a socket : sockflags =〈[ b : sockbflag→ bool; n : socknflag→ num; t : socktflag→ time ]〉 – protocol-specific socket data : protocol info = TCP PROTO of tcp socket | UDP PROTO of udp socket – details of a socket : socket =〈[ fid : fid option; (* associated open file description if any *) sf : sockflags; (* socket flags *) is1 : ip option; (* local IP address if any *) ps1 : port option; (* local port if any *) is2 : ip option; (* remote IP address if any *) ps2 : port option; (* remote port if any *) es : error option; (* pending error if any *) cantsndmore : bool; (* output stream ends at end of send queue *) cantrcvmore : bool; (* input stream ends at end of receive queue *) pr : protocol info (* protocol-specific information *) ]〉 – helper constructor : TCP Sock0(st , cb, lis, sndq , sndurp, rcvq , rcvurp, iobc) =〈[ st := st ; cb := cb; lis := lis; sndq := sndq ; sndurp := sndurp; rcvq := rcvq ; rcvurp := rcvurp; iobc := iobc]〉 – helper constructor : TCP Sock v = TCP PROTO(TCP Sock0 v) – helper constructor : UDP Sock0(rcvq) =〈[ rcvq := rcvq ]〉 – helper constructor : UDP Sock v = UDP PROTO(UDP Sock0 v) – helper constructor : Sock(fid , sf , is1, ps1, is2, ps2, es, csm, crm, pr) =〈[ fid :=fid ; sf := sf ; is1 := is1; ps1 := ps1; is2 := is2; ps2 := ps2; es := es; cantsndmore := csm; cantrcvmore := crm; pr := pr ]〉 – helper accessor (beware ARBitrary behaviour on non-TCP socket) : tcp sock of sock = case sock .pr of TCP PROTO(tcp sock)→ tcp sock ‖ → ARB – helper accessor (beware ARBitrary behaviour on non-UDP socket) : udp sock of sock = case sock .pr of UDP PROTO(udp sock)→ udp sock ‖ → ARB – helper accessor : proto of(TCP PROTO( 1 )) = PROTO TCP ∧ proto of(UDP PROTO( 3 )) = PROTO UDP – compare protocol of two protocol info structures : proto eq pr pr ′ = (proto of pr = proto of pr ′) Description Various convenience functions. Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ routing table entry 60 10.5 The host (TCP and UDP) 10.5.1 Summary arch the architectures we consider ifd network interface descriptor routing table entry routing table entry type abbrev routing table bandlim reason segment category, determining which band limiter to use type abbrev bandlim state hostThreadState state of host wrt a thread host host details 10.5.2 Rules – the architectures we consider : arch = Linux 2 4 20 8 |WinXP Prof SP1 | FreeBSD 4 6 RELEASE Description The behaviour of TCP/IP stacks varies between architectures. Here we list the architectures we consider. In fact our FreeBSD build also has the TCP_DEBUG option turned on, and another edit to improve the accuracy of kernel time (for our automated testing). We believe that these do not impact the TCP semantics in any way. – network interface descriptor : ifd =〈[ ipset : ip set; (* set of IP addresses of this interface *) primary : ip; (* and the primary IP address *) netmask : netmask ; (* netmask *) up : bool(* status: up (and connected) or not *) ]〉 – routing table entry : routing table entry =〈[ destination ip : ip; destination netmask : netmask ; ifid : ifid ]〉 Description Note that both routing table entries and interfaces have IP addresses (plural for interfaces, singular for RTEs) and netmasks; furthermore, interfaces have a primary IP. When we do routing, we ignore the IP addresses and mask of the interface; we only use the address and mask from the RTE. The only use of the interface info is to obtain the primary IP for use by connect(). However, there is one place where all the interface data is used: on input, the interface IP addresses are consulted to see if we can receive a packet. The netmask of the interface is not used in the specification (except by getifaddrs()). Its function in the implementation relates to gateways etc., which (as we abstract from IP routing) we do not model. Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ host 61 Note that the model does not represent the routing cache here (i.e., cached routes with gateways, MSS, RTT, etc.), just the routing table. Cache data is treated nondeterministically. – : type abbrev routing table : routing table entry list – segment category, determining which band limiter to use : bandlim reason = BANDLIM UNLIMITED | BANDLIM RST CLOSEDPORT | BANDLIM RST OPENPORT Description internal bandlimiter state; intended to be opaque – : type abbrev bandlim state : (tcpSegment#ts seq#bandlim reason)list – state of host wrt a thread : hostThreadState = Run (* thread is running *) | Ret of TLang (* about to return given value to thread *) | Accept2 of sid (* blocked in accept *) | Close2 of sid (* blocked in close *) | Connect2 of sid (* blocked in connect *) | Recv2 of sid#num#msgbflag set (* blocked in recv *) | Send2 of sid#((ip#port) option#ip option#port option#ip option#port option) option #byte list#msgbflag set (* blocked in send *) | PSelect2 of fd list#fd list#fd list (* blocked in pselect *) Description Host threads are either Running or executing a sockets call. The latter can either be about to return a value to the thread (state Ret) or blocked; the remaining states capture the data required for the unblock processing for each slow call. – host details : host =〈[ arch : arch; (* architecture *) privs : bool; (* whether process has root/CAP NET ADMIN privilege *) ifds : ifid 7→ ifd; (* interfaces *) rttab : routing table; (* routing table *) ts : tid 7→ hostThreadState timed; (* host view of each thread state *) files : fid 7→ file; (* files *) socks : sid 7→ socket; (* sockets *) listen : sid list; (* list of listening sockets *) bound : sid list; (* list of sockets bound: head of list was first to be bound *) iq : msg list timed; (* input queue *) oq : msg list timed; (* output queue *) bndlm : bandlim state; (* bandlimiting *) ticks : ticker; (* ticker *) fds : fd 7→ fid(* file descriptors (per-process) *) ]〉 Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ tracecb eq 62 Description The input and output queue timers model the interrupt scheduling delay; the first element (if any) must be processed by the timer expiry. 10.6 Trace records (TCP and UDP) For BSD testing we make use of the BSD TCP_DEBUG option, which enables TCP debug trace records at various points in the code. This permits earlier resolution of nondeterminism in the trace checking process. Debug records contain IP and TCP headers, a timestamp, and a copy of the implementation TCP control block. Three issues complicate their use: firstly, not all the relevant state appears in the trace record; secondly, the model deviates in its internal structures from the BSD implementation in several ways; and thirdly, BSD generates trace records in the middle of processing messages, whereas the model performs atomic transitions (albeit split for blocking invocations). These mean that in different circumstances we can use only some of the debug record fields. To save defining a whole new datatype, we reuse tcpcb. However, we define a special equality that only inspects certain fields, and leaves the others unconstrained. Frustratingly, the is1 ps1 is2 ps2 are not always available, since although the TCP control block is structure-copied into the trace record, the embedded Internet control block is not! However, in cases where these are not available, the iss should be sufficiently unique to identify the socket of interest. 10.6.1 Summary traceflavour trace record flavours type abbrev tracerecord tracecb eq compare two control blocks for ”equality” modulo known is- sues tracesock eq compare two sockets for ”equality” modulo known issues 10.6.2 Rules – trace record flavours : traceflavour = TA INPUT | TA OUTPUT | TA USER | TA RESPOND | TA DROP Description Different situations in which a trace may be generated. – : type abbrev tracerecord : traceflavour #sid #(ip option(* is1 *) #port option(* ps1 *) #ip option(* is2 *) #port option(* ps2 *) ) option(* not always available! *) #tcpstate(* st *) #tcpcb(* cb subset *) Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ tracesock eq 63 – compare two control blocks for ”equality” modulo known issues : tracecb eq(flav : traceflavour)(st : tcpstate)(es : error option)(cb : tcpcb)(cb′ : tcpcb) = ((cb.snd una = cb′.snd una) ∧ (if flav = TA OUTPUT then T else cb.snd max = cb′.snd max ) ∧ (if flav = TA OUTPUT ∨ (st = SYN SENT ∧ es 6= ∗) then T else cb.snd nxt = cb′.snd nxt) ∧ (* only bad on error *) (cb.snd wl1 = cb′.snd wl1 ) ∧ (cb.snd wl2 = cb′.snd wl2 ) ∧ (cb.iss = cb′.iss) ∧ (cb.snd wnd = cb′.snd wnd) ∧ (if flav = TA OUTPUT then T else cb.snd cwnd = cb′.snd cwnd) ∧ (* only bad on error *) (cb.snd ssthresh = cb′.snd ssthresh) ∧ (* Don’t check equality of rcv wnd : we recalculate rcv wnd lazily in tcp output instead of after every successful recv() call, so our value is often out of date. *) (* (if st = SYN SENT then T else cb.rcv wnd = cb′.rcv wnd)∧ *) (* Removing this clause is an allowance for the fact that BSD chooses its window size rather late. *) (* Note: we should check how it ensures that a window size it emits on a SYN retransmit is the same as on the initial transmit, and how it ensures it does not accidentally shrink the window on the next output segment (ACK of other end’s SYN,ACK). *) (cb.rcv nxt = cb′.rcv nxt) ∧ (cb.rcv up = cb′.rcv up) ∧ (cb.irs = cb′.irs) ∧ (if flav = TA OUTPUT ∨ flav = TA INPUT then T else cb.rcv adv = cb′.rcv adv) ∧ (if flav = TA OUTPUT ∨ st = SYN SENT ∨ st = TIME WAIT (* we store our initially-sent MSS in t maxseg , whereas BSD just recalculates it. This test decouples the model from BSD in order to cope with this. *) then T else cb.t maxseg = cb′.t maxseg) ∧ (* only bad on error *) (cb.t dupacks = cb′.t dupacks) ∧ (cb.snd scale = cb′.snd scale) ∧ (cb.rcv scale = cb′.rcv scale) ∧ (* t rtseq, if t rtttime <> 0; ignore t rtttime *)(* only bad on error *) (if flav = TA OUTPUT ∨ flav = TA INPUT then T else option map snd cb.t rttseg = option map snd cb′.t rttseg) ∧ (timewindow val of cb.ts recent = timewindow val of cb′.ts recent) ∧ (if flav = TA OUTPUT ∨ flav = TA INPUT then T else cb.last ack sent = cb′.last ack sent)) (* also ignore, always: tt delack ; in case of error: tt rexmt , t softerror *) – compare two sockets for ”equality” modulo known issues : tracesock eq(flav , sid, quad , st , cb)sid ′ sock = (proto of sock .pr = PROTO TCP ∧ let tcp sock = tcp sock of sock in sid = sid ′ ∧ (* If trace is TA DROP then the is2, ps2 values in the trace may not match those in the socket record — the segment is dropped because it is somehow invalid (and thus not safe to compare) *) (case quad of ↑(is1, ps1, is2, ps2)→ is1 = sock .is1 ∧ ps1 = sock .ps1 ∧ (if flav = TA DROP then T else is2 = sock .is2) ∧ (if flav = TA DROP then T else ps2 = sock .ps2) ‖ ∗ → T) ∧ st = tcp sock .st ∧ Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ tracesock eq 64 tracecb eq flav st sock .es cb tcp sock .cb) Rule version: $Id: TCP1 hostTypesScript.sml,v 1.155 2005/03/16 15:06:36 pes20 Exp $ Part XI TCP1 params 65 Chapter 11 Host behavioural parameters This file defines a large number of constants affecting the behaviour of the host. Many of these of are adjustable by sysctls/registry keys on the target architectures. 11.1 Model parameters (TCP and UDP) Booleans that select a particular model semantics. 11.1.1 Summary INFINITE RESOURCES BSD RTTVAR BUG 11.1.2 Rules – : INFINITE RESOURCES = T Description INFINITE RESOURCES forbids various resource failures, e.g. lack of kernel memory. These failures are nondeterministic in the specification (to be more precise the specification would have to model far more detail about the real system) and rare in practice, so for testing and resoning one often wants to exclude them altogether. – : BSD RTTVAR BUG = T Description BSD RTTVAR BUG enables a peculiarity of BSD behaviour for retransmit timeouts. After TCP MAXRXTSHIFT /4 retransmit timeouts, t srtt and t rttvar are invalidated, but should still be used to compute future retransmit timeouts until better information becomes available. BSD makes a mistake in doing this, thus causing future retransmit timeouts to be wrong. The code at tcp_timer.c:420 adds the srtt value to the rttvar , shifted ”appropriately”, and sets srtt to zero. srtt == 0 is the indication (in BSD) that the srtt is invalid. We instead code this with a separate boolean, and are thus able to keep using both srtt and rttvar . But comparing with tcp_var.h:281, where the values are used, reveals that the correction is in fact wrong. 66 Timers (TCP and UDP) 67 This is not visible in the RexmtSyn case (where it would be most obvious), because in that case the srtt never was valid, and rttvar was cunningly hacked up to give the right value (in tcp_subr.c:542 — and the tcp_timer.c:420 code has no effect at all. 11.2 Scheduling parameters (TCP and UDP) Parameters controlling the timing of the OS scheduler. 11.2.1 Summary dschedmax diqmax doqmax 11.2.2 Rules – : dschedmax = time(1000/1000)(* make large for now, tighten when better understood *) – : diqmax = time(1000/1000)(* make large for now, tighten when better understood *) – : doqmax = time(1000/1000)(* make large for now, tighten when better understood *) Description dschedmax is the maximum scheduling delay between a system call yielding a return value and that return value being passed to the process. diqmax and doqmax are the maximum scheduling delays between a message being placed on the queue and being processed (respectively, emitted). For now, pending investigation of tighter realistic upper bounds, they are all made conservatively large. 11.3 Timers (TCP and UDP) Parameters controlling the rate and fuzziness of the various timers used in the model. 11.3.1 Summary HZ tickintvlmin tickintvlmax stopwatchfuzz stopwatch zero SLOW TIMER INTVL SLOW TIMER MODEL INTVL FAST TIMER INTVL FAST TIMER MODEL INTVL KERN TIMER INTVL KERN TIMER MODEL INTVL 11.3.2 Rules Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ SLOW TIMER INTVL 68 – : HZ = 100 : real(* Note this is the FreeBSD value. *) Description The nominal rate at which the timestamp (etc.) clock ticks, in hertz (ticks per second). – : tickintvlmin = 100/(105 ∗HZ) : real – : tickintvlmax = 105/(100 ∗HZ) : real Description The actual bounds on the tick interval, in seconds-per-tick; must include 1/HZ, and be within the RFC1323 bounds of 1sec to 1msec. – : stopwatchfuzz = (5/100) : real(* +/- factor on accuracy of stopwatch timers *) – : stopwatch zero = Stopwatch(0, 1/(1 + stopwatchfuzz), 1 + stopwatchfuzz) Description A stopwatch timer is initialised to stopwatch zero, which gives it an initial time of 0 and a fuzz of stopwatchfuzz. – : SLOW TIMER INTVL = (1/2) : duration (* slow timer is 500msec on BSD *) – : SLOW TIMER MODEL INTVL = (1/1000) : duration (* 1msec fuzziness to mask atomicity of model; Note that it might be possible to reduce this fuzziness *) – : FAST TIMER INTVL = (1/5) : duration (* fast timer is 200msec on BSD *) – : FAST TIMER MODEL INTVL = (1/1000) : duration (* 1msec fuzziness to mask atomicity of model; Note that it might be possible to reduce this fuzziness *) – : KERN TIMER INTVL = tickintvlmax : duration (* precision of select timer *) – : KERN TIMER MODEL INTVL = (the time dschedmax) : duration (* Note that some fuzziness may be re- quired here *) (* Note this was previously 0usec fuzziness; it should really have some fuzziness, though dschedmax has a current value of 1s which is too high. Once epsilon 2 is used properly by the checker, we should be able to reduce this fuzziness as it will enable the time transitions to be split. e.g. in pselect rules, we really want to change from PSelect2() to Ret() states pretty much exactly when the timer goes off, then allow a further epsilon transition before returning. *) Description The slow, fast, and kernel timers are the timers used to control TCP time-related behaviour. The parameters here set their rates and fuzziness. The slow timer is used for retransmit, persist, keepalive, connection establishment, FIN WAIT 2, 2MSL, and linger timers. The fast timer is used for delayed acks. The kernel timer is used for timestamp expiry, select, and bad-retransmit detection. Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ FD SETSIZE 69 11.4 Ports, sockets, and files (TCP and UDP) Parameters defining the classes of ports, and limits on numbers of file descriptors and sockets. 11.4.1 Summary privileged ports ephemeral ports OPEN MAX OPEN MAX FD FD SETSIZE SOMAXCONN 11.4.2 Rules – : privileged ports = {Port n | n < 1024} – : ephemeral ports = {Port n | n ≥ 1024 ∧ n ≤ 5000} Description Ports below 1024 are reserved, and can be bound by privileged users only. Ports in the range 1024 through 5000 inclusive are used for autobinding, when no specific port is specified; these ports are called ”ephemeral”. – : OPEN MAX = 957 : num (* typical value of kern.maxfilesperproc on one of our BSD boxen *) – : OPEN MAX FD = FD OPEN MAX Description A process may hold a maximum of OPEN MAX file descriptors at any one time. These are numbered consecutively from zero on non-Windows architectures, and so the first forbidden file descriptor is OPEN MAX FD. – : (FD SETSIZE : arch → num)Linux 2 4 20 8 = 1024n ∧ FD SETSIZE WinXP Prof SP1 = 64n ∧ FD SETSIZE FreeBDS 4 6 RELEASE = 1024n Description The sets of file descriptors used in calls to pselect can contain only file descriptors numbered less than FD SETSIZE. Variations WinXP FD SETSIZE refers to the maximum number of file descriptors in a file descriptor set. Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ MCLBYTES 70 – : SOMAXCONN = 128 : num Description The maximum listen-queue length. 11.5 UDP parameters (UDP only) UDP-specific parameters. 11.5.1 Summary UDPpayloadMax 11.5.2 Rules – : (UDPpayloadMax : arch → num) Linux 2 4 20 8 = 65507n ∧ UDPpayloadMax WinXP Prof SP1 = 65507n ∧ UDPpayloadMax FreeBSD 4 6 RELEASE = 9216n Description The architecture-dependent maximum payload for a UDP datagram. 11.6 Buffers (TCP and UDP) Parameters to the buffer size computation. 11.6.1 Summary MCLBYTES size of an mbuf cluster MSIZE SB MAX oob extra sndbuf 11.6.2 Rules – size of an mbuf cluster : MCLBYTES = 2048 : num(* BSD default on i386; really, just needs to be >=1500 to fit an etherseg *) – : MSIZE = 256 : num(* BSD default on i386; really, size of an mbuf *) – : SB MAX = 256 ∗ 1024 : num(* BSD *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ sf default n 71 – : oob extra sndbuf = 1024 : num 11.7 File and socket flag defaults (TCP and UDP) Default values of file and socket flags, applied on creation. Some of these are architecture-dependent. Note that SO BSDCOMPAT should really be set to T by default on FreeBSD. 11.7.1 Summary ff default b file flags default ff default sf default b bool socket flags default sf default n num socket flags defaults sf default t time socket flags defaults sf default socket flags defaults sf min n minimum values of num socket flags sf max n maximum values of num socket flags sndrcv timeo t max maximum value of send/recv timeouts pselect timeo t max maximum value of pselect timeouts 11.7.2 Rules – file flags default : (ff default b : filebflag→ bool) O NONBLOCK = F ∧ ff default b O ASYNC = F – : ff default =〈[ b :=ff default b]〉 – bool socket flags default : (sf default b : sockbflag→ bool) SO BSDCOMPAT = F ∧ sf default b SO REUSEADDR = F ∧ sf default b SO KEEPALIVE = F ∧ sf default b SO OOBINLINE = F ∧ sf default b SO DONTROUTE = F – num socket flags defaults : (sf default n : arch → socktype→ socknflag→ num) Linux 2 4 20 8 SOCK STREAM SO SNDBUF = 16384 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK STREAM SO SNDBUF = 8192 ∧ (* from tests *) sf default n FreeBSD 4 6 RELEASE SOCK STREAM SO SNDBUF = 32 ∗ 1024 ∧ (* from code*) sf default n Linux 2 4 20 8 SOCK STREAM SO RCVBUF = 43689 ∧ (* from tests - strange number? *) sf default n WinXP Prof SP1 SOCK STREAM SO RCVBUF = 8192 ∧ (* from tests *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ sf min n 72 sf default n FreeBSD 4 6 RELEASE SOCK STREAM SO RCVBUF = 57344 ∧ (* from code *) sf default n Linux 2 4 20 8 SOCK STREAM SO SNDLOWAT = 1 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK STREAM SO SNDLOWAT = 1 ∧ (* Note this value has not been checked in testing. *) sf default n FreeBSD 4 6 RELEASE SOCK STREAM SO SNDLOWAT = 2048 ∧ (* from code *) sf default n Linux 2 4 20 8 SOCK STREAM SO RCVLOWAT = 1 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK STREAM SO RCVLOWAT = 1 ∧ sf default n FreeBSD 4 6 RELEASE SOCK STREAM SO RCVLOWAT = 1 ∧ (* from code *) sf default n Linux 2 4 20 8 SOCK DGRAM SO SNDBUF = 65535 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK DGRAM SO SNDBUF = 8192 ∧ (* from tests *) sf default n FreeBSD 4 6 RELEASE SOCK DGRAM SO SNDBUF = 9216 ∧ (* from code *) sf default n Linux 2 4 20 8 SOCK DGRAM SO RCVBUF = 65535 ∧ (* correct from tests *) sf default n WinXP Prof SP1 SOCK DGRAM SO RCVBUF = 8192 ∧ (* correct from tests *) sf default n FreeBSD 4 6 RELEASE SOCK DGRAM SO RCVBUF = 42080∧ (* from tests but: 41600 from code; i386 only as dependent on sizeof(struct sock- addr_in) *) sf default n Linux 2 4 20 8 SOCK DGRAM SO SNDLOWAT = 1 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK DGRAM SO SNDLOWAT = 1 ∧ (* from tests *) sf default n FreeBSD 4 6 RELEASE SOCK DGRAM SO SNDLOWAT = 2048 ∧ (* from code *) sf default n Linux 2 4 20 8 SOCK DGRAM SO RCVLOWAT = 1 ∧ (* from tests *) sf default n WinXP Prof SP1 SOCK DGRAM SO RCVLOWAT = 1 ∧ (* from tests *) sf default n FreeBSD 4 6 RELEASE SOCK DGRAM SO RCVLOWAT = 1(* from code *) – time socket flags defaults : (sf default t : socktflag→ time) SO LINGER =∞∧ sf default t SO SNDTIMEO =∞∧ sf default t SO RCVTIMEO =∞ – socket flags defaults : sf default arch socktype =〈[ b := sf default b; n := sf default n arch socktype; t := sf default t ]〉 – minimum values of num socket flags : (sf min n : arch → socknflag→ num) Linux 2 4 20 8 SO SNDBUF = 2048 ∧ (* from tests *) sf min n WinXP Prof SP1 SO SNDBUF = 0 ∧ (* from tests *) sf min n FreeBSD 4 6 RELEASE SO SNDBUF = 1 ∧ (* from code *) sf min n Linux 2 4 20 8 SO RCVBUF = 256 ∧ (* from tests *) sf min n WinXP Prof SP1 SO RCVBUF = 0 ∧ (* from tests *) sf min n FreeBSD 4 6 RELEASE SO RCVBUF = 1 ∧ (* from code *) sf min n Linux 2 4 20 8 SO SNDLOWAT = 1 ∧ (* from tests *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ TCP MAXWIN 73 sf min n WinXP Prof SP1 SO SNDLOWAT = 1 ∧ (* Note this value has not been checked in testing. *) sf min n FreeBSD 4 6 RELEASE SO SNDLOWAT = 1 ∧ (* from code *) sf min n Linux 2 4 20 8 SO RCVLOWAT = 1 ∧ (* from tests *) sf min n WinXP Prof SP1 SO RCVLOWAT = 1 ∧ (* Note this value has not been checked in testing. *) sf min n FreeBSD 4 6 RELEASE SO RCVLOWAT = 1(* from code *) – maximum values of num socket flags : (sf max n : arch → socknflag→ num) Linux 2 4 20 8 SO SNDBUF = 131070 ∧ (* from tests *) sf max n WinXP Prof SP1 SO SNDBUF = 131070 ∧ (* from tests *) sf max n FreeBSD 4 6 RELEASE SO SNDBUF = SB MAX ∗MCLBYTES div(MCLBYTES+MSIZE) ∧ (* from code *) sf max n Linux 2 4 20 8 SO RCVBUF = 131070 ∧ (* from tests *) sf max n WinXP Prof SP1 SO RCVBUF = 131070 ∧ (* from tests *) sf max n FreeBSD 4 6 RELEASE SO RCVBUF = SB MAX ∗MCLBYTES div(MCLBYTES+MSIZE) ∧ (* from code *) sf max n Linux 2 4 20 8 SO SNDLOWAT = 1 ∧ (* from tests *) sf max n WinXP Prof SP1 SO SNDLOWAT = 1 ∧ (* Note this value has not been checked in testing. *) sf max n FreeBSD 4 6 RELEASE SO SNDLOWAT = SB MAX ∗MCLBYTES div(MCLBYTES+MSIZE) ∧ (* clip to SO SNDBUF *) sf max n Linux 2 4 20 8 SO RCVLOWAT = w2n INT32 SIGNED MAX ∧ (* from code *) sf max n WinXP Prof SP1 SO RCVLOWAT = 1 ∧ (* Note this value has not been checked in testing. *) sf max n FreeBSD 4 6 RELEASE SO RCVLOWAT = SB MAX ∗MCLBYTES div(MCLBYTES+MSIZE)(* clip to SO RCVBUF *) – maximum value of send/recv timeouts : sndrcv timeo t max = time 655350000 – maximum value of pselect timeouts : pselect timeo t max = time(31 ∗ 24 ∗ 3600) 11.8 RFC-specified limits (TCP only) Protocol value limits specified in the TCP RFCs. 11.8.1 Summary dtsinval RFC1323 s4.2.3: timestamp validity period. TCP MAXWIN maximum (scaled) window size TCP MAXWINSCALE maximum window scaling exponent 11.8.2 Rules – RFC1323 s4.2.3: timestamp validity period. : dtsinval = time(24 ∗ 24 ∗ 60 ∗ 60) – maximum (scaled) window size : TCP MAXWIN = 65535 : num Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ TCP Q0MINLIMIT 74 – maximum window scaling exponent : TCP MAXWINSCALE = 14 : num Description The maximum (scaled) window size value is TCP MAXWIN, and the maximum scaling exponent is TCP MAXWINSCALE. Thus the maximum window size is TCP MAXWIN TCP MAXWINSCALE. 11.9 Protocol parameters (TCP only) Various TCP protocol parameters, many adjustable by sysctl settings (or equivalent). The values here are typical. It was not considered worthwhile modelling these parameters changing during operation. 11.9.1 Summary MSSDFLT initial t maxseg , modulo route and link MTUs SS FLTSZ LOCAL initial snd cwnd for local connections SS FLTSZ initial snd cwnd for non-local connections TCP DO NEWRENO do NewReno fast recovery TCP Q0MINLIMIT TCP Q0MAXLIMIT backlog fudge 11.9.2 Rules – initial t maxseg, modulo route and link MTUs : MSSDFLT = 512 : num(* BSD default; RFC1122 sec. 4.2.2.6 says this MUST be 536 *) – initial snd cwnd for local connections : SS FLTSZ LOCAL = 4 : num(* BSD; is a sysctl *) – initial snd cwnd for non-local connections : SS FLTSZ = 1 : num(* BSD; is a sysctl *) – do NewReno fast recovery : TCP DO NEWRENO = T : bool(* BSD default *) – : TCP Q0MINLIMIT = 30 : num(* FreeBSD 4.6-RELEASE: tcp syncache.bucket limit *) – : TCP Q0MAXLIMIT = 512 ∗ 30 : num(* FreeBSD 4.6-RELEASE: tcp syncache.cache limit *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ TCPTV RTOBASE 75 Description The incomplete-connection listen queue q0 has a nondeterministic length limit. Con- nections may be dropped once q0 reaches TCP Q0MINLIMIT, and must be dropped once q0 reaches TCP Q0MAXLIMIT. – : backlog fudge(n : int) =min SOMAXCONN(clip int to num n) Description The backlog length fudge-factor function, which translates the requested length of the listen queue into the actual value used. Some architectures apply a linear transformation here. 11.10 Time values (TCP only) Various time intervals controlling TCP’s behaviour. 11.10.1 Summary TCPTV DELACK TCPTV RTOBASE TCPTV RTTVARBASE TCPTV MIN TCPTV REXMTMAX TCPTV MSL TCPTV PERSMIN TCPTV PERSMAX TCPTV KEEP INIT TCPTV KEEP IDLE TCPTV KEEPINTVL TCPTV KEEPCNT TCPTV MAXIDLE 11.10.2 Rules – : TCPTV DELACK = time(1/10)(* FreeBSD 4.6-RELEASE, tcp timer.h *) – : TCPTV RTOBASE = 3 : duration (* initial RTT, in seconds: FreeBSD 4.6-RELEASE, tcp timer.h *) – : TCPTV RTTVARBASE = 0 : duration (* initial retransmit variance, in seconds *) (* FreeBSD has no way of encoding an initial RTT variance, but we do (thanks to tf srttvalid); it should be zero so TCPTV RTOBASE = initial RTO *) – : TCPTV MIN = 1 : duration (* minimum RTT in absence of cached value, in seconds: FreeBSD 4.6-RELEASE, tcp timer.h *) – : TCPTV REXMTMAX = time 64(* BSD: maximum possible RTT *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ TCP BSD BACKOFFS 76 – : TCPTV MSL = time 30(* maximum segment lifetime: BSD: tcp timer.h:79 *) – : TCPTV PERSMIN = time 5(* BSD: minimum possible persist interval: tcp timer.h:85 *) – : TCPTV PERSMAX = time 60(* BSD: maximum possible persist interval: tcp timer.h:86 *) – : TCPTV KEEP INIT = time 75(* connect timeout: BSD: tcp timer.h:88 *) – : TCPTV KEEP IDLE = time(120 ∗ 60)(* time before first keepalive probe: BSD: tcp timer.h:89 *) – : TCPTV KEEPINTVL = time 75(* time between subsequent keepalive probes: BSD: tcp timer.h:90 *) – : TCPTV KEEPCNT = 8 : num(* max number of keepalive probes (+/- a few?): BSD: tcp timer.h:91 *) – : TCPTV MAXIDLE = 8 ∗ TCPTV KEEPINTVL (* BSD calls this tcp maxidle *) 11.11 Timing-related parameters (TCP only) Parameters relating to TCP’s exponential backoff. 11.11.1 Summary TCP BSD BACKOFFS TCP exponential retransmit backoff: BSD: from source code, tcp timer.c:155 TCP LINUX BACKOFFS TCP exponential retransmit backoff: Linux: experimentally determined TCP WINXP BACKOFFS TCP exponential retransmit backoff: WinXP: experimen- tally determined TCP MAXRXTSHIFT TCP maximum retransmit shift TCP SYNACKMAXRXTSHIFT TCP maximum SYNACK retransmit shift TCP SYN BSD BACKOFFS TCP exponential SYN retransmit backoff: BSD: tcp timer.c:152 TCP SYN LINUX BACKOFFS TCP exponential SYN retransmit backoff: Linux: experi- mentally determined TCP SYN WINXP BACKOFFS TCP exponential SYN retransmit backoff: WinXP: experi- mentally determined 11.11.2 Rules Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ TCP SYN BSD BACKOFFS 77 – TCP exponential retransmit backoff: BSD: from source code, tcp timer.c:155 : TCP BSD BACKOFFS = [1; 2; 4; 8; 16; 32; 64; 64; 64; 64; 64; 64; 64] : num list – TCP exponential retransmit backoff: Linux: experimentally determined : TCP LINUX BACKOFFS = [1; 2; 4; 8; 16; 32; 64; 128; 256; 512; 512] : num list(* Note: the tail may be incomplete *) – TCP exponential retransmit backoff: WinXP: experimentally determined : TCP WINXP BACKOFFS = [1; 2; 4; 8; 16] : num list(* Note: the tail may be incomplete *) – TCP maximum retransmit shift : TCP MAXRXTSHIFT = 12 : num(* TCPv2p842 *) – TCP maximum SYNACK retransmit shift : TCP SYNACKMAXRXTSHIFT = 3 : num(* FreeBSD 4.6-RELEASE, tcp syncache.c:SYNCACHE MAXREXMTS *) – TCP exponential SYN retransmit backoff: BSD: tcp timer.c:152 : TCP SYN BSD BACKOFFS = [1; 1; 1; 1; 1; 2; 4; 8; 16; 32; 64; 64; 64] : num list(* Our experimentation shows that this list stops at 8. This will be due to the connection establishment timer firing. Values here are ob- tained from the BSD source *) – TCP exponential SYN retransmit backoff: Linux: experimentally determined : TCP SYN LINUX BACKOFFS = [1; 2; 4; 8; 16] : num list(* This list might be longer. Experimentation does not show further entries, perhaps due to the connection es- tablishment timer firing *) – TCP exponential SYN retransmit backoff: WinXP: experimentally determined : TCP SYN WINXP BACKOFFS = [1; 2] : num list(* This list might be longer. Experimentation does not show fur- ther entries, perhaps due to the connection establishment timer firing *) Rule version: $Id: TCP1 paramsScript.sml,v 1.21 2005/03/17 11:35:34 kw217 Exp $ Part XII TCP1 auxFns 78 Chapter 12 Auxiliary functions This file defines a large number of auxiliary functions to the host specification. 12.1 Architecture handling (TCP and UDP) Many aspects of host behaviour differ from one OS to another, and so a host has an architecture parameter detailing its precise OS and version (e.g., Linux 2 4 20 8). Very often, however, we do not need to be so precise – a certain behaviour might apply to all Linux, or even all Unix, OSes. Below we define predicates for these cases, to allow variant architectures to be easily added later. 12.1.1 Summary windows arch test if host architecture is Windows bsd arch test if host architecture is BSD linux arch test if host architecture is Linux unix arch test if host architecture is Unix 12.1.2 Rules – test if host architecture is Windows : windows arch arch = (arch ∈ {WinXP Prof SP1}) – test if host architecture is BSD : bsd arch arch = (arch ∈ {FreeBSD 4 6 RELEASE}) – test if host architecture is Linux : linux arch arch = (arch ∈ {Linux 2 4 20 8}) – test if host architecture is Unix : unix arch arch = (arch ∈ {Linux 2 4 20 8;FreeBSD 4 6 RELEASE}) 12.2 Interfaces and IP addresses (TCP and UDP) Constructors, predicates, and helper functions that deal with interfaces, IP addresses, and routing. 12.2.1 Summary mask apply a netmask to an IP to obtain the network number mask bits compute network bitmask from netmask 79 IP 80 IP constructor for dotted-decimal IP addresses IN MULTICAST the set of multicast addresses INADDR BROADCAST the local broadcast address LOOPBACK ADDRS the set of loopback addresses ip localhost the canonical loopback address, aka ’localhost’ in loopback is IP address a loopback address? in local is IP address a local address? local ips the set of local IP addresses local primary ips the set of local primary IP addresses is localnet is IP address on a local subnet of this host? if broadcast is IP address a broadcast address? if any the set of addresses in an interface’s subnet is broadormulticast is IP address a broadcast/multicast address? routeable compute set of routeable addresses for a routing table entry outroute ifids determine list of possible sending interfaces ifid up is the interface up? outroute compute interface to use to send to given IP, if any auto outroute compute source address to use to route to given IP test outroute ip test if we can route to given IP, returning appropriate error if not test outroute if destination IP specified, do test outroute ip loopback on wire check if a message bears a loopback address 12.2.2 Rules – apply a netmask to an IP to obtain the network number : mask(NETMASK m)(ip n) = ip((n div(2 ∗∗ (32−m))) ∗ 2 ∗∗ (32−m)) – compute network bitmask from netmask : mask bits(NETMASK m) = ((2 ∗∗ 32− 1)div(2 ∗∗ (32−m))) ∗ 2 ∗∗ (32−m) Description Netmask operations. Recall netmasks are stored as the number of 1 bits in the mask; thus 255.255.128.0 is modelled by NETMASK 17. – constructor for dotted-decimal IP addresses : IP(a : num)(b : num)(c : num)(d : num) = ip(a ∗ 2 ∗∗ 24 + b ∗ 2 ∗∗ 16 + c ∗ 2 ∗∗ 8 + d) – the set of multicast addresses : IN MULTICAST = {i | mask(NETMASK 4)i = IP 224 0 0 0} – the local broadcast address : INADDR BROADCAST = IP 255 255 255 255 – the set of loopback addresses : LOOPBACK ADDRS = {i | mask(NETMASK 8)i = IP 127 0 0 0} – the canonical loopback address, aka ’localhost’ : ip localhost = IP 127 0 0 1 – is IP address a loopback address? : in loopback i = (i ∈ LOOPBACK ADDRS) – is IP address a local address? : in local(ifds : ifid 7→ ifd)i = (in loopback i ∨ i ∈ (bigunion{ifd .ipset | ifd ∈ (rng(ifds))})) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ routeable 81 (* Note: the test ”in loopback i” is usually redundant as there is almost always a loopback interface in ifds with ipset = LOOPBACK ADDRS *) – the set of local IP addresses : local ips(ifds : ifid 7→ ifd) = bigunion{ifd .ipset | ifd ∈ (rng(ifds))} (* annoying: ifd is a constructor, and { | } has no binder to allow us to shadow it *) – the set of local primary IP addresses : local primary ips(ifds : ifid 7→ ifd) = {ifd .primary | ifd ∈ (rng(ifds))} – is IP address on a local subnet of this host? : is localnet(ifds0 : ifid 7→ ifd)i = (∃ifd .ifd ∈ (rng(ifds0)) ∧mask ifd .netmask i = mask ifd .netmask ifd .primary) – is IP address a broadcast address? : if broadcast(ifd0 : ifd) = case (ifd0 .netmask ,mask ifd0 .netmask ifd0 .primary) of (NETMASK m, ip n(* n has been masked by m above *))→ ip(n + 2 ∗∗ (32−m)− 1) (* Note: would be much easier if IPs were actually word32 rather than num *) (* corresponds to INADDR BROADCAST for the interface *) – the set of addresses in an interface’s subnet : if any(ifd0 : ifd) = case (ifd0 .netmask ,mask ifd0 .netmask ifd0 .primary) of (NETMASK m, ip n(* n has been masked by m above *))→ ip(n) (* Note: would be much easier if IPs were actually word32 rather than num *) Description Various distinguished IP addresses and sets of IP addresses. Some of these are are dependent on the host’s set of interfaces. – is IP address a broadcast/multicast address? : is broadormulticast(ifds0 : ifid 7→ ifd)i = (i ∈ IN MULTICAST∨ (* is i a multicast address? *) i = INADDR BROADCAST∨ (* is i the default broadcast address? [CORRECT NAME?] *) ∃(k , ifd0 ) :: ifds0. i ∈ {if broadcast ifd0 ; (* is i the broadcast addr for any interface? *) if any ifd0}) (* RFC 1122 - should accept an all-0s or all-1s broadcast address. all three OSes do *) Description Test if IP address i is a broadcast or multicast address, wrt the given set of interfaces ifds0. If no interfaces given (ifds0 = ∗), then treat only INADDR BROADCAST as a broadcast address. These correctly use the interface rather than the routing-table entry to check what is a broadcast address and what is in the local net of this host. Whether there is a route allowing a send to that local net is another question entirely, although the two data structures should be consistent. – compute set of routeable addresses for a routing table entry : routeable(rte : routing table entry) = {i | mask rte.destination netmask i = mask rte.destination netmask rte.destination ip} – determine list of possible sending interfaces : outroute ifids(i2, rttab : routing table) = MAP OPTIONAL(λrte.if i2 ∈ routeable rte then ↑ rte.ifid else ∗)rttab Description Determine the list of possible interfaces to use in sending to a given IP, based on the routing table. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ test outroute ip 82 – is the interface up? : ifid up ifds ifid = (ifds[ifid ]).up – compute interface to use to send to given IP, if any : outroute(i2, rttab : routing table, ifds : ifid 7→ ifd) = case filter(ifid up ifds)(outroute ifids(i2, rttab)) of [ ]→ ∗ ‖ (ifid :: 987 )→ ↑ ifid Description Determine the interface to use to send to a given IP, if possible. Returns the first up interface that can route to the destination. – compute source address to use to route to given IP : auto outroute(i2 ′, ↑ i2, rttab, ifds) = {i2} ∧ auto outroute(i2 ′, ∗, rttab, ifds) = case outroute(i2 ′, rttab, ifds) of ↑ ifid → {(ifds[ifid ]).primary} ‖ ∗ → {} Description Compute source address to use to route to a given IP, if any possible. If the caller provides an address, use that without checking; otherwise try to find one. Do not return a specific error code. Used for autobinding to a local IP address. – test if we can route to given IP, returning appropriate error if not : test outroute ip(i2 : ip, rttab, ifds, arch) = let ifids = outroute ifids(i2, rttab) in if ifids = [ ] then (if linux arch arch then ↑ ENETUNREACH else ↑ EHOSTUNREACH) else if filter(ifid up ifds)ifids = [ ] then ↑ ENETDOWN else ∗ – if destination IP specified, do test outroute ip : test outroute(msg : msg, rttab, ifds, arch) = case msg.is2 of ↑ i2 → ↑(test outroute ip(i2, rttab, ifds, arch)) ‖ → ∗ Description Check that we can route the message out. First check that there is an interface that can route to the destination address. If not, EHOSTUNREACH. Then, check that there is one of these that is up. If not, ENETDOWN. Otherwise, succeed (indicated by empty set of possible errors). The message should have i2 specified. You might think that we should check that the interface can send from the source address also, but in fact, in the weak end system model, they don’t need to be the same interface. We have tested Linux, and find this behaviour. Not sure yet about BSD, but suspect it will be the same. test 20030204T1525 or so. test outroute modified to be functional rather than relational, as behaviour is purely deterministic. The result is of type error option option, where the first level of ”optionality” indicates whether or not the function is even being called on valid input (whether or not message has an is2 ”field”), and the next level indicates errors being raised, or not. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ fdlt 83 Note that if we ”knew” that this would only be called on messages with ok is2 fields, then it would easier still to just use the, ignore the fact that the function had an unspecified result on arguments with bad is2 fields, and make the result type error option. – check if a message bears a loopback address : loopback on wire(msg : msg)(ifds : ifid 7→ ifd) = case (msg.is1,msg.is2) of (∗, ∗)→ F ‖ (∗, ↑ j )→ F ‖ (↑ i , ∗)→ F ‖ (↑ i , ↑ j )→ in loopback i ∧ ¬ in local ifds j Description RFC1122 says loopback addresses must never appear on the wire. Here we test if this segment is in violation. Ideally, we’d check ”(src or dest in loopback net) and interface not loopback”, but we can’t see which interface it’s going out of in this model. The condition above is possibly the best approximation we can make if one considers the possible values of msg.is1 and msg.is2. 12.3 Files, file descriptors, and sockets (TCP and UDP) The open files of a host are modelled by a set of open file descriptions, indexed by fid . The open files of a process are identified by file descriptor fd, which is an index into a table of fids. This table is modelled by a finite map. File descriptors are isomorphic to the natural numbers. 12.3.1 Summary fdlt < comparison on file descriptors fdle ≤ comparison on file descriptors leastfd least fd satisfying predicate P nextfd next file descriptor to use fid ref count count references to given fid sane socket socket sanity invariants hold 12.3.2 Rules – < comparison on file descriptors : fdlt(FD n)(FD m) = n < m – ≤ comparison on file descriptors : fdle(FD n)(FD m) = n ≤ m – least fd satisfying predicate P : leastfd P = FD(least n.P(FD n)) – next file descriptor to use : nextfd arch fds fd ′ = if windows arch arch then (* no ordering on Windows fds; they’re just handles *) fd ′ /∈ dom(fds) else (* POSIX architectures allocate in order *) fd ′ = leastfd fd ′.fd ′ /∈ dom(fds) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Binding (TCP and UDP) 84 Description Basic operations on file descriptors. Normally, when a new file descriptor is required the least unused one is used. Variations WinXP On Windows, file descriptors are opaque handles, and have no useful ordering. In particular, nextfd returns an arbitrary unused file descriptor. – count references to given fid : fid ref count(fds : fd 7→ fid ,fid) = card(dom((rrestrict fds{fid}))) Description A file is closed when its reference count drops to zero. This function determines the reference count of a file (strictly, a fid). – socket sanity invariants hold : sane socket sock = case sock .pr of TCP PROTO tcp sock → (*LENGTH tcp sock.rcvq <= sock.sf.n(SO RCVBUF) /\ (* true?? *)*) length tcp sock .rcvq ≤ TCP MAXWIN TCP MAXWINSCALE (*/\*) (*LENGTH tcp sock.sndq <= sock.sf.n(SO SNDBUF) (* true?? *)*) ‖ UDP PROTO udp sock → T Description There are some demonstrable invariants on a socket; this definition asserts them. These are largely here to provide explicit bounds to the symbolic evaluator. 12.4 Binding (TCP and UDP) Both TCP and UDP have a concept of a socket being bound to a local port, which means that that socket may receive datagrams addressed to that port. A specific local IP address may also be specified, and a remote IP address and/or port. This ‘quadruple’ (really a quintuple, since the protocol is also relevant) is used to determine the socket that best matches an incoming datagram. The functions in this section determine this best-matching socket, using rules appropriate to each protocol. Support is also provided for determining which ports are available to be bound by a new socket, and for automatically choosing a port to bind to in cases where the user does not specify one. 12.4.1 Summary bound ports protocol autobind the set of ports currently bound by a socket for a protocol bound port allowed is it permitted to bind the given (IP,port) pair? autobind set of ports available for autobinding bound after was sid bound more recently than sid ′? match score score the match against the given pattern of the given quadruple lookup udp the set of sockets matching an address quad, for UDP tcp socket best match the set of sockets matching a quad, for TCP lookup icmp the set of sockets matching a quad, for ICMP Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ bound after 85 12.4.2 Rules – the set of ports currently bound by a socket for a protocol : bound ports protocol autobind pr socks = {p | ∃s : socket. s ∈ rng(socks) ∧ s.ps1 = ↑ p ∧ proto of s.pr = pr} Description Rebinding of ports already bound is often restricted. bound ports protocol autobind is a list of all ports having a socket of the given protocol binding that port. – is it permitted to bind the given (IP,port) pair? : bound port allowed pr socks sf arch is p = p /∈ {port | ∃s : socket. s ∈ rng(socks) ∧ s.ps1 = ↑ port ∧ proto eq s.pr pr ∧ (if bsd arch arch ∧ SO REUSEADDR ∈ sf .b then s.is2 = ∗ ∧ s.is1 = is else if linux arch arch ∧ SO REUSEADDR ∈ sf .b ∧ SO REUSEADDR ∈ s.sf .b ∧ ((∃tcp sock .TCP PROTO(tcp sock) = s.pr ∧ ¬(tcp sock .st = LISTEN)) ∨ ∃udp sock .UDP PROTO(udp sock) = s.pr) then F(* If socket is not in LISTEN state or is a UDP socket can always rebind here *) else if windows arch arch ∧ SO REUSEADDR ∈ sf .b then F(* can rebind any UDP address; not sure about TCP - assume the same for now *) else (is = ∗ ∨ s.is1 = ∗ ∨ (∃i : ip.is = ↑ i ∧ s.is1 = ↑ i)))} Description This determines whether binding a socket (of protocol pr) to local address is, p is permitted, by considering the other bound sockets on the host and the state of the sockets’ SO REUSEADDR flags. Note: SB believes this definition is correct for TCP and UDP on BSD and Linux through exhaustive manual verification. Note: WinXP is still to be checked. – set of ports available for autobinding : autobind(↑ p, , ) = {p} ∧ autobind(∗, pr , socks) = ephemeral ports diff(bound ports protocol autobind pr socks) Description Note that SO REUSEADDR is not considered when choosing a port to autobind to. – was sid bound more recently than sid ′? : bound after sid sid ′[ ] = ASSERTION FAILURE“bound after”(* should never reach this case *) ∧ bound after sid sid ′(sid0 :: bound) = if sid = sid0 then T(* newly-bound sockets are added to the head *) else if sid ′ = sid0 then F else bound after sid sid ′ bound – score the match against the given pattern of the given quadruple : (match score( , ∗, , ) = 0n) ∧ Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp socket best match 86 (match score(∗, ↑ p1, ∗, ∗)(i3, ps3, i4, ps4) = if ps4 = ↑ p1 then 1 else 0) ∧ (match score(↑ i1, ↑ p1, ∗, ∗)(i3, ps3, i4, ps4) = if (i1 = i4) ∧ (↑ p1 = ps4) then 2 else 0) ∧ (match score(↑ i1, ↑ p1, ↑ i2, ∗)(i3, ps3, i4, ps4) = if (i2 = i3) ∧ (i1 = i4) ∧ (↑ p1 = ps4) then 3 else 0) ∧ (match score(↑ i1, ↑ p1, ↑ i2, ↑ p2)(i3, ps3, i4, ps4) = if (↑ p2 = ps3) ∧ (i2 = i3) ∧ (i1 = i4) ∧ (↑ p1 = ps4) then 4 else 0) Description These two functions are used to match an incoming UDP datagram to a socket. The bound after function returns T if the socket sid (the first agrument) was bound after the socket sid ′ (the second argument) according to a list of bound sockets (the third argument). The match score function gives a score specifying how closely two address quads, one from a socket and one from a datagram, correspond; a higher score indicates a more specific match. – the set of sockets matching an address quad, for UDP : lookup udp socks quad bound arch = {sid | sid ∈ dom(socks) ∧ let s = socks[sid] in let sn = match score(s.is1, s.ps1, s.is2, s.ps2)quad in sn > 0 ∧ if windows arch arch then if sn = 1 then ¬(∃(sid ′, s ′) :: (socks\\sid).match score(s ′.is1, s ′.ps1, s ′.is2, s ′.ps2)quad > sn) else T else ¬(∃(sid ′, s ′) :: (socks\\sid). (match score(s ′.is1, s ′.ps1, s ′.is2, s ′.ps2)quad > sn ∨ (linux arch arch ∧match score(s ′.is1, s ′.ps1, s ′.is2, s ′.ps2)quad = sn ∧ bound after sid ′ sid bound)))} Description This function returns a set of UDP sockets which the datagram with address quad quad may be delivered to. For FreeBSD and Linux there is only one such socket; for WinXP there may be multiple. For each socket in the finite map of sockets socks, the score, sn, of the matching of the socket’s address quad and quad is computed using match score (p??). Variations FreeBSD For FreeBSD, the set contains the sockets for which the score is greater than zero and there is no other socket in socks with a higher score. Linux For Linux, the set contains the sockets for which the score is greater than zero, there are no sockets with a higher score, and the socket was bound to its local port after all the other sockets with the same score. WinXP For WinXP, the set contains all the sockets with score greater than one and also the sockets for which the score is one, sn = 1, and there are no sockets with greater scores. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ lookup icmp 87 – the set of sockets matching a quad, for TCP : tcp socket best match(socks : sid 7→ socket)(sid, sock)(seg : tcpSegment)arch = (* is the socket sid the best match for segment seg? *) let s = sock in let score = match score(s.is1, s.ps1, s.is2, s.ps2) (the seg .is1, seg .ps1, the seg .is2, seg .ps2) in ¬(∃(sid ′, s ′) :: socks\\sid. match score(s ′.is1, s ′.ps1, s ′.is2, s ′.ps2) (the seg .is1, seg .ps1, the seg .is2, seg .ps2) > score) Description This function determines whether a given socket sid is the best match for a received TCP segment seg . The score (obtained using match score (p??)) for the given socket is determined, and compared with the score for each other socket in socks. If none have a greater score, this is the best match and true is returned; otherwise, false is returned. – the set of sockets matching a quad, for ICMP : lookup icmp socks icmp arch bound = {sid0 | ∃(sid, sock) :: socks. sock .ps1 = icmp.ps3 ∧ proto of sock .pr = icmp.proto ∧ sid0 = sid ∧ if windows arch arch then T else sock .is1 = icmp.is3 ∧ sock .is2 = icmp.is4 ∧ (sock .ps2 = icmp.ps4 ∨ (linux arch arch ∧ proto of sock .pr = PROTO UDP ∧ sock .ps2 = ∗ ∧ ¬(∃(sid ′, s) :: (socks\\sid). s.is1 = icmp.is3 ∧ s.is2 = icmp.is4 ∧ s.ps1 = icmp.ps3 ∧ s.ps2 = icmp.ps4 ∧ proto of s.pr = icmp.proto ∧ bound after sid ′ sid bound) ))} Description This function returns the set of sockets matching a received ICMP datagram icmp. An ICMP datagram contains the initial portion of the header of the original message to which it is a response. For a socket to match, it must at least be bound to the same port and protocol as the source of the original message. Beyond this, architectures differ. Usually, the socket must be connected, and connected to the same port as the original destination; and the source and destination IP addresses must agree. Variations WinXP For Windows, the socket need not be connected, and the source and destination IP addresses need not agree; an ICMP is delivered to one socket bound to the same port and protocol as the original source. Linux For Linux, UDP ICMPs may also be delivered to unconnected sockets, as long as no matching connected socket was bound more recently than that socket. FreeBSD For FreeBSD, the behaviour is as described above. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ slow timer 88 12.5 Timers (TCP and UDP) Many TCP protocol events are time-dependent, and time is also necessary for a useful specification of the behaviour of system calls, returns, and datagram emission and receipt. These common time-dependent be- haviours are described using the timers below. 12.5.1 Summary slow timer TCP slow timer, typically 500ms resolution (for keepalive, MSL, linger, badrxtwin) fast timer TCP fast timer, typically 200ms resolution (for delack) kern timer kernel timer, typically 10ms resolution (for timestamp valid, pselect) sched timer scheduling timer (for OS returns) inqueue timer in-queue timer (incoming message processing) outqueue timer out-queue timer (outgoing message emission) 12.5.2 Rules – TCP slow timer, typically 500ms resolution (for keepalive, MSL, linger, badrxtwin) : slow timer d = fuzzy timer d SLOW TIMER INTVL SLOW TIMER MODEL INTVL – TCP fast timer, typically 200ms resolution (for delack) : fast timer d = fuzzy timer d FAST TIMER INTVL FAST TIMER MODEL INTVL – kernel timer, typically 10ms resolution (for timestamp valid, pselect) : kern timer d = fuzzy timer d KERN TIMER INTVL KERN TIMER MODEL INTVL – scheduling timer (for OS returns) : sched timer = upper timer dschedmax – in-queue timer (incoming message processing) : inqueue timer = upper timer diqmax – out-queue timer (outgoing message emission) : outqueue timer = upper timer doqmax Description Traditionally TCP has been implemented using two timers, a slow timer ticking once every 500ms, and a fast timer ticking once every 200ms. In addition, the kernel is assumed to maintain a tick count, typically incremented every 10ms. Measuring intervals with such a timer means an uncertainty in duration: the observed interval may be up to one tick less than the specified interval, and is on average half a tick less. We model this with a fuzzy timer (p47), fuzzy to the left by eps and to the right by fuz , i.e., [d − eps, d + fuz ]. The eps, one tick, accounts for the fact that we do not know where in the clock’s period we set the timer. The fuz (some global fuzziness) is included to account for the atomicity of the model. For example, an implementation TCP processing step, performed by tcp_output etc., occupies some time interval, with timers such as tt rexmt being reset at various points within that interval. The model, on the other hand, has atomic transitions. The possible time difference between multiple timer resets in the same step must be accounted for by this fuzziness. For example, a model rule may reset the tt rexmt timer and also leave a segment on the output queue, with time passing before the segment is seen on the wire. The various flavours of upper timer (p??) – sched timer, inqueue timer, outqueue timer – fire at any time between now and dmax . These events may occur at any time up to a specified maximum delay. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Queues (TCP and UDP) 89 12.6 Time values for socket options (TCP and UDP) The TLang sockets interface representation of a time is as a pair of integers, the first for seconds and the second for nanoseconds. It also uses (int#int) option representations, e.g. in the arguments to setsocktopt and pselect and the result of setsocktopt, with the None value meaning infinity. Internally, time is represented as a time value, either a real or infinity. These routines convert between the various types. Note that they allow ill-formed tltimeopts without complaint. 12.6.1 Summary time of tltime convert (sec,nsec) pair to real time value time of tltimeopt convert optional (sec,nsec) pair to real time value (where ∗ mapped to ∞) tltimeopt wf is an optional (sec,nsec) pair well-formed? tltimeopt of time convert a time value to an optional (sec,nsec) pair 12.6.2 Rules – convert (sec,nsec) pair to real time value : (time of tltime : int#int→ time) (sec,nsec) = time(real of int sec + real of int nsec/1000000000) – convert optional (sec,nsec) pair to real time value (where ∗ mapped to ∞) : time of tltimeopt ∗ =∞∧ time of tltimeopt(↑ sn) = time of tltime sn – is an optional (sec,nsec) pair well-formed? : (tltimeopt wf : (int#int) option→ bool) ∗ = T ∧ tltimeopt wf(↑(sec,nsec)) = (sec ≥ 0 ∧ nsec ≥ 0 ∧ nsec < 1000000000) – convert a time value to an optional (sec,nsec) pair : (tltimeopt of time : time→ (int#int) option)t = @x . tltimeopt wf x ∧ time of tltimeopt x = t (* garbage if t not nonnegative integral number of nsec *) Description A tltimeopt is well-formed if sec and nsec are positive and nsec is less than 109. 12.7 Queues (TCP and UDP) Messages are queued at various points within the implementations, e.g. within the network interface hardware and in the kernel. These queues can become full, though their ”size” is not simple to describe — e.g. in BSD there is some accounting of the number of mbufs used. We model this with simple queues, for example the host message inqueue and outqueue (see iq and oq , host (p61)) which have lists of messages. These model the combination of network interface and kernel queues. We allow them to nondetermistically be full for enqueue operations, to ensure that the specification includes all real-world traces. This behaviour is guarded by INFINITE RESOURCES. The nondeterminism means that queue operations must be relations, not functions, and hence that many definitions that use them must also be relational. Many queues also associated with timers (see e.g. inqueue timer (p??)) bounding the times within which they must next be processed. One might want additional properties, e.g. (1) if a queue is empty then at least one message can be enqueued, or more generally a specified finite lower bound on queue size; or (2) if a queue is full then is remains so until a message is dequeued (perhaps only for enqueue attempts of at least the same size). At present we see no need for the additional complication. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ dequeue 90 12.7.1 Summary enqueue attempt to enqueue a message enqueue iq attempt to enqueue onto the in-queue enqueue oq attempt to enqueue onto the out-queue dequeue attempt to dequeue a message dequeue iq attempt to dequeue from the in-queue dequeue oq attempt to dequeue from the out-queue route and enqueue oq attempt to route and then enqueue an outgoing message enqueue list qinfo attempt to enqueue a list of messages enqueue list attempt to enqueue a list of messages, ignoring success flags enqueue oq list qinfo attempt to enqueue a list of messages onto the out-queue enqueue oq list attempt to enqueue a list of messages onto the out-queue, ignoring success flags accept incoming q0 should an incoming incomplete connection be accepted? accept incoming q should an incoming completed connection be accepted? drop from q0 drop from incomplete-connection queue? 12.7.2 Rules – attempt to enqueue a message : enqueue dq((q)d ,msg, (q ′)d′ , queued) = ((INFINITE RESOURCES =⇒ queued) ∧ (q ′, d ′) = (if queued then (q @ [msg], dq) else (q , d)) ) Description This is a relation between an original timed queue (q)d , a message to enqueue, msg, a resulting timed queue (q ′)d′ , and a boolean queued indicating whether the enqueue was successful or not. For a successful enqueue the timer on the resulting queue is set to dq – attempt to enqueue onto the in-queue : enqueue iq = enqueue inqueue timer – attempt to enqueue onto the out-queue : enqueue oq = enqueue outqueue timer Description Add a message to the respective queue, returning the new queue and a flag saying whether the message was successfully queued. – attempt to dequeue a message : dequeue dq((q)d , (q ′)d′ ,msg) = case q of (msg0 :: q0)→ q ′ = q0 ∧msg = ↑ msg0 ∧ d ′ = (if q0 = [ ] then never timer else dq) ‖ [ ]→ q ′ = q ∧msg = ∗ ∧ d ′ = d – attempt to dequeue from the in-queue : dequeue iq = dequeue inqueue timer – attempt to dequeue from the out-queue : dequeue oq = dequeue outqueue timer Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ accept incoming q0 91 Description Remove a message from the queue, returning the new queue, and the message if there is one. – attempt to route and then enqueue an outgoing message : route and enqueue oq(rttab, ifds, oq ,msg, oq ′, es, arch) = case test outroute(msg, rttab, ifds, arch) of ∗ → F ‖ ↑(↑ e)→ oq ′ = oq ∧ es = ↑ e ‖ ↑ ∗ → ∃queued . enqueue oq(oq ,msg, oq ′, queued) ∧ es = if queued then ∗ else ↑ ENOBUFS Description This is a relation because enqueue oq can non-deterministically decide that the oq is full. – attempt to enqueue a list of messages : enqueue list qinfo dq(q , (msg, queued) ::msgqs, q ′) = (∃q0. enqueue dq(q ,msg, q0, queued) ∧ enqueue list qinfo dq(q0,msgqs, q ′)) ∧ enqueue list qinfo dq(q , [ ], q ′) = (q ′ = q) – attempt to enqueue a list of messages, ignoring success flags : enqueue list dq(q ,msgs, q ′, queued) = (∃msgqs. enqueue list qinfo dq(q ,msgqs, q ′) ∧ msgs =map fst msgqs ∧ queued = every(λx . snd x = T)msgqs) – attempt to enqueue a list of messages onto the out-queue : enqueue oq list qinfo = enqueue list qinfo outqueue timer – attempt to enqueue a list of messages onto the out-queue, ignoring success flags : enqueue oq list = enqueue list outqueue timer Description We sometimes need to enqueue multiple messages at a time. enqueue list qinfo tries to enqueue a list of messages, pairing each with its success boolean. Often, we don’t care too much about the precise queueing success of each message. enqueue list provides the AND of success of each message (though this is of limited use). – should an incoming incomplete connection be accepted? : accept incoming q0(lis : socket listen)(b : bool) = (b = length lis.q < backlog fudge lis.qlimit) – should an incoming completed connection be accepted? : accept incoming q(lis : socket listen)(b : bool) = (b = length lis.q < 3 ∗ backlog fudge lis.qlimit div 2) – drop from incomplete-connection queue? : drop from q0(lis : socket listen)(b : bool) = ((length lis.q0 ≥ TCP Q0MINLIMIT∧b = T) ∨ (length lis.q0 < TCP Q0MAXLIMIT∧b = F)) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Buffers, windows, and queues (TCP and UDP) 92 Description A listening socket has two queues, the incomplete connections queue lis.q0 and the completed connections queue lis.q . An incoming incomplete (respectively, completed) connection be accepted onto lis.q0 (respectively, lis.q) if the relevant queue is not full. Intriguingly, for FreeBSD 4.6-RELEASE, this specification is correct, but if syncaches were to be turned off, the condition in the q0 case would be length lis.q < 3 ∗ lis.qlimit/2 instead. Existing incomplete connections may dropped from lis.q0 to make room if its length is between its minimum and maximum limits. 12.8 TCP Options (TCP only) TCP option handling. 12.8.1 Summary do tcp options Constrain the TCP timestamp option values that appear in an outgoing segment calculate tcp options len Calculate the length consumed by the TCP options in a real TCP segment 12.8.2 Rules – Constrain the TCP timestamp option values that appear in an outgoing segment : do tcp options cb tf doing tstmp cb ts recent cb ts val = if cb tf doing tstmp then let ts ecr ′ = option case (ts seq 0w) I (timewindow val of cb ts recent) in ↑(cb ts val , ts ecr ′) else ∗ – Calculate the length consumed by the TCP options in a real TCP segment : calculate tcp options len cb tf doing tstmp = if cb tf doing tstmp then 12 else 0 : num Description This calculation omits window-scaling and mss options as these only appear in SYN segments during connection setup. The total length consumed by all options will always be a multiple of 4 bytes due to padding. If more TCP options were added to the model, the space consumed by options would be architecture/options/alignment/padding dependent. 12.9 Buffers, windows, and queues (TCP and UDP) Various functions that compute buffer sizes, window sizes, and remaining send queue space. Some of these computations are architecture-specific. 12.9.1 Summary calculate buf sizes Calculate buffer sizes for rcvbufsize, sndbufsize, t maxseg , and snd cwnd calculate bsd rcv wnd Calculation of rcv wnd send queue spaceRule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ send queue space 93 12.9.2 Rules – Calculate buffer sizes for rcvbufsize, sndbufsize, t maxseg, and snd cwnd : calculate buf sizes cb t maxseg seg mss bw delay product for rt is local conn rcvbufsize sndbufsize cb tf doing tstmp arch = let t maxseg ′ = (* TCPv2p901 claims min 32 for ”sanity”; FreeBSD4.6 has 64 in tcp_mss(). BSD has the route MTU if avail, or min MSSDFLT(link MTU ) otherwise, as the first argument of the MIN below. That is the same calculation as we did in connect 1 . We don’t repeat it, but use the cached value in cb.t maxseg . *) let maxseg = (min cb t maxseg(max 64(option case MSSDFLT I seg mss))) in if linux arch arch then maxseg else (* BSD subtracts the size consumed by options in the TCP header post connection establishment. The WinXP and Linux behaviour has not been fully tested but it appears Linux does not do this and WinXP does. *) maxseg − (calculate tcp options len cb tf doing tstmp) in (* round down to multiple of cluster size if larger (as BSD). From BSD code; assuming true for WinXP for now *) let t maxseg ′′ = if linux arch arch then t maxseg ′(* from tests *) else rounddown MCLBYTES t maxseg ′ in (* buffootle: rcv *) let rcvbufsize ′ = option case rcvbufsize I bw delay product for rt in let (rcvbufsize ′′, t maxseg ′′′) = (if rcvbufsize ′ < t maxseg ′′ then (rcvbufsize ′, rcvbufsize ′) else (min SB MAX(roundup t maxseg ′′ rcvbufsize ′), t maxseg ′′)) in (* buffootle: snd *) let sndbufsize ′ = option case sndbufsize I bw delay product for rt in let sndbufsize ′′ = (if sndbufsize ′ < t maxseg ′′′ then sndbufsize ′ else min SB MAX(roundup t maxseg ′′ sndbufsize ′)) in (* compute initial cwnd *) let snd cwnd = t maxseg ′′′ ∗ (if is local conn then SS FLTSZ LOCAL else SS FLTSZ) in (rcvbufsize ′′, sndbufsize ′′, t maxseg ′′′, snd cwnd) Description Used in deliver in 1 and deliver in 2 . – Calculation of rcv wnd : calculate bsd rcv wnd sf tcp sock = max(num(tcp sock .cb.rcv adv − tcp sock .cb.rcv nxt)) (sf .n(SO RCVBUF)− length tcp sock .rcvq) Description Calculation of rcv wnd as done in BSD’s tcp_input.c, line 1052. The model currently calls this from tcp output really in post-ESTABLISHED states, using deliver in 3 to update rcv wnd as soon as a segment comes, rather than waiting for the next deliver in, as BSD does — this is a saner thing to do. In order to comply with BSD however, we need calculate bsd rcv to be called on receipt of the first ’real’ (i.e. non-syncache) segment, to update rcv wnd from the temporary initial value. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ bandlim state init 94 – : send queue space(sndq max : num)sndq size oob arch maxseg i2 = {n | if bsd arch arch then n ≤ (sndq max − sndq size) + (if oob then oob extra sndbuf else 0) else if linux arch arch then (if in loopback i2 then n = maxseg + ((sndq max − sndq size)div 16816) ∗maxseg else n = (2 ∗maxseg) + ((sndq max − sndq size − 1890)div 1888) ∗maxseg) else n ≥ 0} Description Calculation of the usable send queue space. FreeBSD calculates send buffer space based on the byte-count size and max, and the number and max of mbufs. As we do not model mbuf usage precisely we are somewhat nondeterministic here. Linux calculates it based on the MSS: the space is some multiple of the MSS; the number of bytes for each MSS-sized segment is the MSS+overhead where overhead is 420+(20 if using IP), which is why the i2 argument is needed. Windows is very strange. Leaving it completely unconstrained is not what actually happens, but more investigation is needed in future to determine the actual behaviour. 12.10 Band limiting (TCP and UDP) The rate of emission of certain TCP and ICMP responses from a host is often controlled by a bandwidth limiter. This limits resource usage in the event of some error conditions, and also defends against certain denial-of-service attacks. Responses that may be bandlimited are grouped into categories (bandlim reason), and bandlimiting is applied to each category separately. Bandlimiting is applied across the entire host, not per socket or process. There are a range of different schemes that may be used, from none at all, through limiting the number of packets in any given second, to a decaying average tuned to limit bursts and sustained throughput differently. We provide specifications for the first two. 12.10.1 Summary bandlim state init initial state of bandlimiter bandlim rst ok always the trivial ’always OK’ bandlimiter simple limit simple-bandlimiter rate settings bandlim rst ok simple a simple rate-limiting bandlimiter bandlim rst ok the bandlimiter actually used enqueue oq bndlim rst enqueue onto out-queue if allowed by bandlimiter 12.10.2 Rules – initial state of bandlimiter : bandlim state init = [ ] : bandlim state – the trivial ’always OK’ bandlimiter : (bandlim rst ok always : tcpSegment#ts seq#bandlim reason#bandlim state → bool#bandlim state) (seg , ticks, reason, bndlm) = let bndlm ′ = (seg , ticks, reason) :: bndlm in (T, bndlm ′) – simple-bandlimiter rate settings : Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ UDP support (UDP only) 95 (simple limit : bandlim reason→ num option) BANDLIM UNLIMITED = ∗ ∧ simple limit BANDLIM RST CLOSEDPORT = ↑ 200 ∧ simple limit BANDLIM RST OPENPORT = ↑ 200 – a simple rate-limiting bandlimiter : (bandlim rst ok simple : tcpSegment#ts seq#bandlim reason#bandlim state → bool#bandlim state) (seg , ticks, reason, bndlm) = let reasoneq = (λr0.λ(s, t , r).r = r0) and ticksgt = (λt0.λ(s, t , r).t > t0) in let count = length(filter(reasoneq reason)(TAKEWHILE(ticksgt(ticks − num floor(1 ∗HZ)))bndlm)) in ((case simple limit reason of ∗ → T ‖ ↑ n → count < n), (seg , ticks, reason) :: bndlm) Description Simple bandlimiter: limit number of ICMPs in the last second to the listed value. This is based roughly on the BSD behaviour, save that for BSD it is ”since the last second” not ”in the last second”. – the bandlimiter actually used : bandlim rst ok = bandlim rst ok simple Description Which band limiter to use? – enqueue onto out-queue if allowed by bandlimiter : enqueue oq bndlim rst(oq , seg , ticks, reason, bndlm, oq ′, bndlm ′, queued or dropped) = let (emit , bndlm0) = bandlim rst ok(seg , ticks, reason, bndlm) in bndlm ′ = bndlm0 ∧ if emit then enqueue oq(oq ,TCP seg , oq ′, queued or dropped) else (oq ′ = oq ∧ queued or dropped = T) Description For convenience, combine enqueueing and bandlimiting into a single function. 12.11 UDP support (UDP only) Performing a UDP send, filling in required details as necessary. 12.11.1 Summary dosend do a UDP send, filling in source address and port as necessary 12.11.2 Rules Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp backoffs 96 – do a UDP send, filling in source address and port as necessary : (dosend(ifds, rttab, (∗, data), (↑ i1, ↑ p1, ↑ i2, ps2), oq , oq ′, ok) = enqueue oq(oq ,UDP(〈[ is1 := ↑ i1; is2 := ↑ i2; ps1 := ↑ p1; ps2 := ps2; data := data]〉), oq ′, ok)) ∧ (dosend(ifds, rttab, (↑(i , p), data), (∗, ↑ p1, ∗, ∗), oq , oq ′, ok) = (∃i ′1. enqueue oq(oq ,UDP(〈[ is1 := ↑ i ′1; is2 := ↑ i ; ps1 := ↑ p1; ps2 := ↑ p; data := data]〉), oq ′, ok) ∧ i ′1 ∈ auto outroute(i , ∗, rttab, ifds))) ∧ (dosend(ifds, rttab, (↑(i , p), data), (↑ i1, ↑ p1, is2, ps2), oq , oq ′, ok) = enqueue oq(oq ,UDP(〈[ is1 := ↑ i1; is2 := ↑ i ; ps1 := ↑ p1; ps2 := ↑ p; data := data]〉), oq ′, ok)) Description For use in UDP sendto(). 12.12 TCP timing and RTT (TCP only) TCP performs repeated transmissions in three situations: retransmission of unacknowledged data, retransmis- sion of an unacknowledged SYN, and probing a closed window (‘persisting’). In each case the interval between transmissions is a function of the estimated round-trip time for the connection, and is exponentially backed off if no response is received. The RTT estimate indicates when TCP should expect a reply, and the exponential backoff controls TCP’s resource usage. 12.12.1 Summary tcp backoffs select this architecture’s retransmit backoff list tcp syn backoffs select this architecture’s SYN -retransmit backoff list mode of obtain the mode of a backoff timer shift of obtain the shift of a backoff timer computed rto compute retransmit timeout to use computed rxtcur compute the last-used rxtcur start tt rexmt gen construct retransmit timer (generic) start tt rexmt construct normal retransmit timer start tt rexmtsyn construct SYN -retransmit timer start tt persist construct persist timer update rtt update RTT estimators from new measurement expand cwnd expand congestion window 12.12.2 Rules – select this architecture’s retransmit backoff list : tcp backoffs(arch : arch) = if bsd arch arch then TCP BSD BACKOFFS else if linux arch arch then TCP LINUX BACKOFFS else if windows arch arch then TCP WINXP BACKOFFS else TCP BSD BACKOFFS (* default to BSD *) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ start tt rexmt gen 97 – select this architecture’s SYN -retransmit backoff list : tcp syn backoffs(arch : arch) = if bsd arch arch then TCP SYN BSD BACKOFFS else if linux arch arch then TCP SYN LINUX BACKOFFS else if windows arch arch then TCP SYN WINXP BACKOFFS else TCP SYN BSD BACKOFFS (* default to BSD *) – obtain the mode of a backoff timer : (mode of : (rexmtmode#num)timed option→ rexmtmode option) (↑(((mode, )) )) = ↑ mode ∧ mode of ∗ = ∗ – obtain the shift of a backoff timer : shift of(↑((( , shift)) )) = shift Description TCP exponential-backoff timers are represented as (rexmtmode#num)timed option, where mode : rexmtmode is the current TCP output mode (see rexmtmode (p55)), and shift : num is the 0-origin index into the backoff list of the interval currently underway. – compute retransmit timeout to use : computed rto(backoffs : num list)(shift : num)(ri : rttinf) = real of num(EL shift backoffs) ∗ max ri .t rttmin(ri .t srtt + 4 ∗ ri .t rttvar) – compute the last-used rxtcur : computed rxtcur(ri : rttinf)(arch : arch) =max ri .t rttmin (min(the TCPTV REXMTMAX) (computed rto(if ri .t wassyn then tcp syn backoffs arch else tcp backoffs arch) ri .t lastshift ri)) Description computed rto computes the retransmit timeout to be used, from the backoff list, the shift, and the current RTT estimators. The base time is RTT + 4RTTVAR; this is clipped against a minimum value, and then multiplied by the value from the backoff list. computed rxtcur is not used in constructing timers, but tcp output uses it to check if TCP has been idle for a while (causing slow start to be entered again). It is an approximation to the value actually used below. Note it might be possible to make this precise rather than an approximation; also, computed rxmtcur and start tt rexmt gen could be merged. Note: TCPTV REXMTMAX had better not be infinite! – construct retransmit timer (generic) : start tt rexmt gen(mode : rexmtmode)(backoffs : num list)(shift : num)(wantmin : bool)(ri : rttinf) = let rxtcur =max(if wantmin then max ri .t rttmin(ri .t lastrtt + 2/HZ) else ri .t rttmin) (min(the TCPTV REXMTMAX (* better not be infinite! *)) (computed rto backoffs shift ri) ) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ update rtt 98 in ↑(((mode, shift))slow timer(time rxtcur)) – construct normal retransmit timer : start tt rexmt(arch : arch) = start tt rexmt gen Rexmt(tcp backoffs arch) – construct SYN -retransmit timer : start tt rexmtsyn(arch : arch) = start tt rexmt gen RexmtSyn(tcp syn backoffs arch) – construct persist timer : start tt persist(shift : num)(ri : rttinf)(arch : arch) = let cur =max(the TCPTV PERSMIN (* better not be infinite! *)) (min(the TCPTV PERSMAX (* better not be infinite! *)) (computed rto(tcp backoffs arch)shift ri) ) in ↑(((Persist, shift))slow timer(time cur)) Description Starting the retransmit, SYN -retransmit, and persist timers: these function return the new timer with the given shift. This models both initialisation on receiving a segment, and update in the retransmit timer handler. There are two alternative clipping values used for the minimum timer. ri .t rttmin is used always, but in one place t .last rtt + 2/HZ (i.e., 0.02s plus the last measured RTT) is used as well. The BSD sources have a comment here saying ”minimum feasible timer”; it is a puzzle why this value is not used elsewhere also. (tcp input.c:2408 vs tcp timer.c:394, tcp input.c:2542). Starting the persist timer is similar to starting the retransmit timers, but the bounds are different. Note that we don’t need to look at tf srttvalid , since in any case t srtt and t rttvar will have sensible values. That flag is just for the benefit of update rtt. – update RTT estimators from new measurement : update rtt(rtt : duration)(ri : rttinf) = let (t srtt ′, t rttvar ′) = (if ri .tf srtt valid then let delta = (rtt − 1/HZ)− ri .t srtt in let vardelta = abs delta − ri .t rttvar in let t srtt ′ =max(1/(32 ∗HZ))(ri .t srtt + (1/8) ∗ delta) and t rttvar ′ =max(1/(16 ∗HZ))(ri .t rttvar + (1/4) ∗ vardelta) (* BSD behaviour is never to let these go to zero, but clip at the least positive value. Since SRTT is measured in 1/32 tick and RTTVAR in 1/16 tick, these are the minimum values. A more natural implementation would clip these to zero. *) in (t srtt ′, t rttvar ′) else let t srtt ′ = rtt and t rttvar ′ = rtt/2 in (t srtt ′, t rttvar ′)) in ri 〈[ t rttupdated := ri .t rttupdated + 1; tf srtt valid :=T; t srtt := t srtt ′; t rttvar := t rttvar ′; t lastrtt := rtt ; t lastshift := 0; t wassyn :=F(* if t lastshift=0, this doesn’t make a difference *) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ next smaller 99 (* t softerror, t rttseg, and t rxtcur must be handled by the caller *) ]〉 Description Update the round trip time estimators on obtaining a new instantaneous value. Based on a close reading of tcp xmit timer(), tcp input.c:2347-2419. – expand congestion window : expand cwnd ssthresh maxseg maxwin cwnd =min maxwin(cwnd + (if cwnd > ssthresh then (maxseg ∗maxseg)div cwnd else maxseg)) Description Congestion window expansion is linear or exponential depending on the current threshold ssthresh. 12.13 Path MTU Discovery (TCP only) For efficiency and reliability, it is best to send datagrams that do not need to be fragmented in the network. However, TCP has direct access only to the maximum packet size (MTU) for the interfaces at either end of the connection – it has no information about routers and links in between. To determine the MTU for the entire path, TCP marks all datagrams ‘do not fragment’. It begins by sending a large datagram; if it receives a ‘fragmentation needed’ ICMP in return it reduces the size of the datagram and repeats the process. Most modern routers include the link MTU in the ICMP message; if the message does not contain an MTU, however, TCP uses the next lower MTU in the table below. 12.13.1 Summary next smaller find next-smaller element of a set mtu tab path MTU plateaus to try 12.13.2 Rules – find next-smaller element of a set : (next smaller : (num→ bool)→ num→ num)xs y = @x :: xs.x < y ∧ ∀x ′ :: xs.x ′ > x =⇒ x ′ ≥ y – path MTU plateaus to try : mtu tab arch = if linux arch arch then {32000; 17914; 8166; 4352; 2002; 1492; 576; 296; 216; 128; 68} : num set else {65535; 32000; 17914; 8166; 4352; 2002; 1492; 1006; 508; 296; 68} Description MTUs to guess for path MTU discovery. This table is from RFC1191, and is the one that appears in BSD. On comp.protocols.tcp-ip, Sun, 15 Feb 2004 01:38:26 -0000, <102tj- cifv6vgm02@corp.supernews.com>, kml@bayarea.net (Kevin Lahey) suggests that this is out-of-date, and 2312 (WiFi 802.11), 9180 (common ATM), and 9000 (jumbo Ethernet) should be added. For some polemic discussion, see http://www.psc.edu/~mathis/MTU/. RFC1191 says explicitly ”We do not expect that the values in the table [...] are going to be valid forever. The values given here are an implementation suggestion, NOT a specification or requirement. Implementors should use up-to-date references to pick a set of plateaus [...]”. BSD is therefore not compliant here. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp reass 100 Linux adds 576, 216, 128 and drops 1006. 576 is used in X.25 networks, and the source says 216 and 128 are needed for AMPRnet AX.25 paths. 1006 is used for SLIP, and was used on the ARPANET. Linux does not include the modern MTUs listed above. 12.14 Reassembly (TCP only) TCP segments may arrive out-of-order, leaving holes in the data stream. They may also overlap, due to retransmission, confusion, or deliberate effort by an unusual TCP implementation. The TCP reassembly algorithm is responsible for retrieving the data stream from the segments that arrive (note this is not to be confused with IP fragmentation reassembly, which is beneath the scope of this specification). There are various ways of resolving overlaps; in this specification we are completely nondeterministic, and allow any legal reassembly. 12.14.1 Summary tcp reass perform TCP segment reassembly tcp reass prune drop prefix of reassembly queue 12.14.2 Rules – perform TCP segment reassembly : tcp reass seq(rsegq : tcpReassSegment list) = let myrel = {(i , c) | ∃rseg . rseg ∈ rsegq ∧ i ≥ rseg .seq ∧ i < rseg .seq + length rseg .data + (if rseg .spliced urp 6= ∗ then 1 else 0) ∧ (case rseg .spliced urp of ↑(n)→ (if i > n then c = ↑(EL(num(i − rseg .seq − 1))(rseg .data)) else if i = n then c = ∗ else c = ↑(EL(num(i − rseg .seq))(rseg .data))) ‖ ∗ → c = ↑(EL(num(i − rseg .seq))(rseg .data)))} in {(cs ′, len,FIN ) | ∃cs.cs ′ = CONCAT OPTIONAL cs ∧ (∀n : num.n < length cs =⇒ (seq + n,EL n cs) ∈ myrel) ∧ (¬∃c.(seq + length cs, c) ∈ myrel) ∧ (len = length cs) ∧ (FIN = ∃rseg .rseg ∈ rsegq ∧ rseg .seq + length rseg .data + (if rseg .spliced urp 6= ∗ then 1 else 0) = seq + length cs ∧ rseg .FIN )} (* NB: the FIN may come from a 0-length segment, or from a different segment from that which the last character came but logically is always at the end of cs’s. *) Description Returns the set of maximal-length strings starting at seq that can be constructed by taking bytes from the segments in rsegq , accounting for any spliced (out-of-line) urgent data. Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ initial cb 101 – drop prefix of reassembly queue : tcp reass prune seq(rsegq : tcpReassSegment list) = filter(λrseg .rseg .seq + length rseg .data + (if rseg .spliced urp 6= ∗ then 1 else 0) + (if rseg .FIN then 1 else 0) > seq)rsegq Description Prune away every segment ending before the specified seq , accounting for any spliced (out- of-line) urgent data. 12.15 The initial TCP control block (TCP only) The initial state of the TCP control block. 12.15.1 Summary initial cb 12.15.2 Rules – : initial cb = 〈[ t segq :=[ ]; tt rexmt := ∗; tt keep := ∗; tt 2msl := ∗; tt delack := ∗; tt conn est := ∗; tt fin wait 2 := ∗; tf needfin :=F; tf shouldacknow :=F; snd una := tcp seq local 0w; snd max := tcp seq local 0w; snd nxt := tcp seq local 0w; snd wl1 := tcp seq foreign 0w; snd wl2 := tcp seq local 0w; iss := tcp seq local 0w; snd wnd := 0; snd cwnd :=TCP MAXWIN TCP MAXWINSCALE; snd ssthresh :=TCP MAXWIN TCP MAXWINSCALE; rcv wnd := 0; tf rxwin0sent :=F; rcv nxt := tcp seq foreign 0w; rcv up := tcp seq foreign 0w; irs := tcp seq foreign 0w; rcv adv := tcp seq foreign 0w; snd recover := tcp seq local 0w; t maxseg :=MSSDFLT; t advmss := ∗; t rttseg := ∗; t rttinf := 〈[ Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ initial cb 102 t rttupdated := 0; tf srtt valid :=F; t srtt :=TCPTV RTOBASE; t rttvar :=TCPTV RTTVARBASE; t rttmin :=TCPTV MIN; t lastrtt := 0; t lastshift := 0; t wassyn :=F(* if t lastshift=0, this doesn’t make a difference *) ]〉; t dupacks := 0; t idletime := stopwatch zero; t softerror := ∗; snd scale := 0; rcv scale := 0; request r scale := ∗;(* this like many other things is overwritten with the chosen value later - cf tcp newtcpcb() *) tf doing ws :=F; ts recent :=TimeWindowClosed; tf req tstmp :=F; (* cf tcp newtcpcb() *) tf doing tstmp :=F; last ack sent := tcp seq foreign 0w; bsd cantconnect :=F; snd cwnd prev := 0; snd ssthresh prev := 0; t badrxtwin :=TimeWindowClosed (* Note: everything should be listed here, leaving nothing as ARB. *) (* Many are always overwritten, however. *) ]〉 Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Chapter 13 Relational monad The relational ‘monad’ is used to describe stateful computation in a convenient and compositional way. 13.1 Relational monad (TCP only) The implementation TCP input and output routines are imperative C code, with mutations of state variables and calls to various other routines, some of which send messages or have other observable effects. These are intertwined in a complex control flow. In the specification we have attempted, as much as possible, to adopt purely functional or relational styles. To deal with the observable side effects in the middle of (e.g.) tcp_output, however, we have had to identify some intermediate states. We introduce a relational monadic style to do so, using higher-order functions to hide the plumbing of state variables. The nondeterminism of our model adds another layer of complexity; instead of the usual functional monads, we use relational monads. An operation on the current state is modelled by a relation on the current and resulting states. A number of primitive operations are defined; these operations are then chained together by a binding combinator, which takes two relations and yields their composition. In this way arbitrarily complex operations on state may be defined in a modular manner, and the referential transparency of the logic is maintained. In the present application, the current state is a pair (sock : socket, bndlm : bandlim state) of the current socket and the state of the host’s band limiter. The resulting state is a quadruple ((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool) of the final socket, band-limiter state, a list of segments to be output, and a flag. This flag models aborting: if it is set, operations should be chained together normally; if it is cleared, subsequent operations should not be performed, and instead the resulting state should be the final state of the entire composite operation of which this is a part. The binding combinator is andThen. Primitive operators include cont, which does nothing and continues, and stop, which does nothing and stops. Several other operations are defined to manipulate the state – the monadic glue is intended to abstract away from the implementation of that state as a pair of tuples. It should be a theorem that andThen is assoc, that cont is unit and stop is zero, and so on. Note that outsegs, the list of messages, is actually a list of arbitrary type; this enables us to lift the glue to the type msg#bool in deliver in 3 , where we need the flag to deal with queueing failure. As throughout this specification, beware that the nondeterminism of, e.g., chooseM is modelled by an existential, and is thus ”angelic” in some sense. This may or may not be what you expect. 13.1.1 Summary andThen normal sequencing cont do nothing, and continue (unit for andThen) stop do nothing, and stop (zero for andThen) assert assert truth of condition, and continue assert failure assertion violated; fail noisily chooseM choose a value from a set, nondeterministically get sock get current socket get tcp sock assert current socket is TCP, and get its protocol data get cb assert current socket is TCP, and get its control block modify sock apply function to current socket modify tcp sock apply function to current socket 103 get sock 104 modify cb assert current socket is TCP, and apply function to its control block emit segs append segments to current output list emit segs pred append segments specified by a predicate (nondeterministic) mliftc lift a monadic operation not involving continue or bndlm mliftc bndlm lift a monadic operation not involving continue 13.1.2 Rules – normal sequencing : (op1 andThen op2 ) = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). ∃sock1 bndlm1 outsegs1 continue1 sock2 bndlm2 outsegs2 continue2. op1 (sock , bndlm)((sock1, bndlm1, outsegs1), continue1) ∧ if continue1 then op2 (sock1, bndlm1)((sock2, bndlm2, outsegs2), continue2) ∧ (sock ′ = sock2 ∧ bndlm ′ = bndlm2 ∧ outsegs ′ = outsegs1 @ outsegs2 ∧ continue ′ = continue2) else (sock ′ = sock1 ∧ bndlm ′ = bndlm1 ∧ outsegs ′ = outsegs1 ∧ continue ′ = F) – do nothing, and continue (unit for andThen) : cont = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (sock ′ = sock ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = T) – do nothing, and stop (zero for andThen) : stop = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (sock ′ = sock ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = F) – assert truth of condition, and continue : assert p = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (p ∧ sock ′ = sock ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = T) – assertion violated; fail noisily : assert failure s = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). ASSERTION FAILURE s – choose a value from a set, nondeterministically : chooseM s f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). choose x :: s.f x (sock , bndlm)((sock ′, bndlm ′, outsegs ′), continue ′) – get current socket : get sock f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). f sock(sock , bndlm)((sock ′, bndlm ′, outsegs ′), continue ′) – assert current socket is TCP, and get its protocol data : get tcp sock f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). ∃tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ f tcp sock(sock , bndlm)((sock ′, bndlm ′, outsegs ′), continue ′) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ mliftc 105 – assert current socket is TCP, and get its control block : get cb f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). ∃tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ f tcp sock .cb(sock , bndlm)((sock ′, bndlm ′, outsegs ′), continue ′) – apply function to current socket : modify sock f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (sock ′ = f sock ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = T) – apply function to current socket : modify tcp sock f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (∃tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ sock ′ = sock 〈[ pr :=TCP PROTO(f tcp sock)]〉 ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = T) – assert current socket is TCP, and apply function to its control block : modify cb f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). ∃tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ (sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb :=(f tcp sock .cb)]〉)]〉 ∧ bndlm ′ = bndlm ∧ outsegs ′ = [ ] ∧ continue ′ = T) – append segments to current output list : emit segs segs = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (sock ′ = sock ∧ bndlm ′ = bndlm ∧ outsegs ′ = segs ∧ continue ′ = T) – append segments specified by a predicate (nondeterministic) : emit segs pred f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (sock ′ = sock ∧ f bndlm bndlm ′ outsegs ′ ∧ continue ′ = T) – lift a monadic operation not involving continue or bndlm : mliftc f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (f sock(sock ′, outsegs ′) ∧ bndlm ′ = bndlm ∧ continue ′ = T) – lift a monadic operation not involving continue : mliftc bndlm f = λ(sock : socket, bndlm : bandlim state)((sock ′ : socket, bndlm ′ : bandlim state, outsegs ′ : ′msg list), continue ′ : bool). (f (sock , bndlm)(sock ′, bndlm ′, outsegs ′) ∧ continue ′ = T) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Chapter 14 Auxiliary functions for TCP segment creation and drop We gather here all the general TCP segment generation and processing functions that are used in the host LTS. 14.1 SYN and RST Segment Creation (TCP only) Generating various simple segments (none of which contain any user data). 14.1.1 Summary make syn segment Make a SYN segment for emission by connect 1 etc make syn ack segment Make a SYN,ACK segment for emission by deliver in 1 , deliver in 2 , etc. make ack segment Make a plain boring ACK segment in response to a SYN,ACK segment bsd make phantom segment Make phantom (no flags) segment for BSD LISTEN bug make rst segment from cb Make a RST segment asynchronously, from socket informa- tion only make rst segment from seg Make a RST segment synchronously, in response to an in- coming segment 14.1.2 Rules – Make a SYN segment for emission by connect 1 etc : make syn segment cb(i1, i2, p1, p2)ts val seg ′ = (choose urp any ::UNIV . choose ack any ::UNIV . (* Determine window size; fail if out of range *) let win = n2w cb.rcv wnd in w2n win = cb.rcv wnd ∧ (* Choose a window scaling; fail if out of range *) (* Note there may be a better place for this assertion. *) let ws = option map CHR cb.request r scale in (is some cb.request r scale =⇒ ord(the ws) = the cb.request r scale) ∧ (case ws of ∗ → T ‖ ↑ n → ord n ≤ TCP MAXWINSCALE) ∧ 106 make syn ack segment 107 (* Determine maximum segment size; fail if out of range *) (* Put the MSS we initially advertise into t advmss *) let mss = (case cb.t advmss of ∗ → ∗ ‖ ↑ v → ↑(n2w v)) in (case cb.t advmss of ∗ → T ‖ ↑ v → v = w2n(the mss)) ∧ (* Do timestamping? *) let ts = do tcp options cb.tf req tstmp cb.ts recent ts val in seg ′ =〈[ is1 := ↑ i1; is2 := ↑ i2; ps1 := ↑ p1; ps2 := ↑ p2; seq := cb.iss; ack := ack any ; URG :=F; ACK :=F; PSH :=F; RST :=F; SYN :=T; FIN :=F; win :=win; ws :=ws; urp := urp any ; mss :=mss; ts := ts; data :=[ ] ]〉 ) – Make a SYN,ACK segment for emission by deliver in 1 , deliver in 2 , etc.: make syn ack segment cb(i1, i2, p1, p2)ts val ′ seg ′ = choose urp any ::UNIV . (* Determine window size; fail if out of range *) (* We don’t scale yet ( rcv scale ′). RFC1323 says: segments with SYN are not scaled, and BSD agrees. Even though we know what scaling the other end wants to use, and we know whether we are doing scaling, we can’t use it until we reach the ESTABLISHED state. *) let win = n2w cb.rcv wnd in (* rcv window − length data ′ *) w2n win = cb.rcv wnd ∧ (* If doing window scaling, set it; fail if out of range *) let ws = if cb.tf doing ws then ↑(CHR cb.rcv scale) else ∗ in (cb.tf doing ws =⇒ ord(the ws) = cb.rcv scale) ∧ (* Determine maximum segment size; fail if out of range *) (* Put the MSS we initially advertise into t advmss *) let mss = (case cb.t advmss of ∗ → ∗ ‖ ↑ v → ↑(n2w v)) in (case cb.t advmss of ∗ → T ‖ ↑ v → v = w2n(the mss)) ∧ Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ make ack segment 108 (* Set timestamping option? *) let ts = do tcp options cb.tf doing tstmp cb.ts recent ts val ′ in seg ′ =〈[ is1 := ↑ i1; is2 := ↑ i2; ps1 := ↑ p1; ps2 := ↑ p2; seq := cb.iss; ack := cb.rcv nxt ; URG :=F; ACK :=T; PSH :=F; (* see below *) RST :=F; SYN :=T; FIN :=F; (* Note: we are not modelling T/TCP *) win :=win; ws :=ws; urp := urp any ; mss :=mss; ts := ts; data :=[ ] (* see below *) ]〉 (* No data can be send here using the BSD sockets API, although TCP notionally allows it. Accordingly, the PSH flag is never set (under BSD, PSH is only set if we’re sending a non-zero amount of data (and emptying the send buffer); see tcp_output.c:626). *) – Make a plain boring ACK segment in response to a SYN,ACK segment : make ack segment cb FIN (i1, i2, p1, p2)ts val ′ seg ′ = ((* SB thinks these should be unconstrained. *) choose urp garbage ::UNIV . (* Determine window size; fail if out of range *) (* Connection is now established so any scaling should be taken into account *) (* Note it might be appropriate to clip the value to be in range rather than failing if out of range. *) let win = n2w(cb.rcv wnd cb.rcv scale) in w2n win = cb.rcv wnd cb.rcv scale ∧ (* Set timestamping option? *) let ts = do tcp options cb.tf doing tstmp cb.ts recent ts val ′ in seg ′ =〈[ is1 := ↑ i1; is2 := ↑ i2; ps1 := ↑ p1; ps2 := ↑ p2; seq := if FIN then cb.snd una else cb.snd nxt ; ack := cb.rcv nxt ; URG :=F; ACK :=T; PSH :=F; (* see comment for make syn ack segment *) RST :=F; SYN :=F; FIN :=FIN ; win :=win; ws := ∗; urp := urp garbage; Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ make rst segment from cb 109 mss := ∗; ts := ts; data :=[ ] (* Note that if there is data in sndq then it should always appear in a seperate segment after the connnection establishment handshake, but this needs to be verified. *) ]〉) – Make phantom (no flags) segment for BSD LISTEN bug : (* If a socket is changed to the LISTEN state, the rexmt timer may still be running. If it fires, phantom segments are emitted. *) bsd make phantom segment cb(i1, i2, p1, p2)ts val ′ cantsndmore seg ′ = (choose urp garbage ::UNIV . (* Determine window size; fail if out of range *) (* Connection is now established so any scaling should be taken into account *) (* Note it might be appropriate to clip the value to be in range rather than failing if out of range. *) let win = n2w(cb.rcv wnd cb.rcv scale) in w2n win = cb.rcv wnd cb.rcv scale ∧ let FIN = (cantsndmore ∧ cb.snd una < (cb.snd max − 1)) in (* Set timestamping option? *) let ts = do tcp options cb.tf doing tstmp cb.ts recent ts val ′ in seg ′ =〈[ is1 := ↑ i1; is2 := ↑ i2; ps1 := ↑ p1; ps2 := ↑ p2; seq := if FIN then cb.snd una else cb.snd max ; (* no flags, no data, and no persist timer so use snd max *) ack := cb.rcv nxt ; (* yes, really, even though ¬ACK *) URG :=F; ACK :=F; PSH :=F; RST :=F; SYN :=F; FIN :=FIN ; win :=win; ws := ∗; urp := urp garbage; mss := ∗; ts := ts; data :=[ ] (* sndq always empty in this situation *) ]〉) – Make a RST segment asynchronously, from socket information only : make rst segment from cb cb(i1, i2, p1, p2)seg ′ = (* Deliberately unconstrained *) choose urp garbage ::UNIV . choose URG garbage ::UNIV . choose PSH garbage ::UNIV . choose win garbage ::UNIV . choose data garbage ::UNIV . choose FIN garbage ::UNIV . Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ make rst segment from seg 110 (* Note that BSD is perfectly capable of putting data in a RST segment; try filling the buffer and then doing a force close: the result is a segment with RST+PSH+data+win advertisement. Presumably URG is also possible. This is *not* the same as the RFC-suggested data carried by a RST; that would be an error message, this is just data from the buffer! *) seg ′ =〈[ is1 := ↑ i1; ps1 := ↑ p1; is2 := ↑ i2; ps2 := ↑ p2; seq := cb.snd nxt ; (* from RFC793p62 *) ack := cb.rcv nxt ; (* seems the right thing to do *) URG :=URG garbage; (* expect: F *) ACK :=T; (* from TCPv1p248 *) PSH :=PSH garbage; (* expect: F *) RST :=T; SYN :=F; FIN :=FIN garbage; (* expect: F *) win :=win garbage; (* expect: 0w *) ws := ∗; urp := urp garbage; (* expect: 0w *) mss := ∗; ts := ∗; (* RFC1323 S4.2 recommends no TS on RST, and BSD follows this *) data := data garbage (* expect: [ ] *) ]〉 – Make a RST segment synchronously, in response to an incoming segment : make rst segment from seg seg seg ′ = (seg .RST = F ∧ (* Sanity check: never RST a RST *) (∃ack ′. (* Deliberately unconstrained *) choose urp garbage ::UNIV . choose URG garbage ::UNIV . choose PSH garbage ::UNIV . choose win garbage ::UNIV . choose data garbage ::UNIV . choose FIN garbage ::UNIV . (* RFC795 S3.4: only ack segments that don’t contain an ACK. SB believes this is equivalent to: only send a RST+ACK segment in response to a bad SYN segment *) let ACK ′ = ¬seg .ACK in (* Sequence number is zero for RST+ACK segments, otherwise it is the next sequence number expected *) let seq ′ = if seg .ACK then tcp seq flip sense seg .ack else tcp seq local 0w in (if ACK ′ then (* RFC794 S3.4: for RST+ACK segments the ack value must be valid *) ack ′ = tcp seq flip sense seg .seq + length seg .data + (if seg .SYN then 1 else 0) else (* otherwise it can be arbitrary, although it possibly should be zero *) ack ′ ∈ {n | T} ) ∧ seg ′ =〈[ is1 := seg .is2; ps1 := seg .ps2; is2 := seg .is1; ps2 := seg .ps1; seq := seq ′; Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp output required 111 ack := ack ′; URG :=URG garbage; (* expect: F *) ACK :=ACK ′; PSH :=PSH garbage; (* expect: F *) RST :=T; SYN :=F; FIN :=FIN garbage; (* expect: F *) win :=win garbage; (* expect: 0w *) ws := ∗; urp := urp garbage; (* expect: 0w *) mss := ∗; ts := ∗; (* RFC1323 S4.2 recommends no TS on RST, and BSD follows this *) data := data garbage (* expect: [ ] *) ]〉 )) 14.2 General Segment Creation (TCP only) The TCP output routines. These, together with the input routines in deliver in 3 , form the heart of TCP. 14.2.1 Summary tcp output required determine whether TCP output is required tcp output really do TCP output tcp output perhaps combination of tcp output required and tcp output really 14.2.2 Rules – determine whether TCP output is required : tcp output required arch ifds0 sock = let tcp sock = tcp sock of sock in let cb = tcp sock .cb in (* Note this does not deal with TF_LASTIDLE and PRU_MORETOCOME *) let snd cwnd ′ = if ¬(cb.snd max = cb.snd una ∧ stopwatch val of cb.t idletime ≥ computed rxtcur cb.t rttinf arch) then (* inverted so this clause is tried first *) cb.snd cwnd else (* The connection is idle and has been for >= 1 RTO *) (* Reduce snd cwnd to commence slow start *) cb.t maxseg ∗ (if is localnet ifds0(the sock .is2) then SS FLTSZ LOCAL else SS FLTSZ) in (* Calculate the amount of unused send window *) let win =min cb.snd wnd snd cwnd ′ in let snd wnd unused = int of num win − (cb.snd nxt − cb.snd una) in (* Is it possible that a FIN may need to be sent? *) let fin required = (sock .cantsndmore ∧ tcp sock .st /∈ {FIN WAIT 2;TIME WAIT}) in (* Under BSD, we may need to send a FIN in state SYN SENT or SYN RECEIVED, so we may effectively still have a SYN on the send queue. *) Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp output required 112 let syn not acked = (bsd arch arch ∧ tcp sock .st ∈ {SYN SENT;SYN RECEIVED}) in (* Is there data or a FIN to transmit? *) let last sndq data seq = cb.snd una + length tcp sock .sndq in let last sndq data and fin seq = last sndq data seq + (if fin required then 1 else 0) + (if syn not acked then 1 else 0) in let have data to send = cb.snd nxt < last sndq data seq in let have data or fin to send = cb.snd nxt < last sndq data and fin seq in (* The amount by which the right edge of the advertised window could be moved *) let window update delta = (int min(int of num(TCP MAXWIN cb.rcv scale)) (int of num(sock .sf .n(SO RCVBUF))− int of num(length tcp sock .rcvq)))− (cb.rcv adv − cb.rcv nxt) in (* Send a window update? This occurs when (a) the advertised window can be increased by at least two max- imum segment sizes, or (b) the advertised window can be increased by at least half the receive buffer size. See tcp_output.c:322ff. *) let need to send a window update = (window update delta ≥ int of num(2 ∗ cb.t maxseg) ∨ 2 ∗ window update delta ≥ int of num(sock .sf .n(SO RCVBUF))) in (* Note that silly window avoidance and max sndwnd need to be dealt with here; see tcp_output.c:309 *) (* Can a segment be transmitted? *) let do output = ( (* Data to send and the send window has some space, or a FIN can be sent *) (have data or fin to send ∧ (have data to send =⇒ snd wnd unused > 0)) ∨ (* don’t need space if only sending FIN *) (* Can send a window update *) need to send a window update ∨ (* There is outstanding urgent data to be transmitted *) is some tcp sock .sndurp ∨ (* An ACK should be sent immediately (e.g. in reply to a window probe) *) cb.tf shouldacknow ) in let persist fun = let cant send = (¬do output ∧ tcp sock .sndq 6= [ ] ∧mode of cb.tt rexmt = ∗) in let window shrunk = (win = 0 ∧ snd wnd unused < 0∧ (* win = 0 if in SYN SENT, but still may send FIN *) (bsd arch arch =⇒ tcp sock .st 6= SYN SENT)) in if cant send then (* takes priority over window shrunk; note this needs to be checked *) (* Can not transmit a segment despite a non-empty send queue and no running persist or retransmit timer. Must be the case that the receiver’s advertised window is now zero, so start the persist timer. Normal: tcp_output.c:378ff *) ↑λcb.cb 〈[ tt rexmt := start tt persist 0 cb.t rttinf arch]〉 else if window shrunk then (* The receiver’s advertised window is zero and the receiver has retracted window space that it had previously advertised. Reset snd nxt to snd una because the data from snd una to snd nxt has likely not been buffered by the receiver and should be retransmitted. Bizzarely (on FreeBSD 4.6-RELEASE), if the persist timer is running reset its shift value *) (* Window shrunk: |tcp output.c:250ff| *) ↑λcb. cb 〈[ tt rexmt := case cb.tt rexmt of ↑(((Persist, shift))d)→ ↑(((Persist, 0))d) ‖ 593 → start tt persist 0 cb.t rttinf arch; Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp output really 113 snd nxt := cb.snd una]〉 else (* Otherwise, leave the persist timer alone *) ∗ in (do output , persist fun) Description This function determines if it is currently necessary to emit a segment. It is not quite a predicate, because in certain circumstances the operation of testing may start or reset the persist timer, and alter snd nxt . Thus it returns a pair of a flag do output (with the obvious meaning), and an optional mutator function persist fun which, if present, performs the required updates on the TCP control block. – do TCP output : tcp output really arch window probe ts val ′ ifds0 sock(sock ′, outsegs ′) = let tcp sock = tcp sock of sock in let cb = tcp sock .cb in (* Assert that the socket is fully bound and connected *) sock .is1 6= ∗ ∧ sock .is2 6= ∗ ∧ sock .ps1 6= ∗ ∧ sock .ps2 6= ∗ ∧ (* Note this does not deal with TF_LASTIDLE and PRU_MORETOCOME *) let snd cwnd ′ = if ¬(cb.snd max = cb.snd una ∧ stopwatch val of cb.t idletime ≥ computed rxtcur cb.t rttinf arch) then (* inverted so this clause is tried first *) cb.snd cwnd else (* The connection is idle and has been for >= 1RTO *) (* Reduce snd cwnd to commence slow start *) cb.t maxseg ∗ (if is localnet ifds0(the sock .is2) then SS FLTSZ LOCAL else SS FLTSZ) in (* Calculate the amount of unused send window *) let win0 =min cb.snd wnd snd cwnd ′ in let win = (if window probe ∧ win0 = 0 then 1 else win0) in let (snd wnd unused : int) = int of num win − (cb.snd nxt − cb.snd una) in (* Is it possible that a FIN may need to be transmitted? *) let fin required = (sock .cantsndmore ∧ tcp sock .st /∈ {FIN WAIT 2;TIME WAIT}) in (* Calculate the sequence number after the last data byte in the send queue *) let last sndq data seq = cb.snd una + length tcp sock .sndq in (* The data to send in this segment (if any) *) let data ′ = DROP(num(cb.snd nxt − cb.snd una))tcp sock .sndq in let data to send = TAKE(min(clip int to num snd wnd unused)cb.t maxseg)data ′ in (* Should FIN be set in this segment? *) let FIN = (fin required ∧ cb.snd nxt + length data to send ≥ last sndq data seq) in (* Should ACK be set in this segment? Under BSD, it is not set if the socket is in SYN SENT and emitting a FIN segment due to shutdown() having been called. *) let ACK = if (bsd arch arch ∧ FIN ∧ tcp sock .st = SYN SENT) then F else T in Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp output really 114 (* If this socket has previously sent a FIN which has not yet been acked, and snd nxt is past the FIN ’s sequence number, then snd nxt should be set to the sequence number of the FIN flag, i.e. a retransmission. Check that snd una 6= iss as in this case no data has yet been sent over the socket *) let snd nxt ′ = if FIN ∧ (cb.snd nxt + length data to send = last sndq data seq + 1 ∧ cb.snd una 6= cb.iss ∨ num(cb.snd nxt − cb.iss) = 2) then cb.snd nxt − 1 else cb.snd nxt in (* The BSD way: set PSH whenever sending the last byte of data in the send queue *) let PSH = (data to send 6= [ ] ∧ cb.snd nxt + length data to send = last sndq data seq) in (* If sending urgent data, set the URG and urp fields appropriately *) let (URG , urp) = (case tcp sock .sndurp of ∗ → (F, 0) ‖ (* No urgent data; don’t set *) ↑ sndurpn → let urpn = (cb.snd una + sndurpn)− cb.snd nxt + 1 in (* points one byte *past* the urgent byte *) if urpn < 1 then (F, 0) (* Urgent data out of range; don’t set *) else if urpn < 65536 then (T,num urpn) (* Urgent data in range; set *) else (* Urgent data in the very distant future; set *) (* Steven’s suggestion; not sure if followed *) (T, 65535)) in (* Calculate size of the receive window (based upon available buffer space) *) let rcv wnd ′′ = calculate bsd rcv wnd sock .sf tcp sock in let rcv wnd ′ =max(num(cb.rcv adv − cb.rcv nxt))(min(TCP MAXWIN cb.rcv scale) (if rcv wnd ′′ < sock .sf .n(SO RCVBUF)div 4 ∧ rcv wnd ′′ < cb.t maxseg then 0 (* Silly window avoidance: shouldn’t advertise a tiny window *) else rcv wnd ′′)) in (* Possibly set the segment’s timestamp option. Under BSD, we may need to send a FIN segment from SYN SENT, if the user called shutdown(), in which case the timestamp option hasn’t yet been negotiated, so we used tf req tstmp rather than tf doing tstmp. *) let want tstmp = if (bsd arch arch ∧ tcp sock .st = SYN SENT) then cb.tf req tstmp else cb.tf doing tstmp in let ts = do tcp options want tstmp cb.ts recent ts val ′ in (* Advertise an appropriately scaled receive window *) (* Assert the advertised window is within a sensible range *) let win = n2w(rcv wnd ′ cb.rcv scale) in w2n win = rcv wnd ′ cb.rcv scale ∧ (* Assert the urgent pointer is within a sensible range *) let urp = n2w urp in w2n urp = urp ∧ let seg =〈[ is1 := sock .is1; is2 := sock .is2; ps1 := sock .ps1; ps2 := sock .ps2; seq := snd nxt ′; ack := cb.rcv nxt ; URG :=URG ; ACK :=ACK ; PSH :=PSH ; Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp output really 115 RST :=F; SYN :=F; FIN :=FIN ; win :=win; ws := ∗; urp := urp ; mss := ∗; ts := ts; data := data to send ]〉 in (* If emitting a FIN for the first time then change TCP state *) let st ′ = if FIN then case tcp sock .st of SYN SENT→ tcp sock .st ‖ (* can’t move yet – wait until connection established (see deliver in 2/deliver in 3 ) *) SYN RECEIVED→ tcp sock .st ‖ (* can’t move yet – wait until connection established (see deliver in 2/deliver in 3 ) *) ESTABLISHED→ FIN WAIT 1 ‖ CLOSE WAIT→ LAST ACK ‖ FIN WAIT 1→ tcp sock .st ‖ (* FIN retransmission *) FIN WAIT 2→ tcp sock .st ‖ (* can’t happen *) CLOSING→ tcp sock .st ‖ (* FIN retransmission *) LAST ACK→ tcp sock .st ‖ (* FIN retransmission *) TIME WAIT→ tcp sock .st (* can’t happen *) else tcp sock .st in (* Updated values to store in the control block after the segment is output *) let snd nxt ′′ = snd nxt ′ + length data to send + (if FIN then 1 else 0) in let snd max ′ =max cb.snd max snd nxt ′′ in (* Following a tcp_output code walkthrough by SB: *) let tt rexmt ′ = if (mode of cb.tt rexmt = ∗ ∨ (mode of cb.tt rexmt = ↑(Persist) ∧ ¬window probe)) ∧ snd nxt ′′ > cb.snd una then (* If the retransmit timer is not running, or the persist timer is running and this segment isn’t a window probe, and this segment contains data or a FIN that occurs past snd una (i.e. new data), then start the retransmit timer. Note: if the persist timer is running it will be implicitly stopped *) start tt rexmt arch 0 F cb.t rttinf else if (window probe ∨ (is some tcp sock .sndurp)) ∧ win0 6= 0 ∧ mode of cb.tt rexmt = ↑(Persist) then (* If the segment is a window probe or urgent data is being sent, and in either case the send window is not closed, stop any running persist timer. Note: if window probe is T then a persist timer will always be running but this isn’t necessarily true when urgent data is being sent *) ∗ (* stop persisting *) else (* Otherwise, leave the timers alone *) cb.tt rexmt in (* Time this segment if it is sensible to do so, i.e. the following conditions hold : (a) a segment is not already being timed, and (b) data or a FIN are being sent, and (c) the segment being emitted is not a retransmit, and (d) the segment is not a window probe *) let t rttseg ′ = if IS NONE cb.t rttseg ∧ (data to send 6= [ ] ∨ FIN ) ∧ snd nxt ′′ > cb.snd max ∧ ¬window probe then ↑(ts val ′, snd nxt ′) else cb.t rttseg in Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Segment Queueing (TCP only) 116 (* Update the socket *) sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ st := st ′; cb := tcp sock .cb 〈[ tt rexmt := tt rexmt ′; snd cwnd := snd cwnd ′; rcv wnd := rcv wnd ′; tf rxwin0sent :=(rcv wnd ′ = 0); tf shouldacknow :=F; t rttseg := t rttseg ′; snd max := snd max ′; snd nxt := snd nxt ′′; tt delack := ∗; last ack sent := cb.rcv nxt ; rcv adv := cb.rcv nxt + rcv wnd ′ ]〉]〉)]〉 ∧ (* Constrain the list of output segments to contain just the segment being emitted *) outsegs ′ = [TCP seg ] Description This function constructs the next segment to be output. It is usually called once tcp output required has returned true, but sometimes is called directly when we wish always to emit a segment. A large number of TCP state variables are modified also. Note that while constructing the segment a variety of errors such as ENOBUFS are possible, but this is not modelled here. Also, window shrinking is not dealt with properly here. – combination of tcp output required and tcp output really : tcp output perhaps arch ts val ifds0 sock(sock ′, outsegs) = let (do output , persist fun) = tcp output required arch ifds0 sock in let sock ′′ = option case sock (λf .sock 〈[ pr :=TCP PROTO(tcp sock of sock cb :=ˆ f )]〉) persist fun in if do output then tcp output really arch F ts val ifds0 sock ′′(sock ′, outsegs) else (sock ′ = sock ′′ ∧ outsegs = [ ]) 14.3 Segment Queueing (TCP only) Once a segment is generated for output, it must be enqueued for transmission. This enqueuing may fail. These functions model what happens in this case, and encapsulate the enqueuing-and-possibly-rolling-back process. 14.3.1 Summary rollback tcp output Attempt to enqueue segments, reverting appropriate socket fields if the enqueue fails enqueue or fail wrap rollback tcp output together with enqueue enqueue or fail sock version of enqueue or fail that works with sockets rather than cbs enqueue and ignore fail version of enqueue or fail that ignores errors and doesn’t touch the tcpcb enqueue each and ignore fail version of above that ignores errors and doesn’t touch the tcpcb Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ rollback tcp output 117 mlift tcp output perhaps or fail do mliftc for function returning at most one segment and not dealing with queueing flag 14.3.2 Rules – Attempt to enqueue segments, reverting appropriate socket fields if the enqueue fails : rollback tcp output rcvdsyn seg arch rttab ifds is connect cb0 cb in(cb′, es ′, outsegs ′) = (* NB: from cb0, only snd nxt , tt delack , last ack sent , rcv adv , tf rxwin0sent , t rttseg , snd max , tt rexmt are used. *) (choose allocated :: (if INFINITE RESOURCES then {T} else {T;F}). let route = test outroute(seg , rttab, ifds, arch) in let f0 = λcb.cb 〈[ (* revert to original values; on ip output failure *) snd nxt := cb0.snd nxt ; tt delack := cb0.tt delack ; last ack sent := cb0.last ack sent ; rcv adv := cb0.rcv adv ]〉 in let f1 = λcb.if ¬rcvdsyn then cb else cb 〈[ (* set soft error flag; on ip output routing failure *) t softerror := the route(* assumes route = SOME (SOME e) *) ]〉 in let f2 = λcb.cb 〈[ (* revert to original values; on early ENOBUFS *) tf rxwin0sent := cb0.tf rxwin0sent ; t rttseg := cb0.t rttseg ; snd max := cb0.snd max ; tt rexmt := cb0.tt rexmt ]〉 in let f3 = λcb.if is some cb.tt rexmt ∨ is connect then (* quench; on ENOBUFS *) cb else cb 〈[ (* maybe start rexmt and close down window *) tt rexmt := start tt rexmt arch 0 F cb.t rttinf ; snd cwnd := cb.t maxseg(* no LAN allowance, by design *) ]〉 in if ¬allocated then (* allocation failure *) cb′ = f3 (f2 (f0 cb in)) ∧ outsegs ′ = [ ] ∧ es ′ = ↑ ENOBUFS else if route = ∗ then (* ill-formed segment *) ASSERTION FAILURE“rollback tcp output:1”(* should never happen *) else if ∃e.route = ↑(↑ e) then (* routing failure *) cb′ = f1 (f0 cb in) ∧ outsegs ′ = [ ] ∧ es ′ = the route else if loopback on wire seg ifds then (* loopback not allowed on wire - RFC1122 *) (if windows arch arch then cb′ = cb in ∧ outsegs ′ = [ ] ∧ es ′ = ∗(* Windows silently drops segment! *) else if bsd arch arch then cb′ = f0 cb in ∧ outsegs ′ = [ ] ∧ es ′ = ↑ EADDRNOTAVAIL else if linux arch arch then cb′ = f0 cb in ∧ outsegs ′ = [ ] ∧ es ′ = ↑ EINVAL else ASSERTION FAILURE“rollback tcp output:2”(* never happen *) ) else (∃queued . Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ mlift tcp output perhaps or fail 118 outsegs ′ = [(seg , queued)] ∧ if ¬queued then (* queueing failure *) cb′ = f3 (f0 cb in) ∧ es ′ = ↑ ENOBUFS else (* success *) cb′ = cb in ∧ es ′ = ∗) ) – wrap rollback tcp output together with enqueue : enqueue or fail rcvdsyn arch rttab ifds outsegs oq cb0 cb in(cb′, oq ′) = (case outsegs of [ ]→ cb′ = cb0 ∧ oq ′ = oq ‖ [seg ]→ (∃outsegs ′ es ′. rollback tcp output rcvdsyn seg arch rttab ifds F cb0 cb in(cb′, es ′, outsegs ′) ∧ enqueue oq list qinfo(oq , outsegs ′, oq ′)) ‖ other84 → ASSERTION FAILURE“enqueue or fail”(* only 0 or 1 segments at a time *) ) – version of enqueue or fail that works with sockets rather than cbs : enqueue or fail sock rcvdsyn arch rttab ifds outsegs oq sock0 sock(sock ′, oq ′) = (* NB: could calculate rcvdsyn, but clearer to pass it in *) let tcp sock = tcp sock of sock in let tcp sock0 = tcp sock of sock0 in (∃cb′. enqueue or fail rcvdsyn arch rttab ifds outsegs oq(tcp sock of sock0 ).cb(tcp sock of sock).cb(cb′, oq ′) ∧ sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock of sock 〈[ cb := cb′ ]〉)]〉) – version of enqueue or fail that ignores errors and doesn’t touch the tcpcb : enqueue and ignore fail arch rttab ifds outsegs oq oq ′ = ∃rcvdsyn cb0 cb in cb′. enqueue or fail rcvdsyn arch rttab ifds outsegs oq cb0 cb in(cb′, oq ′) – version of above that ignores errors and doesn’t touch the tcpcb : (enqueue each and ignore fail arch rttab ifds[ ]oq oq ′ = (oq = oq ′)) ∧ (enqueue each and ignore fail arch rttab ifds(seg :: segs)oq oq ′′ = ∃oq ′. enqueue and ignore fail arch rttab ifds[seg ]oq oq ′ ∧ enqueue each and ignore fail arch rttab ifds segs oq ′ oq ′′) – do mliftc for function returning at most one segment and not dealing with queueing flag : mlift tcp output perhaps or fail ts val arch rttab ifds0 = mliftc(λs(s ′, outsegs ′). ∃s1 segs. tcp output perhaps arch ts val ifds0 s(s1, segs) ∧ case segs of Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Drop Segment Functions (TCP only) 119 [ ]→ s ′ = s1 ∧ outsegs ′ = [ ] ‖ [seg ]→ (∃cb′ es ′.(* ignore error return *) rollback tcp output T seg arch rttab ifds0 F (tcp sock of s).cb(tcp sock of s1).cb(cb′, es ′, outsegs ′) ∧ s ′ = s1 〈[ pr :=TCP PROTO(tcp sock of s1 〈[ cb := cb′]〉)]〉) ‖ other58 → ASSERTION FAILURE“mlift tcp output perhaps or fail”(* never happen *) ) 14.4 Incoming Segment Functions (TCP only) Updates performed to the idle, keepalive, and FIN_WAIT_2 timers for every incoming segment. 14.4.1 Summary update idle Do updates appropriate to receiving a new segment on a con- nection 14.4.2 Rules – Do updates appropriate to receiving a new segment on a connection : update idle tcp sock = let t idletime ′ = stopwatch zero in (* update ’time most recent packet received’ field *) let tt keep′ = (if ¬(tcp sock .st = SYN RECEIVED ∧ tcp sock .cb.tf needfin) then (* reset keepalive timer to 2 hours. *) ↑((())slow timer TCPTV KEEP IDLE) else tcp sock .cb.tt keep) in let tt fin wait 2 ′ = (if tcp sock .st = FIN WAIT 2 then ↑((())slow timer TCPTV MAXIDLE) else tcp sock .cb.tt fin wait 2 ) in (t idletime ′, tt keep′, tt fin wait 2 ′) 14.5 Drop Segment Functions (TCP only) When an erroneous or unexpected segment arrives, it is usually dropped (i.e, ignored). However, the peer is usually informed immediately by means of a RST or ACK segment. 14.5.1 Summary dropwithreset emit a RST segment corresponding to the passed segment, unless that would be stupid. mlift dropafterack or fail send immediate ACK to segment, but otherwise process it no further dropwithreset ignore fail do emit segs pred, for function returning at most one seg and not dealing with queueing flag Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ dropwithreset ignore fail 120 14.5.2 Rules – emit a RST segment corresponding to the passed segment, unless that would be stupid. : dropwithreset seg ifds0 ticks reason bndlm bndlm ′ outsegs = (* Needs list of the host’s interfaces, to verify that the incoming segment wasn’t broadcast. Returns a list of segments. *) if (* never RST a RST *) seg .RST ∨ (* is segment a (link-layer?) broadcast or multicast? *) F ∨ (* is source or destination broadcast or multicast? *) (∃i1.seg .is1 = ↑ i1 ∧ is broadormulticast ∅ i1) ∨ (∃i2.seg .is2 = ↑ i2 ∧ is broadormulticast ifds0 i2) (* BSD only checks incoming interface, but should have same effect as long as interfaces don’t overlap *) then outsegs = [ ] ∧ bndlm ′ = bndlm else (choose seg ′ :: make rst segment from seg seg . let (emit , bndlm ′′) = bandlim rst ok(seg ′, ticks, reason, bndlm) in (* finally: check if band-limited *) bndlm ′ = bndlm ′′ ∧ outsegs = if emit then [TCP seg ′] else [ ]) – send immediate ACK to segment, but otherwise process it no further : mlift dropafterack or fail seg arch rttab ifds ticks(sock , bndlm)((sock ′, bndlm ′, outsegs ′), continue) = (* ifds is just in case we need to send a RST, to make sure we don’t send it to a broadcast address. *) let tcp sock = tcp sock of sock in (continue = T ∧ let cb = tcp sock .cb in if tcp sock .st = SYN RECEIVED ∧ seg .ACK ∧ (let ack = tcp seq flip sense seg .ack in (ack < cb.snd una ∨ cb.snd max < ack)) then (* break loop in ”LAND” DoS attack, and also prevent ACK storm between two listening ports that have been sent forged SYN segments, each with the source address of the other. (tcp_input.c:2141) *) sock ′ = sock ∧ dropwithreset seg ifds ticks BANDLIM RST OPENPORT bndlm bndlm ′(map fst outsegs ′) (* ignore queue full error *) else (∃sock1 msg cb′ es ′.(* ignore errors *) let tcp sock1 = tcp sock of sock1 in tcp output really arch F ticks ifds sock(sock1, [msg ])∧ (* did set tf acknow and call tcp output perhaps, which seemed a bit silly *) (* notice we here bake in the assumption that the timestamps use the same counter as the band limiter; perhaps this is unwise *) rollback tcp output T msg arch rttab ifds F tcp sock .cb tcp sock1 .cb(cb′, es ′, outsegs ′) ∧ sock ′ = sock1 〈[ pr :=TCP PROTO(tcp sock1 〈[ cb := cb′]〉)]〉 ∧ bndlm ′ = bndlm)) – do emit segs pred, for function returning at most one seg and not dealing with queueing flag : dropwithreset ignore fail seg in arch ifds rttab ticks reason b b′(outsegs ′ : (msg#bool)list) = Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp drop and close 121 (* No rollback necessary here. *) ∃segs. dropwithreset seg in ifds ticks reason b b′ segs ∧ case segs of [ ]→ outsegs ′ = [ ] ‖ [seg ]→ (choose allocated :: if INFINITE RESOURCES then {T} else {T;F}. if ¬allocated then outsegs ′ = [ ] else (case test outroute(seg , rttab, ifds, arch) of ∗ → ASSERTION FAILURE“dropwithreset ignore fail:1”(* never happen *) ‖ ↑(↑ e)→ outsegs ′ = [ ](* ignore error *) ‖ ↑ ∗ → ∃queued .outsegs ′ = [(seg , queued)])) ‖ other57 → ASSERTION FAILURE“dropwithreset ignore fail:2”(* never happen *) 14.6 Close Functions (TCP only) Closing a connection, updating the socket and TCP control block appropriately. 14.6.1 Summary tcp close close the socket and remove the TCPCB tcp drop and close drop TCP connection, reporting the specified error. If syn- chronised, send RST to peer 14.6.2 Rules – close the socket and remove the TCPCB : tcp close arch sock = sock 〈[ cantrcvmore :=T; (* MF doesn’t believe this is correct for Linux or WinXP *) cantsndmore :=T; is1 := if bsd arch arch then ∗ else sock .is1; ps1 := if bsd arch arch then ∗ else sock .ps1; pr :=TCP PROTO(tcp sock of sock 〈[ st :=CLOSED; cb := initial cb (* in reality, it’s dropped entirely, but we don’t do that *) 〈[ bsd cantconnect := if bsd arch arch then T else F]〉; sndq :=[ ]]〉) ]〉 Description This is similar to BSD’s tcp_close(), except that we do not actually remove the proto- col/control blocks. The quad of the socket is cleared, to enable another socket to bind to the port we were previously using — this isn’t actually done by BSD, but the effect is the same. The bsd cantconnect flag is set to indicate that the socket is in such a detached state. – drop TCP connection, reporting the specified error. If synchronised, send RST to peer : tcp drop and close arch err sock(sock ′, outsegs) = let tcp sock = tcp sock of sock in ( (if tcp sock .st /∈ {CLOSED;LISTEN;SYN SENT} then Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ tcp drop and close 122 (choose seg :: (make rst segment from cb tcp sock .cb (the sock .is1, the sock .is2, the sock .ps1, the sock .ps2)). outsegs = [TCP seg ]) else outsegs = [ ]) ∧ let es ′ = if err = ↑ ETIMEDOUT then (if tcp sock .cb.t softerror 6= ∗ then tcp sock .cb.t softerror else ↑ ETIMEDOUT) else if err 6= ∗ then err else sock .es in sock ′ = tcp close arch(sock 〈[ es := es ′]〉)) Description BSD calls this tcp_drop Rule version: $Id: TCP1 auxFnsScript.sml,v 1.219 2005/03/17 11:35:34 kw217 Exp $ Part XIII TCP1 hostLTS 123 Chapter 15 Host LTS: Socket Calls 15.1 accept() (TCP only) accept : fd→ fd ∗ (ip ∗ port) accept(fd) returns the next connection available on the completed connections queue for the listening TCP socket referenced by file descriptor fd. The returned file descriptor fd refers to the newly-connected socket; the returned ip and port are its remote address. accept() blocks if the completed connections queue is empty and the socket does not have the O NONBLOCK flag set. Any pending errors on the new connection are ignored, except for ECONNABORTED which causes accept() to fail with ECONNABORTED. Calling accept() on a UDP socket fails: UDP is not a connection-oriented protocol. 15.1.1 Errors A call to accept() can fail with the errors below, in which case the corresponding exception is raised: EAGAIN The socket has the O NONBLOCK flag set and no connections are available on the completed connections queue. ECONNABORTED The connection at the head of the completed connections queue has been aborted; the socket has been shutdown for reading; or the socket has been closed. EINVAL Ths socket is not accepting connections, i.e., it is not in the LISTEN state, or is a UDP socket. EMFILE The maximum number of file descriptors allowed per process are already open for this process. EOPNOTSUPP The socket type of the specified socket does not support accepting connections. This error is raised if accept() is called on a UDP socket. ENFILE Out of resources. ENOBUFS Out of resources. ENOMEM Out of resources. EINTR The system was interrupted by a caught signal. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 124 accept() (TCP only) 125 15.1.2 Common cases accept() is called and immediately returns a connection: accept 1 ; return 1 accept() is called and blocks; a connection is completed and the call returns: accept 2 ; deliver in 99 ; deliver in 1 ; accept 1 ; return 1 15.1.3 API Posix: int accept(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); FreeBSD: int accept(int s, struct sockaddr *addr, socklen_t *addrlen); Linux: int accept(int s, struct sockaddr *addr, socklen_t *addrlen); WinXP: SOCKET accept(SOCKET s, struct sockaddr* addr, int* addrlen); In the Posix interface: • socket is the listening socket’s file descriptor, corresponding to the fd argument of the model; • the returned int is either non-negative, i.e., a file descriptor referring to the newly-connected socket, or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of INVALID_SOCKET, not -1, with the actual error code available through a call to WSAGetLastError(). • address is a pointer to a sockaddr structure of length address_len corresponding to the ip ∗ port returned by the model accept(). If address is not a null pointer then it stores the address of the peer for the accepted connection. For the model accept() it will actually be a sockaddr_in structure; the peer IP address will be stored in the sin_addr.s_addr field, and the peer port will be stored in the sin_port field. If address is a null pointer then the peer address is ignored, but the model accept() always returns the peer address. On input the address_len is the length of the address structure, and on output it is the length of the stored address. 15.1.4 Model details If the accept() call blocks then state Accept2(sid) is entered, where sid is the index of the socket that accept() was called upon. The following errors are not included in the model: • EFAULT signifies that the pointers passed as either the address or address_len arguments were inac- cessible. This is an artefact of the C interface to accept() that is excluded by the clean interface used in the model. • EPERM is a Linux-specific error code described by the Linux man page as ”Firewall rules forbid connection”. This is outside the scope of what is modelled. • EPROTO is a Linux-specific error code described by the man page as ”Protocol error”. Only TCP and UDP are modelled here; the only sockets that can exist in the model are bound to a known protocol. • WSAECONNRESET is a WinXP-specific error code described in the MSDN page as ”An incoming connection was indicated, but was subsequently terminated by the remote peer prior to accepting the call.” This error has not been encountered in exhaustive testing. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. From the Linux man page: Linux accept() passes already-pending network errors on the new socket as an error code from accept. This behaviour differs from other BSD socket implementations. For reliable operation the application should detect the network errors defined for the protocol after accept and treat them like EAGAIN by retrying. In case of TCP/IP these are ENETDOWN, EPROTO, ENOPROTOOPT, EHOSTDOWN, ENONET, EHOSTUNREACH, EOPNOTSUPP, and ENETUNREACH. This is currently not modelled, but will be looked at when the Linux semantics are investigated. 15.1.5 Summary Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ accept 1 126 accept 1 tcp: rc Return new connection; either immediately or from a blocked state. accept 2 tcp: block Block waiting for connection accept 3 tcp: fast fail Fail with EAGAIN: no pending connections and non- blocking semantics set accept 4 tcp: rc Fail with ECONNABORTED: the listening socket has cantsndmore set or has become CLOSED. Returns either immediately or from a blocked state. accept 5 tcp: rc Fail with EINVAL: socket not in LISTEN state accept 6 tcp: rc Fail with EMFILE: out of file descriptors accept 7 udp: fast fail Fail with EOPNOTSUPP or EINVAL: accept() called on a UDP socket 15.1.6 Rules accept 1 tcp: rc Return new connection; either immediately or from a blocked state. h 〈[ts := ts ⊕ (tid 7→ (t)d); fds := fds; files :=files; socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA))); (sid ′,Sock(∗, sf ′, ↑ i ′1, ↑ p1, ↑ i2, ↑ p2, es ′, cantsndmore ′, cantrcvmore ′, TCP Sock(ESTABLISHED, cb′, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(fd ′, (i2, p2))))sched timer); fds := fds ′; files :=files ⊕ [(fid ′,File(FT Socket(sid ′),ff default))]; socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis ′, [ ], ∗, [ ], ∗,NO OOBDATA))); (sid ′,Sock(↑ fid ′, sf ′, ↑ i ′1, ↑ p1, ↑ i2, ↑ p2, es ′, cantsndmore ′, cantrcvmore ′,TCP Sock(ESTABLISHED, cb′, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 t = Run ∧ lbl = tid ·(accept fd) ∧ rc = fast succeed ∧ fid = fds[fd ] ∧ fd ∈ dom(fds) ∧ files[fid ] = File(FT Socket(sid),ff ) ∨ t = Accept2(sid) ∧ lbl = τ ∧ rc = slow urgent succeed ∧ lis.q = q @ [sid ′] ∧ lis ′.q = q ∧ lis ′.q0 = lis.q0 ∧ lis ′.qlimit = lis.qlimit ∧ (sid 6= sid ′) ∧ es ′ 6= ↑ ECONNABORTED ∧ fid ′ /∈ ((dom(files)) ∪ {fid}) ∧ nextfd h.arch fds fd ′ ∧ fds ′ = fds ⊕ (fd ′,fid ′) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ accept 3 127 (∀i1.↑ i1 = is1 =⇒ i1 = i ′1) Description This rule covers two cases: (1) the completed connection queue is non-empty when accept(fd) is called from a thread tid in the Run state, where fd refers to a TCP socket sid , and (2) a previous call to accept(fd) on socket sid blocked, leaving its calling thread tid in state Accept2(sid), and a new connection has become available. In either case the listening TCP socket sid has a connection sid ′ at the head of its completed connections queue sid ′ :: q . A socket entry for sid ′ already exists in the host’s finite map of sockets, socks⊕ . . . . The socket is ESTABLISHED, is not shutdown for reading, and is only missing a file description association that would make it accessible via the sockets interface. A new file description record is created for connection sid ′, indexed by a new fid ′, and this is added to the host’s finite map of file descriptions files. It is assigned a default set of file flags, ff default. The socket entry sid ′ is completed with its file association ↑ fid ′ and sid ′ is removed from the head of the completed connections queue. When the listening socket sid is bound to a local IP address i1, the accepted socket sid ′ is also bound to it. Finally, the new file descriptor fd ′ is created in an architecture-specific way using the auxiliary nextfd (p??), and an entry mapping fd ′ to fid ′ is added to the host’s finite map of file descriptors. If the calling thread was previously blocked in state Accept2(sid) it proceeds via a τ transition, otherwise by a tid ·(accept fd) transition. The thread is left in state Ret(OK(fd ′, (i2, p2))) to return the file descriptor and remote address of the accepted connection in response to the original accept() call. If the new socket sid ′ has error ECONNABORTED pending in its error field es ′, this is handled by rule accept 5 . All other pending errors on sid ′ are ignored, but left as the socket’s pending error. accept 2 tcp: block Block waiting for connection h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·(accept fd)−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Accept2(sid))never timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ff .b(O NONBLOCK) = F ∧ (∃sf is1 p1 cb lis es. h.socks[sid ] = Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es,F, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)) ∧ lis.q = [ ]) Description A blocking accept() call is performed on socket sid when no completed incoming connections are available. The calling thread blocks until a new connection attempt completes successfully, the call is interrupted, or the process runs out of file descriptors. From thread tid , which is initially in the Run state, accept(fd) is called where fd refers to listening TCP socket sid which is bound to local port p1, is not shutdown for reading and is in blocking mode: ff .b(O NONBLOCK) = F. The socket’s queue of completed connections is empty, q :=[ ], hence the accept() call blocks waiting for a successful new connection attempt, leaving the calling thread state Accept2(sid). Socket sid might not be bound to a local IP address, i.e. is1 could be ∗. In this case the socket is listening for connection attempts on port p1 for all local IP addresses. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ accept 4 128 accept 3 tcp: fast fail Fail with EAGAIN: no pending connections and non-blocking semantics set h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·(accept fd)−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer)]〉 fd ∈ dom(h.fds) ∧ h.fds[fd ] = fid ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ff .b(O NONBLOCK) = T ∧ (∃sf is1 p1 cb lis es. h.socks[sid ] = Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)) ∧ lis.q = [ ]) Description A non-blocking accept() call is performed on socket sid when no completed incoming connections are available. Error EAGAIN is returned to the calling thread. From thread tid , which is initially in the Run state, accept(fd) is called where fd refers to a listen- ing TCP socket sid which is bound to local port p1, not shutdown for writing, and in non-blocking mode: ff .b(O NONBLOCK) = T. The socket’s queue of completed connections is empty, q :=[ ], hence the accept() call returns error EAGAIN, leaving the calling thread state Ret(FAIL EAGAIN) after a tid ·accept(fd) transition. Socket sid might not be bound to a local IP address, i.e. is1 could be ∗. In this case the socket is listening for connection attempts on port p1 for all local IP addresses. accept 4 tcp: rc Fail with ECONNABORTED: the listening socket has cantsndmore set or has become CLOSED. Returns either immediately or from a blocked state. h 〈[ts := ts ⊕ (tid 7→ (t)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)))]]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ECONNABORTED))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)))]]〉 t = Run ∧ st = LISTEN ∧ cantsndmore = T ∧ lbl = tid ·accept(fd) ∧ rc = fast fail ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∨ t = Accept2(sid) ∧ ((cantrcvmore = T ∧ st = LISTEN) ∨ (st = CLOSED)) ∧ lbl = τ ∧ rc = slow urgent fail Description This rule covers two cases: (1) an accept(fd) call is made on a listening TCP socket sid , referenced by fd , with cantsndmore set, and (2) a previous call to accept() on socket sid blocked, leaving a thread tid in state Accept2(sid), but the socket has since either entered the CLOSED state, or had cantrcvmore set. In both cases, ECONNABORTED is returned. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ accept 6 129 This situation will arise only when a thread calls close() on the listening socket while another thread is blocking on an accept() call, or if listen() was originally called on a socket which already had cantrcvmore set. The latter can occur in BSD, which allows listen() to be called in any (non CLOSED or LISTEN) state, though should never happen under typical use. If the calling thread was previously blocked in stateAccept2(sid), it proceeds via an τ transition, otherwise by a tid ·accept(fd) transition. The thread is left in state Ret(FAIL ECONNABORTED) to return the error ECONNABORTED in response to the initial accept() call. Note that this rule is not correct when dealing with the FreeBSD behaviour which allows any socket to be placed in the LISTEN state. accept 5 tcp: rc Fail with EINVAL: socket not in LISTEN state h 〈[ts := ts ⊕ (tid 7→ (t)d)]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 t = Run ∧ lbl = tid ·accept(fd) ∧ rc = fast fail ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∨ t = Accept2(sid) ∧ lbl = τ ∧ rc = slow urgent fail ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ tcp sock .st 6= LISTEN Description It is not valid to call accept() on a socket that is not in the LISTEN state. This rule covers two cases: (1) on the non-listening TCP socket sid , accept() is called from a thread tid , which is in the Run state, and (2) a previous call to accept() on TCP socket sid blocked because no completed connections were available, leaving thread tid in state Accept2(sid) and after the accept() call blocked the socket changed to a state other than LISTEN. In the first case the accept(fd) call on socket sid , referenced by file descriptor fd , proceeds by a tid ·accept(fd) transition and in the latter by a τ transition. In either case, the thread is left in state Ret(FAIL EINVAL) to return error EINVAL to the caller. The second case is subtle: a previous call to accept() may have blocked waiting for a new completed connection to arrive and an operation, such as a close() call, in another thread caused the socket to change from the LISTEN state. accept 6 tcp: rc Fail with EMFILE: out of file descriptors h 〈[ts := ts ⊕ (tid 7→ (t)d)]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EMFILE))sched timer)]〉 t = Run ∧ lbl = tid ·accept(fd) ∧ rc = fast fail ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = (h.socks[sid ]) ∧ proto of sock .pr = PROTO TCP ∨ t = Accept2(sid) ∧ lbl = τ ∧ rc = slow nonurgent fail ∧ card(dom(h.fds)) ≥ OPEN MAX Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind() (TCP and UDP) 130 This rule covers two cases: (1) from thread tid , which is in the Run state, an accept(fd) call is made where fd refers to a TCP socket sid , and (2) a previous call to accept() blocked leaving thread tid in the Accept2(sid) state. In either case the accept() call fails with EMFILE as the process (see Model Details) already has open its maximum number of open file descriptors OPEN MAX. In the first case the error is returned immediately (fast fail) by performing an tid ·accept(fd) transition, leaving the thread state Ret(FAIL EMFILE). In the second, the thread is unblocked, also leaving the thread state Ret(FAIL EMFILE), by performing a τ transition. Model details In real systems, error EMFILE indicates that the calling process already has OPEN MAX file descriptors open and is not permitted to open any more. This specification only models one single-process host with multiple threads, thus EMFILE is generated when the host exceeds the OPEN MAX limit in this model. accept 7 udp: fast fail Fail with EOPNOTSUPP or EINVAL: accept() called on a UDP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·accept(fd)−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of(h.socks[sid ]).pr = PROTO UDP ∧ (if bsd arch h.arch then err = EINVAL else err = EOPNOTSUPP) Description Calling accept() on a socket for a connectionless protocol (such as UDP) has no defined behaviour and is thus an invalid (EINVAL) or unsupported (EOPNOTSUPP) operation. From thread tid , which is in the Run state, an accept(fd) call is made where fd refers to a UDP socket identified by sid . The call proceeds by a tid ·accept(fd) transition leaving the thread state Ret(FAIL err) to return error err . On FreeBSD err is EINVAL; on all other systems the error is EOPNOTSUPP. Variations FreeBSD FreeBSD returns error EINVAL if accept() is called on a UDP socket. 15.2 bind() (TCP and UDP) bind : (fd ∗ ip option ∗ port option)→ unit bind(fd, is, ps) assigns a local address to the socket referenced by file descriptor fd. The local address, (is, ps), may consist of an IP address, a port or both an IP address and port. If bind() is called without specifying a port, bind( , , ∗), the socket’s local port assignment is autobound, i.e. an unused port for the socket’s protocol in the host’s ephemeral port range is selected and assigned to the socket. Otherwise the port p specified in the bind call, bind( , , ↑ p) forms part of the socket’s local address. On some architectures a range of port values are designated to be privileged, e.g. 0-1023 inclusive. If a call to bind() requests a port in this range and the caller does not have sufficient privileges the call will fail. A bind() call may or may not specify the IP address. If an IP address is not specified, bind( , ∗, ), the socket’s local IP address is set to ∗ and it will receive segments or datagrams addressed to any of the host’s local IP addresses and port p. Otherwise, the caller specifies a local IP address, bind( , ↑ i , ), the socket’s local IP address is set to ↑ i , and it only receives segments or datagrams addressed to IP address i and port p. A call to bind() may be unsuccessful if the requested IP address or port is unavailable to bind to, although in certain situations this can be overrriden by setting the socket option SO REUSEADDR appropriately: see bound port allowed (p85). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind() (TCP and UDP) 131 A socket can only be bound once: it is not possible to rebind it to a different port later. A bind() call is not necessary for every socket: sockets may be autobound to an ephemeral port when a call requiring a port binding is made, e.g. connect(). 15.2.1 Errors A call to bind() can fail with the errors below, in which case the corresponding exception is raised: EACCES The specified port is in the privileged port range of the host architecture and the current thread does not have the required privileges to bind to it. EADDRINUSE The specified address is in use by or conflicts with the address of another socket using the same protocol. The error may occur in the following situations only: • bind( , , ↑ p) will fail with EADDRINUSE if another socket is bound to port p. This error may be preventable by setting the SO REUSEADDR socket option. • bind( , ↑ i , ↑ p) will fail with EADDRINUSE if another socket is bound to port p and IP address i , or is bound to port p and wildcard IP. This error will not occur if the SO REUSEADDR option is correctly used to allow multiple sockets to be bound to the same local port. This error is never returned from a call bind( , , ∗) that requests an autobound port. EADDRNOTAVAIL The specified IP address cannot be bound as it is not local to the host. EINVAL The socket is already bound to an address and the socket’s protocol does not support rebinding to a new address. Multiple calls to bind() are not permitted. EISCONN The socket is connected and rebinding to a new local address is not permitted (TCP ONLY). ENOBUFS A port was not specified in the bind() call and autobinding failed because no ephemeral ports for the socket’s protocol are currently available. In addition, on WinXP the error can signal that the host has insufficient available buffers to com- plete the operation. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.2.2 Common cases A server application creates a TCP socket and binds it to its local address. It is then put in the LISTEN state to accept incoming connections to this address: socket 1 ; return 1 ; bind 1 ; return 1 ; listen 1 A UDP socket is created and bound to its local address. recv() is called and the socket blocks, waiting to receive datagrams sent to the local address: socket 1 ; return 1 ; bind 1 ; return 1 ; recv 12 15.2.3 API Posix: int bind(int socket, const struct sockaddr *address, socklen_t address_len); FreeBSD: int bind(int s, struct sockaddr *addr, socklen_t addrlen); Linux: int bind(int sockfd, struct sockaddr *addr, socklen_t addrlen); WinXP: SOCKET bind(SOCKET s, const struct sockaddr* name, int namelen); Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind() (TCP and UDP) 132 In the Posix interface: • socket is the socket’s file descriptor, corresponding to the fd argument of the model. • address is a pointer to a sockaddr structure of size socklen_t containing the local IP address and port to be assigned to the socket, corresponding to the is and ps arguments of the model. For the AF_INET sockets used in the model, a sockaddr_in structure stores the address. The sin_addr.s_addr field holds the IP address; if it is set to 0 then the IP address is wildcarded: is = ∗. The sin_port field stores the port to bind to; if it is set to 0 then the port is wildcarded: ps = ∗. On WinXP a wildcard IP is specified by the constant INADDR_ANY, not 0 • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux and WinXP interfaces are similar modulo some argument renaming, except where noted above. On Windows Socket 2 the name parameter is not necessarily interpreted as a pointer to a sockaddr structure but is cast this way for compatilibity with Windows Socket 1.1 and the BSD sockets interface. The service provider implementing the functionality can choose to interpret the pointer as a pointer to any block of memory provided that the first two bytes of the block start with the address family used to create the socket. The default WinXP internet family provider expects a sockaddr structure here. This change is purely an interface design choice that ultimately achieves the same functionality of providing a name for the socket and is not modelled. 15.2.4 Model details The specification only models the AF,PF INET address families thus the address family field of the struct sockaddr argument to bind() and those errors specific to other address familes, e.g. UNIX domain sockets, are not modelled here. In the Posix specification, ENOBUFS may have the additional meaning of ”Insufficient resources were available to complete the call”. This is more general than the use of ENOBUFS in the model. The following errors are not modelled: • EAGAIN is BSD-specific and described in the man page as: ”Kernel resources to complete the request are temporarily unavailable”. This is not modelled here. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. • EFAULT signifies that the pointers passed as either the address or address_len arguments were inacces- sible. This is an artefact of the C interface to bind() that is excluded by the clean interface used in the model. On WinXP, the equivalent error WSAEFAULT in addition signifies that the name address format used in name may be incorrect or the address family in name does not match that of the socket. • ENOTDIR, ENAMETOOLONG, ENOENT, ELOOP, EIO (BSD-only), EROFS, EISDIR (BSD-only), ENOMEM, EAFNOT- SUPPORT (Posix-only) and EOPNOTSUPP (Posix-only) are errors specific to other address families and are not modelled here. None apply to WinXP as other address families are not available by default. 15.2.5 Summary bind 1 all: fast succeed Successfully assign a local address to a socket (possibly by autobinding the port) bind 2 all: fast fail Fail with EADDRINUSE: the specified address is already in use bind 3 all: fast fail Fail with EADDRNOTAVAIL: the specified IP address is not available on the host bind 5 all: fast fail Fail with EINVAL: the socket is already bound to an address and does not support rebinding; or socket has been shutdown for writing on FreeBSD Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind 1 133 bind 7 all: fast fail Fail with EACCES: the specified port is priveleged and the current process does not have permission to bind to it bind 9 all: fast badfail Fail with ENOBUFS: no ephemeral ports free for autobind- ing or, on WinXP only, insufficient buffers available. 15.2.6 Rules bind 1 all: fast succeed Successfully assign a local address to a socket (possibly by autobinding the port) h0 tid ·bind(fd , is1, ps1)−−−−−−−−−−−−−−−−→ h h0 = h ′ 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, es, cantsndmore, cantrcvmore, pr))] ]〉 ∧ h = h ′ 〈[ ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, pr))]; bound := bound ]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sid /∈ (dom(socks)) ∧ (∀i1.is1 = ↑ i1 =⇒ i1 ∈ local ips(h0.ifds)) ∧ p1 ∈ autobind(ps1, (proto of pr), socks) ∧ bound = sid :: h0.bound ∧ (h0.privs ∨ p1 /∈ privileged ports) ∧ bound port allowed pr(h0.socks\\sid)sf h0.arch is1 p1 ∧ (case pr of TCP PROTO(tcp sock)→ tcp sock = TCP Sock0(CLOSED, cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA) ∧ (bsd arch h0.arch =⇒ cantsndmore = F ∧ cb.bsd cantconnect = F) ‖ UDP PROTO(udp sock)→ udp sock = UDP Sock0([ ])) Description The call bind(fd , is1, ps1) is perfomed on the TCP or UDP socket sid referenced by file descriptor fd from a thread tid in the Run state. The socket sid is currently uninitialised, i.e. it has no local or remote address defined (∗, ∗, ∗, ∗), and it contains an uninitialised TCP or UDP protocol block, tcp sock and udp sock as appropriate for the socket’s protocol. If an IP address is specified in the bind() call, i.e. is1 = ↑ i1, the call can only succeed if the IP address i1 is one of those belonging to an interface of host h, i1 ∈ local ips(h0.ifds). The port p1 that the socket will be bound to is determined by the auxiliary function autobind that takes as argument the port option ps1 from the bind() call. If ps1 = ↑ p autobind simply returns the singleton set {p}, constraining the local port binding p1 by p1 = p. Otherwise, autobind returns a set of available ephemeral ports and p1 is constrained to be a port within the set. If a port is specified in the bind() call, i.e. ps1 = ↑ p1, either the port is not a privileged port p1 /∈ privileged ports or the host (actually, process) must have sufficient privileges h0.priv = T. Not all requested bindings are permissible because other sockets in the system may be bound to the chosen address or to a conflicting address. To check the binding is1, ↑ p1 is permitted the auxiliary function bound port allowed is used. bound port allowed is architecture dependent and checks not only the other sockets bound locally to port p1 on the host, but also the status of the socket flag SO REUSEADDR for socket sid and the conflicting sockets. The use of the socket flag SO REUSEADDR can permit sockets to share bindings under some circumstances, resolving the binding conflict. See bound port allowed (p85) for further information. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind 3 134 The call proceeds by performing a tid ·bind(fd , is1, ps1) transition returning OK() to the calling thread. Socket sid is bound to local address (is1, ↑ p1)and the host has an updated list of bound sockets bound with socket sid at its head. Model details The list of bound sockets bound is used by the model to determine the order in which sockets are bound. This is required to model ICMP message and UDP datagram delivery on Linux. Variations FreeBSD If sid is a TCP socket then it cannot be shutdown for writing: cantsndmore = F, and its bsd cantconnect flag cannot be set. bind 2 all: fast fail Fail with EADDRINUSE: the specified address is already in use h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·bind(fd , is1, ↑ p1)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRINUSE))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = (h.socks[sid ]) ∧ ¬(bound port allowed sock .pr(h.socks\\sid)sock .sf h.arch is1 p1) ∧ (option case T (λi1.i1 ∈ local ips(h.ifds)) is1 ∨ windows arch h.arch) Description From thread tid , which is in the Run state, a bind(fd , is1, ↑ p1) call is performed on the socket sock , which is identified by sid and referenced by fd . If an IP address is specified in the call, is1 = ↑ i1, then i1 must be an IP address for one of the host’s interfaces. The requested local address binding, (is1, ↑ p1), is not available as it is already in use: see bound port allowed (p85) for details. The call proceeds by a tid ·bind(fd , is1, ↑ p1) transition leaving the thread in state Ret(FAIL EADDRINUSE) to return error EADDRINUSE to the caller. bind 3 all: fast fail Fail with EADDRNOTAVAIL: the specified IP address is not available on the host h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·bind(fd , ↑ i1, ps1)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRNOTAVAIL))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ i1 /∈ local ips(h.ifds) Description From thread tid , which is in the Run state, a bind(fd , ↑ i1, ps1) call is made where fd refers to a socket sid . The IP address, i1, to be assigned as part of the socket’s local address does not belong to any of the interfaces on the host, i1 /∈ local ips(h.ifds), and therefore can not be assigned to the socket. The call proceeds by a tid ·bind(fd , ↑ i1, ps1) transition leaving the thread in state Ret(FAIL EADDRNOTAVAIL) to return error EADDRNOTAVAIL to the caller. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ bind 9 135 bind 5 all: fast fail Fail with EINVAL: the socket is already bound to an address and does not support rebinding; or socket has been shutdown for writing on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·bind(fd , is1, ps1)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = sock ∧ (sock .ps1 6= ∗ ∨ (bsd arch h.arch ∧ sock .pr = TCP PROTO(tcp sock) ∧ (sock .cantsndmore ∨ tcp sock .cb.bsd cantconnect))) Description From thread tid , which is in the Run state, a bind(fd , is1, ps1) call is made where fd refers to a socket sock . The socket already has a local port binding: sock .ps1 6= ∗, and rebinding is not supported. A tid ·bind(fd , is1, ps1) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations FreeBSD This rule also applies if fd refers to a TCP socket which is either shut down for writing or has its bsd cantconnect flag set. bind 7 all: fast fail Fail with EACCES: the specified port is priveleged and the current process does not have permission to bind to it h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·bind(fd , is1, ↑ p1)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EACCES))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (¬h.privs ∧ p1 ∈ privileged ports) Description From thread tid , which is in the Run state, a bind(fd , is1, ↑ p1) call is made where fd refers to a socket sid . The port specified in the bind call, p1, lies in the host’s range of privileged ports, p1 ∈ privileged ports, and the current host (actually, process) does not have sufficient permissions to bind to it: ¬h.privs. The call proceeds by a tid ·bind(fd , is1, ↑ p1) transition leaving the thread in state Ret(FAIL EACCES) to return the access violation error EACCES to the caller. bind 9 all: fast badfail Fail with ENOBUFS: no ephemeral ports free for autobinding or, on WinXP only, insufficient buffers available. h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·bind(fd , is1, ps1)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOBUFS))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close() (TCP and UDP) 136 h.files[fid ] = File(FT Socket(sid),ff ) ∧ ps1 = ∗ ∧ ((autobind(ps1, (proto of(h.socks[sid ]).pr), h.socks) = ∅) ∨ windows arch h.arch) Description From thread tid , which is in the Run state, a bind(fd , is1, ps1) call is made where fd refers to a socket sid . A port is not specifed in the bind call, i.e. ps1 = ∗, and calling autobind returns the ∅ set rather than a set of free ephemeral ports that the socket could choose from. This occurs only when there are no remaining ephemeral ports available for autobinding. The call proceeds by a tid ·bind(fd , is1, ps1) transition leaving the thread state Ret(FAIL ENOBUFS) to return the out of resources error ENOBUFS to the caller. Model details Posix reports ENOBUFS to signify that ”Insufficient resources were available to complete the call”. This is not modelled here. Variations WinXP On WinXP this error can occur non-deterministically when insufficient buffers are available. 15.3 close() (TCP and UDP) close : fd→ unit A call close(fd) closes file descriptor fd so that it no longer refers to a file description and associated socket. The closed file descriptor is made available for reuse by the process. If the file descriptor is the last file descriptor referencing a file description the file description itself is deleted and the underlying socket is closed. If the socket is a UDP socket it is removed. It is important to note the distinction drawn above: only closing the last file descriptor of a socket has an effect on the state of the file description and socket. The following behaviour may occur when closing the last file descriptor of a TCP socket: • A TCP socket may have the SO LINGER option set which specifies a maximum duration in seconds that a close(fd) call is permitted to block. – In the normal case the SO LINGER option is not set, the close call returns immediately and asynchronously sends any remaining data and gracefully closes the connection. – If SO LINGER is set to a non-zero duration, the close(fd) call will block while the TCP implemen- tation attempts to successfully send any remaining data in the socket’s send buffer and gracefully close the connection. If the sending of remaining data and the graceful close are successful within the set duration, close(fd) returns successfully, otherwise the linger timer expires, close(fd) returns an error EAGAIN, and the close operation continues asychronously, attempting to send the remaining data. – The SO LINGER option may be set to zero to indicate that close(fd) should be abortive. A call to close(fd) tears down the connection by emitting a reset segment to the remote end (abandoning any data remaining in the socket’s send queue) and returns successfully without blocking. • If close(fd) is called on a TCP socket in a pre-established state the file description and socket are simply closed and removed, regardless of how SO LINGER is set, except on Linux platforms where SYN RECEIVED is dealt with as an established state for the purposes of close(fd). • Calling close(fd) on a listening TCP socket closes and removes the socket and aborts each of the connec- tions on the socket’s pending and completed connection queues. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close() (TCP and UDP) 137 15.3.1 Errors A call to close() can fail with the errors below, in which case the corresponding exception is raised: EAGAIN The linger timer expired for a lingering close() call and the socket has not yet been successfully closed. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. EINTR The system was interrupted by a caught signal. 15.3.2 Common cases A TCP socket is created and connected to a peer; other socket calls are made, most likely send() and recv(), but the SO LINGER option is not set. close() is then called and the connection is gracefully closed: socket 1 ; . . . ; close 2 A UDP socket is created and socket calls are made on it, mostly send() and recv() calls; the socket is then closed: socket 1 ; . . . ; close 10 15.3.3 API Posix: int close(int fildes); FreeBSD: int close(int d); Linux: int close(int fd); WinXP: int closesocket(SOCKET s); In the Posix interface: • fildes is the file descriptor to close, corresponding to the fd argument of the model close(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux and WinXP interfaces are similar modulo argument renaming, except where noted above. 15.3.4 Model details The following errors are not modelled: • In Posix and on FreeBSD and Linux, EIO means an I/O error occurred while reading from or writing to the file system. Since we model only sockets, not file systems, we do not model this error. • On FreeBSD, ENOSPC means the underlying object did not fit, cached data was lost. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.3.5 Summary close 1 all: fast succeed Successfully close a file descriptor that is not the last file descriptor for a socket close 2 tcp: fast succeed Successfully perform a graceful close on the last file descriptor of a synchronised socket close 3 tcp: fast succeed Successful abortive close of a synchronised socket Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 2 138 close 4 tcp: block Block on a lingering close on the last file descriptor of a syn- chronised socket close 5 tcp: slow urgent suc- ceed Successful completion of a lingering close on a synchronised socket close 6 tcp: slow nonurgent fail Fail with EAGAIN: unsuccessful completion of a lingering close on a synchronised socket close 7 tcp: fast succeed Successfully close the last file descriptor for a socket in the CLOSED, SYN SENT or SYN RECEIVED states. close 8 tcp: fast succeed Successfully close the last file descriptor for a listening TCP socket close 10 udp: fast succeed Successfully close the last file descriptor of a UDP socket 15.3.6 Rules close 1 all: fast succeed Successfully close a file descriptor that is not the last file descriptor for a socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′]〉 fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) > 1 ∧ fds ′ = fds\\fd Description A close(fd) call is performed where fd refers to either a TCP or UDP socket. At least two file descriptors refer to file description fid , fid ref count(fds,fid) > 1, of which one is fd , fid = fds[fd ]. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the successful return state Ret(OK()). In the final host state, the mapping of file descriptor fd to file descriptor index fid is removed from the file descriptors finite map fds ′ = fds\\fd , effectively reducing the reference count of the file description by one. The close() call does not alter the socket’s state as other file descriptors still refer to the socket through file description fid . close 2 tcp: fast succeed Successfully perform a graceful close on the last file descriptor of a synchronised socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′; files :=files\\fid ; socks := socks ⊕ [(sid ,Sock(∗, sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es,T,T, TCP Sock(st , cb, ∗, sndq , sndurp, [ ], rcvurp, iobc)))]]〉 (st ∈ {ESTABLISHED;FIN WAIT 1;CLOSING;FIN WAIT 2; TIME WAIT;CLOSE WAIT;LAST ACK} ∨ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 3 139 st = SYN RECEIVED ∧ linux arch h.arch) ∧ (sf .t(SO LINGER) =∞∨ ff .b(O NONBLOCK) = T ∧ sf .t(SO LINGER) 6= 0 ∧ ¬ linux arch h.arch) ∧ fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ fds ′ = fds\\fd ∧ fid /∈ (dom(files)) Description A close(fd) call is performed on the TCP socket sid referenced by file descriptor fd which is the only file descriptor referencing the socket’s file description: fid ref count(fds,fid) = 1. The TCP socket sid is in a synchronised state, i.e. a state ≥ ESTABLISHED, or on Linux it may be in the SYN RECEIVED state. In the common case the socket’s linger option is not set, sf .t(SO LINGER) = ∞, and regardless of whether the socket is in non-blocking mode or not, i.e. ff .b(O NONBLOCK) is unconstrained, the call to close() proceeds successfully without blocking. On all platforms except for Linux, if the socket is in non-blocking mode ff .b(O NONBLOCK) = T the linger option may be set with a positive duration: sf .t(SO LINGER) 6= 0). In this case the option is ignored giving precedence to the socket’s non-blocking semantics. The close() call succeeds without blocking. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the successful return state Ret(OK()). The final socket is marked as unable to send and receive further data, cantsndmore = T ∧ cantrcvmore = T, eventually causing TCP to transmit all remaining data in the socket’s send queue and perform a graceful close. In the final host state, the mapping of file descriptor fd to file descriptor index fid is removed from the file descriptors finite map fds ′ = fds\\fd and the file description entry fid is removed from the finite map of file descriptors files\\fid . The socket entry itself, (sid ,Sock(↑ fid ,. . . ,)) is not destroyed at this point; it remains until the TCP connection has been successfully closed. Variations Linux The socket can be in the SYN RECEIVED state or in one of the synchronised states ≥ ESTABLISHED. On Linux, non-blocking semantics do not take precedence over the SO LINGER option, i.e. if the socket is non-blocking, ff .b(O NONBLOCK) = T and a linger option is set to a non-zero value, sf .t(SO LINGER) 6= 0, the socket may block on a call to close(). See also close 4 (p140). close 3 tcp: fast succeed Successful abortive close of a synchronised socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′; files :=files; socks := socks ⊕ [(sid , sock ′)]; oq := oq ′]〉 (st ∈ {ESTABLISHED;FIN WAIT 1;CLOSING;FIN WAIT 2; TIME WAIT;CLOSE WAIT;LAST ACK} ∨ st = SYN RECEIVED ∧ linux arch h.arch) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ sf .t(SO LINGER) = 0 ∧ fd ∈ dom(fds) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 4 140 fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ fds ′ = fds\\fd ∧ fid /∈ (dom(files)) ∧ sid /∈ dom(socks) ∧ sock ′ = (tcp close h.arch sock)〈[ fid := ∗]〉 ∧ seg ∈ make rst segment from cb cb(i1, i2, p1, p2) ∧ enqueue and ignore fail h.arch h.rttab h.ifds[TCP seg ]oq oq ′ Description A close(fd) call is performed on the TCP socket sid referenced by file descriptor fd which is the only file descriptor referencing the socket’s file description: fid ref count(fds,fid) = 1. The TCP socket sid is in a synchronised state, i.e. a state >= ESTABLISHED, except on Linux platforms where it may be in the SYN RECEIVED state. The socket’s linger option is set to a duration of zero, sf .t(SO LINGER) = 0, to signify that an abortive closure of socket sid is required. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the successful return state Ret(OK()). A reset segment seg is constructed from the socket’s control block cb and address quad (i1, i2, p1, p2) and is appended to the host’s output queue, oq , by the function enqueue and ignore fail (p118), to create new output queue oq ′. The enqueue and ignore fail function always succeeds; if it is not possible to add the reset segment seq to the output queue the corresponding error code is ignored and the reset segment is not queued for transmission. The mapping of file descriptor fd to index fid is removed from the file descriptors finite map fds ′ = fds\\fd and the file description entry indexed by fid is removed from the finite map of file descriptions. The socket is put in the CLOSED state, shutdown for reading and writing, has its control block reset, and its send and receive queues emptied; this is done by the auxiliary function tcp close (p121). Additionally, its file description field is cleared. Variations Linux The socket can be in the SYN RECEIVED state or in one of the synchronised states ≥ ESTABLISHED. close 4 tcp: block Block on a lingering close on the last file descriptor of a synchronised socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Close2(sid))slow timer(sf .t(SO LINGER))); fds := fds ′; files :=files; socks := socks ⊕ [(sid ,Sock(∗, sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es,T,T, TCP Sock(st , cb, ∗, sndq , sndurp, [ ], rcvurp, iobc)))]]〉 (st ∈ {ESTABLISHED;FIN WAIT 1;CLOSING;FIN WAIT 2; TIME WAIT;CLOSE WAIT;LAST ACK} ∨ st = SYN RECEIVED ∧ linux arch h.arch) ∧ sf .t(SO LINGER) /∈ {0;∞} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 5 141 (ff .b(O NONBLOCK) = F ∨ (ff .b(O NONBLOCK) = T ∧ linux arch h.arch)) ∧ fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ fds ′ = fds\\fd ∧ fid /∈ (dom(files)) Description A close(fd) call is performed on the TCP socket sid referenced by file descriptor fd which is the only file descriptor referencing the socket’s file description: fid ref count(fds,fid) = 1. The TCP socket sid has a blocking mode of operation, ff .b(O NONBLOCK) = F, and is in a synchronised state, i.e. a state ≥ ESTABLISHED. On Linux, the socket is also permitted to be in the SYN RECEIVED state and it may have non-blocking semantics ff .b(O NONBLOCK) = T, because the linger option takes precedence over non-blocking semantics. The socket’s linger option is set to a positive duration and is neither zero (which signifies an imme- diate abortive close of the socket) nor infinity (which signifies that the linger option has not been set), sf .t(SO LINGER) /∈ {0;∞}. The close call blocks for a maximum duration that is the linger option du- ration in seconds, during which time TCP attempts to send all remaining data in the socket’s send buffer and gracefully close the connection. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the blocked state Close2(sid). The socket is marked as unable to send and receive further data, cantsndmore = T ∧ cantrcvmore = T; this eventually causes TCP to send all remaining data in the socket’s send queue and perform a graceful close. In the final host state, the mapping of file descriptor fd to file descriptor index fid is removed from the file descriptors finite map fds ′ = fds\\fd and file description entry fid is removed from the finite map of file descriptors. The socket entry itself, (sid ,Sock(↑ fid ,. . . )), is not destroyed at this point; it remains until the TCP socket has been successfully closed by future asychronous events. Variations Linux The socket can be in the SYN RECEIVED state or in one of the synchronised states ≥ ESTABLISHED. On Linux, non-blocking semantics do not take precedence over the SO LINGER option, i.e. if the socket is non-blocking, ff .b(O NONBLOCK) = T and a linger option is set to a non-zero value, sf .t(SO LINGER) 6= 0 the socket may block on a call to close(). close 5 tcp: slow urgent succeed Successful completion of a lingering close on a synchronised socket h 〈[ts := ts ⊕ (tid 7→ (Close2(sid))d); socks := socks ⊕ [(sid ,Sock(∗, sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es,T,T, TCP Sock(st , cb, ∗, [ ], sndurp, [ ], rcvurp, iobc)))]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(∗, sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es,T,T, TCP Sock(st , cb, ∗, [ ], sndurp, [ ], rcvurp, iobc)))]]〉 st ∈ {TIME WAIT;CLOSED;FIN WAIT 2} Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 7 142 A previous call to close() with the linger option set on the socket blocked leaving thread tid in the Close2(sid) state. The socket sid has successfully transmitted all the data in its send queue, sndq = [ ], and has completed a graceful close of the connection: st ∈ {TIME WAIT;CLOSED;FIN WAIT 2}. The rule proceeds via a τ transition leaving thread tid in the Ret(OK()) state to return successfully from the blocked close() call. The socket remains in a closed state. Note that the asychronous sending of any remaining data in the send queue and graceful closing of the connection is handled by other rules. This rule applies once these events have reached a successful conclusion. close 6 tcp: slow nonurgent fail Fail with EAGAIN: unsuccessful completion of a lingering close on a synchronised socket h 〈[ts := ts ⊕ (tid 7→ (Close2(sid))d); socks := socks ⊕ [(sid , sock)] ]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer); socks := socks ⊕ [(sid , sock)] ]〉 sock = Sock(∗, sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es,T,T, TCP Sock(st , cb, ∗, sndq , sndurp, [ ], rcvurp, iobc)) ∧ timer expires d ∧ st /∈ {TIME WAIT;CLOSED} Description A previous call to close() with the linger option set on the socket blocked, leaving thread tid in the Close2(sid) state. The linger timer has expired, timer expires d , before the socket has been successfully closed: st /∈ {TIME WAIT;CLOSED}. The rule proceeds via a τ transition leaving thread tid in the Ret(FAIL EAGAIN) state to return error EAGAIN from the blocked close() call. The socket remains in a synchronised state and is not destroyed until the socket has been successfully closed by future asychronous events. The asychronous transmission of any remaining data in the send queue and the graceful closing of the connection is handled by other rules. This rule is only predicated on the unsuccessfulness of these operations, i.e. st /∈ {TIME WAIT;CLOSED}. When the linger timer expires the socket could be (a) still attempting to successfully transmit the data in the send queue, or (b) be someway through the graceful close operation. The exact state of the socket is not important here, explaining the relatively unconstrained socket state in the rule. close 7 tcp: fast succeed Successfully close the last file descriptor for a socket in the CLOSED, SYN SENT or SYN RECEIVED states. h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid , sock)]]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′; files :=files; socks := socks]〉 (tcp sock .st ∈ {CLOSED;SYN SENT} ∨ tcp sock .st = SYN RECEIVED ∧ ¬ linux arch h.arch) ∧ TCP PROTO(tcp sock) = sock .pr ∧ fid /∈ (dom(files)) ∧ sid /∈ (dom(socks)) ∧ fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 8 143 fds ′ = fds\\fd Description A close(fd) call is performed on the TCP socket sock , identified by sid and referenced by file descriptor fd which is the only file descriptor referencing the socket’s file description: fid ref count(fds,fid) = 1. The TCP socket sock is not in a synchronised state: st ∈ {CLOSED;SYN SENT}. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the successful return state Ret(OK()). The mapping of file descriptor fd to file descriptor index fid is removed from the host’s finite map of file descriptors; the file description entry for fid is removed from the host’s finite map of file descriptors; and the socket entry (sid , sock) is removed from the host’s finite map of sockets. Variations Linux The rule does not apply if the socket is in state SYN RECEIVED: for the pur- poses of close() this is treated as a synchronised state on Linux. Note that the socket sock is not in a synchronised state and thus has no data in its send queue ready for transmission. Closing an unsynchronised socket simply in- volves deleting the socket entry and removing all references to it. These operations are performed immediately by the rule, hence the socket’s SO LINGER option is not constrained because it has no effect regardless of how it may be set. close 8 tcp: fast succeed Successfully close the last file descriptor for a listening TCP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid , sock)]; listen := listen; oq := oq ]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′; files :=files; socks := socks ′; listen := listen ′; oq := oq ′]〉 sock = Sock(↑ fid , sf , is1, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ fid /∈ (dom(files)) ∧ sid /∈ (dom(socks)) ∧ (* cantrcvmore/cantsndmore unconstrained under BSD, as may have previously called shutdown *) (* MS: this is more of an assertion than a condition, so we could get away without it *) (bsd arch h.arch ∨ (cantsndmore = F ∧ cantrcvmore = F)) ∧ (* BSD and Linux do not send RSTs to sockets on lis.q0. *) socks to rst = {(sock ′, tcp sock ′) | ∃sid ′.sid ′ ∈ lis.q ∧ sock ′ = socks[sid ′] ∧ TCP PROTO(tcp sock ′) = sock ′.pr ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ close 10 144 tcp sock ′.st /∈ {CLOSED;LISTEN;SYN SENT}} ∧ socks to rst list ∈ ORDERINGS socks to rst ∧ card socks to rst = length segs ∧ (let make rst seg = λ(sock ′, tcp sock ′). make rst segment from cb tcp sock ′.cb(the sock ′.is1, the sock ′.is2, the sock ′.ps1, the sock ′.ps2) in every I(map2(λs ′ seg ′.seg ′ ∈ make rst seg s ′)socks to rst list segs)) ∧ (* Note this is a clear example of where fuzzy timing is needed: should these really all have exactly the same time always? *) enqueue each and ignore fail h.arch h.rttab h.ifds(map TCP segs)oq oq ′ ∧ fds ′ = fds\\fd ∧ listen ′ = filter(λsid ′.sid ′ 6= sid)listen ∧ socks ′ = socks|{sid′|sid′ /∈lis.q0@lis.q} Description A close(fd) call is performed on the TCP socket sock referenced by file descriptor fd which is the only file descriptor referencing the socket’s file description fid , fid ref count(fds,fid) = 1. Socket sock is locally bound to port p1 and one or more local IP addresses is1, and is in the LISTEN state. The listening socket sock may have ESTABLISHED incoming connections on its connection queue lis.q and incomplete incoming connection attempts on queue lis.q0. Each connection, regardless of whether it is complete or not, is represented by a socket entry in h.socks and its corresponding index sid is on the respective queue. These connections have not been accepted by any thread through a call to accept() and are dropped on the closure of socket sock . A set of reset seqments rsts to go is created using the auxiliary function make rst segment from cb (p109) for each of the sockets referenced by both queues. This is performed by looking up each socket sock ′ for every sid ′ in the concatentation of both queues, lis.q0 @ lis.q , and extracting their address quads (sock ′.is1, sock ′.is2, sock ′.ps1, sock ′.ps2) and control blocks cb for use by make rst segment from cb. The set of reset segments rsts to go is constrained to a list, segs, and queued by the auxiliary function enqueue each and ignore fail on the hosts output queue h.oq . The enqueue each and ignore fail function al- ways succeeds; if it is not possible to add any of the reset segments segs to the output queue h.oq , the corresponding error codes are ignored and the reset segments in error are ultimately not queued for transmis- sion. This is sensible behaviour as the sockets for these connections are about to be deleted: if a reset segment does not successfully abort the remote end of the connection, perhaps because it could not be transmitted in the first place, any future incoming segments should not match any other socket in the system and will be dropped. The close(fd) call proceeds by a tid ·close(fd) transition leaving the host in the successful return state Ret(OK()). In the final host state, the mapping of file descriptor fd to file descriptor index fid is removed from the file descriptors finite map fds ′ = fds\\fd and file description entry fid is removed from the finite map of file descriptors h.files. The socket entry sock is removed from the hosts finite map of sockets h.socks and the socket’s sid value is removed from the host’s list of listening sockets h.listen by listen ′ = filter(λsid ′.sid ′ 6= sid)listen. Finally, all the sockets in h.socks that were referenced on one of the queues lis.q0 and lis.q , are removed by socks ′ = socks|{sid′|sid′ /∈lis.q0@lis.q} as they were not accepted by any thread before socket sock was closed. Model details The local IP address option is1 of the socket sock is not constrained in this rule. Instead it is constrained by other rules for bind() and listen() prior to the socket entering the LISTEN state. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect() (TCP and UDP) 145 close 10 udp: fast succeed Successfully close the last file descriptor of a UDP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files ⊕ [(fid ,File(FT Socket(sid),ff ))]; socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, UDP PROTO(udp)))]]〉 tid ·close(fd)−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); fds := fds ′; files :=files; socks := socks]〉 fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ fid ref count(fds,fid) = 1 ∧ fds ′ = fds\\fd ∧ fid /∈ (dom(files)) ∧ sid /∈ (dom(socks)) Description Consider a UDP socket sid , referenced by fd , with a file description record indexed by fid . fd is the only open file descriptor referring to the file description record indexed by fid , fid ref count(fds,fid) = 1. From thread tid , which is in the Run state, a close(fd) call is made and succeeds. A tid ·close(fd) transition is made, leaving the thread state Ret(OK()). The socket sid is removed from the host’s finite map of sockets socks⊕ . . . , the file description record indexed by fid is removed from the host’s finite map of file descriptions files⊕ . . . , and fd is removed from the host’s finite map of file descriptors fds ′ = fds\\fd . 15.4 connect() (TCP and UDP) connect : fd ∗ ip ∗ port option→ unit A call to connect(fd, ip, port) attempts to connect a TCP socket to a peer, or to set the peer address of a UDP socket. Here fd is a file descriptor referring to a socket, ip is the peer IP address to connect to, and port is the peer port. If fd refers to a TCP socket then TCP’s connection establishment protocol, often called the three-way handshake, will be used to connect the socket to the peer specified by (ip, port). A peer port must be specified: port cannot be set to ∗. There must be a listening TCP socket at the peer address, otherwise the connection attempt will fail with an ECONNRESET or ECONNREFUSED error. The local socket must be in the CLOSED state: attempts to connect() to a peer when already synchronised with another peer will fail. To start the connection establishment attempt, a SYN segment will be constructed, specifying the initial sequeunce number and window size for the connection, and possibly the maximum segment size, window scaling, and timestamping. The segment is then enqueued on the host’s out-queue; if this fails then the connect() call fails, otherwise connection establishment proceeds. If the socket is a blocking one (the O NONBLOCK flag for fd is not set), then the call will block until the connection is established, or a timeout expires in which case the error ETIMEDOUT is returned. If the socket is non-blocking (the O NONBLOCK flag is set for fd), then the connect() call will fail with an EINPROGRESS error (or EALREADY on WinXP), and connection establishment will proceed asynchronously. Calling connect() again will indicate the current status of the connection establishment in the returned error: it will fail with EALREADY if the connection has not been established, EISCONN once the connec- tion has been established, or if the connection establishment failed, an error describing why. Alternatively, pselect([ ], [fd], [ ], ∗, ) can be used; it will return when fd is ready for writing which will be when connection establishment is complete, either successfully or not. On Linux, unsetting the O NONBLOCK flag for fd and Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect() (TCP and UDP) 146 then calling connect() will block until the connection is established or fails; for WinXP the call will fail with EALREADY and the connection establishment will be performed asynchronously still; for FreeBSD the call will fail with EISCONN even if the connection has not been established. Upon completion of connection establishment the socket will be in state ESTABLISHED, ready to send and receive data, or CLOSE WAIT if it received a FIN segment during connection establishment. On FreeBSD, if connection establishment fails having sent a SYN then further connection establishment attempts are not allowed; on Linux and WinXP further attempts are possible. If fd refers to a UDP socket then the peer address of the socket is set, but no connection is made. The peer address is then the default destination address for subsequent send() calls (and the only possible destination address on FreeBSD), and only datagrams with this source address will be delivered to the socket. On FreeBSD the peer port must be specified: a call to connect(fd, ip, ∗) will fail with an EADDRNOTAVAIL error; on Linux and WinXP such a call succeeds: datagrams from any port on the host with IP address ip will be delivered to the socket. Calling connect() on a UDP socket that already has a peer address set is allowed: the peer address will be replaced with the one specified in the call. On FreeBSD if the socket has a pending error, that may be returned when the call is made, and the peer address will also be set. In order for a socket to connect to a peer or have its peer address set, it must be bound to a local IP and port. If it is not bound to a local port when the connect() call is made, then it will be autobound: an unused port for the socket’s protocol in the host’s ephemeral port range is selected and assigned to the socket. If the socket does not have its local IP address set then it will be bound to the primary IP address of an interface which has a route to the peer. If the socket does have a local IP address set then the interface that this IP address will be the one used to connect to the peer; if this interface does not have a route to the peer then for a TCP socket the connect() call will fail when the SYN is enqueued on the host’s outqueue; for a UDP socket the call will fail on FreeBSD, whereas on Linux and WinXP the connect() call will succeed but later send() calls to the peer will fail. For a TCP socket, its binding quad must be unique: there can be no other socket in the host’s finite map of sockets with the same binding quad. If the connect() call would result in two sockets having the same binding quad then it will fail with an EADDRINUSE error. For UDP sockets the same is true on FreeBSD, but on Linux and WinXP multiple sockets may have the same address quad. The socket that matching datagrams are delivered to is architecture-dependent: see lookup (p??). 15.4.1 Errors A call to connect() can fail with the errors below, in which case the corresponding exception is raised: EADDRNOTAVAIL There is no route to the peer; a port must be specified (port 6= ∗); or there are no ephemeral ports left. EADDRINUSE The address quad that would result if the connection was successful is in use by another socket of the same protocol. EAGAIN On WinXP, the socket is non-blocking and the connection cannot be established immediately: it will be established asynchronously. [TCP ONLY] EALREADY A connection attempt is already in progress on the socket but not yet complete: it is in state SYN SENT or SYN RECEIVED. [TCP ONLY] ECONNREFUSED Connection rejected by peer. [TCP ONLY] ECONNRESET Connection rejected by peer. [TCP ONLY] EHOSTUNREACH No route to the peer. EINPROGRESS The socket is non-blocking and the connection cannot be established immediately: it will be established asynchronously. [TCP ONLY] EINVAL On WinXP, socket is listening. [TCP ONLY] Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect() (TCP and UDP) 147 EISCONN Socket already connected. [TCP ONLY] ENETDOWN The interface used to reach the peer is down. ENETUNREACH No route to the peer. EOPNOTSUPP On FreeBSD, socket is listening. [TCP ONLY] ETIMEDOUT The connection attempt timed out before a connection was established for a socket. [TCP ONLY] EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. EINTR The system was interrupted by a caught signal. ENOBUFS Out of resources. 15.4.2 Common cases TCP: socket 1 ; connect 1 ; . . . UDP: socket 1 ; bind 1 ; connect 8 ; . . . 15.4.3 API Posix: int connect(int socket, const struct sockaddr *address, socklen_t address_len); FreeBSD: int connect(int s, const struct sockaddr *name, socklen_t namelen); Linux: int connect(int sockfd, constr struct sockaddr *serv_addr, socklen_t addrlen); WinXP: int connect(SOCKET s, const struct sockaddr* name, int namelen); In the Posix interface: • socket is a file descriptor referring to the socket to make a connection on, corresponding to the fd argument of the model connect(). • address is a pointer to a sockaddr structure of length address_len specifying the peer to connect to. sockaddr is a generic socket address structure: what is used for the model connect() is an internet socket address structure sockaddr_in. The sin_family member is set to AF_INET; the sin_port is the port to connect to, corresponding to the port argument of the model connect(): sin_port = 0 corresponds to port = ∗ and sin_port=p corresponds to port = ↑ p; the sin_addr.s_addr member of the structure corresponds to the ip argument of the model connect(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux and WinXP interfaces are similar modulo argument renaming, except where noted above. Note: For UDP sockets, the Winsock Reference says ”The default destination can be changed by simply calling connect again, even if the socket is already connected. Any datagrams queued for receipt are discarded if name is different from the previous connect.” This is not the case. 15.4.4 Model details If the call blocks then the thread enters state Connect2(sid) where sid is the identifier of the socket attempting to establish a connection. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 1 148 The following errors are not modelled: • EAFNOSUPPORT means that the specified address is not a valid address for the address family of the specified socket. The model connect() only models the AF_INET family of addresses so this error cannot occur. • EFAULT signifies that the pointers passed as either the address or address_len arguments were inacces- sible. This is an artefact of the C interface to connect() that is excluded by the clean interface used in the model. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. • EINVAL is a Posix-specific error signifying that the address_len argument is not a valid length for the socket’s address family or invalid address family in the sockaddr structure. The length of the address to connect to is implicit in the model connect(), and only the AF_INET family of addresses is modelled so this error cannot occur. • EPROTOTYPE is a Posix-specific error meaning that the specified address has a different type than the socket bound to the specified peer address. This error does not occur in any of the implementations as TCP and UDP sockets are dealt with seperately. • EACCES, ELOOP, and ENAMETOOLONG are errors dealing with Unix domain sockets which are not modelled here. 15.4.5 Summary connect 1 tcp: rc Begin connection establishment by creating a SYN and trying to enqueue it on host’s outqueue connect 2 tcp: slow urgent suc- ceed Successfully return from blocking state after connection is successfully established connect 3 tcp: slow urgent fail Fail with the pending error on a socket in the CLOSED state connect 4 tcp: slow urgent fail Fail: socket has pending error connect 4a tcp: fast fail Fail with pending error connect 5 tcp: fast fail Fail with EALREADY, EINVAL, EISCONN, EOPNOTSUPP: socket already in use connect 5a all: fast fail Fail: no route to host connect 5b all: fast fail Fail with EADDRINUSE: address already in use connect 5c all: fast fail Fail with EADDRNOTAVAIL: no ephemeral ports left connect 5d tcp: block Block, entering state Connect2: connection attempt al- ready in progress and connect called with blocking semantics connect 6 tcp: fast fail Fail with EINVAL: socket has been shutdown for writing connect 7 udp: fast succeed Set peer address on socket with binding quad ∗, ps1, ∗, ∗ connect 8 udp: fast succeed Set peer address on socket with local address set connect 9 udp: fast fail Fail with EADDRNOTAVAIL: port must be specified in connect() call on FreeBSD connect 10 udp: fast fail Fail with pending error on FreeBSD, but still set peer address 15.4.6 Rules connect 1 tcp: rc Begin connection establishment by creating a SYN and trying to enqueue it on host’s outqueue h tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h ′ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 1 149 (* Thread tid is in state Run and TCP socket sid has binding quad (is1, ps1, is2, ps2). *) h = h0 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)))]; oq := oq ]〉 ∧ (* Thread tid ends in state t ′ with updated host sockets and output queue *) h ′ = h0 〈[ ts := ts ⊕ (tid 7→ t ′); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i ′1, ↑ p′1, is ′2, ps ′2, es ′′,F,F, TCP Sock(st ′, cb′′′, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)))]; bound := bound ; oq := oq ′]〉 ∧ (* File descriptor fd refers to TCP socket sid *) fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ (* Either sid is bound to a local IP address or one of the host’s interface has a route to i2 and i ′ 1 is one of its IP addresses. If it is not routable, then we will fail below, when we try to enqueue the segment. *) i ′1 ∈ auto outroute(i2, is1, h.rttab, h.ifds) ∧ (* Notice that auto outroute never fails if is1 6= ∗ (i.e., is specified in the socket). *) (* The socket is either bound to a local port p′1 or can be autobound to an ephemeral port p ′ 1 *) p′1 ∈ autobind(ps1,PROTO TCP, h.socks) ∧ (* If autobinding occurs then sid is added to the head of the host’s list of bound sockets. *) (if ps1 = ∗ then bound = sid :: h.bound else bound = h.bound) ∧ (* The socket can be in one of two states: (1) it is in state CLOSED in which case its peer address is not set; it has no pending error; it is not shutdown for writing; and it is not shutdown for reading on non-FreeBSD architectures. Otherwise, (2) on FreeBSD the socket is in state TIME WAIT, and either is2 and ps2 are both set or both are not set. The fact that BSD allows a TIME WAIT socket to be reconnected means that some fields may contain old data, so we leave them unconstrained here. This is particularly important in the cb. *) ((st = CLOSED ∧ is2 = ∗ ∧ ps2 = ∗ ∧ es = ∗ ∧ cantsndmore = F ∧ (cantrcvmore = F ∨ bsd arch h.arch)) ∨ (bsd arch h.arch ∧ st = TIME WAIT ∧ (is2 6= ∗ =⇒ ps2 6= ∗) ∧ (ps2 6= ∗ =⇒ is2 6= ∗))) ∧ (* No other TCP sockets on the host have the address quad (↑ i ′1, ↑ p′1, ↑ i2, ↑ p2). *) ¬(∃(sid ′, s) :: (h.socks\\sid). s.is1 = ↑ i ′1 ∧ s.ps1 = ↑ p′1 ∧ s.is2 = ↑ i2 ∧ s.ps2 = ↑ p2 ∧ proto of s.pr = PROTO TCP) ∧ (* Pick an initial sequence number non-deterministically. This allows accidental spoofing of our own connections, but it is unclear how a tighter specification should be expressed. *) iss ∈ {n | T} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 1 150 (* If windows-scaling is to be requested for the connection then request r scale = ↑ n where n is a valid window scale; otherwise, request r scale = ∗. rcv wnd0 is a valid receive window size. If window scaling is to be requested then the socket’s receive window is set to rcv wnd0 scaled by the window scale factor n; otherwise it is set to rcv wnd0 . The socket’s receive window is not greater than the size of the socket’s receive buffer. We must allow implementations to either (a) not implement window scaling, or (b) choose on a per-connection basis whether to do window scaling or not. This permits both. *) (request r scale : num option) ∈ {∗} ∪ {↑ n | n ≥ 0 ∧ n ≤ TCP MAXWINSCALE} ∧ (rcv wnd0 : num) ∈ {n | n > 0 ∧ n ≤ TCP MAXWIN} ∧ (rcv wnd : num) = rcv wnd0 (option case 0 I request r scale) ∧ rcv wnd ≤ sf .n(SO RCVBUF) ∧ (* Either advertise a maximum segment size, advmss, that is between 1 and 65535 − 40, or advertise no maximum segment size. If one is advertised, advmss ′ = ↑ advmss; otherwise, advmss ′ = ∗. *) advmss ∈ {n | n ≥ 1 ∧ n ≤ (65535− 40)} ∧ advmss ′ ∈ {∗; ↑ advmss} ∧ (* If time-stamping is to be requested for the connection, then tf req tstmp′ = T; otherwise tf req tstmp′ = F. *) tf req tstmp′ ∈ {F;T}∧ (* do timestamp? *) (* If there is no segment currently being timed for this socket (the expected case) then the SYN segment will be timed, with t rttseg ′ set to the current time and the initial sequence number for the connection, iss. *) (let t rttseg ′ = if IS NONE cb.t rttseg then ↑(ticks of h.ticks, iss) else cb.t rttseg in (* Update the socket’s control block to cb′, which is cb except we: (1) start the retransmit and connection establishment timers; (2) set the snd una, snd nxt , snd max , iss fields based on the initial sequence number chosen; (3) set the rcv wnd , rcv adv , and tf rxwin0sent fields based on the receive window chosen; (4) record whether or not to do windows scaling, time-stamping, and what the advertised maximum segment size is; and (5) store the segment to time. *) cb′ = cb 〈[ tt rexmt := start tt rexmtsyn h.arch 0 F cb.t rttinf ; tt conn est := ↑((())slow timer TCPTV KEEP INIT); snd una := iss; snd nxt := iss + 1; snd max := iss + 1; iss := iss; rcv wnd := rcv wnd ; rcv adv := cb.rcv nxt + rcv wnd ; (* since rcv nxt is 0 at this point (since we do not yet know), this is a bit odd. But it models BSD behaviour. *) tf rxwin0sent :=(rcv wnd = 0); request r scale := request r scale; (* store whether we requested WS and if so what *) t maxseg := cb.t maxseg ; (* do not change this *) tadvmss := advmss ′; (* store what mss we advertised; ∗ or ↑ v *) tf req tstmp := tf req tstmp′; last ack sent := tcp seq foreign 0w; t rttseg := t rttseg ′ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 1 151 ]〉) ∧ (* now build the segment (using an auxiliary, since we might have to retransmit it) *) (* Make a SYN segment based on the updated control block and the socket’s address quad; see make syn segment (p106) for details. *) choose seg :: make syn segment cb′(i ′1, i2, p ′ 1, p2)(ticks of h.ticks). (* and send it out... *) (* If possible, enqueue the segment seg on the host’s outqueue. The auxiliary function rollback tcp output (p117) is used for this; if the segment is a well-formed segment, there is a route to the peer from i ′1, and there are no buffer allocation failures, outsegs ′ 6= [ ], then the segment is enqueued on the host’s outqueue, oq , resulting in a new outqueue, oq ′. The socket’s control block is left as cb′ which is described above. Otherwise an error may have occurred; possible errors are: (1) ENOBUFS indicating a buffer allocation failure; (2) a routing error; or (3) EADDRNOTAVAIL on FreeBSD or EINVAL on Linux indicating that the segment would cause a loopback packet to appear on the wire (on WINXP the segment is silently dropped with no error in this case). If an error does occur then the socket’s control block reverts to cb, the control block when the call was made. *) ∃outsegs ′. rollback tcp output F(TCP seg)h.arch h.rttab h.ifds T (cb 〈[ snd nxt := iss; snd max := iss; tt delack := ∗; last ack sent := tcp seq foreign 0w; rcv adv := tcp seq foreign 0w ]〉)cb′(cb′′, es ′, outsegs ′) ∧ cb′′′ = (if (outsegs ′ 6= [ ] ∨ windows arch h.arch) then cb′′ else cb) ∧ enqueue oq list qinfo(oq , outsegs ′, oq ′) ∧ (* If the socket is a blocking one, its O NONBLOCK flag is not set, then the call will block, entering state Connect2(sid) and leaving the socket in state SYN SENT with peer address (↑ i2, ↑ p2) and, if the segment could not be enqueued, its pending error set to the error resulting from the attempt to enqueue the segment. If the socket is non-blocking, its O NONBLOCK flag is set, and the segment was enqueued on the host’s outqueue, then the call will fail with an EINPROGRESS error (or EAGAIN on WinXP). The socket will be left in state SYN SENT with peer address (↑ i2, ↑p2). Otherwise, if the segment was not enqueued, then the call will fail with the error resulting from attempting to enqueue it, ↑ err ; the socket will be left in state CLOSED with no peer address set. *) (* In the case of BSD, if we connect via the loopback interface, then the segment exchange occurs so fast that the socket has connected before the connect-calling thread regains control. When it does, it sees that the socket has been connected, and therefore returns with success rather than EINPROGRESS. Since this behaviour is due to timing, however, it may be possible for the connect call to return before all the segments have been sent, for example if there was an artificially imposed delay on the loopback interface. This behaviour is therefore made nondeterministic, for a BSD non-blocking socket connecting via loopback, in that it may either fail immediately, or be blocked for a short time. Linux does not exhibit this behaviour.*) ( (* blocking socket, or BSD and using loopback interface *) ((¬ff .b(O NONBLOCK) ∨ (bsd arch h.arch ∧ i2 ∈ local ips h.ifds)) ∧ t ′ = (Connect2(sid))never timer ∧ rc = block ∧ es ′′ = es ′ ∧ st ′ = SYN SENT ∧ is ′2 = ↑ i2 ∧ ps ′2 = ↑ p2) ∨ (* non-blocking socket *) (ff .b(O NONBLOCK) ∧ es = ∗ ∧ (err = (if windows arch h.arch then EAGAIN else EINPROGRESS) ∨ ↑ err = es ′) ∧ t ′ = (Ret(FAIL err))sched timer ∧ rc = fast fail ∧ es ′′ = ∗ ∧ if oq = oq ′ then st ′ = CLOSED ∧ is ′2 = ∗ ∧ ps ′2 = ∗ else Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 3 152 st ′ = SYN SENT ∧ is ′2 = ↑ i2 ∧ ps ′2 = ↑ p2) ) Description From thread tid , a connect(fd , i2, ↑ p2) call is made where fd refers to a TCP socket. The socket is in state CLOSED with no peer address set, no pending error, and not shutdown for reading or writing. A SYN segment is created to being connection establishment, and is enqueued on the host’s out-queue. If the socket is a blocking one (its O NONBLOCK flag is not set) then the call will block: a tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Connect2(sid). If the socket is non- blocking (its O NONBLOCK flag is set) and the segment enqueuing was successful then the call will fail: a tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL EINPROGRESS) (or Ret(FAIL EAGAIN) on WinXP); connection establishment will proceed asynchronously. Otherwise, if the enqueueing did not succeed, the call will fail with an error err : a tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread in state Ret(FAIL err). For further details see the in-line comments above. Variations FreeBSD The socket may also be in state TIME WAIT when the connect() call is made, with either both its peer IP and port set, or neither set. The socket may be shutdown for reading when the connect() call is made. WinXP If there is an early buffer allocation failure when enqueuing the segment, then it will not be placed on the host’s out-queue and es ′ = ENOBUFS; the socket’s control block will be cb′ with its snd nxt and snd max fields set to the intial sequence number, its last ack seen and rcv adv fields set to 0, its tt delack option set to ∗, its tt rexmt timer stopped, and its tf rxwin0sent and t rttseg fields reset. If there is no route from an interface specified by the local IP address i1 to the foreign IP address i2 then the socket’s control block will be cb′ with its snd next field set to the initial sequence number, its last ack sent and rcv adv fields set to 0, and its tt delack option set to ∗. If the segment would case a loopback packet to be sent on the wire then the socket’s control block will be cb′. connect 2 tcp: slow urgent succeed Successfully return from blocking state after connection is successfully established h 〈[ts := ts ⊕ (tid 7→ (Connect2 sid)d)]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer)]〉 TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ tcp sock .st ∈ {ESTABLISHED;CLOSE WAIT} ∧ (¬∃tid ′ d ′.(tid ′ ∈ dom(ts)) ∧ (tid ′ 6= tid) ∧ ts[tid ′] = (Connect2 sid)d′) Description Thread tid is blocked in state Connect2(sid) where sid identifies a TCP socket which is in state ESTABLISHED: the connection establishment has been successfully completed; or CLOSE WAIT: con- nection establishment successfully completed but a FIN was received during establishment. tid is the only thread which is blocked waiting for the socket sid to establish a connection. As connection establishment has now completed, the thread can successfully return from the blocked state. A τ transition is made, leaving the thread state Ret(OK()). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 4 153 connect 3 tcp: slow urgent fail Fail with the pending error on a socket in the CLOSED state h 〈[ts := ts ⊕ (tid 7→ (Connect2 sid)d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗]〉)]]〉 TCP PROTO(tcp sock) = sock .pr ∧ tcp sock .st = CLOSED ∧ (bsd arch h.arch =⇒ tcp sock .cb.bsd cantconnect = T) Description Thread tid is blocked in the Connect2(sid) state where sid identifies a TCP socket sock that is in the CLOSED state: connection establishment has failed, leaving the socket in a pending error state ↑ e. Usually this occurs when there is no listening TCP socket at the peer address, giving an error of ECONNREFUSED or ECONNRESET; or when the connection establishment timer expired, giving an error of ETIMEDOUT. The call now returns, failing with the error e, and clearing the pending error field of the socket. A τ transition is made, leaving the thread state Ret(FAIL e). Variations FreeBSD When connection establishment failed, the bsd cantconnect flag in the control block would have been set, the socket’s cantsndmore and cantrcvmore flags would have been set and its local address binding would have been removed. This renders the sockets useless: call to bind(), connect(), and listen() will all fail. connect 4 tcp: slow urgent fail Fail: socket has pending error h 〈[ts := ts ⊕ (tid 7→ (Connect2 sid)d); socks := socks ⊕ [(sid , sock)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 sock = Sock(↑ fid , sf , ↑ i1, ps1, ↑ i2, ↑ p2, ↑ err ,F,F, TCP Sock(SYN SENT, cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)) ∧ (* On WinXP if the error is from routing to an unavailable address, the error is not returned and the socket is left alone. The rexmtsyn timer will retry the SYN transmission and eventually fail. *) ¬(windows arch h.arch ∧ err = EINVAL) ∧ (if bsd arch h.arch then (if (err = EADDRNOTAVAIL) then sock ′ = Sock(↑ fid , sf , ↑ i1, ps1, ↑ i2, ↑ p2, ∗,F,F, TCP Sock(SYN SENT, cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)) else sock ′ = Sock(↑ fid , sf , ↑ i1, ps1, ∗, ∗, ∗,F,F, TCP Sock(CLOSED, initial cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA))) else (* close the socket, but do not shutdown for reading/writing *) sock ′ = Sock(↑ fid , sf , ↑ i1, ps1, ∗, ∗, ∗,F,F, TCP Sock(CLOSED, cb′, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)) ∧ cb′ = initial cb ) Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 5 154 Thread tid is blocked in the Connect2(sid) state waiting for a connection to be established. sid identifies a TCP socket sock that has not been shutdown for reading or writing, and has binding quad (↑ i1, ps1, ↑ i2, ↑ p2) and pending error err . The socket is in state SYN SENT, is not listening, has empty send and receive queues, and no urgent marks set. The call fails, returning the pending error. A τ transition is made, leaving the thread state Ret(FAIL err). The socket is left in state CLOSED with its peer address not set, its pending error cleared, and its control block reset to the initial control block, initial cb. Variations FreeBSD If the pending error is EADDRNOTAVAIL then the error is cleared and returned but the rest of the socket stays the same: it is in state SYN SENT so the SYN will be retransmitted until it times out. If the pending error is not EADDRNOTAVAIL then the socket is reset as above except that the the socket’s local ip and port are cleared WinXP If the error is EINVAL then this rule does not apply. connect 4a tcp: fast fail Fail with pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[es := ↑ err ]〉)]]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = sock .pr ∧ tcp sock .st ∈ {CLOSED} Description From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made. fd refers to a TCP socket sock , identified by sid , with pending error err and in state CLOSED. The call fails with the pending error. A tid ·connect(fd , ip, port) transition is made, leaving the thread state Ret(FAIL err) and the socket’s pending error clear. The most likely cause of this behaviour is for a non-blocking connect(fd , , ) call to have previously been made. The call fails, setting the pending error on the socket, and when connect() is called to check the status of connection establishment the error is returned. In such a case err is most likely to be ECONNREFUSED, ECONNRESET, or ETIMEDOUT. connect 5 tcp: fast fail Fail with EALREADY, EINVAL, EISCONN, EOPNOTSUPP: socket already in use h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ case tcp sock .st of SYN SENT→ if ff .b(O NONBLOCK) = T then err = EALREADY (* connection already in progress *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 5a 155 else if windows arch h.arch then err = EALREADY (* connection already in progress *) else if bsd arch h.arch then err = EISCONN (* connection being established *) else ASSERTION FAILURE“connect 5:1” ‖ (* never happen *) SYN RECEIVED→ if ff .b(O NONBLOCK) = T then err = EALREADY (* connection already in progress *) else if windows arch h.arch then err = EALREADY else if bsd arch h.arch then err = EISCONN (* connection being established *) else ASSERTION FAILURE“connect 5:2” ‖ (* never happen *) LISTEN→ if windows arch h.arch then err = EINVAL (* socket is listening *) else if bsd arch h.arch then err = EOPNOTSUPP else if linux arch h.arch then err = EISCONN else ASSERTION FAILURE“connect 5:3” ‖ (* never happen *) ESTABLISHED→ err = EISCONN ‖ (* socket already connected *) FIN WAIT 1→ err = EISCONN ‖ (* socket already connected *) FIN WAIT 2→ err = EISCONN ‖ (* socket already connected *) CLOSING→ err = EISCONN ‖ (* socket already connected *) CLOSE WAIT→ err = EISCONN ‖ (* socket already connected *) LAST ACK→ err = EISCONN ‖ (* socket already connected; seems that fd is valid in this state *) TIME WAIT→ (windows arch h.arch ∨ linux arch h.arch) ∧ err = EISCONN ‖ (* BSD allows a TIME WAIT socket to be reconnected *) CLOSED→ err = EINVAL ∧ bsd arch h.arch ∧ tcp sock .cb.bsd cantconnect = T Description From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made where fd refers to a TCP socket identified by sid . The call fails with an error err : if the socket is in state SYN SENT or SYN RECEIVED and the socket is non-blocking or the host is a WinXP architecture then err = EALREADY (EISCONN on FreeBSD); if it is in state LISTEN then on WinXP err = EINVAL, on FreeBSD err = EOPNOTSUPP, and on Linux err = EISCONN; if it is in state ESTABLISHED, FIN WAIT 1, FIN WAIT 2, CLOSING, CLOSE WAIT, or TIME WAIT on Linux and WinXP, err = EISCONN; if it is in state CLOSED on FreeBSD and has its bsd cantconnect flag set then err = EINVAL. A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL err). Variations FreeBSD If the socket is in state TIME WAIT then the call does not fail: the socket may be reconnected by connect 1 (p148). connect 5a all: fast fail Fail: no route to host h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[is1 := ∗; ps1 := ps1]〉)]]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid , sock 〈[is1 := is ′1; ps1 := ps ′1]〉)]; bound := bound ]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (if bsd arch h.arch ∧ proto of sock .pr = PROTO TCP then is ′1 = ↑ i ′1 ∧ i ′1 ∈ local primary ips h.ifds ∧ ps ′1 = ↑ p′1 ∧ p′1 ∈ autobind(ps1,PROTO TCP, h.socks) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 5b 156 (if ps1 = ∗ then bound = sid :: h.bound else bound = h.bound) else is ′1 = ∗ ∧ ps ′1 = ps1 ∧ bound = h.bound) ∧ case test outroute ip(i2, h.rttab, h.ifds, h.arch) of ↑ e → err = e ‖ other29 → F ∧ (proto of sock .pr = PROTO UDP =⇒ ¬ bsd arch h.arch) Description From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made. fd refers to a socket identified by sid which does not have a local IP address set. The test outroute ip (p82) function is used to check if there is a route from the host to i2. There is no route so the call will fail with a routing error err . If there is no interface with a route to the host then on Linux the call fails with ENETUNREACH and on FreeBSD and WinXP it fails with EHOSTUNREACH. If there are interfaces with a route to the host but none of these are up then the call fails with ENETDOWN. A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL err), where err is one of the above errors. Variations FreeBSD This rule does not apply to UDP sockets on FreeBSD. Additionally, if the socket is not bound to a local port then it will be autobound to one and sid will be appended to the head of the host’s list of bound sockets, bound . The socket’s local IP address may be set to ↑ i1 even though there is no route from i1 to i2. connect 5b all: fast fail Fail with EADDRINUSE: address already in use h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]; bound := bound ]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRINUSE))sched timer); socks := socks ⊕ [(sid , sock 〈[is1 := is ′1; ps1 := ↑ p′1; is2 := is ′2; ps2 := ps ′2]〉)]; bound := bound ′]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ i ′1 ∈ auto outroute(i2, sock .is1, h.rttab, h.ifds) ∧ p′1 ∈ autobind(sock .ps1, (proto of sock .pr), h.socks) ∧ (if sock .ps1 = ∗ then bound ′ = sid :: bound else bound ′ = bound) ∧ (proto of sock .pr = PROTO UDP =⇒ ¬(linux arch h.arch ∨ windows arch h.arch)) ∧ (∃(sid ′, s) :: socks\\sid . s.is1 = ↑ i ′1 ∧ s.ps1 = ↑ p′1 ∧ s.is2 = ↑ i2 ∧ s.ps2 = ↑ p2 ∧ proto eq sock .pr s.pr) ∧ (if proto of sock .pr = PROTO UDP then if sock .is2 = ∗ then is ′1 = sock .is1 ∧ is ′2 = ∗ ∧ ps ′2 = ∗ else is ′1 = ∗ ∧ is ′2 = ∗ ∧ ps ′2 = ∗ else is ′1 = sock .is1 ∧ is ′2 = sock .is2 ∧ ps ′2 = sock .ps2) Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 5d 157 From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made where fd refers to a socket sock identified by sid . The socket is either bound to local port ↑ p′1, or can be autobound to port ↑ p′1. The socket either has its local IP address set to ↑ i ′1 or else its local IP address is unset but there exists an IP address i ′1 for one of the host’s interfaces which has a route to i2. There exists another socket s in the host’s finite map of sockets, identified by sid ′, that has as its binding quad (↑ i ′1, ↑ p′1, ↑ i2, ↑ p2). A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL EADDRINUSE): there is already another socket with the same local address connected to the peer address (↑ i2, ↑ p2). The socket’s local port is set to ↑ p′1; if this was accomplished by autobinding then sid is appended to the head of bound , the host’s list of bound sockets, to create a new list bound ′. If sock is a TCP socket then its is1, is2, and ps2 fields are unchanged. If sock is a UDP socket on FreeBSD then if its peer IP address was set, its local IP address will be unset: is ′1 = ∗, otherwise its local IP address will stay as it was: is ′1 = sock .is1; its peer IP address and port will both be unset: is ′2 = ∗ ∧ ps ′2 = ∗. Variations Linux This rule does not apply to UDP sockets: Linux allows two UDP sockets to have the same binding quad. WinXP This rule does not apply to UDP sockets: WinXP allows two UDP sockets to have the same binding quad. connect 5c all: fast fail Fail with EADDRNOTAVAIL: no ephemeral ports left h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRNOTAVAIL))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (h.socks[sid ]).ps1 = ∗ ∧ autobind(∗, (proto of(h.socks[sid ]).pr), h.socks) = ∅ Description From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) is made. fd refers to a socket identified by sid which is not bound to a local port. There are no ephemeral ports available to autobind to so the call fails with an EADDRNOTAVAIL error. A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL EADDRNOTAVAIL). connect 5d tcp: block Block, entering state Connect2: connection attempt already in progress and connect called with blocking semantics h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Connect2(sid))never timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ ff .b(O NONBLOCK) = F ∧ linux arch h.arch ∧ tcp sock .st ∈ {SYN SENT;SYN RECEIVED} Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 7 158 Description From thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made. fd refers to a TCP socket identified by sid which is in state SYN SENT or SYN RECEIVED: in other words, a connection attempt is already in progress for the socket (this could be an asynchronous connection attempt or one in another thread). The open file description referred to by fd does not have its O NONBLOCK flag set so the call blocks, awaiting completion of the original connection attempt. A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Connect2(sid). Variations FreeBSD This rule does not apply. WinXP This rule does not apply. connect 6 tcp: fast fail Fail with EINVAL: socket has been shutdown for writing h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[cantsndmore :=T; pr :=TCP PROTO(tcp 〈[st :=CLOSED]〉)]〉)]]〉 tid ·connect(fd , i2, ↑ p2)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer); socks := socks ⊕ [(sid , sock 〈[cantsndmore :=T; pr :=TCP PROTO(tcp 〈[st :=CLOSED]〉)]〉)]]〉 bsd arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) Description On FreeBSD, from thread tid , which is in the Run state, a connect(fd , i2, ↑ p2) call is made. fd refers to a TCP socket sock identified by sid which is in state CLOSED and has been shutdown for writing. A tid ·connect(fd , i2, ↑ p2) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations Posix This rule does not apply. Linux This rule does not apply. WinXP This rule does not apply. connect 7 udp: fast succeed Set peer address on socket with binding quad ∗, ps1, ∗, ∗ h0 tid ·connect(fd , i2, ps2)−−−−−−−−−−−−−−−−−→ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 8 159 h0 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i ′1, ↑ p′1, ↑ i2, ps2, es, cantsndmore ′, cantrcvmore,UDP PROTO(udp)))]; bound := bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ps1, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))] ]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ p′1 ∈ autobind(ps1,PROTO UDP, h0.socks) ∧ (if ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ i ′1 ∈ auto outroute(i2, ∗, h0.rttab, h0.ifds) ∧ ¬(∃(sid ′, s) :: (h0.socks\\sid). s.is1 = ↑ i ′1 ∧ s.ps1 = ↑ p′1 ∧ s.is2 = ↑ i2 ∧ s.ps2 = ps2 ∧ proto of s.pr = PROTO UDP ∧ bsd arch h.arch) ∧ (bsd arch h.arch =⇒ ps2 6= ∗ ∧ es = ∗) ∧ (if windows arch h.arch then cantsndmore ′ = F else cantsndmore ′ = cantsndmore) Description Consider a UDP socket sid , referenced by fd , with no local IP or peer address set. From thread tid , which is in the Run state, a connect(fd , i2, ps2) call is made. The socket’s local port is either set to p′1, or it is unset and can be autobound to a local ephemeral port p′1. The local IP address can be set to i ′ 1 which is the primary IP address for an interface with a route to i2. A tid ·connect(fd , i2, ps2) transition is made, leaving the thread stateRet(OK()). The socket’s local address is set to (↑ i ′1, ↑ p′1), and its peer address is set to (↑ i2, ps2). If the socket’s local port was autobound then sid is placed at the head of the host’s list of bound sockets: bound = sid :: h0.bound . Variations FreeBSD As above, with the additional conditions that a foreign port is specified in the connect() call: ps2 6= ∗, and there are no pending errors on the socket. Further- more, there may be no other sockets in the host’s finite map of sockets with the binding quad (↑ i ′1, ↑p′1, ↑ i2, ps2). WinXP As above, except that the socket will not be shutdown for writing after the connect() call has been made. connect 8 udp: fast succeed Set peer address on socket with local address set h0 tid ·connect(fd , i , ps)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i , ps, es, cantsndmore ′, cantrcvmore,UDP PROTO(udp)))]]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ connect 9 160 socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (bsd arch h.arch =⇒ ps 6= ∗ ∧ es = ∗) ∧ (if windows arch h.arch then cantsndmore ′ = F else cantsndmore ′ = cantsndmore) ∧ ¬(∃(sid ′, s) :: (h0.socks\\sid). s.is1 = ↑ i1 ∧ s.ps1 = ↑ p1 ∧ s.is2 = ↑ i ∧ s.ps2 = ps ∧ proto of s.pr = PROTO UDP ∧ bsd arch h.arch) Description Consider a UDP socket sid , referenced by fd , with local address set to (↑ i1, ↑p1). Its peer address may or may not be set. From thread tid , which is in the Run state, a connect(fd , i , ps) call is made. The call succeeds: a tid ·connect(fd , i , ps) transition is made, leaving the thread in state Ret(OK()). The socket has its peer address set to (↑ i , ps). Variations FreeBSD As above, with the additional conditions that a foreign port is specified in the connect() call, ps 6= ∗, and there are no pending errors on the socket. Furthermore, there may be no other sockets in the host’s finite map of sockets with the binding quad (↑ i ′1, ↑p1 ′, ↑ i , ps). WinXP As above, with the additional effect that if the socket was shutdown for writing when the connect() call was made, it will no longer be shutdown for writing. connect 9 udp: fast fail Fail with EADDRNOTAVAIL: port must be specified in connect() call on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 tid ·connect(fd , i , ∗)−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRNOTAVAIL))sched timer); socks := socks ⊕ [(sid , sock 〈[is1 := is1; is2 := ∗; ps2 := ∗; pr :=UDP PROTO(udp)]〉)]]〉 bsd arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (if sock .is2 6= ∗ then is1 = ∗ else is1 = sock .is1) Description On FreeBSD, consider a UDP socket sid referenced by fd . From thread tid , which is in the Run state, a connect(fd , i , ∗) call is made. Because no port is specified, the call fails with an EADDRNOTAVAIL error. A tid ·connect(fd , i , ∗) transition is made, leaving the thread state Ret(FAIL EADDRNOTAVAIL). The socket’s peer address is cleared: is2 := ∗ and ps2 := ∗. Additionally, if the socket had its peer IP address set, sock .is2 6= ∗, then its local IP address will be cleared: is1 = ∗; otherwise it remains the same: is1 = sock .is1. Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ disconnect() (TCP and UDP) 161 Posix This rule does not apply. Linux This rule does not apply. WinXP This rule does not apply. connect 10 udp: fast fail Fail with pending error on FreeBSD, but still set peer address h0 tid ·connect(fd , i , ps)−−−−−−−−−−−−−−−−→ h0 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid , sock 〈[is2 := ↑ i ; ps2 := ps; es := ∗; pr :=UDP PROTO(udp)]〉)]]〉 bsd arch h.arch ∧ h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ es := ↑ err ; pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ps 6= ∗ ∧ ¬(∃(sid ′, s) :: (h0.socks\\sid). s.is1 = sock .is1 ∧ s.ps1 = sock .ps1 ∧ s.is2 = ↑ i ∧ s.ps2 = ps ∧ proto of s.pr = PROTO UDP) Description On FreeBSD, consider a UDP socket sid , referenced by fd , with pending error err . From thread tid , which is in the Run state, a connect(fd , i , ps) call is made with ps 6= ∗. There is no other UDP socket on the host which has the same local address sock .is1, sock .ps1 as sid , and its peer address set to ↑ i , ps. The call fails, returning the pending error err . A tid ·connect(fd , i , ps) transition is made, leaving the thread state Ret(FAIL err). The socket’s peer address is set to (↑ i , ps), and the error is cleared from the socket. Variations Linux This rule does not apply. WinXP This rule does not apply. 15.5 disconnect() (TCP and UDP) disconnect : fd→ unit A call to disconnect(fd), where fd is a file descriptor referring to a socket, removes the peer address for a UDP socket. If a UDP socket has peer address set to (↑ i2, ↑ p2) then it can only receive datagrams with source address (i2, p2). Calling disconnect() on the socket resets its peer address to (∗, ∗), and so it will be able to receive datagrams with any source address. It does not make sense to disconnect a TCP socket in this way. Most supported architectures simply disallow disconnect on such a socket; however, Linux implements it as an abortive close (see close 3 (p139)). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ disconnect() (TCP and UDP) 162 15.5.1 Errors A call to disconnect() can fail with the errors below, in which case the corresponding exception is raised: EADDRNOTAVAIL There are no ephemeral ports left for autobinding to. EAFNOSUPPORT The address family AF_UNSPEC is not supported. This can be the result for a successful disconnect() for a UDP socket. EAGAIN There are no ephemeral ports left for autobinding to. EALREADY A connection is already in progress. EBADF The file descriptor fd is an invalid file descriptor. EISCONN The socket is already connected. ENOBUFS No buffer space is available. EOPNOTSUPP The socket is listening and cannot be connected. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.5.2 Common cases disconnect 1 ; return 1 15.5.3 API disconnect() is a Posix connect() call with the address family set to AF_UNSPEC. Posix: int connect(int socket, const struct sockaddr *address, socklen_t address_len); FreeBSD: int connect(int s, const struct sockaddr *name, socklen_t namelen); Linux: int connect(int sockfd, const struct sockaddr *serv_addr, socklen_t addrlen); WinXP: int connect(SOCKET s, const struct sockaddr* name, int namelen); In the Posix interface: • socket is a file descriptor referring to a socket. This corresponds to the fd argument of the model disconnect(). • address is a pointer to a location of size address_len containing a sockaddr structure which specifies the address to connect to. For a disconnect() call, the sin_family field of the sockaddr family must be set to AF_UNSPEC; other fields can be set to anything. • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The Linux man-page states: ”Unconnecting a socket by calling connect with a AF UNSPEC address is not yet implemented.” As a result, a disconnect() call always returns successfully on Linux. The WinXP documentation states: ”The default destination can be changed by simply calling connect again, even if the socket is already connected. Any datagrams queued for receipt are discarded if name is different from the previous connect.” This implies that calling disconnect() will result in all datagrams on the socket’s receive queue; however, this is not the case: no datagrams are discarded. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ disconnect 4 163 15.5.4 Summary disconnect 4 tcp: fast fail Fail with EAFNOSUPPORT: address family not sup- ported; EOPNOTSUPP: operation not supported; EALREADY: connection already in progress; or EISCONN: socket already connected disconnect 5 tcp: fast fail Succeed on Linux, possibly dropping the connection disconnect 1 udp: fast succeed Unset socket’s peer address disconnect 2 udp: fast succeed Unset socket’s peer address and autobind local port disconnect 3 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL, or ENOBUFS: there are no ephemeral ports left 15.5.5 Rules disconnect 4 tcp: fast fail Fail with EAFNOSUPPORT: address family not supported; EOPNOTSUPP: operation not supported; EALREADY: connection already in progress; or EISCONN: socket already connected h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·disconnect(fd)−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ ¬(linux arch h.arch) ∧ case tcp sock .st of CLOSED→ if bsd arch h.arch then if tcp sock .cb.bsd cantconnect = T then err = EINVAL else err = EAFNOSUPPORT else err = EAFNOSUPPORT ‖ LISTEN→ if windows arch h.arch then err = EAFNOSUPPORT (* socket is listening *) else if bsd arch h.arch then err = EOPNOTSUPP else ASSERTION FAILURE“disconnect 4:1” ‖ (* never happen *) SYN SENT→ err = EALREADY ‖ (* connection already in progress *) SYN RECEIVED→ err = EALREADY ‖ (* connection already in progress *) ESTABLISHED→ err = EISCONN ‖ (* socket already connected *) TIME WAIT→ if windows arch h.arch then err = EISCONN else if bsd arch h.arch then err = EAFNOSUPPORT else ASSERTION FAILURE“disconnect 4:2” ‖ (* never happen *) 1 → err = EISCONN (* all other states *) Description Consider a TCP socket sid referenced by fd on a non-Linux architecture. From thread tid , which is in the Run state, a disconnect(fd) call is made. The call fails with an error err which depends on the the state of the socket: If the socket is in the CLOSED state then it fails with EAFNOSUPPORT, except if on FreeBSD its bsd cantconnect flag is set, in which case it fails with EINVAL;if it is in the LISTEN state the error is EAFNOSUPPORT on WinXP and EOPNOTSUPP on FreeBSD; if it is in the SYN SENT or SYN RECEIVED state the error is EALREADY; if it is in the ESTABLISHED state the error is EISCONN; if it is in the TIME WAIT state the error is EISCONN on WinXP and EAFNOSUPPORT on FreeBSD; in all other states the error is EISCONN. A tid ·disconnect(fd) transition is made, leaving the thread state Ret(FAIL err) where err is one of the above errors. Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ disconnect 1 164 Linux This rule does not apply. disconnect 5 tcp: fast fail Succeed on Linux, possibly dropping the connection h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 tid ·disconnect(fd)−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]; oq := oq ′]〉 linux arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = sock .pr ∧ (if tcp sock .st ∈ {SYN RECEIVED;ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSE WAIT} then tcp drop and close h.arch ∗ sock(sock ′, outsegs) ∧ enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ else sock = sock ′ ∧ oq = oq ′) Description On Linux, consider a TCP socket sid , referenced by fd . From thread tid , which is in the Run state, a disconnect(fd) call is made and succeeds. A tid ·disconnect(fd) transition is made, leaving the thread state Ret(OK()). If the socket is in the SYN RECEIVED, ESTABLISHED, FIN WAIT 1, FIN WAIT 2, or CLOSE WAIT state then the con- nection is dropped, a RST segment is constructed, outsegs, which may be placed on the host’s outqueue, oq , resulting in new outqueue oq ′. If the socket is in any other state then it remains unchanged, as does the host’s outqueue. Model details Note that disconnect() has not been properly implemented on Linux yet so it will always succeed. Variations Posix This rule does not apply. FreeBSD This rule does not apply. WinXP This rule does not apply. disconnect 1 udp: fast succeed Unset socket’s peer address h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))] ]〉 tid ·disconnect(fd)−−−−−−−−−−−−−−→ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ disconnect 2 165 h 〈[ts := ts ⊕ (tid 7→ (Ret(ret))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))] ]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (if linux arch h.arch then ret = OK() else if windows arch h.arch ∧ ∃i ′2.is2 = ↑ i ′2 then ret = OK() else ret = FAIL EAFNOSUPPORT) Description Consider a UDP socket sid referenced by fd with (is1, ↑ p1, is2, ps2) as its binding quad. From thread tid , which is in the Run state, a disconnect(fd) call is made. On Linux the call succeeds; on WinXP if the socket had its peer IP address set then the call succeeds, otherwise it fails with an EAFNOSUPPORT error; on FreeBSD the call fails with an EAFNOSUPPORT error. A tid ·disconnect(fd) transition is made, leaving the thread state Ret(OK()) or Ret(FAIL EAFNOSUPPORT). The socket has its peer address set to (∗, ∗), and its local IP ad- dress set to ∗. The local port, p1, is left in place. Variations FreeBSD As above: the call fails with an EAFNOSUPPORT error. Linux As above: the call succeeds. WinXP As above: the call succeeds if the socket had a peer IP address set, or fails with an EAFNOSUPPORT error otherwise. disconnect 2 udp: fast succeed Unset socket’s peer address and autobind local port h0 tid ·disconnect fd−−−−−−−−−−−−−→ h0 〈[ts := ts ⊕ (tid 7→ (Ret(ret))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ↑ p1, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]; bound := sid :: h0.bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ p1 ∈ autobind(∗,PROTO UDP, h0.socks) ∧ (if linux arch h.arch then ret = OK() else ret = (FAIL EAFNOSUPPORT)) Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ dup() (TCP and UDP) 166 Consider a UDP socket sid referenced by fd and with binding quad (∗, ∗, ∗, ∗). From thread tid , which is in the Run state, a disconnect(fd) call is made. The call succeeds on Linux and fails with an EAFNOSUPPORT error on FreeBSD and WinXP. A tid ·disconnect(fd) transition is made, leaving the thread either in state Ret(OK()), or in state Ret(FAIL EAFNOSUPPORT). The socket is autobound to a local ephemeral port p1 ′, and sid is placed on the head of the host’s list of bound sockets. Variations FreeBSD As above: the call fails with an EAFNOSUPPORT error. Linux As above: the call succeeds. WinXP As above: the call fails with an EAFNOSUPPORT error. disconnect 3 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL, or ENOBUFS: there are no ephemeral ports left h0 tid ·disconnect fd−−−−−−−−−−−−−→ h0 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ autobind(∗,PROTO UDP, h0.socks) = ∅ ∧ e ∈ {EAGAIN;EADDRNOTAVAIL;ENOBUFS} Description Consider a UDP socket sid referenced by fd and with binding quad ∗, ∗, ∗, ∗. From thread tid , which is in the Run state, a disconnect(fd) call is made. There are no ephemeral ports left, so the socket cannot be autobound to a local port. The call fails with an error: EAGAIN, EADDRNOTAVAIL, or ENOBUFS. A tid ·disconnect(fd) transition is made, leaving the thread state Ret(FAIL e) where e is one of the above errors. 15.6 dup() (TCP and UDP) dup : fd→ fd A call to dup(fd) creates and returns a new file descriptor referring to the open file description referred to by the file descriptor fd. A successful dup() call will return the least numbered free file descriptor. The call will only fail if there are no more free file descriptors, or fd is not a valid file descriptor. 15.6.1 Errors A call to dup() can fail with the errors below, in which case the corresponding exception is raised: EMFILE There are no more file descriptors available. EBADF The file descriptor passed is not a valid file descriptor. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ dup 2 167 15.6.2 Common cases dup 1 ; return 1 15.6.3 API Posix: int dup(int fildes); FreeBSD: int dup(int oldd); Linux: int dup(int oldfd); In the Posix interface: • fildes is a file descriptor referring to the open file description for which another file descriptor is to be created for. This corresponds to the fd argument of the model dup(). • The returned int is either non-negative to indicate success or -1 to indicate an error, in which case the error code is in errno. If the call is successful then the returned int is the new file descriptor corresponding to the fd return type of the model dup(). The FreeBSD and Linux interfaces are similar. This call does not exist on WinXP. 15.6.4 Summary dup 1 all: fast succeed Successfully duplicate file descriptor dup 2 all: fast fail Fail with EMFILE: no more file descriptors available 15.6.5 Rules dup 1 all: fast succeed Successfully duplicate file descriptor h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds]〉 tid ·dup(fd)−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK fd ′)) sched timer ); fds := fds ′]〉 unix arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ nextfd h.arch fds fd ′ ∧ fd ′ < OPEN MAX FD∧ fds ′ = fds ⊕ (fd ′,fid) Description From thread tid , which is in the Run state, a dup(fd) call is made where fd is a file descriptor referring to an open file description identified by fid . A new file descriptor, fd ′ can be created in an architecture-specific way according to the nextfd (p??) function. fd ′ is less than the maximum open file descriptor, OPEN MAX FD. The call succeeds returning fd ′. A tid ·dup(fd) transition is made, leaving the thread state Ret(OK fd ′). The host’s finite map of file descriptors, fds, is extended to map the new file descriptor fd ′ to the file identifier fid , which results in a new finite map of file descriptors fds ′ for the host. Variations WinXP This rule does not apply: there is no dup() call on WinXP. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ dupfd() (TCP and UDP) 168 dup 2 all: fast fail Fail with EMFILE: no more file descriptors available h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·dup(fd)−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EMFILE))sched timer)]〉 unix arch h.arch ∧ fd ∈ dom(h.fds) ∧ (card(dom(h.fds)) + 1) ≥ OPEN MAX Description From thread tid , which is in the Run state, a dup(fd) call is made where fd is a valid file descriptor: it has an entry in the host’s finite map of file descriptors, h.fds. Creating another file descriptor would cause the number of open file descriptors to be greater than or equal to the maximum number of open file descriptors, OPEN MAX. The call fails with an EMFILE error. A tid ·dup(fd) transition is made, leaving the thread state Ret(FAIL EMFILE). Variations WinXP This rule does not apply: there is no dup() call on WinXP. 15.7 dupfd() (TCP and UDP) dupfd : fd ∗ int→ fd A call to dupfd(fd,n) creates and returns a new file desciptor referring to the open file description referred to by the file descriptor fd. A successful dupfd() call will return the least free file descriptor greater than or equal to n. The call will fail if n is negative or greater than the maximum allowed file descriptor, OPEN MAX; if the file descriptor fd is not a valid file descriptor; or if there are no more file descriptors available. 15.7.1 Errors A call to dupfd() can fail with the errors below, in which case the corresponding exception is raised: EINVAL The requested file descriptor is invalid: it is negative or greater than the maximum allowed. EMFILE There are no more file descriptors available. EBADF The file descriptor passed is not a valid file descriptor. 15.7.2 Common cases dupfd 1 ; return 1 15.7.3 API dupfd() is Posix fcntl() using the F_DUPFD command: Posix: int fcntl(int fildes, int cmd, int arg); FreeBSD: int fcntl(int fd, int cmd, int arg); Linux: int fcntl(int fd, int cmd, long arg); Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ dupfd 1 169 In the Posix interface: • fildes is a file descriptor referring to the open file description for which another file descriptor is to be created for. This corresponds to the fd argument of the model dupfd(). • cmd is the command to run on the specified file descriptor. For the model dupfd() this command is set to F_DUPFD. • The returned int is either non-negative to indicate success or -1 to indicate an error, in which case the error code is in errno. If the call was successful then the returned int is the new file descriptor. The FreeBSD and Linux interfaces are similar. This call does not exist on WinXP. 15.7.4 Model details Note that dupfd() is fcntl() with F_DUPFD rather than the similar but different dup2(). 15.7.5 Summary dupfd 1 all: fast succeed Successfully create a duplicate file descriptor greater than or equal to n dupfd 3 all: fast fail Fail with EINVAL: n is negative or greater than the maxi- mum allowed file descriptor dupfd 4 all: fast fail Fail with EMFILE: no more file descriptors available 15.7.6 Rules dupfd 1 all: fast succeed Successfully create a duplicate file descriptor greater than or equal to n h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds]〉 tid ·dupfd(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK fd ′)) sched timer ); fds := fds ′]〉 unix arch h.arch ∧ fd ∈ dom(fds) ∧ fid = fds[fd ] ∧ n ≥ 0 ∧ FD(num n) < OPEN MAX FD∧ fd ′ = FD(least n ′.num n ≤ n ′ ∧ FD n ′ < OPEN MAX FD∧FD n ′ /∈ dom(fds)) ∧ fds ′ = fds ⊕ (fd ′,fid) Description From thread tid , which is in the Run state, a dupfd(fd ,n) call is made. The host’s finite map of file descriptors is fds, and fd is a valid file descriptor in fds, referring to an open file description identified by fid . n is non-negative. A file descriptor fd ′ can be created, where it is the least free file descriptor greater than or equal to n, and less than the maximum allowed file descriptor, OPEN MAX FD. The call succeeds, returning this new file descriptor fd ′. A tid ·dupfd(fd ,n) transition is made, leaving the thread state Ret(OKfd ′). An entry mapping fd ′ to the open file description fid is added to fds, resulting in a new finite map of file descriptors for the host, fds ′. Variations WinXP This rule does not apply: there is no dupfd() call on WinXP. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getfileflags() (TCP and UDP) 170 dupfd 3 all: fast fail Fail with EINVAL: n is negative or greater than the maximum allowed file descriptor h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·dupfd(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 unix arch h.arch ∧ n < 0 ∨ num n ≥ OPEN MAX∧ err = (if bsd arch h.arch then EBADF else EINVAL) Description From thread tid , which is in the Run state, a dupfd(fd ,n) call is made. n is either negative or greater than the maximum number of open file descriptors, OPEN MAX. The call fails with an EINVAL error. A tid ·dupfd(fd ,n) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations WinXP This call does not apply: there is no dupfd() call on WinXP. FreeBSD On BSD the error EBADF is returned. dupfd 4 all: fast fail Fail with EMFILE: no more file descriptors available h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·dupfd(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EMFILE))sched timer)]〉 unix arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ n ≥ 0 ∧ fd ′ = FD(least n ′.num n ≤ n ′ ∧OPEN MAX FD ≤ FD n ′ ∧ FD n ′ /∈ dom(h.fds)) Description From thread tid , which is in the Run state, a dupfd(fd ,n) call is made. fd is a file descriptor referring to open file description fid and n is non-negative. The least file descriptor fd ′ that is greater than or equal to n is greater than or equal to the maximum open file descriptor, OPEN MAX FD. The call fails with an EMFILE error. A tid ·dupfd(fd ,n) transition is made, leaving the thread state Ret(FAIL EMFILE). Variations WinXP This rule does not apply: there is no dupfd() call on WinXP. 15.8 getfileflags() (TCP and UDP) getfileflags : fd→ filebflag list A call to getfileflags(fd) returns a list of the file flags currently set for the file which fd refers to. The possible file flags are: • O ASYNC Reports whether signal driven I/O is enabled. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getfileflags 1 171 • O NONBLOCK Reports whether a socket is non-blocking. 15.8.1 Errors A call to getfileflags() can fail with the error below, in which case the corresponding exception is raised: EBADF The file descriptor passed is not a valid file descriptor. 15.8.2 Common cases A call to getfileflags() is made, returning the flags set: getfileflags 1 ; return 1 15.8.3 API getfileflags() is Posix fcntl(fd,F_GETFL). On WinXP it is ioctlsocket() with the FIONBIO command. Posix: int fcntl(int fildes, int cmd, ...); FreeBSD: int fcntl(int fd, int cmd, ...); Linux: int fcntl(int fd, int cmd); WinXP: int ioctlsocket(SOCKET s, long cmd, u_long* argp) In the Posix interface: • fildes is a file descriptor for the file to retrieve flags from. It corresponds to the fd argument of the model getfileflags(). On WinXP the s is a socket descriptor corresponding to the fd argument of the model getfileflags(). • cmd is a command to perform an operation on the file. This is set to F_GETFL for the model getfileflags(). On WinXP, cmd is set to FIONBIO to get the O NONBLOCK flag; there is no O ASYNC flag on WinXP. • The call takes a variable number of arguments. For the model getfileflags() only the two arguments described above are needed. • If the call succeeds the returned int represents the file flags that are set corresponding to the filebflag list return type of the model getfileflags(). If the returned int is -1 then an error has occurred in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR with the actual error code available through a call to WSAGetLastError(). 15.8.4 Model details The following errors are not modelled: • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. • WSAENOTSOCK is a possible error on WinXP as the ioctlsocket() call is specific to a socket. In the model the getfileflags() call is performed on a file. 15.8.5 Summary getfileflags 1 all: fast succeed Return list of file flags currently set for an open file descrip- tion 15.8.6 Rules Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getifaddrs() (TCP and UDP) 172 getfileflags 1 all: fast succeed Return list of file flags currently set for an open file description h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getfileflags(fd)−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK flags))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(ft ,ff ) ∧ flags ∈ ORDERINGS ff .b Description From thread tid , which is in the Run state, a getfileflags(fd) call is made. fd refers to a file description File(ft ,ff ) where ff is the file flags that are set. The call succeeds, returning flags which is a list representing some ordering of the boolean file flags ff .b in ff . A tid ·getfileflags(fd) transition is made, leaving the thread state Ret(OK(flags)). 15.9 getifaddrs() (TCP and UDP) getifaddrs : unit→ (ifid ∗ ip ∗ ip list ∗ netmask)list A call to getifaddrs() returns the interface information for a host. For each interface a tuple is constructed consisting of: the interface name, the primary IP address for the interface, the auxiliary IP addresses for the interface, and the subnet mask for the interface. A list is constructed with one tuple for each interface, and this is the return value of the call to getifaddrs(). 15.9.1 Errors EINTR The system was interrupted by a caught signal. EBADF The file descriptor passed is not a valid file descriptor. 15.9.2 Common cases getifaddrs 1 ; return 1 15.9.3 API getifaddrs() is two calls to Posix ioctl(): one with the SIOCGIFCONF request and one with the SIOCGIFNETMASK request. On FreeBSD there is a specific getifaddrs() call. On WinXP the getifaddrs() call does not exist. Posix: int ioctl(int fildes, int request, ... /* arg */); FreeBSD: int getifaddrs(struct ifaddrs **ifap); Linux: int ioctl(int d, int request, ...); In the Posix interface: • fildes is a file descriptor. There is no corresponding argument in the model getifaddrs(). • request is the operation to perform on the file. When request is SIOCGIFCONF the list of all interfaces is returned; when it is SIOCNETMASK the subnet mask is returned for an interface. • The function takes a variable number of arguments. When request is SIOCGIFCONF there is a third argument: a pointer to a location to store a linked-list of the interfaces; when it is SIOCGIFNETMASK it is a pointer to a structure containing the interface and it is filled in with the subnet mask for that interface. • The returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getpeername() (TCP and UDP) 173 To construct the return value of type (ifid ∗ ip∗ ip list∗netmask)list, the interface name and the IP addresses associated with it are obtained from the call to ioctl() using SIOCGIFCONF, and then the subnet mask for each interface is obtained from a call to ioctl() using SIOCGIFNETMASK. On FreeBSD the ifap argument to getifaddrs() is a pointer to a location to store a linked list of the interface information in, corresponding to the return type of the model getifaddrs(). 15.9.4 Model details Any of the errors possible when making an ioctl() call are possible: EIO, ENOTTY, ENXIO, and ENODEV. None of these are modelled. Note that the Posix interface admits the possibility that the interfaces will change between the two calls, whereas in the model interface the getifaddrs() call is atomic. 15.9.5 Summary getifaddrs 1 all: fast succeed Successfully return host interface information 15.9.6 Rules getifaddrs 1 all: fast succeed Successfully return host interface information h ts := ts ⊕ (tid 7→ (Run)d) tid ·getifaddrs()−−−−−−−−−−−−→ h ts := ts ⊕ (tid 7→ (Ret(OK iflist))sched timer) ifidlist ∈ ORDERINGS ifidset ∧ length ifidlist = length iflist ∧ ifidset = {(ifid , hifd) | ifid ∈ dom(h.ifds) ∧ hifd = h.ifds[ifid ]} ∧ every I(map2(λ(ifid , hifd)(ifid ′, primary , ipslist ,netmask).(ifid ′ = ifid ∧ primary = hifd .primary ∧ ipslist ∈ ORDERINGS hifd .ipset ∧ netmask = hifd .netmask)) ifidlist iflist) Description On a Unix architecture, from thread tid , which is in the Run state, a getifaddrs() call is made. The call succeeds, returning iflist which is a list of tuples: one for each interface on the host. Each tuple consists of: the interface name; the primary IP address for the interface; a list of the other IP addresses for the interface; and the netmask for the interface. A tid ·getifaddrs() transition is made, leaving the thread state Ret(OKiflist). Variations WinXP This call does not exist on WinXP. 15.10 getpeername() (TCP and UDP) getpeername : fd→ (ip ∗ port) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getpeername() (TCP and UDP) 174 A call to getpeername(fd) returns the peer address of the socket referred to by file descriptor fd. If the file descriptor refers to a socket sock then a successful call will return (i2, p2) where sock .is2 = ↑ i2, and sock .ps2 = ↑ p2. 15.10.1 Errors A call to getpeername() can fail with the errors below, in which case the corresponding exception is raised: ENOTCONN Socket not connected to a peer. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.10.2 Common cases getpeername 1 ; return 1 15.10.3 API Posix: int getpeername(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); FreeBSD: int getpeername(int s, struct sockaddr *name, socklen_t *namelen); Linux: int getpeername(int s, struct sockaddr *name, socklen_t *namelen); WinXP: int getpeername(SOCKET s,struct sockaddr* name, int* namelen); In the Posix interface: • socket is a file descriptor referring to the socket to get the peer address of, corresponding to the fd argument in the model getpeername(). • address is a pointer to a sockaddr structure of length address_len, which contains the peer address of the socket upon return. These two correspond to the (ip ∗ port) return type of the model getpeername(). The sin_addr.s_addr field of the address structure holds the peer IP address, corresponding to the ip in the return tuple; the sin_port field of the address structure holds the peer port, corresponding to the port in the return tuple. • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.10.4 Model details The following errors are not modelled: • According to the FreeBSD man page for getpeername(), ECONNRESET can be returned if the con- nection has been reset by the peer. This behaviour has not been observed in any tests. • On FreeBSD, Linux, and WinXP, EFAULT can be returned if the name parameter points to memory not in a valid part of the process address space. This is an artefact of the C interface to getpeername() that is excluded by the clean interface used in the model getpeername(). • In Posix, EINVAL can be returned if the socket has been shutdown; none of the implementations in the model return this error from a getpeername() call. • In Posix, EOPNOTSUPP is returned if the getpeername() operation is not supported by the protocol. Both TCP and UDP support this operation. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getpeername 1 175 • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.10.5 Summary getpeername 1 all: fast succeed Successfully return socket’s peer address getpeername 2 all: fast fail Fail with ENOTCONN: socket not connected to a peer 15.10.6 Rules getpeername 1 all: fast succeed Successfully return socket’s peer address h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getpeername(fd)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(i2, p2)))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ sock .is2 = ↑ i2 ∧ (sock .ps2 = ↑ p2 ∨ (windows arch h.arch ∧ sock .ps2 = ∗ ∧ (p2 = Port 0) ∧ proto of sock .pr = PROTO UDP)) ∧ ((∀tcp sock .sock .pr = TCP PROTO(tcp sock) =⇒ tcp sock .st ∈ {ESTABLISHED;CLOSE WAIT;LAST ACK; FIN WAIT 1;CLOSING} ∨ (¬sock .cantrcvmore ∧ tcp sock .st = FIN WAIT 2) ∨ (linux arch h.arch ∧ tcp sock .st = SYN RECEIVED) ∨ (* BSD listen bug *) (bsd arch h.arch ∧ tcp sock .st = LISTEN)) ∨ windows arch h.arch) Description From thread tid , which is in the Run state, a getpeername(fd) call is made. fd refers to a socket sock , identified by sid , which has its peer IP address set to ↑i2 and its peer port address set to ↑ p2. If sock is a TCP socket then either it is in state ESTABLISHED, CLOSE WAIT, LAST ACK, FIN WAIT 1, or CLOSING; or it is in state FIN WAIT 2 and is not shutdown for reading. The call succeeds, returning (i2, p2), the socket’s peer address. A tid ·getpeername(fd) transition is made, leaving the thread state Ret(OK(i2, p2)). Variations FreeBSD If sock is a TCP socket then it may be in state LISTEN; this is due to the FreeBSD bug that allows listen() to be called on a synchronised socket. Linux If sock is a TCP socket then it may also be in state SYN RECEIVED. WinXP If sock is a UDP socket and has no peer port set, sock .ps2 = ∗ then the call may still succeed with p2 = Port 0. Additionally, if sock is a TCP socket then it may be in any state. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockbopt() (TCP and UDP) 176 getpeername 2 all: fast fail Fail with ENOTCONN: socket not connected to a peer h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getpeername(fd)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOTCONN))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ ¬(sock .is2 6= ∗ ∧ (sock .ps2 6= ∗ ∨ (windows arch h.arch ∧ proto of sock .pr = PROTO UDP)) ∧ (∀tcp sock .sock .pr = TCP PROTO(tcp sock) =⇒ tcp sock .st ∈ {ESTABLISHED;CLOSE WAIT;LAST ACK;FIN WAIT 1;CLOSING} ∨ (¬sock .cantrcvmore ∧ tcp sock .st = FIN WAIT 2) ∨ (linux arch h.arch ∧ tcp sock .st = SYN RECEIVED) ∨ windows arch h.arch)) Description From thread tid , which is in the Run state, a getpeername(fd) call is made where fd refers to a socket sock identified by sid . The socket does not have both its peer IP and port set, If it is a TCP socket then it is not in state ESTABLISHED, CLOSE WAIT, LAST ACK, FIN WAIT 1 or CLOSING; or in state FIN WAIT 2 and not shutdown for reading. The call fails with an ENOTCONN error. A tid ·getpeername(fd) transition is made, leaving the thread state Ret(FAIL ENOTCONN). Variations Linux As above, with the additional condition that if sock is a TCP socket then it is not in state SYN RECEIVED. WinXP As above, except that if sock is a TCP socket then it does not matter what state it is in and if it is a UDP socket then the state of its peer port, whether it is set or unset, does not matter. 15.11 getsockbopt() (TCP and UDP) getsockbopt : (fd ∗ sockbflag)→ bool A call to getsockbopt(fd,flag) returns the value of one of the socket’s boolean-valued flags. The fd argument is a file descriptor referring to the socket to retrieve a flag’s value from, and the flag argument is the boolean-valued socket flag to get. Possible flags are: • SO BSDCOMPAT Reports whether the BSD semantics for delivery of ICMPs to UDP sockets with no peer address set is enabled. • SO DONTROUTE Reports whether outgoing messages bypass the standard routing facilities. • SO KEEPALIVE Reports whether connections are kept active with periodic transmission of messages, if this is supported by the protocol. • SO OOBINLINE Reports whether the socket leaves received out-of-band data (data marked urgent) inline. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockbopt() (TCP and UDP) 177 • SO REUSEADDR Reports whether the rules used in validating addresses supplied to bind() should allow reuse of local ports, if this is supported by the protocol. The return value of the getsockbopt() call is the boolean-value of the specified socket flag. 15.11.1 Errors A call to getsockbopt() can fail with the errors below, in which case the corresponding exception is raised: ENOPROTOOPT The specified flag is not supported by the protocol. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.11.2 Common cases getsockbopt 1 ; return 1 15.11.3 API getsockbopt() is Posix getsockopt() for boolean-valued socket flags. Posix: int getsockopt(int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); FreeBSD: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); Linux: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); WinXP: int getsockopt(SOCKET s,int level,int optname, char* optval, int* optlen); In the Posix interface: • socket is the file descriptor of the socket on which to get the flag, corresponding to the fd argument of the model getsockbopt(). • level is the protocol level at which the flag resides: SOL_SOCKET for the socket level options, and option_name is the flag to be retrieved. These two correspond to the flag argument to the model getsockbopt() where the possible values of option_name are limited to: SO BSDCOMPAT, SO DONTROUTE, SO KEEPALIVE, SO OOBINLINE, and SO REUSEADDR. • option_value is a pointer to a location of size option_len to store the value retrieved by getsockopt(). These two correspond to the bool return type of the model getsockbopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.11.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to getsockbopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockbopt 2 178 15.11.5 Summary getsockbopt 1 all: fast succeed Successfully retrieve value of boolean socket flag getsockbopt 2 udp: fast succeed Fail with ENOPROTOOPT: option not valid on WinXP UDP socket 15.11.6 Rules getsockbopt 1 all: fast succeed Successfully retrieve value of boolean socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsockbopt(fd , f )−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(sf .b(f ))))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sf = (h.socks[sid ]).sf ∧ (windows arch h.arch ∧ proto of(h.socks[sid ]).pr = PROTO UDP =⇒ f /∈ {SO KEEPALIVE;SO OOBINLINE}) Description From thread tid , which is in the Run state, a getsockbopt(fd , f ) call is made. fd refers to a socket sid with boolean socket flags sf .b, and f is a boolean socket flag. The call succeeds, returning the value of f : T if f is set, and F if f is not set in sf .b. A tid ·getsockbopt(fd , f ) transition is made, leaving the thread state Ret(OK(sf .b(f ))) where sf .b(f ) is the boolean value of the socket’s flag f . Variations WinXP As above, except that if sid is a UDP socket, then f cannot be SO KEEPALIVE or SO OOBINLINE. getsockbopt 2 udp: fast succeed Fail with ENOPROTOOPT: option not valid on WinXP UDP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 tid ·getsockbopt(fd , f )−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ f ∈ {SO KEEPALIVE;SO OOBINLINE} Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockerr() (TCP and UDP) 179 On WinXP, consider a UDP socket sid referenced by fd . From thread tid , which is in the Run state, a getsockbopt(fd , f ) call is made, where f is either SO KEEPALIVE or SO OOBINLINE. The call fails with an ENOPROTOOPT error. A tid ·getsockbopt(fd , f ) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations FreeBSD This rule does not apply. Linux This rule does not apply. 15.12 getsockerr() (TCP and UDP) getsockerr : fd→ unit A call getsockerr(fd) returns the pending error of a socket, clearing it, if there is one. fd is a file descriptor referring to a socket. If the socket has a pending error then the getsockerr() call will fail with that error, otherwise it will return successfully. 15.12.1 Errors In addition to failing with the pending error, a call to getsockerr() can fail with the errors below, in which case the corresponding exception is raised: EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.12.2 Common cases getsockerr 1 ; return 1 getsockerr 2 ; return 1 15.12.3 API getsockerr() is Posix getsockopt() for the SO_ERROR socket option. Posix: int getsockopt(int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); FreeBSD: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); Linux: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); WinXP: int getsockopt(SOCKET s,int level,int optname, char* optval, int* optlen); In the Posix interface: • socket is the file descriptor of the socket to get the option on, corresponding to the fd argument of the model getsockerr(). • level is the protocol level at which the option resides: SOL_SOCKET for the socket level options, and option_name is the option to be retrieved. For getsockerr() option_name is set to SO_ERROR. • option_value is a pointer to a location of size option_len to store the value retrieved by getsockopt(). When option_name is SO_ERROR these fields are not used. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockerr 2 180 • the returned int is either 0 to indicate the socket has no pending error or -1 to indicate a pending error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.12.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, the flag for getsockerr() is always SO_ERROR so this error cannot occur. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.12.5 Summary getsockerr 1 all: fast succeed Return successfully: no pending error getsockerr 2 all: fast fail Fail with pending error and clear the error 15.12.6 Rules getsockerr 1 all: fast succeed Return successfully: no pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsockerr(fd)−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (h.socks[sid ]).es = ∗ Description From thread tid , which is in the Run state, a getsockerr(fd) call is made. fd refers to a socket sid which has no pending errors. The call succeeds. A tid ·getsockerr(fd) transition is made, leaving the thread state Ret(OK()). getsockerr 2 all: fast fail Fail with pending error and clear the error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 tid ·getsockerr(fd)−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ↑ e = sock .es ∧ sock ′ = sock 〈[ es := ∗]〉 Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocklistening() (TCP and UDP) 181 From thread tid , which is in the Run state, a getsockerr(fd) call is made. fd refers to a socket sid which has pending error e. The call fails, returning e. A tid ·getsockerr(fd) transition is made, leaving the thread state Ret(FAIL e) and cleaing the error e from the socket. 15.13 getsocklistening() (TCP and UDP) getsocklistening : fd→ bool A call to getsocklistening(fd) returns T if the socket referenced by fd is listening, or F otherwise. For TCP a socket is listening if it is in the LISTEN state. For UDP, which is not a connection-oriented protocol, a socket can never be listening. 15.13.1 Errors A call to getsocklistening() can fail with the errors below, in which case the corresponding exception is raised: ENOPROTOOPT FreeBSD does not support this socket option, and on Linux and WinXP this option is not supported for UDP sockets. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.13.2 Common cases getsocklistening 1 ; return 1 15.13.3 API getsocklistening() is Posix getsockopt() for the SO_ACCEPTCONN socket option. Posix: int getsockopt(int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); FreeBSD: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); Linux: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); WinXP: int getsockopt(SOCKET s,int level,int optname, char* optval, int* optlen); In the Posix interface: • socket is the file descriptor of the socket to get the option on, corresponding to the fd argument of the model getsocklistening(). • level is the protocol level at which the option resides: SOL_SOCKET for the socket level options, and option_name is the option to be retrieved. For getsocklistening() option_name is set to SO_ACCEPTCONN. • option_value is a pointer to a location of size option_len to store the value retrieved by getsockopt(). The value stored in the location corresponds to the bool return value of the model getsocklistening(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The Linux and WinXP interfaces are similar except where noted. FreeBSD does not support the SO_ACCEPTCONN socket option. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocklistening 3 182 15.13.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, the flag for getsocklistening() is always SO_ACCEPTCONN so this error cannot occur. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.13.5 Summary getsocklistening 1 tcp: fast succeed Return successfully: T if socket is listening, F otherwise getsocklistening 3 tcp: fast fail Fail with ENOPROTOOPT: on FreeBSD operation not supported getsocklistening 2 udp: rc Return F or fail with ENOPROTOOPT: a UDP socket cannot be listening 15.13.6 Rules getsocklistening 1 tcp: fast succeed Return successfully: T if socket is listening, F otherwise h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocklistening(fd)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK b))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ b = (tcp sock .st = LISTEN) ∧ ¬(bsd arch h.arch) Description From thread tid , which is in the Run state, a getsocklistening(fd) call is made where fd refers to a TCP socket sid . A tid ·getsocklistening(fd) transition is made, leaving the thread state Ret(OK b) where b = T if the socket is in the LISTEN state, and b = F otherwise. Variations FreeBSD This rule does not apply: see getsocklistening 3 . getsocklistening 3 tcp: fast fail Fail with ENOPROTOOPT: on FreeBSD operation not supported h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocklistening(fd)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer)]〉 bsd arch h.arch ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockname() (TCP and UDP) 183 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr Description On FreeBSD, a getsocklistening(fd) call is made from thread tid which is in the Run state wherefd refers to a TCP socket sid . The call fails with an ENOPROTOOPT error. A tid ·getsocklistening(fd) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations Linux This rule does not apply: see getsocklistening 1 . WinXP This rule does not apply: see getsocklistening 1 . getsocklistening 2 udp: rc Return F or fail with ENOPROTOOPT: a UDP socket cannot be listening h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocklistening(fd)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(ret))sched timer)]〉 proto of(h.socks[sid ]).pr = PROTO UDP ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ if linux arch h.arch then rc = fast succeed ∧ ret = OK F else rc = fast fail ∧ ret = FAIL ENOPROTOOPT Description Consider a UDP socket sid , referenced by fd . From thread tid , which is in the Run state, a getsocklistening(fd) call is made. On Linux the call succeeds, returning F; on FreeBSD and WinXP the call fails with an ENOPROTOOPT error. A tid ·getsocklistening(fd) transition is made, leaving the thread state Ret(OK(F)) on Linux, and Ret(FAIL ENOPROTOOPT) on FreeBSD and Linux. Variations Posix As above: the call fails with an ENOPROTOOPT error. FreeBSD As above: the call fails with an ENOPROTOOPT error. Linux As above: the call succeeds, returning F. WinXP As above: the call fails with an ENOPROTOOPT error. 15.14 getsockname() (TCP and UDP) getsockname : fd→ (ip option ∗ port option) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockname() (TCP and UDP) 184 A call to getsockname(fd) returns the local address pair of a socket. If the file descriptor fd refers to the socket sock then the return value of a successfull call will be (sock .is1, sock .ps1). 15.14.1 Errors A call to getsockname() can fail with the errors below, in which case the corresponding exception is raised: ECONNRESET On FreeBSD, TCP socket has its cb.bsd cantconnect flag set due to previous con- nection establishment attempt. EINVAL Socket not bound to local address on WinXP. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. ENOBUFS Out of resources. 15.14.2 Common cases getsockname 1 ; return 1 15.14.3 API Posix: int getsockname(int socket, struct sockaddr *restrict address, socklen_t *restrict address_len); FreeBSD: int getsockname(int s, struct sockaddr *name, socklen_t *namelen); Linux: int getsockname(int s, struct sockaddr *name, socklen_t *namelen); WinXP: int getsockname(SOCKET s, struct sockaddr* name, int* namelen); In the Posix interface: • socket is a file descriptor referring to the socket to get the local address of, corresponding to the fd argument in the model getsockname(). • address is a pointer to a sockaddr structure of length address_len, which contains the local address of the socket upon return. These two correspond to the (ip option, port option) return type of the model getsockname(). If the sin_addr.s_addr field of the name structure is set to 0 on return, then the socket’s local IP address is not set: the ip option member of the return tuple is set to ∗; otherwise, if it is set to i then it corresponds to the socket having local IP address and so the ip option member of the return tuple is↑i . If the sin_port field of the name structure is set to 0 on return then the socket does not have a local port set, corresponding to the port option in the return tuple being ∗; otherwise the sin_port field is set to p corresponding to the socket having its local port set: the port option in the return tuple is ↑ p. • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.14.4 Model details The following errors are not modelled: • On FreeBSD, Linux, and WinXP, EFAULT can be returned if the name parameter points to memory not in a valid part of the process address space. This is an artefact of the C interface to getsockname() that is excluded by the clean interface used in the model getsockname(). • in Posix, EINVAL can be returned if the socket has been shutdown. None of the implementations return EINVAL in this case. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockname 2 185 • in Posix, EOPNOTSUPP is returned if the getsockname() operation is not supported by the protocol. Both UDP and TCP support this operation. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.14.5 Summary getsockname 1 all: fast succeed Successfully return socket’s local address getsockname 2 tcp: fast fail Fail with ECONNRESET: previous connection attempt has failed on FreeBSD getsockname 3 all: fast fail Fail with EINVAL: socket not bound on WinXP 15.14.6 Rules getsockname 1 all: fast succeed Successfully return socket’s local address h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsockname(fd)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(sock .is1, sock .ps1)))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ (case sock .pr of TCP PROTO(tcp sock)→ bsd arch h.arch =⇒ ¬(tcp sock .cb.bsd cantconnect = T ∧ sock .ps1 = ∗) ‖ UDP PROTO( 444 )→ T) ∧ (windows arch h.arch =⇒ sock .is1 6= ∗ ∨ sock .ps1 6= ∗) Description From thread tid , which is in the Run state, a getsockname(fd) call is made where fd refers to socket sock , identified by sid . The socket’s local address is returned: (sock .is1, sock .ps1). A tid ·getsockname(fd) transition is made, leaving the thread state Ret(OK(sock .is1, sock .ps1)). Variations FreeBSD This rule does not apply if the socket’s bsd cantconnect flag is set in its control block and its local port is not set. WinXP As above with the additional condition that either the socket’s local IP address or local port must be set. getsockname 2 tcp: fast fail Fail with ECONNRESET: previous connection attempt has failed on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsockname 3 186 tid ·getsockname(fd)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ECONNRESET))sched timer); socks := socks ⊕ [(sid , sock)]]〉 bsd arch h.arch ∧ sock .pr = TCP PROTO(tcp sock) ∧ (tcp sock .cb.bsd cantconnect = T ∧ sock .ps1 = ∗) ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) Description On FreeBSD, from thread tid , which is in the Run state, a getsockname(fd) call is made where fd refers to a TCP socket sock , identified by sid , which has its bsd cantconnect flag set and is not bound to a local port. A tid ·getsockname(fd) transition is made, leaving the thread state Ret(FAIL ECONNRESET). Variations Linux This rule does not apply. WinXP This rule does not apply. getsockname 3 all: fast fail Fail with EINVAL: socket not bound on WinXP h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[is1 := ∗; ps1 := ∗]〉)]]〉 tid ·getsockname(fd)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer); socks := socks ⊕ [(sid , sock 〈[is1 := ∗; ps1 := ∗]〉)]]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) Description On WinXP, a getsockname(fd) call is made from thread tid which is in the Run state. fd refers to a socket sid which has neither its local IP address nor its local port set. The call fails with an EINVAL error. A tid ·getsockname(fd) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations Posix This rule does not apply. FreeBSD This rule does not apply. Linux This rule does not apply. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocknopt() (TCP and UDP) 187 15.15 getsocknopt() (TCP and UDP) getsocknopt : (fd ∗ socknflag)→ int A call to getsocknopt(fd,flag) returns the value of one of the socket’s numeric flags. The fd argument is a file descriptor referring to the socket to retrieve a flag’s value from. The flag argument is a numeric socket flag. Possible flags are: • SO RCVBUF Reports receive buffer size information. • SO RCVLOWAT Reports the minimum number of bytes to process for socket input operations. • SO SNDBUF Reports send buffer size information. • SO SNDLOWAT Reports the minimum number of bytes to process for socket output operations. The return value of the getsocknopt() call is the numeric-value of the specified flag . 15.15.1 Errors A call to getsocknopt() can fail with the errors below, in which case the corresponding exception is raised: ENOPROTOOPT The specified flag is not supported by the protocol. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.15.2 Common cases getsocknopt 1 ; return 1 15.15.3 API getsocknopt() is Posix getsockopt() for numeric socket flags. Posix: int getsockopt(int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); FreeBSD: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); Linux: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); WinXP: int getsockopt(SOCKET s,int level,int optname, char* optval, int* optlen); In the Posix interface: • socket is the file descriptor of the socket to set the option on, corresponding to the fd argument of the model getsocknopt(). • level is the protocol level at which the option resides: SOL_SOCKET for the socket level options, and option_name is the option to be retrieved. These two correspond to the flag argument to the model getsocknopt() where the possible values of option_name are limited to SO RCVBUF, SO RCVLOWAT, SO SNDBUF and SO SNDLOWAT. • option_value is a pointer to a location of size option_len to store the value retrieved by getsockopt(). They correspond to the int return type of the model getsocknopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocknopt 4 188 15.15.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to getsocknopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.15.5 Summary getsocknopt 1 all: fast succeed Successfully retrieve value of a numeric socket flag getsocknopt 4 all: fast fail Fail with ENOPROTOOPT: value of SO RCVLOWAT and SO SNDLOWAT not retrievable 15.15.6 Rules getsocknopt 1 all: fast succeed Successfully retrieve value of a numeric socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocknopt(fd , f )−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(int of num(sf .n(f )))))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sf = (h.socks[sid ]).sf ∧ (windows arch h.arch =⇒ f /∈ {SO RCVLOWAT;SO SNDLOWAT}) Description Consider the socket sid , referenced by fd , with socket flags sf . From thread tid , which is in the Run state, a getsocknopt(fd , f ) call is made. f is a numeric socket flag whose value is to be returned. The call succeeds, returning sf .n(f ), the numeric value of flag f for socket sid . A tid ·getsocknopt(fd , f ) transition is made, leaving the thread state Ret(OK(int of num(sf .n(f )))). Variations WinXP The flag f is not SO RCVLOWAT or SO SNDLOWAT. getsocknopt 4 all: fast fail Fail with ENOPROTOOPT: value of SO RCVLOWAT and SO SNDLOWAT not retrievable h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocknopt(fd , f )−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer)]〉 windows arch h.arch ∧ f ∈ {SO RCVLOWAT;SO SNDLOWAT} Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocktopt() (TCP and UDP) 189 Description From thread tid , which is in the Run state, a getsocknopt(fd , f ) call is made where fd is a file descriptor. f is a numeric socket flag: either SO RCVLOWAT or SO SNDLOWAT, both flags whose value is non- retrievable. The call fails with an ENOPROTOOPT error. A tid ·getsocknopt(fd , f ) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations FreeBSD This rule does not apply. Linux This rule does not apply. 15.16 getsocktopt() (TCP and UDP) getsocktopt : (fd ∗ socktflag)→ (int ∗ int) option A call to getsocktopt(fd,flag) returns the value of one of the socket’s time-option flags. The fd argument is a file descriptor referring to the socket to retrieve a flag’s value from. The flag argument is a time option socket flag. Possible flags are: • SO RCVTIMEO Reports the timeout value for input operations. • SO SNDTIMEO Reports the timeout value specifying the amount of time that an output function blocks because flow control prevents data from being sent. The return value of the getsocktopt() call is the time-value of the specified flag . A return value of ∗ means the timeout is disabled. A return value of ↑(s,ns) means the timeout value is s seconds and ns nano-seconds. 15.16.1 Errors A call to getsocktopt() can fail with the errors below, in which case the corresponding exception is raised: ENOPROTOOPT The specified flag is not supported by the protocol. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.16.2 Common cases getsocktopt 1 ; return 1 15.16.3 API getsocktopt() is Posix getsockopt() for time-valued socket options. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ getsocktopt 1 190 Posix: int getsockopt(int socket, int level, int option_name, void *restrict option_value, socklen_t *restrict option_len); FreeBSD: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); Linux: int getsockopt(int s, int level, int optname, void *optval, socklen_t *optlen); WinXP: int getsockopt(SOCKET s,int level,int optname, char* optval, int* optlen); In the Posix interface: • socket is the file descriptor of the socket to set the option on, corresponding to the fd argument of the model getsocktopt(). • level is the protocol level at which the option resides: SOL_SOCKET for the socket level options, and option_name is the option to be retrieved. These two correspond to the flag argument to the model getsocktopt() where the possible values of option_name are limited to SO RCVTIMEO and SO SNDTIMEO. • option_value is a pointer to a location of size option_len to store the value retrieved by getsockopt(). They correspond to the (int ∗ int) option return type of the model getsocktopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.16.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to getsocktopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.16.5 Summary getsocktopt 1 all: fast succeed Successfully retrieve value of time-option socket flag getsocktopt 4 all: fast fail Fail with ENOPROTOOPT: on WinXP SO LINGER not retrievable for UDP sockets 15.16.6 Rules getsocktopt 1 all: fast succeed Successfully retrieve value of time-option socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocktopt(fd , f )−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK t))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sf = (h.socks[sid ]).sf ∧ t = tltimeopt of time(sf .t(f )) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen() (TCP only) 191 ¬(windows arch h.arch ∧ proto of(h.socks[sid ]).pr = PROTO UDP ∧ f = SO LINGER) Description From thread tid , which is in the Run state, a getsocktopt(fd , f ) call is made. fd is a file descriptor referring to the socket sid which has socket flags sf , and f is a time-option flag. The call succeeds, returning OK(t) where t is the value of the socket’s flag f . A tid ·getsocktopt(fd , f ) transition is made, leaving the thread state Ret(OKt). Model details The return type is (int∗ int) option, but the type of a time-option socket flag is time. The auxiliary function tltimeopt of time is used to do the conversion. Variations WinXP As above but in addition if fd refers to a UDP socket then the flag is not SO LINGER. getsocktopt 4 all: fast fail Fail with ENOPROTOOPT: on WinXP SO LINGER not retrievable for UDP sockets h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·getsocktopt(fd , f )−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer)]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of(h.socks[sid ]).pr = PROTO UDP ∧ f = SO LINGER Description On WinXP, from thread tid which is in the Run state, a getsocktopt(fd , f ) call is made. fd is a file descriptor referring to a UDP socket sid and f is the socket flag SO LINGER. The flag f is not retrievable so the call fails with an ENOPROTOOPT error. A tid ·getsocktopt(fd , f ) transition is made, leaving the thread state Ret(ENOPROTOOPT). Variations FreeBSD This rule does not apply. Linux This rule does not apply. 15.17 listen() (TCP only) listen : fd ∗ int→ unit A call to listen(fd,n) puts a TCP socket that is in the CLOSED state into the LISTEN state, making it a passive socket, so that incoming connections for the socket will be accepted by the host and placed on its listen queue. Here fd is a file descriptor referring to the socket to put into the LISTEN state and n is Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen() (TCP only) 192 the backlog used to calculate the maximum lengths of the two components of the socket’s listen queue: its pending connections queue, lis.q0, and its complete connection queue, lis.q . The details of this calculation very between architectures. The maximum useful value of n is SOMAXCONN: if n is greater than this then it will be truncated without generating an error. The minimum value of n is 0: if it a negative integer then it will be set to 0. Once a socket is in the LISTEN state, listen() can be called again to change the backlog value. 15.17.1 Errors A call to listen() can fail with the errors below, in which case the corresponding exception is raised: EADDRINUSE Another socket is listening on this local port. EINVAL On FreeBSD the socket has been shutdown for writing; on Linux the socket is not in the CLOSED or LISTEN state; or on WinXP the socket is not bound, EISCONN On WinXP the socket is already connected: it is not in the CLOSED or LISTEN state. EOPNOTSUPP The listen() operation is not supported for UDP. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.17.2 Common cases A TCP socket is created, has its local address and port set by bind(), and then is put into the LISTEN state which can accept new incoming connections: socket 1 ; return 1 ; bind 1 return 1 ; listen 1 ; return 1 ; . . . 15.17.3 API Posix: int listen(int socket, int backlog); FreeBSD: int listen(int s, int backlog); Linux: int listen(int s, int backlog); WinXP: int listen(SOCKET s, int backlog); In the Posix interface: • socket is a file descriptor referring to the socket to put into the LISTEN state, corresponding to the fd argument of the model listen(). • backlog is an int on which the maximum permitted length of the socket’s listen queue depends. It corresponds to the n argument of the model listen(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.17.4 Model details The following errors are not modelled: • In Posix, EACCES may be returned if the calling process does not have the appropriate privileges. This is not modelled here. • In Posix, EDESTADDRREQ shall be returned if the socket is not bound to a local address and the protocol does not support listening on an unbound socket. WinXP returns an EINVAL error in this case; FreeBSD and Linux autobind the socket if listen() is called on an unbound socket. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen 1 193 • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.17.5 Summary listen 1 tcp: fast succeed Successfully put socket in LISTEN state listen 1b tcp: fast succeed Successfully update backlog value listen 1c tcp: fast succeed Successfully put socket in the LISTEN state from any non- {CLOSED;LISTEN} state on FreeBSD listen 2 tcp: fast fail Fail with EINVAL on WinXP: socket not bound to local port listen 3 tcp: fast fail Fail with EINVAL on Linux or EISCONN on WinXP: socket not in CLOSED or LISTEN state listen 4 tcp: fast fail Fail with EADDRINUSE on Linux: another socket already listening on local port listen 5 tcp: fast fail Fail with EINVAL on BSD: socket shutdown for writing or bsd cantconnect flag set listen 7 udp: fast fail Fail with EOPNOTSUPP: listen() called on UDP socket 15.17.6 Rules listen 1 tcp: fast succeed Successfully put socket in LISTEN state h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,F, cantrcvmore, TCP Sock(CLOSED, cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)))]; listen := listen0]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es,F, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)))]; listen := sid :: listen0; bound := bound ]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (bsd arch h.arch ∨ cantrcvmore = F) ∧ ¬(windows arch h.arch ∧ IS NONE ps1) ∧ (bsd arch h.arch =⇒ cb.bsd cantconnect = F) ∧ p1 ∈ autobind(ps1,PROTO TCP, socks\\sid) ∧ (if ps1 = ∗ then bound = sid :: h.bound else bound = h.bound) ∧ lis =〈[ q0 :=[ ]; q :=[ ]; qlimit :=n]〉 Description From thread tid , which is currently in the Run state, a listen(fd ,n) call is made. fd is a file descriptor referring to a TCP socket identified by sid which is not shutdown for writing, is in the CLOSED state, has an empty send and receive queue, and does not have its send or receive urgent pointers set. The host’s list of listening sockets is listen0. Either the socket is bound to a local port p1, or it can be autobound to a local port p1. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen 1c 194 The call succeeds: a tid ·listen(fd ,n) transition is made, leaving the thread in state Ret(OK()). The socket is put in the LISTEN state, with an empty listen queue, lis, with n as its backlog. sid is added to the host’s list of listening sockets, listen := sid :: listen0, and if autobinding occurred, it is also added to the host’s list of bound sockets, h.bound , to create a new list bound . Variations FreeBSD The bsd cantconnect flag in the control block must not be set to T (from an earlier connection establishment attempt). WinXP As above, except that the socket must be bound to a local port p1. If it is not bound then autobinding will not occur: the call will fail with an EINVAL error. See also listen 2 (p195). listen 1b tcp: fast succeed Successfully update backlog value h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,F, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)))]; listen := listen0]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,F, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis ′, [ ], ∗, [ ], ∗,NO OOBDATA)))]; listen := sid :: listen0]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (bsd arch h.arch ∨ cantrcvmore = F) ∧ lis ′ = lis 〈[ qlimit :=n]〉 Description From thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket identified by sid which is currently in the LISTEN state. The host has a list of listening sockets, listen0. The call succeeds. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(OK()). The backlog value of the socket’s listen queue, lis.qlimit is updated to be n, resulting in a new listen queue lis ′ for the socket. sid is added to the head of the host’s listen queue, listen := sid :: listen0. listen 1c tcp: fast succeed Successfully put socket in the LISTEN state from any non- {CLOSED;LISTEN} state on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]; listen := listen0]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]; listen := sid :: listen0]〉 bsd arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen 3 195 h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore,TCP PROTO(tcp sock)) ∧ tcp sock .st /∈ {CLOSED;LISTEN} ∧ sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ st :=LISTEN; lis := ↑ lis]〉)]〉 ∧ lis =〈[ q0 :=[ ]; q :=[ ]; qlimit :=n]〉 Description On BSD, calling listen() always succeeds on a socket regardless of its state: the state of the socket is just changed to LISTEN. From thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket identified by sid which is currently in any non-{CLOSED;LISTEN} state. The call succeeds. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(OK()). The socket state is updated to LISTEN, with empty listen queues. listen 2 tcp: fast fail Fail with EINVAL on WinXP: socket not bound to local port h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = sock ∧ proto of sock .pr = PROTO TCP ∧ sock .ps1 = ∗ Description On WinXP, from thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket sock , identified by sid , which is not bound to a local port: sock .ps1 = ∗. The call fails with an EINVAL error. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations FreeBSD This rule does not apply. Linux This rule does not apply. listen 3 tcp: fast fail Fail with EINVAL on Linux or EISCONN on WinXP: socket not in CLOSED or LISTEN state h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = sock ∧ sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .st /∈ {CLOSED;LISTEN} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen 4 196 ¬(bsd arch h.arch) ∧ (if windows arch h.arch then err = EISCONN else if linux arch h.arch then err = EINVAL else F) Description From thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket sock , identified by sid , which is not in the CLOSED or LISTEN state. On Linux the call fails with an EINVAL error; on WinXP it fails with an EISCONN error. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(FAIL err) where err is one of the above errors. Variations FreeBSD This rule does not apply: listen() can be called from any state. Linux As above: the call fails with an EINVAL error. WinXP As above: the call fails with an EISCONN error. listen 4 tcp: fast fail Fail with EADDRINUSE on Linux: another socket already listening on local port h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRINUSE))sched timer)]〉 linux arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = sock ∧ sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .st = CLOSED ∧ sock .ps1 = ↑ p1 ∧ (∃sid ′ sock ′ tcp sock ′.h.socks[sid ′] = sock ′ ∧ sock ′.pr = TCP PROTO(tcp sock ′) ∧ tcp sock ′.st = LISTEN ∧ sock ′.ps1 = sock .ps1 ∧ ¬(∃i1 i ′1.i1 6= i ′1 ∧ sock .is1 = ↑ i1 ∧ sock ′.is1 = ↑ i ′1)) Description On Linux, from thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket sock , identified by sid , in state CLOSED and bound to local port p1. There is another TCP socket, sock ′, in the host’s finite map of sockets, h.socks that is also bound to local port p1, and is in the LISTEN state. The two sockets, sock and sock ′, are not bound to different IP addresses: either they are both bound to the same IP address, one is bound to an IP address and the other is not bound to an IP address, or neither is bound to an IP address. The call fails with an EADDRINUSE error. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(FAIL EADDRINUSE). Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ listen 7 197 FreeBSD This rule does not apply. WinXP This rule does not apply. listen 5 tcp: fast fail Fail with EINVAL on BSD: socket shutdown for writing or bsd cantconnect flag set h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[cantsndmore := cantsndmore; pr :=TCP PROTO(tcp sock 〈[st := st ]〉)]〉)]]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer); socks := socks ⊕ [(sid , sock 〈[cantsndmore := cantsndmore; pr :=TCP PROTO(tcp sock 〈[st := st ]〉)]〉)]]〉 bsd arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ st ∈ {CLOSED;LISTEN} ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (cantsndmore = T ∨ tcp sock .cb.bsd cantconnect = T) Description On FreeBSD, from thread tid , which is in the Run state, a listen(fd ,n) call is made. fd refers to a TCP socket sock , identified by sid , which is in the CLOSED or LISTEN state. The socket is either shutdown for writing or has its bsd cantconnect flag set due to an earlier connection-establishment attempt. The call fails with an EINVAL error. A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations Linux This rule does not apply. WinXP This rule does not apply. listen 7 udp: fast fail Fail with EOPNOTSUPP: listen() called on UDP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·listen(fd ,n)−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EOPNOTSUPP))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of(h.socks[sid ]).pr = PROTO UDP Description Consider a UDP socket sid , referenced by fd . From thread tid , which is in the Run state, a listen(fd ,n) call is made. The call fails with an EOPNOTSUPP error. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect() (TCP and UDP) 198 A tid ·listen(fd ,n) transition is made, leaving the thread state Ret(FAIL EOPNOTSUPP). Calling listen() on a socket for a connectionless protocol (such as UDP) is meaningless and is thus an unsupported (EOPNOTSUPP) operation. 15.18 pselect() (TCP and UDP) pselect : (fd list ∗ fd list ∗ fd list ∗ (int ∗ int) option ∗ signal list option)→ (fd list ∗ (fd list ∗ fd list)) A call to pselect(readfds,writefds, exceptfds, timeout , sigmask) waits for one of the file descriptors in readfds to be ready for reading, writefds to be ready for writing, exceptfds to have a pending error, or for timeout to expire. The readfds argument is a set of file descriptors to be checked for being ready to read. Broadly, a file descriptor fd is ready for reading if a recv(fd, , ) call on the socket would not block, i.e. if there is data present or a pending error. The writefds argument is a set of file descriptors to be checked for being ready to write. Broadly, a file descriptor fd is ready for writing if a send(fd, , , ) call would not block. The exceptfds argument is a set of file descriptors to be checked for exception conditions pending. A file descriptor fd has an exception condition pending if there exists out-of-band data for the socket it refers to or the socket is still at the out-of-band mark. The timeout argument specifies how long the pselect() call should block waiting for a file descriptor to be ready. If timeout = ∗ then the call should block until one of the file descriptors in the readfds, writefds, or exceptfds becomes ready. If timeout = ↑(s,ns) then the call should block for at most s seconds and ns nanoseconds. However, system activity can lengthen the timeout interval by an indeterminate amount. The sigmask argument is used to set the signal mask, the set of signals to be blocked. In the implementa- tions, if sigmask = ↑(siglist) then pselect() first replaces the current signal mask by siglist before proceeding with the call, and then restores the original signal mask upon return. This specification does not model the dynamic behaviour of signals, however, and so we specify the behaviour of pselect() only for an empty signal mask. A return value of (readfds ′, (writefds ′, exceptfds ′)) from a pselect() call signifies that: the file descriptors in readfds ′ are ready for reading; the file descriptors in writefds ′ are reading for writing; and the file descriptors in exceptfds ′ have exceptional conditions pending. If a pselect([ ], [ ], [ ],Some(s,ns), sigmask) call is made then the call will block for s seconds and ns nano- seconds or until a signal occurs. To perform a poll, a pselect(readfds,writefds, exceptfds,Some(0, 0), sigmask) call should be made. 15.18.1 Errors A call to pselect() can fail with the errors below, in which case the corresponding exception is raised: EBADF One or more of the file descriptors in a set is not a valid file descriptor. EINVAL Time-out not well-formed, file descriptor out of range, or on WinXP all file descrip- tor sets are empty. ENOTSOCK One or more of the file descriptors in a set is not a valid socket. EINTR The system was interrupted by a caught signal. 15.18.2 Common cases pselect() is called and returns immediately: pselect 1 ; return 1 pselect() blocks and then times out before any of the file descriptors become ready: pselect 2 ; pselect 3 ; return 1 Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect() (TCP and UDP) 199 pselect() blocks, TCP data is received from the network and processed, making a file descriptor ready for reading, and then pselect() returns: pselect 1 ; deliver in 99 ; deliver in 3 ; pselect 2 ; return 1 pselect() blocks, UDP data is received from the network and processed, making a file descriptor ready for reading, and then pselect() returns: pselect 1 ; deliver in 99 ; deliver in udp 1 ; pselect 2 ; return 1 pselect() blocks, TCP data is sent to the network, an acknowledgement is received and processed, mak- ing a file descriptor ready for writing, and then pselect() returns: pselect 1 ; deliver out 1 ; deliver out 99 ; deliver in 99 ; deliver in 3 ; pselect 2 ; return 1 15.18.3 API Posix: int pselect(int nfds, fd_set *restrict readfds, fd_set *restrict writefds, fd_set *restrict errorfds, const struct timespec *restrict timeout, const sigset_t *restrict sigmask); FreeBSD: int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); Linux: int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *timeout, const sigset_t *sigmask); WinXP: int select(int nfds, fd_set* readfds, fd_set* writefds, fd_set* exceptfds, const struct timeval* timeout); In the Posix interface: • nfds specifies the range of file descriptors to be tested. The first nfds file descriptors shall be checked in each set. This is not necessary in the model pselect() as the file descriptor sets are implemented as a list rather than the integer arrays in Posix pselect(). • readfds on input specifies the file descriptors to be checked for being ready to read, corresponding to the readfds argument of the model pselect(). On output readfds indicates which of the file descriptors specified on input are ready to read, corresponding to the first fd list in the return type of the model pselect(). An fd_set is an integer array, where each bit of each integer corresponds to a file descriptor. If that bit is set then that file descriptor should be checked. FD_CLR(), FD_ISSET(), FD_SET(), and FD_ZERO() are provided to set bits in an fd_set. • writefds on input specifies the file descriptors to be checked for being ready to write, corresponding to the writefds argument of the model pselect(). On output writefds indicates which of the file descriptors specified on input are ready to write, corresponding to the second fd list in the return type of the model pselect(). • errorfds on input specifies the file descriptors to be checked for pending error conditions, corresponding to the exceptfds argument of the model pselect(). On output exceptfds indicated which of the file descriptors specified on input have pending error conditions, corresponding to the third fd list in the return type of the model pselect(). • timeout specifies how long the pselect() call shall block before timing out, corresponding to the timeout argument of the model pselect(). If the timeout parameter is a null pointer this corresponds to timeout = ∗; if the timeout parameter is not a null pointer, then its two fields, timeout.tv_sec (the number of seconds) and timeout.tv_nsec (the number of nano-seconds), correspond to timeout = ↑(s,ns) where s is the number of seconds, and ns is the number of nano-seconds. • sigmask is the signal-mask to be used when examining the file descriptors, corresponding to the sigmask argument of the model pselect(). If sigmask is a null pointer then sigmask = ∗ in the model; if sigmask is not a null pointer then sigmask = ↑ sigs in the model where sigs is the signal-mask to use. • if the call is successful then the returned int is the number of bits set in the three fd_set arguments: the total number of file descriptors ready for reading, writing, or having exceptional conditions pending. Otherwise, the returned int is -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The Linux interface is similar. On FreeBSD and WinXP there is no pselect() call, only a select() call which is the same as the interface described above, except without the sigmask argument. The select() call Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect 1 200 corresponds to calling the model pselect() with sigmask = ∗. Additionally, the timeout argument is a pointer to a timeval structure which has two members tv_sec and tv_usec, specifying the seconds and micro-seconds to block for, rather than seconds and nano-seconds. The FreeBSD man page for select() warns of the following bug: ”Version 2 of the Single UNIX Specifica- tion (”SUSv2”) allows systems to modify the original timeout in place. Thus, it is unwise to assume that the timeout value will be unmodified by the select() call.” 15.18.4 Model details If the pselect() call blocks then the thread enters state PSelect2(readfds,writefds, exceptfds) where: • readfds : fd list is the list of file descriptors to be checked for being ready to read. • writefds : fd list is the list of file descriptors to be checked for being ready to write. • exceptfds : fd list is the list of file descriptors to be checked for pending exceptional conditions. The following errors are not modelled: • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.18.5 Summary pselect 1 all: fast succeed One or more file descriptors immediately ready, or no timeout set soreadable check whether a socket is readable sowriteable check whether a socket is writable soexceptional check whether a socket is exceptional pselect 2 all: block Normal case pselect 3 all: slow nonurgent suc- ceed Something becomes ready or pselect times out pselect 4 all: fast fail Fail with EINVAL: Timeout not well-formed pselect 5 all: fast fail Fail with EINVAL: File descriptor out of range pselect 6 all: fast fail Fail with EBADF or ENOTSOCK: Bad file descriptor 15.18.6 Rules pselect 1 all: fast succeed One or more file descriptors immediately ready, or no timeout set h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(readfds ′′,writefds ′′, exceptfds ′′))) sched timer )]〉 (tltimeopt wf timeout ∨ windows arch h.arch) ∧ sigmask = ∗ ∧ ¬(∃fd n.(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ if windows arch h.arch then n = (max(length readfds)(max(length writefds)(length exceptfds))) ∧ n ≥ (FD SETSIZE h.arch) else fd = FD n ∧ n ≥ FD SETSIZE h.arch) ∧ badreadfds = filter(λfd .fd /∈ dom(h.fds))readfds ∧ badwritefds = filter(λfd .fd /∈ dom(h.fds))writefds ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect 1 201 badexceptfds = filter(λfd .fd /∈ dom(h.fds))exceptfds ∧ (bsd arch h.arch ∨ (badreadfds = [ ] ∧ badwritefds = [ ] ∧ badexceptfds = [ ])) ∧ ¬(∃fd .(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ fd /∈ dom(h.fds)) ∧ readfds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ soreadable h.arch sock)readfds ∧ writefds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ sowriteable h.arch sock)writefds ∧ exceptfds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ soexceptional h.arch sock)exceptfds ∧ (windows arch h.arch =⇒ readfds 6= [ ] ∧ writefds 6= [ ] ∧ exceptfds 6= [ ]) ∧ (readfds ′ 6= [ ] ∨ writefds ′ 6= [ ] ∨ exceptfds ′ 6= [ ] ∨ timeout = ↑(0, 0)) ∧ if windows arch h.arch then readfds ′′ = readfds ′ ∧ writefds ′′ = writefds ′ ∧ exceptfds ′′ = exceptfds ′ else readfds ′′ = INSERT ORDERED readfds ′ readfds badreadfds ∧ writefds ′′ = INSERT ORDERED writefds ′ writefds badwritefds ∧ exceptfds ′′ = INSERT ORDERED exceptfds ′ exceptfds badexceptfds Description From thread tid , which is in the Run state, a pselect(readfds,writefds, exceptfds, timeout , sigmask) call is made. The time-out is well-formed and no signal mask was set: sigmask = ∗. All of the file descriptors in the sets readfds, writefds, and exceptfds are greater than the maximum allowed file descriptor in a set for the architecure, FD SETSIZE, and all of them are valid file descriptors: they are in the host’s finite map of file descriptors, h.fds. The call returns, without blocking, three sets: readfds ′′, writefds ′′, and exceptfds ′′. readfds ′′ is the set of valid file descriptors in readfds that are ready for reading: a blocking recv(fd , , ) call would not block; see soreadable (p202) for details. writefds ′′ is the set of valid file descriptors in writefds that are ready for writing: a blocking send(fd , , ) call would not block; see sowriteable (p202) for details. exceptfds ′′ is the set of valid file descriptors in exceptfds that have pending exceptional conditions; see soexceptional (p203) for details. One of these three sets must be non-empty or else a zero timeout was specified, timeout = ↑(0, 0). A tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask) transition is made, leaving the thread state Ret(OK(readfds ′′,writefds ′′, exceptfds ′′)). Variations FreeBSD Invalid file descriptors (ones not in the host’s finite map of file descriptors, h.fds) may be present in the sets readfds, writefds, and exceptfds, and all such file descrip- tors will then be included in the return sets readfds ′′, writefds ′′, and exceptfds ′′. WinXP On WinXP FD SETSIZE is the maximum number of file descriptors in a set, so none of the sets readfds, writefds, and exceptfds has more than FD SETSIZE members. Additionally, all three sets may not be empty. The time-out need not be well-formed because one or more file descriptors is im- mediately ready. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ sowriteable 202 – check whether a socket is readable : soreadable arch sock = case sock .pr of TCP PROTO(tcp)→ (length tcp.rcvq ≥ sock .sf .n(SO RCVLOWAT) ∨ sock .cantrcvmore ∨ (linux arch arch ∧ tcp.st = CLOSED) ∨ (tcp.st = LISTEN ∧ ∃lis.tcp.lis = ↑ lis ∧ lis.q 6= [ ]) ∨ sock .es 6= ∗) ‖ UDP PROTO(udp)→ (udp.rcvq 6= [ ] ∨ sock .es 6= ∗ ∨ (sock .cantrcvmore ∧ ¬windows arch arch)) Description A TCP socket sock is readable if: (1) the length of its receive queue is greater than or equal to the minimum number of bytes for socket input operations, sf .n(SO RCVLOWAT); (2) it has been shut down for reading; (3) on Linux, it is in the CLOSED state; it is in the LISTEN state and has at least one connection on its completed connection queue; or (4) it has a pending error. A UDP socket sock is readable if its receive queue is not empty, it has a pending error, or it has been shutdown for reading. Variations Linux On all OSes, attempting to read from a closed socket yields an immediate error. Only on Linux, however, does soreadable return T in this case. WinXP The socket will not be readable if it has been shutdown for reading. – check whether a socket is writable : sowriteable arch sock = case sock .pr of TCP PROTO(tcp)→ ((tcp.st ∈ {ESTABLISHED;CLOSE WAIT} ∧ sock .sf .n(SO SNDBUF)− length tcp.sndq ≥ sock .sf .n(SO SNDLOWAT)) ∨ (* change to send buffer space *) (if linux arch arch then ¬sock .cantsndmore else sock .cantsndmore) ∨ (linux arch arch ∧ tcp.st = CLOSED) ∨ sock .es 6= ∗) ‖ UDP PROTO(udp)→ T Variations Linux On all OSes, attempting to write to a closed socket yields an immediate error. Only on Linux, however, does sowriteable return T in this case. On Linux, if the outgoing half of the connection has been closed by the application, the socket becomes non-writeable, whereas on other OSes it becomes writeable (because an immediate error would result from writing). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect 3 203 – check whether a socket is exceptional : soexceptional arch sock = case sock .pr of TCP PROTO(tcp)→ (tcp.st = ESTABLISHED ∧ (tcp.rcvurp = ↑ 0 ∨ (∃c.tcp.iobc = OOBDATA c))) ‖ UDP PROTO(udp)→ F Description A TCP socket has a pending exceptional condition if it is in state ESTABLISHED and has a pending byte of out-of-band data. A UDP socket never has a pending exceptional condition. pselect 2 all: block Normal case h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (PSelect2(readfds,writefds, exceptfds))kern timer d′)]〉 tltimeopt wf timeout ∧ d ′ =min(time of tltimeopt timeout) pselect timeo t max∧ sigmask = ∗ ∧ ¬(∃fd n.(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ if windows arch h.arch then n =max(length readfds)(max(length writefds)(length exceptfds)) ∧ n ≥ FD SETSIZE h.arch else fd = FD n ∧ n ≥ FD SETSIZE h.arch) ∧ ¬(∃fd .(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ fd /∈ dom(h.fds)) ∧ (windows arch h.arch =⇒ readfds 6= [ ] ∧ writefds 6= [ ] ∧ exceptfds 6= [ ]) Description From thread tid , which is in the Run state, a pselect(readfds,writefds, exceptfds, timeout , sigmask) call is made. The time-out is well-formed and no signal mask was set: sigmask = ∗. All of the file descriptors in the sets readfds, writefds, and exceptfds are greater than the maximum allowed file descriptor in a set for the architecure, FD SETSIZE, and all of them are valid file descriptors: they are in the host’s finite map of file descriptors, h.fds. The call blocks: a tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask) transition is made, leaving the thread state PSelect2(readfds,writefds, exceptfds). Variations WinXP On WinXP FD SETSIZE is the maximum number of file descriptors in a set, so none of the sets readfds, writefds, and exceptfds has more than FD SETSIZE members. Additionally, all three sets may not be empty. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect 4 204 pselect 3 all: slow nonurgent succeed Something becomes ready or pselect times out h 〈[ts := ts ⊕ (tid 7→ (PSelect2(readfds,writefds, exceptfds))d)]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(readfds ′′,writefds ′′, exceptfds ′′))) sched timer )]〉 readfds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ soreadable h.arch sock)readfds ∧ writefds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ sowriteable h.arch sock)writefds ∧ exceptfds ′ = filter(λfd .∃fid ff sid sock . fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ soexceptional h.arch sock)exceptfds ∧ (readfds ′ 6= [ ] ∨ writefds ′ 6= [ ] ∨ exceptfds ′ 6= [ ] ∨ timer expires d) ∧ badreadfds = filter(λfd .fd /∈ dom(h.fds))readfds ∧ badwritefds = filter(λfd .fd /∈ dom(h.fds))writefds ∧ badexceptfds = filter(λfd .fd /∈ dom(h.fds))exceptfds ∧ if windows arch h.arch then readfds ′′ = readfds ′ ∧ writefds ′′ = writefds ′ ∧ exceptfds ′′ = exceptfds ′ else readfds ′′ = INSERT ORDERED readfds ′ readfds badreadfds ∧ writefds ′′ = INSERT ORDERED writefds ′ writefds badwritefds ∧ exceptfds ′′ = INSERT ORDERED exceptfds ′ exceptfds badexceptfds Description Thread tid is blocked in state PSelect2(readfds,writefds, exceptfds). The call now returns three sets: readfds ′′, writefds ′′, and exceptfds ′′. readfds ′′ is the set of valid file descriptors in readfds that are ready for reading: a blocking recv(fd , , ) call would not block; see soreadable (p202) for details. writefds ′′ is the set of valid file descriptors in writefds that are ready for writing: a blocking send(fd , , ) call would not block; see sowriteable (p202) for details. exceptfds ′′ is the set of valid file descriptors in exceptfds that have pending exceptional conditions; see soexceptional (p203) for details. Either one of these three sets is not empty or the timer d , which was set to the timeout value specified when the pselect() call was made, has expired. A τ transition is made, leaving the thread state Ret(OK(readfds ′′,writefds ′′, exceptfds ′′)). Variations FreeBSD Invalid file descriptors (ones not in the host’s finite map of file descriptors, h.fds) may be present in the sets readfds, writefds, and exceptfds, and all such file descrip- tors will then be included in the return sets readfds ′′, writefds ′′, and exceptfds ′′. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ pselect 6 205 pselect 4 all: fast fail Fail with EINVAL: Timeout not well-formed h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 ¬(tltimeopt wf timeout) Description From thread tid , which is in the Run state, a pselect(readfds,writefds, exceptfds, timeout , sigmask) call is made. The timeout value is not well-formed: timeout = ↑(s,ns) where either s is negative; ns is negative; or ns > 1000000000. The call fails with an EINVAL error. A tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask) transition is made, leaving the thread state Ret(FAIL EINVAL). Model details Such negative values are not admitted by the POSIX interface type but are by the model interface type (with (int ∗ int) option timeouts), so we check and generate EINVAL in the wrapper. pselect 5 all: fast fail Fail with EINVAL: File descriptor out of range h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 (∃fd n.(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ if windows arch h.arch then n =max(length readfds)(max(length writefds)(length exceptfds)) ∧ n ≥ FD SETSIZE h.arch else fd = FD n ∧ n ≥ FD SETSIZE h.arch) ∨ (windows arch h.arch ∧ readfds = [ ] ∧ writefds = [ ] ∧ exceptfds = [ ]) Description From thread tid , which is in the Run state, a pselect(readfds,writefds, exceptfds, timeout , sigmask) call is made. One or more of the file descriptors in readfds, writefds, or exceptfds is greater than the architecure dependent FD SETSIZE, the maximum file descriptor that can be specified in a pselect() call. The call fails with an EINVAL error. A tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations WinXP OnWinXP FD SETSIZE is the maximum number of file descriptors in a set, so one of the sets readfds, writefds, or exceptfds has more than FD SETSIZE members. Also, the call will fail with EINVAL if the sets readfds, writefds, and exceptfds are all empty. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (TCP only) 206 pselect 6 all: fast fail Fail with EBADF or ENOTSOCK: Bad file descriptor h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 ¬bsd arch h.arch ∧ (∃fd .(fd ∈ readfds ∨ fd ∈ writefds ∨ fd ∈ exceptfds) ∧ fd /∈ dom(h.fds)) ∧ (if windows arch h.arch then err = ENOTSOCK else err = EBADF) Description From thread tid , which is in the Run state, a pselect(readfds,writefds, exceptfds, timeout , sigmask) call is made. There exists a file descriptor fd in readfds, writefds, or exceptfds that is not a valid file descriptor. The call fails with an EBADF error on FreeBSD and Linux and an ENOTSOCK error on WinXP. A tid ·pselect(readfds,writefds, exceptfds, timeout , sigmask) transition is made, leaving the thread state Ret(FAIL err) where err is one of the above errors. Variations FreeBSD This rule does not apply. Linux As above: the call fails with an EBADF error. WinXP As above: the call fails with an ENOTSOCK error. 15.19 recv() (TCP only) recv : fd ∗ int ∗msgbflag list→ (string ∗ ((ip ∗ port) ∗ bool) option) A call to recv(fd,n, opts) reads data from a socket’s receive queue. This section describes the behaviour for TCP sockets. Here fd is a file descriptor referring to a TCP socket to read data from, n is the number of bytes of data to read, and opts is a list of message flags. Possible flags are: • MSG DONTWAIT: Do not block if there is no data available. • MSG OOB: Return out-of-band data. • MSG PEEK: Read data but do not remove it from the socket’s receive queue. • MSG WAITALL: Block until all n bytes of data are available. The returned string is the data read from the socket’s receive queue. The ((ip∗port)∗bool) option is always returned as ∗ for a TCP socket. In order to receive data, a TCP socket must be connected to a peer; otherwise, the recv() call will fail with an ENOTCONN error. If the socket has a pending error then the recv() call will fail with this error even if there is data available. If there is no data available and non-blocking behaviour is not enabled (the socket’s O NONBLOCK flag is not set and the MSG DONTWAIT flag was not used) then the recv() call will block until data arrives or an error occurs. If non-blocking behaviour is enabled and there is no data or error then the call will fail with an EAGAIN error. TheMSG OOB flag can be set in order to receive out-of-band data; for this, the socket’s SO OOBINLINE cannot be set (i.e. out-of-band data must not be being returned inline). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (TCP only) 207 15.19.1 Errors A call to recv() can fail with the errors below, in which case the corresponding exception is raised: Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (TCP only) 208 EAGAIN Non-blocking recv() call made and no data available; or out-of-band data requested and none is available. EINVAL Out-of-band data requested and SO OOBINLINE flag set or the out-of-band data has already been read. ENOTCONN Socket not connected. ENOTSOCK The file descriptor passed does not refer to a socket. EBADF The file descriptor passed is not a valid file descriptor. EINTR The system was interrupted by a caught signal. ENOBUFS Out of resources. ENOMEM Out of resources. 15.19.2 Common cases A TCP socket is created and then connected to a peer; a recv() call is made to receive data from that peer: socket 1 ; return 1 ; connect 1 ; return 1 ; recv 1 ; . . . 15.19.3 API Posix: ssize_t recv(int socket, void *buffer, size_t length, int flags); FreeBSD: ssize_t recv(int s, void *buf, size_t len, int flags); Linux: int recv(int s, void *buf, size_t len, int flags); WinXP: int recv(SOCKET s, char* buf, int len, int flags); In the Posix interface: • socket is the file descriptor of the socket to receive from, corresponding to the fd argument of the model recv(). • buffer is a pointer to a buffer to place the received data in, which upon return contains the data received on the socket. This corresponds to the string return value of the model recv(). • length is the amount of data to be read from the socket, corresponding to the int argument of the model recv(); it should be at most the length of buffer. • flags is a disjunction of the message flags that are set for the call, corresponding to the msgbflag list argument of the model recv(). • the returned ssize_t is either non-negative, in which case it is the the amount of data that was received by the socket, or it is -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux and WinXP interfaces are similar modulo argument renaming, except where noted above. There are other functions used to receive data on a socket. recvfrom() is similar to recv() except it returns the source address of the data; this is used for UDP but is not necessary for TCP as the source address will always be the peer the socket has connected to. recvmsg(), another input function, is a more general form of recv(). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 1 209 15.19.4 Model details If the call blocks then the thread enters state Recv2(sid,n, opts) where: • sid : sid is the identifier of the socket that the recv() call was made on, • n : num is the number of bytes to be read, and • opts : msgbflag list is the list of message flags. The following errors are not modelled: • On FreeBSD, Linux, and WinXP, EFAULT can be returned if the buffer parameter points to memory not in a valid part of the process address space. This is an artefact of the C interface to ioctl() that is excluded by the clean interface used in the model recv(). • In Posix, EIO may be returned to indicated that an I/O error occurred while reading from or writing to the file system; this is not modelled here. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. The following Linux message flags are not modelled: MSG_NOSIGNAL, MSG_TRUNC, and MSG_ERRQUEUE. 15.19.5 Summary recv 1 tcp: fast succeed Successfully return data from the socket without blocking recv 2 tcp: block Block, entering state Recv2 as not enough data is available recv 3 tcp: slow nonurgent succeed Blocked call returns from Recv2 state recv 4 tcp: fast fail Fail with EAGAIN: non-blocking call would block waiting for data recv 5 tcp: fast succeed Successfully read non-inline out-of-band data recv 6 tcp: fast fail Fail with EAGAIN or EINVAL: recv() called with MSG OOB set and out-of-band data is not available recv 7 tcp: fast fail Fail with ENOTCONN: socket not connected recv 8 tcp: fast fail Fail with pending error recv 8a tcp: slow urgent fail Fail with pending error from blocked state recv 9 tcp: fast fail Fail with ESHUTDOWN: socket shut down for reading on WinXP 15.19.6 Rules recv 1 tcp: fast succeed Successfully return data from the socket without blocking h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode str , ∗)))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq ′′, rcvurp′, iobc)))]]〉 ((st ∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSING; TIME WAIT;CLOSE WAIT;LAST ACK} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 1 210 is1 = ↑ i1 ∧ ps1 = ↑ p1 ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2) ∨ (st = CLOSED)) ∧ n = clip int to num n0 ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ MSG OOB /∈ opts ∧ (* We return now if we can fill the buffer, or we can reach the low-water mark (usually ignored if MSG WAITALL is set), or we can reach EOF or the next urgent-message boundary. Pending errors are not checked. *) let have all data = (length rcvq ≥ n) in let have enough data = (length rcvq ≥ sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length rcvq) in (have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead ∨ cantrcvmore) ∧ ((str , rcvq ′) = SPLIT(min n (case rcvurp of ∗ → length rcvq ‖ ↑ om → if om = 0 then (length rcvq) else min om(length rcvq))) rcvq) ∧ rcvq ′′ = (if MSG PEEK ∈ opts then rcvq else rcvq ′) ∧ rcvurp′ = (case rcvurp of ∗ → ∗ ‖ ↑ om → if om = 0 then ∗ else if om ≤ length str then ↑ 0 else ↑(om − length str)) Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made where out-of-band data is not requested. fd refers to a synchronised TCP socket sid with binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2) and no pending error. Alternatively the socket is uninitialised and in state CLOSED. The call can return immediately because either: (1) there are at least n bytes of data in the socket’s receive queue (the have all data case above); (2) the length of the socket’s receive queue is greater than or equal to the minimum number of bytes for socket recv() operations, sf .n(SO RCVLOWAT), and the call does not have to return all n bytes of data; either because (i) the MSG WAITALL flag is not set in opts0, (ii) the number of bytes requested is greater than the number of bytes in the socket’s receive queue, or (iii) on non-FreeBSD architectures the MSG PEEK flag is set in opts0 (the have enough data ∧ partial data ok case above); (3) there is urgent data available in the socket’s receive queue (the urgent data ahead case above); or (4) the socket has been shutdown for reading. The call succeeds, returning a string, implode str , which is either: (5) the smaller of the first n bytes of the socket’s receive queue or its entire receive queue, if the urgent pointer is not set or the socket is at the urgent mark; or (6) the smaller of the first n bytes of the the socket’s receive queue, the data in its receive queue up to the urgent mark, and its entire receive queue, if the urgent mark is set and the socket is not at the urgent mark. A tid ·recv(fd ,n0, opts0) transition is made leaving the thread state Ret(OK(implode str , ∗)). If the MSG PEEK flag was set in opts0 then the socket’s receive queue remains unchanged; otherwise, the data str is removed from the head of the socket’s receive queue, rcvq , to leave the socket with new receive queue rcvq ′. If the receive urgent pointer was not set or was set to ↑ 0 then it will be set to ∗; if it was set to ↑ om and om is less than the length of the returned string then it will be set to ↑ 0 (because the returned string was the data in the receive queue up to the urgent mark); otherwise it will be set to ↑(om − length str). Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 3 211 POSIX specifies an unsigned type for n0 and this is one possible model thereof. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. The data itself is represented as a byte list in the datagram but is returned a string: the implode function is used to do the conversion. recv 2 tcp: block Block, entering state Recv2 as not enough data is available h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Recv2(sid ,n, opts))never timer)]〉 n = clip int to num n0 ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ st ∈ {ESTABLISHED;SYN SENT;SYN RECEIVED;FIN WAIT 1;FIN WAIT 2} ∧ MSG OOB /∈ opts ∧ (* We block if not enough (see recv 1 (p209)) data is available and there is no pending error. *) let blocking = ¬(MSG DONTWAIT ∈ opts ∨ ff .b(O NONBLOCK)) in let have all data = (length rcvq ≥ n) in let have enough data = (length rcvq ≥ sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length rcvq) in blocking ∧ ¬(have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead ∨ cantrcvmore) ∧ es = ∗ Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made where out-of-band data is not requested. fd refers to a TCP socket sid in state ESTABLISHED, SYN SENT, SYN RECEIVED, FIN WAIT 1, or FIN WAIT 2, with binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2) and no pending error. The call is blocking: the MSG DONTWAIT flag is not set in opts0 and the socket’s O NONBLOCK flag is not set. The call cannot return immediately because: (1) there are less than n bytes of data in the socket’s re- ceive queue; (2) there are less than sf .n(SO RVCLOWAT ) (the minimum number of bytes for socket recv() operations) bytes of data in the socket’s receive queue or the call must return all n bytes of data: (i) the MSG WAITALL flag is set in opts0, (ii) the number of bytes requested is greater than the length of the socket’s receive queue, and (iii) the MSG PEEK flag is not set in opts0; (3) there is no urgent data ahead in the socket’s receive queue; and (4) the socket is not shutdown for reading. The call blocks in state Recv2 waiting for data; a tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Recv2(sid ,n, opts). Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. POSIX specifies an unsigned type for n0, whereas the model uses int. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. Variations FreeBSD In case (iii) above, the MSG PEEK flag may be set in opts0. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 3 212 recv 3 tcp: slow nonurgent succeed Blocked call returns from Recv2 state h 〈[ts := ts ⊕ (tid 7→ (Recv2(sid ,n, opts))d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode str , ∗)))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq ′′, rcvurp′, iobc)))]]〉 ((st ∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSING; TIME WAIT;CLOSE WAIT;LAST ACK} ∧ is1 = ↑ i1 ∧ ps1 = ↑ p1 ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2) ∨ st = CLOSED) ∧ (* We return at last if we now have enough (see recv 1 (p209)) data available. Pending errors are not checked. *) let have all data = (length rcvq ≥ n) in let have enough data = (length rcvq ≥ sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length rcvq) in (have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead ∨ cantrcvmore) ∧ (str , rcvq ′) = SPLIT(min n (case rcvurp of ∗ → length rcvq ‖ ↑ om → if om = 0 then (length rcvq) else min om(length rcvq))) rcvq ∧ rcvq ′′ = (if MSG PEEK ∈ opts then rcvq else rcvq ′) ∧ rcvurp′ = (case rcvurp of ∗ → ∗ ‖ ↑ om → if om = 0 then ∗ else if om ≤ length str then ↑ 0 else ↑(om − length str)) Description Thread tid is in the Recv2(sid ,n, opts) state after a previous recv() call blocked. sid refers either to a synchronised TCP socket with binding quad (↑ i1, ↑p1, ↑ i2, ↑ p2); or to a TCP socket in state CLOSED. Sufficient data is not available on the socket for the call to return: either (1) there is at least n bytes of data in the socket’s receive queue (the have all data case above); (2) the length of the socket’s receive queue is greater than or equal to the minimum number of bytes for socket recv() operations, sf .n(SO RCVLOWAT), and the call does not have to return all n bytes of data (the partial data ok case): either (i) the MSG WAITALL flag is not set in opts, (ii) the number of bytes requested is greater than the number of bytes in the socket’s receive queue, or (iii) on non-FreeBSD architectures the MSG PEEK flag is set in opts (the have enough data ∧ partial data ok case above); (3) there is urgent data available in the socket’s receive queue (the urgent data ahead cae above); or (4) the socket has been shutdown for reading. The data returned, str , is either: (1) the smaller of the first n bytes of the socket’s receive queue or its entire receive queue, if the urgent pointer is not set or the socket is at the urgent mark; or (2) the smaller of the first n bytes of the the socket’s receive queue, the data in its receive queue up to the urgent mark, and its entire receive queue, if the urgent mark is set and the socket is not at the urgent mark. A τ transition is made leaving the thread state Ret(OK(implode str , ∗)). If the MSG PEEK flag was set in opts then the socket’s receive queue remains unchanged; otherwise, the data str is removed from the head of the socket’s receive queue, rcvq , to leave the socket with new receive queue rcvq ′. If the receive urgent pointer was not set or was set to ↑ 0 then it will be set to ∗; if it was set to ↑ om and om is less than the Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 4 213 length of the returned string then it will be set to ↑ 0 (because the returned string was the data in the receive queue up to the urgent mark); otherwise it will be set to ↑(om − length str). Model details The data itself is represented as a byte list in the datagram but is returned a string: the implode function is used to do the conversion. recv 4 tcp: fast fail Fail with EAGAIN: non-blocking call would block waiting for data h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer)]〉 n = clip int to num n0 ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ st ∈ {ESTABLISHED;SYN SENT;SYN RECEIVED;FIN WAIT 1;FIN WAIT 2} ∧ MSG OOB /∈ opts ∧ (* We fail if we would otherwise block (see recv 2 (p211); these conditions are identical). *) let blocking = ¬(MSG DONTWAIT ∈ opts ∨ ff .b(O NONBLOCK)) in let have all data = (length rcvq ≥ n) in let have enough data = (length rcvq ≥ sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length rcvq) in ¬blocking ∧ ¬(have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead ∨ cantrcvmore) ∧ (rcvq = [ ] =⇒ es = ∗) Description From thead tid , which is in the Run state, a recv(fd ,n0, opts0) call is made where out-of-band data is not requested. fd refers to a TCP socket sid with binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2) and no pending error, which is in state ESTABLISHED, SYN SENT, SYN RECEIVED, FIN WAIT 1, or FIN WAIT 2. The recv() call is non-blocking: either theMSG DONTWAIT flag was set in opts0 or the socket’s O NONBLOCK flag is set. The call would block because: (1) there are less than n bytes of data in the socket’s receive queue; (2) there are less than sf .n(SO RVCLOWAT ) (the minimum number of bytes for socket recv() operations) bytes of data in the socket’s receive queue or the call must return all n bytes of data: (i) the MSG WAITALL flag is set in opts0, (ii) the number of bytes requested is greater than the length of the socket’s receive queue, and (iii) the MSG PEEK flag is not set in opts0; (3) there is no urgent data ahead in the socket’s receive queue; (4) the socket is not shutdown for reading; and (5) if the socket’s receive queue is empty then it has no pending error. The call fails with an EAGAIN error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL EAGAIN). Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. POSIX specifies an unsigned type for n0 and this is one possible model thereof. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 6 214 FreeBSD In case (iii) above, the MSG PEEK flag may be set in opts0. recv 5 tcp: fast succeed Successfully read non-inline out-of-band data h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode str , ∗)))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc′)))]]〉 n = clip int to num n0 ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ MSG OOB ∈ opts ∧ ¬sf .b(SO OOBINLINE) ∧ iobc = OOBDATA c ∧ str = (if n = 0 then [ ] else [c]) ∧ iobc′ = (if MSG PEEK ∈ opts then iobc else HAD OOBDATA) Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. fd refers to a TCP socket sid with binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2) and no pending error. Out-of-band data is requested: the MSG OOB flag is set in opts0, and out-of-band data is not being returned inline: ¬sf .b(SO OOBINLINE). There is a byte c of out-of-band data on the socket; if zero bytes of data were requested, n0 = 0, then the empty string is returned, otherwise c is returned. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(OK(implode str , ∗)) where implode str is the returned out-of-band data. If theMSG PEEK flag was set in opts0 then the byte of out-of- band data is left in place, iobc′ = iobc; otherwise it is removed and marked as read: iobc′ = HAD OOBDATA. Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. POSIX specifies an unsigned type for n0, whereas the model uses int. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. The data itself is represented as a byte list in the datagram but is returned a string: the implode function is used to do the conversion. recv 6 tcp: fast fail Fail with EAGAIN or EINVAL: recv() called with MSG OOB set and out-of- band data is not available h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 n = clip int to num n0 ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 8 215 h.socks[sid ] = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ MSG OOB ∈ opts ∧ (if sf .b(SO OOBINLINE) then (e = EINVAL) else case iobc of NO OOBDATA→ (e = if rcvurp = ∗ then EINVAL else EAGAIN) ‖ OOBDATA c → F ‖ HAD OOBDATA→ (e = EINVAL)) Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. fd refers to a TCP socket identified by sid with binding quad (↑ i1, ↑ p1, ↑ i2, ↑p2) and no pending error. The MSG OOB flag is set in opts0, indicating that out-of-band data should be returned, but no out-of-band data is available because either: (1) out-of-band data is being returned in-line (the sf .b(SO OOBINLINE) flag is set); (2) the out-of-band data on the socket has already been read; (3) there is no out-of-band data and the receive urgent pointer is set; or (4) there is no out-of-band data but the urgent pointer is set, corresponding to the case where the peer has advertised urgent data but that data has yet to arrive. The call fails with an EINVAL error in cases (1), (2), and (3); and a EAGAIN error in case (4) indicating that the recv() call should be made again to see if the data has now arrived. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL e) where e is one of the above errors. recv 7 tcp: fast fail Fail with ENOTCONN: socket not connected h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOTCONN))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = h.socks[sid ] ∧ TCP PROTO(tcp sock) = sock .pr ∧ (tcp sock .st = LISTEN ∨ (tcp sock .st = CLOSED ∧ sock .cantrcvmore = F) ) Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. fd refers to a TCP socket sock identified by sid which is either in the LISTEN state or is not shutdown for reading in the CLOSED state. The call fails with an ENOTCONN error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL ENOTCONN). recv 8 tcp: fast fail Fail with pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, ↑ e, cantsndmore, cantrcvmore,TCP PROTO(tcp sock)))]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 8a 216 h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore,TCP PROTO(tcp sock)))]]〉 opts = list to set opts0 ∧ n = clip int to num n0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ((tcp sock .st /∈ {CLOSED;LISTEN} ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2) ∨ tcp sock .st = CLOSED) ∧ (* We fail immediately if there is a pending error and we could not otherwise return data (see recv 1 (p209)). *) let rcvq = tcp sock .rcvq in let rcvurp = tcp sock .rcvurp in let blocking = ¬(MSG DONTWAIT ∈ opts ∨ ff .b(O NONBLOCK)) in let have all data = (length rcvq ≥ n) in let have enough data = (length rcvq ≥ sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length rcvq) in ¬(have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead) ∧ (blocking ∨ rcvq = [ ]) ∧ es = if MSG PEEK ∈ opts then ↑ e else ∗ Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. fd refers to a TCP socket that either is in state CLOSED or is in state other than CLOSED or LISTEN with peer address set to (↑ i2, ↑ p2). The socket has a pending error e. The call cannot immediately return data because: (1) there are less than n bytes of data in the socket’s receive queue; (2) there are less than sf .n(SO RVCLOWAT ) (the minimum number of bytes for socket recv() operations) bytes of data in the socket’s receive queue or the call must return all n bytes of data: (i) the MSG WAITALL flag is set in opts0, (ii) the number of bytes requested is greater than the length of the socket’s receive queue, and (iii) the MSG PEEK flag is not set in opts0; (3) there is no urgent data ahead in the socket’s receive queue; and (4) either the call is a blocking one: the MSG DONTWAIT flag is set in opts0 or the socket’s O NONBLOCK flag is set, or the socket’s receive queue is empty. The call fails, returning the pending error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL e). If the MSG PEEK flag was set in opts0 then the socket’s pending error remains, otherwise it is cleared. Model details The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. Variations FreeBSD In case (iii) above, the MSG PEEK flag may be set in opts0. recv 8a tcp: slow urgent fail Fail with pending error from blocked state h 〈[ts := ts ⊕ (tid 7→ (Recv2(sid ,n, opts))d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e; pr :=TCP PROTO(tcp sock)]〉)]]〉 Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 9 217 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := es; pr :=TCP PROTO(tcp sock)]〉)]]〉 (* We fail now if there is a pending error and we could not otherwise return data (see recv 1 (p209)). *) let have all data = (length tcp sock .rcvq ≥ n) in let have enough data = (length tcp sock .rcvq ≥ sock .sf .n(SO RCVLOWAT)) in let partial data ok = (MSG WAITALL /∈ opts ∨ n > sock .sf .n(SO RCVBUF) ∨ (¬(bsd arch h.arch) ∧MSG PEEK ∈ opts)) in let urgent data ahead = (∃om.tcp sock .rcvurp = ↑ om ∧ 0 < om ∧ om ≤ length tcp sock .rcvq) in ¬(have all data ∨ (have enough data ∧ partial data ok) ∨ urgent data ahead) ∧ (es = if MSG PEEK ∈ opts then ↑ e else ∗) Description Thread tid is blocked in state Recv2(sid ,n, opts) where sid identifies a socket with pending error ↑ e. The call fails, returning the pending error. Data cannot be returned because: (1) there are less than n bytes of data in the socket’s receive queue; (2) there are less than sf .n(SO RVCLOWAT ) (the minimum number of bytes for socket recv() operations) bytes of data in the socket’s receive queue or the call must return all n bytes of data: (i) the MSG WAITALL flag is set in opts, (ii) the number of bytes requested is greater than the length of the socket’s receive queue, and (iii) the MSG PEEK flag is not set in opts; and (3) there is no urgent data ahead in the socket’s receive queue. The thread returns from the blocked state, returning the pending error. A τ transition is made, leaving the thread state Ret(FAIL e). If the MSG PEEK flag was set in opts then the socket’s pending error remains, otherwise it is cleared. Variations FreeBSD In case (iii) above, the MSG PEEK flag may be set in opts. recv 9 tcp: fast fail Fail with ESHUTDOWN: socket shut down for reading on WinXP h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[cantrcvmore :=T; pr :=TCP PROTO(tcp sock)]〉)]]〉 tid ·recv(fd ,n, opts)−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ESHUTDOWN))sched timer); socks := socks ⊕ [(sid , sock 〈[cantrcvmore :=T; pr :=TCP PROTO(tcp sock)]〉)]]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) Description On WinXP, from thread tid , which is in the Run state, a recv(fd ,n, opts) call is made where fd refers to a TCP socket sid which is shut down for reading. The call fails with an ESHUTDOWN error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL ESHUTDOWN). Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (UDP only) 218 FreeBSD This rule does not apply. Linux This rule does not apply. 15.20 recv() (UDP only) recv : (fd ∗ int ∗msgbflag list)→ (string ∗ ((ip ∗ port) ∗ bool) option) A call to recv(fd,n, opts) returns data from the datagram on the head of a socket’s receive queue. This section describes the behaviour for UDP sockets. Here the fd argument is a file descriptor referring to the socket to receive data from, n specifies the number of bytes of data to read from that socket, and the opts argument is a list of flags for the recv() call. The possible flags are: • MSG DONTWAIT: non-blocking behaviour is requested for this call. This flag only has effect on Linux. FreeBSD and WinXP ignore it. See rules recv 12 and recv 13 . • MSG PEEK: return data from the datagram on the head of the receive queue, without removing that datagram from the receive queue. • MSG WAITALL: do not return until all n bytes of data have been read. Linux and FreeBSD ignore this flag. WinXP fails with EOPNOTSUPP as this is not meaningful for UDP sockets: the returned data is from only one datagram. • MSG OOB: return out-of-band data. This flag is ignored on Linux. On WinXP and FreeBSD the call fails with EOPNOTSUPP as out-of-band data is not meaningful for UDP sockets. The returned value of the recv() call, (string ∗ ((ip ∗ port) ∗ bool) option), consists of the data read from the socket (the string), the source address of the data (the ip ∗ port), and a flag specifying whether or not all of the datagram’s data was read (the bool). The latter two components are wrapped in an option type (for type compatibility with the TCP recv()) but are always returned for UDP. The flag only has meaning on WinXP and should be ignored on FreeBSD and Linux. For a socket to receive data, it must be bound to a local port. On Linux and FreeBSD, if the socket is not bound to a local port, then it is autobound to an ephemeral port when the recv() call is made. On WinXP, calling recv() on a socket that is not bound to a local port is an EINVAL error. If a non-blocking recv() call is made (the socket’s O NONBLOCK flag is set) and there are no datagrams on the socket’s receive queue, then the call will fail with EAGAIN. If the call is a blocking one and the socket’s receive queue is empty then the call will block, returning when a datagram arrives or an error occurs. If the socket has a pending error then on FreeBSD and Linux, the call will fail with that error. On WinXP, errors from ICMP messages are placed on the socket’s receive queue, and so the error will only be returned when that message is at the head of the receive queue. 15.20.1 Errors A call to recv() can fail with the errors below, in which case the corresponding exception is raised. EAGAIN The call would block and non-blocking behaviour is requested. This is done ei- ther via the MSG DONTWAIT flag being set in the recv() flags or the socket’s O NONBLOCK flag being set. EMSGSIZE The amount of data requested in the recv() call on WinXP is less than the amount of data in the datagram on the head of the receive queue. EOPNOTSUPP Operation not supported: out-of-band data is requested on FreeBSD and WinXP, or the MSG WAITALL flag is set on a recv() call on WinXP. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (UDP only) 219 ESHUTDOWN On WinXP, a recv() call is made on a socket that has been shutdown for reading. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. EINTR The system was interrupted by a caught signal. ENOBUFS Out of resources. ENOMEM Out of resources. 15.20.2 Common cases A UDP socket is created and bound to a local address. Other calls are made and datagrams are delivered to the socket; recv() is called to read from a datagram: socket 1 ; return 1 ; bind 1 ; . . . recv 11 ; return 1 ; A UDP socket is created and bound to a local address. recv() is called and blocks; a datagram arrives addressed to the socket’s local address and is placed on its receive queue; the call returns: socket 1 ; return 1 ; bind 1 ; . . . recv 12 ; deliver in 99 ; deliver in udp 1 ; recv 15 ; return 1 ; 15.20.3 API Posix: ssize_t recvfrom(int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address, socklen_t *restrict address_len); FreeBSD: ssize_t recvfrom(int s, void *buf, size_t len, int flags, struct sockaddr *from, socklen_t *fromlen); Linux: int recvfrom(int s, void *buf, size_t len, int flags, struct sockaddr *from, socklen_t *fromlen); WinXP: int recvfrom(SOCKET s, char* buf, int len, int flags, struct sockaddr* from, int* fromlen); In the Posix interface: • socket is the file descriptor of the socket to receive from, corresponding to the fd argument of the model recv(). • buffer is a pointer to a buffer to place the received data in, which upon return contains the data received on the socket. This corresponds to the string return value of the model recv(). • length is the amount of data to be read from the socket, corresponding to the int argument of the model recv(); it should be at most the length of buffer. • flags is a disjunction of the message flags that are set for the call, corresponding to the msgbflag list argument of the model recv(). • address is a pointer to a sockaddr structure of length address_len, which upon return contains the source address of the data received by the socket corresponding to the (ip ∗ port) in the return value of the model recv(). For the AF_INET sockets used in the model, it is actually a sockaddr_in that is used: the in_addr.s_addr field corresponds to the ip and the sin_port field corresponds to the port. • the returned ssize_t is either non-negative, in which case it is the the amount of data that was received by the socket, or it is -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). On WinXP, if the data from a datagram is not all read then the call fails with EMSGSIZE, but still fills the buffer with data. This is modelled by the bool flag in the model recv(): if it is set to T then the call Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv() (UDP only) 220 succeeded and read all of the datagrams’s data; if it is set to F then the call failed with EMSGSIZE but still returned data. There are other functions used to receive data on a socket. recv() is similar to recvfrom() except it does not have the address and address_len arguments. It is used when the source address of the data does not need to be returned from the call. recvmsg(), another input function, is a more general form of recvfrom(). 15.20.4 Model details If the call blocks then the thread enters state Recv2(sid,n, opts) where: • sid : sid is the identifier of the socket that the recv() call was made on, • n : num is the number of bytes to be read, and • opts : msgbflag list is the set of message flags. The following errors are not modelled: • On FreeBSD, Linux, and WinXP, EFAULT can be returned if the buffer parameter points to memory not in a valid part of the process address space. This is an artefact of the C interface to ioctl() that is excluded by the clean interface used in the model recv(). • In Posix, EIO may be returned to indicated that an I/O error occurred while reading from or writing to the file system; this is not modelled here. • EINVAL may be returned if the MSG OOB flag is set and no out-of-band data is available; out-of-band data does not exist for UDP so this does not apply. • ENOTCONN may be returned if the socket is not connected; this does not apply for UDP as the socket need not have a peer specified to receive datagrams. • ETIMEDOUT can be returned due to a transmission timeout on a connection; UDP is not connection- oriented so this does not apply. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. The following Linx message flags are not modelled: MSG_NOSIGNAL, MSG_TRUNC, and MSG_ERRQUEUE. 15.20.5 Summary recv 11 udp: fast succeed Receive data successfully without blocking recv 12 udp: block Block, entering Recv2 state as no datagrams available on socket recv 13 udp: fast fail Fail with EAGAIN: call would block and socket is non- blocking or, on Linux, non-blocking behaviour has been re- quested with the MSG DONTWAIT flag recv 14 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL, or ENOBUFS: there are no ephemeral ports left recv 15 udp: slow urgent suc- ceed Blocked call returns from Recv2 state with data recv 16 udp: fast fail Fail with EOPNOTSUPP: MSG WAITALL flag not sup- ported on WinXP, or MSG OOB flag not supported on FreeBSD and WinXP recv 17 udp: rc Socket shutdown for reading: fail with ESHUTDOWN on WinXP or succeed on Linux and FreeBSD recv 20 udp: rc Successful partial read of datagram on head of socket’s re- ceive queue on WinXP recv 21 udp: fast succeed Read zero bytes of data from an empty receive queue on FreeBSD Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 11 221 recv 22 udp: fast fail Fail with EINVAL on WinXP: socket is unbound recv 23 udp: rc Read ICMP error from receive queue and fail with that error on WinXP recv 24 udp: fast fail Fail with pending error 15.20.6 Rules recv 11 udp: fast succeed Receive data successfully without blocking h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock(rcvq)]〉)]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode data ′, ↑((i3, ps3), b))))sched timer); socks := socks ⊕ [(sid , sock)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, ∗, cantsndmore, cantrcvmore,UDP Sock(rcvq ′)) ∧ (¬(linux arch h.arch) =⇒ cantrcvmore = F) ∧ rcvq = (Dgram msg(〈[ is := i3; ps := ps3; data := data]〉)) :: rcvq ′′ ∧ n = clip int to num n0 ∧ ((length data ≤ n ∧ data = data ′) ∨ (length data > n ∧ data ′ = TAKE n data ∧ length data ′ = n ∧ ¬(windows arch h.arch))) ∧ (windows arch h.arch =⇒ b = T) ∧ opts = list to set opts0 ∧ rcvq ′ = (if MSG PEEK ∈ opts then rcvq else rcvq ′′) Description Consider a UDP socket sid , referenced by fd . It is not shutdown for reading, has no pending errors, and is bound to local port p1. Thread tid is in the Run state. The socket’s receive queue has a datagram at its head with data data and source address i3, ps3. A call recv(fd ,n0, opts0), from thread tid , succeeds. A tid ·recv(fd ,n0, opts0) transition is made. The thread is left in stateRet(OK(implode data ′, ↑(i3, ps3))), where data ′ is either: • all of the data in the datagram, data, if the amount of data requested n0 is greater than or equal to the amount of data in the datagram, or • the first n0 bytes of data if n0 is less than the amount of data in the datagram, unless the architecture is WinXP (see below). If theMSG PEEK option is set in opts0 then the entire datagram stays on the receive queue; the next call to recv() will be able to access this datagram. Otherwise, the entire datagram is discarded from the receive queue, even if all of its data has not been read. Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. POSIX specifies an unsigned type for n0 and this is one possible model thereof. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. The data itself is represented as a byte list in the datagram but is returned a string: the implode function is used to do the conversion. Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 13 222 WinXP The amount of data in bytes requested, n0, must be greater than or equal to the number of bytes of data in the datagram on the head of the receive queue. The boolean b equals T, indicating that all of the datagram’s data has been read. Otherwise refer to rule recv 20 . recv 12 udp: block Block, entering Recv2 state as no datagrams available on socket h0 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h0 〈[ts := ts ⊕ (tid 7→ (Recv2(sid ,n, opts))never timer); socks := h0.socks ⊕ [(sid , sock 〈[ps1 := ↑ p′1]〉)]; bound := bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = Sock(↑ fid , sf , is1, ps1, is2, ps2, ∗, cantsndmore,F,UDP Sock([ ])) ∧ p′1 ∈ autobind(sock .ps1,PROTO UDP, h0.socks) ∧ (if sock .ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ ¬((MSG DONTWAIT ∈ opts ∧ linux arch h.arch) ∨ ff .b(O NONBLOCK)) ∧ (bsd arch h.arch =⇒ ¬(n = 0)) ∧ n = clip int to num n0 ∧ opts = list to set opts0 Description Consider a UDP socket sid , referenced by fd , that has no pending errors, is not shutdown for reading, has an empty receive queue, and does not have its O NONBLOCK flag set. The socket is either bound to a local port ↑ p′1 or can be autobound to a local port ↑ p′1. From thread tid , which in the Run state, a recv(fd ,n0, opts0) call is made. Because there are no datagrams on the socket’s receive queue, the call will block. A tid ·recv(fd ,n0, opts0) transition will be made, leaving the thread state Recv2(sid ,n, opts). If autobind- ing occurred then sid will be placed on the head of the host’s list of bound sockets: bound = sid :: h0.bound . Model details The amount of data requested, n0, is clipped to a natural number n from an integer, using clip int to num. POSIX specifies an unsigned type for n0 and this is one possible model thereof. The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. Variations FreeBSD As above, with the added condition that the number of bytes requested to be read is not zero. Linux As above, with the added condition that the MSG DONTWAIT flag is not set in opts0. recv 13 udp: fast fail Fail with EAGAIN: call would block and socket is non-blocking or, on Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 14 223 Linux, non-blocking behaviour has been requested with the MSG DONTWAIT flag h0 tid ·recv(fd ,n, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer); socks := socks ⊕ [(sid , s 〈[es := ∗; pr :=UDP Sock([ ])]〉)]]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , s 〈[ es := ∗; pr :=UDP Sock([ ])]〉)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ opts = list to set opts0 ∧ ((MSG DONTWAIT ∈ opts ∧ linux arch h.arch) ∨ ff .b(O NONBLOCK)) Description Consider a UDP socket sid referenced by fd . It has no pending errors, and an empty receive queue. The socket is non-blocking: its O NONBLOCK flag has been set. From thread tid , in the Run state, a recv(fd ,n, opts0) call is made. The call would block because the socket has an empty receive queue, so the call fails with an EAGAIN error. A tid ·recv(fd ,n, opts0) transition is made, leaving the thread state Ret(FAIL EAGAIN). Model details The opts0 argument is of type list. In the model it is converted to a set opts using list to set. Variations Linux As above, but the rule also applies if the socket’sO NONBLOCK flag is not set but the MSG DONTWAIT flag is set in opts0. Also, note that EWOULDBLOCK and EAGAIN are aliased on Linux. recv 14 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL, or ENOBUFS: there are no ephemeral ports left h0 tid ·recv(fd ,n, opts)−−−−−−−−−−−−−−−→ h0 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, ∗, cantsndmore, cantrcvmore,UDP Sock([ ])))]]〉 ∧ autobind(∗,PROTO UDP, h0.socks) = ∅ ∧ e ∈ {EAGAIN;EADDRNOTAVAIL;ENOBUFS} ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) Description Consider a UDP socket sid , referenced by fd . The socket has no pending errors, an empty receive queue, and binding quad ∗, ∗, ∗, ∗. From thread tid , which is in the Run state, a recv(fd ,n, opts) call is made. There is no ephemeral port to autobind the socket to, so the call fails with either EAGAIN, EADDRNOTAVAIL or ENOBUFS. A tid ·recv(fd ,n, opts) transition is made, leaving the thread state Ret(FAIL e) where e is one of the above errors. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 16 224 recv 15 udp: slow urgent succeed Blocked call returns from Recv2 state with data h 〈[ts := ts ⊕ (tid 7→ (Recv2(sid ,n, opts))d); socks := socks ⊕ [(sid , sock 〈[ps1 := ↑ p1; es := ∗; pr :=UDP Sock(rcvq)]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode data ′, ↑((i3, ps3), b))))sched timer); socks := socks ⊕ [(sid , sock 〈[ps1 := ↑ p1; es := ∗; pr :=UDP Sock(rcvq ′)]〉)]]〉 rcvq = (Dgram msg(〈[ is := i3; ps := ps3; data := data]〉)) :: rcvq ′′ ∧ (rcvq ′ = if MSG PEEK ∈ opts then rcvq else rcvq ′′) ∧ ((length data ≤ n ∧ data = data ′) ∨ (length data > n ∧ ¬(windows arch h.arch) ∧ data ′ = TAKE n data ′ ∧ length data ′ = n)) ∧ (windows arch h.arch =⇒ b = T) Description Consider a UDP socket sid with no pending errors and bound to local port p1. At the head of the socket’s receive queue, rcvq , is a UDP datagram with source address (i3, ps3) and data data. Thread tid is blocked in state Recv2(sid ,n, opts). The blocked call successfully returns (implode data ′, ↑((i3, ps3, b))). If the number of bytes requested, n, is greater than or equal to the number of bytes of data in the datagram, data, then all of data is returned. If n is less than the number of bytes in the datagram, then the first n bytes of data are returned. A τ transition is made, leaving the thread state Ret(OK(implode data ′, ↑((i3, ps3), b))). If the MSG PEEK flag was set in opts then the datagram stays on the head of the socket’s receive queue; oth- erwise, it is discarded from the receive queue. Variations WinXP As above, except the number of bytes of data requested n, must be greater than or equal to the length in bytes of data. The boolean b equals T, indicating that all of the datagram’s data was read. recv 16 udp: fast fail Fail with EOPNOTSUPP: MSG WAITALL flag not supported on WinXP, or MSG OOB flag not supported on FreeBSD and WinXP h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EOPNOTSUPP))sched timer); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ opts = list to set opts0 ∧ ((MSG OOB ∈ opts ∧ ¬(linux arch h.arch)) ∨ (MSG WAITALL ∈ opts ∧ windows arch h.arch)) Description Consider a UDP socket sid referenced by fd . From thread tid , in the Run state, a recv(fd ,n0, opts0) call is made. The MSG OOB or MSG WAITALL flags are set in opts0. The call fails with an EOPNOTSUPP error. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 20 225 A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL EOPNOTSUPP). Model details The opts0 argument is of type list. In the model it is converted to a set opts using list to set. Variations Posix As above, except the rule only applies when MSG OOB is set in opts0. FreeBSD As above, except the rule only applies when MSG OOB is set in opts0. Linux This rule does not apply. recv 17 udp: rc Socket shutdown for reading: fail with ESHUTDOWN on WinXP or succeed on Linux and FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[cantrcvmore :=T; pr :=UDP Sock(rcvq)]〉)]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(ret))sched timer); socks := socks ⊕ [(sid , sock 〈[cantrcvmore :=T; pr :=UDP Sock(rcvq)]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ if windows arch h.arch then ret = FAIL (ESHUTDOWN) ∧ rc = fast fail else if bsd arch h.arch then ret = OK(“”, ↑((∗, ∗), b)) ∧ rc = fast succeed ∧ sock .es = ∗ else if linux arch h.arch then rcvq = [ ] ∧ ret = OK(“”, ↑((∗, ∗), b)) ∧ rc = fast succeed ∧ sock .es = ∗ else ASSERTION FAILURE“recv 17” Description Consider a UDP socket sid , referenced by fd , that has been shutdown for reading. From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. On FreeBSD and Linux, if the socket has no pending error the call is successfully, returning (“”, ↑((∗, ∗), b)); on WinXP the call fails with an ESHUTDOWN error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(OK(“”, ↑((∗, ∗), b))) on FreeBSD and Linux, or Ret(FAIL ESHUTDOWN) on WinXP. Variations FreeBSD As above: the call succeeds. Linux As above: the call succeeds with the additional condition that the socket has an empty receive queue. WinXP As above: the call fails with an ESHUTDOWN error. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 20 226 recv 20 udp: rc Successful partial read of datagram on head of socket’s receive queue on WinXP h 〈[ts := ts ⊕ (tid 7→ (t)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock(rcvq)]〉)]]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode data ′, ↑((i3, ps3),F))))sched timer); socks := socks ⊕ [(sid , sock)]]〉 windows arch h.arch ∧ rcvq = (Dgram msg(〈[ is := i3; ps := ps3; data := data]〉)) :: rcvq ′′ ∧ sock = Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, ∗, cantsndmore, cantrcvmore,UDP Sock(rcvq ′)) ∧ ((∃fd ff n n0 opts0. fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (rcvq ′ = if MSG PEEK ∈ (list to set opts0) then rcvq else rcvq ′′) ∧ n = clip int to num n0 ∧ n < length data ∧ data ′ = TAKE n data ∧ t = Run ∧ rc = fast succeed ∧ lbl = tid ·recv(fd ,n0, opts0)) ∨ (∃n opts. lbl = τ ∧ t = Recv2(sid ,n, opts) ∧ rc = slow urgent succeed ∧ data ′ = TAKE n data ∧ n < length data ∧ rcvq ′ = if MSG PEEK ∈ opts then rcvq else rcvq ′′)) Description On WinXP, consider a UDP socket sid bound to a local port p1 and with no pending errors. At the head of the socket’s receive queue is a datagram with source address is := i3; ps := ps3 and data data. This rule covers two cases: In the first, from thread tid , which is in theRun state, a recv(fd ,n0, opts0) call is made where fd refers to the socket sid . The amount of data to be read, n0 bytes, is less than the number of bytes of data in the datagram, data. The call successfully returns the first n0 bytes of data from the datagram, data ′. A tid ·recv(fd ,n0, opts0) transition is made leaving the thread state Ret(OK(implode data ′, ↑((i3, ps3),F))) where the F indicates that not all of the datagram’s data was read. The datagram is discarded from the socket’s receive queue unless the MSG PEEK flag was set in opts0, in which case the whole datagram remains on the socket’s receive queue. In the second case, thread tid is blocked in state Recv2(sid ,n, opts) where the number of bytes to be read, n, is less than the number of bytes of data in the datagram. There is now data to be read so a τ transition is made, leaving the thread state Ret(OK(implode data ′, ↑((i3, ps3),F))) where the F indicated that not all of the datagram’s data was read. The datagram is discarded from the socket’s receive queue unless the MSG PEEK flag was set in opts, in which case the whole datagram remains on the socket’s receive queue. Model details The amount of data requested, n0, is clipped to a natural number from an integer, using clip int to num. POSIX specifies an unsigned type for n0 and this is one possible model thereof. The data itself is represented as a byte list in the datagram but is returned a string, so the implode function is used to do the conversion. In the model the return value is OK(implode data ′, ↑((i3, p3),F)) where the F represents not all the data in the datagram at the head of the socket’s receive queue being read. What actually happens is that an EMSGSIZE error is returned, and the data is put into the read buffer specified when the recv() call was made. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 22 227 Variations Posix This rule does not apply. FreeBSD This rule does not apply. Linux This rule does not apply. recv 21 udp: fast succeed Read zero bytes of data from an empty receive queue on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock([ ])]〉)]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(“”, ↑((∗, ∗), b))))sched timer); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock([ ])]〉)]]〉 bsd arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ 0 = clip int to num n0 Description On FreeBSD, consider a UDP socket sid , referenced by fd , with an empty receive queue. From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made where n0 = 0. The call succeeds, returning the empty string and not specifying an address: OK(“”, ↑((∗, ∗), b)). A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(OK(“”, ↑((∗, ∗), b))). Variations Posix This rule does not apply: see rules recv 12 and recv 13 . Linux This rule does not apply: see rules recv 12 and recv 13 . WinXP This rule does not apply: see rules recv 12 and recv 13 . recv 22 udp: fast fail Fail with EINVAL on WinXP: socket is unbound h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ps1 := ∗; pr :=UDP PROTO(udp)]〉)]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer); socks := socks ⊕ [(sid , sock 〈[ps1 := ∗; pr :=UDP PROTO(udp)]〉)]]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ recv 24 228 h.files[fid ] = File(FT Socket(sid),ff ) Description OnWinXP, consider a UDP socket sid referenced by fd that is not bound to a local port. A recv(fd ,n0, opts0 call is made from thread tid which is in the Run state. The call fails with an EINVAL error. A tid ·recv(fd ,n0, opts0) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations Posix This rule does not apply. FreeBSD This rule does not apply. Linux This rule does not apply. recv 23 udp: rc Read ICMP error from receive queue and fail with that error on WinXP h 〈[ts := ts ⊕ (tid 7→ (t)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock(rcvq)]〉)]]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid , sock 〈[pr :=UDP Sock(rcvq ′)]〉)]]〉 windows arch h.arch ∧ rcvq = (Dgram error(〈[ e := err ]〉)) :: rcvq ′ ∧ ((∃fd n0 opts0 fid ff .t = Run ∧ lbl = tid ·recv(fd ,n0, opts0) ∧ rc = fast fail ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff )) ∨ (∃n opts.t = Recv2(sid ,n, opts) ∧ lbl = τ ∧ rc = slow urgent fail)) Description On WinXP, consider a UDP socket sid referenced by fd . At the head of the socket’s receive queue, rcvq , is an ICMP message with error err . This rule covers two cases. In the first, thread tid is in the Run state and a recv(fd ,n0, opts0) call is made. The call fails with error err , making a tid ·recv(fd ,n0, opts0) transition. This leaves the thread state Ret(FAIL err), and the socket with the ICMP message removed from its receive queue. In the second case, thread tid is blocked in state Recv2(sid ,n0, opts0). A τ transition is made, leaving the thread state Ret(FAIL err), and the socket with the ICMP message removed from its receive queue. Variations Posix This rule does not apply. FreeBSD This rule does not apply. Linux This rule does not apply. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (TCP only) 229 recv 24 udp: fast fail Fail with pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, is2, ps2, ↑ e, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 tid ·recv(fd ,n0, opts0)−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ opts = list to set opts0 ∧ (¬ linux arch h.arch =⇒ ∃p2.ps2 = ↑ p2) ∧ es = if MSG PEEK ∈ opts then ↑ e else ∗ Description From thread tid , which is in the Run state, a recv(fd ,n0, opts0) call is made. fd refers to a UDP socket that has local address (↑ i1, ↑ p1), has its peer port set: ps2 = ↑ p2, and has pending error ↑ e. The call fails returning the pending error: a tid ·recv(fd ,n0, opts0) transition is made leaving the thread state Ret(FAIL EAGAIN). If theMSG PEEK flag was set in opts0 then the socket’s pending error remains, otherwise it is cleared. Model details The opts0 argument to recv() is of type msgbflag list, but it is converted to a set, opts, using list to set. Variations Linux The socket need not have its peer port set. 15.21 send() (TCP only) send : fd ∗ (ip ∗ port) option ∗ string ∗msgbflag list→ string This section describes the behaviour of send() for TCP sockets. A call to send(fd, ∗, data,flags) enqueues data on the TCP socket’s send queue. Here fd is a file descriptor referring to the TCP socket to enqueue data on. The second argument, of type (ip ∗ port) option, is the destination address of the data for UDP, but for a TCP socket it should be set to ∗ (the socket must be connected to a peer before send() can be called). The data is the data to be sent. Finally, flags is a list of flags for the send() call; possible flags are: MSG OOB, specifying that the data to be sent is out-of-band data, andMSG DONTWAIT, specifying that non-blocking behaviour is to be used for this call. The MSG WAITALL and MSG PEEK flags may also be set, but as they are meaningless for send() calls, FreeBSD ignores them, and Linux and WinXP fail with EOPNOTSUPP. The returned string is any data that was not sent. For a successful send() call, the socket must be in a synchronised state, must not be shutdown for writing, and must not have a pending error. If there is not enough room on a socket’s send queue then a send() call may block until space becomes available. For a successful blocking send() call on FreeBSD the entire string will be enqueued on the socket’s send queue. 15.21.1 Errors In addition to errors returned via ICMP (see deliver in icmp 3 (p337)), a call to send() can fail with the errors below, in which case the corresponding exception is raised: Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (TCP only) 230 EAGAIN Non-blocking send() call would block. ENOTCONN Socket not connected on FreeBSD and WinXP. EOPNOTSUPP Message flags MSG PEEK and MSG WAITALL not supported. Linux and WinXP. EPIPE Socket not connected on Linux; or socket shutdown for writing on FreeBSD and Linux. ESHUTDOWN Socket shutdown for writing on WinXP. EBADF The file descriptor passed is not a valid file descriptor. EINTR The system was interrupted by a caught signal. ENOTSOCK The file descriptor passed does not refer to a socket. 15.21.2 Common cases A TCP socket is created and successfully connects with a peer; data is then sent to the peer: socket 1 ; return 1 ; connect 1 ; return 1 ; . . . connect 2 ; return 1 ; send 1 ; . . . 15.21.3 API Posix: ssize_t send(int socket, const void *buffer, size_t length, int flags); FreeBSD: ssize_t send(int s, const void *msg, size_t len, int flags); Linux: int send(int s, const void *msg, size_t len, int flags); WinXP: int send(SOCKET s, const char *buf, int len, int flags); In the Posix interface: • socket is the file descriptor of the socket to send from, corresponding to the fd argument of the model send(). • message is a pointer to the data to be sent of length length. The two together correspond to the string argument of the model send(). • flags is a disjunction of the message flags for the send() call, corresponding to the msgbflag list in the model send(). • the returned ssize_t is either non-negative or -1. If it is non-negative then it is the amount of data from message that was sent. If it is -1 then it indicates an error, in which case the error is stored in errno. This corresponds to the model send()’s return value of type string which is the data that was not sent. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux and WinXP interfaces are similar modulo argument renaming, except where noted above. 15.21.4 Model details If the call blocks then the thread enters state Send2(sid, ∗, str , opts) (the optional parameter is used for UDP only), where • sid : sid is the identifier of the socket that made the send() call, • str : string is the data to be sent, and Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 1 231 • opts : msgbflag list is the set of options for the send() call. The following errors are not modelled: • In Posix and on all three architectures, EDESTADDRREQ indicates that the socket is not connection-mode and no peer address is set. This doesn’t apply to TCP, which is a connection-mode protocol. • In Posix, EACCES signifies that write access to the socket is denied. This is not modelled here. • On FreeBSD and Linux, EFAULT signifies that the pointers passed as either the address or address_len arguments were inaccessible. This is an artefact of the C interface to accept() that is excluded by the clean interface used in the model. • In Posix and on Linux, EINVAL signifies that an invalid argument was passed. The typing of the model interface prevents this from happening. • In Posix, EIO signifies that an I/O error occurred while reading from or writing to the file system. This is not modelled. • On Linux, EMSGSIZE indicates that the message is too large to be sent all at once, as the socket requires; this is not a requirement for TCP sockets. • In Posix, ENETDOWN signifies that the local network interface used to reach the destination is down. This is not modelled. The following flags are not modelled: • On Linux, MSG_CONFIRM is used to tell the link layer not to probe the neighbour. • On Linux, MSG_NOSIGNAL requests not to send SIGPIPE errors on stream-oriented sockets when the other end breaks the connection. • On FreeBSD and WinXP, MSG_DONTROUTE is used by routing programs. • On FreeBSD, MSG_EOR is used to indicate the end of a record for protocols that support this. It is not modelled because TCP does not support records. • On FreeBSD, MSG_EOF is used to implement Transaction TCP which is not modelled here. 15.21.5 Summary send 1 tcp: fast succeed Successfully send data without blocking send 2 tcp: block Block waiting for space in socket’s send queue send 3 tcp: slow nonurgent succeed Successfully return from blocked state having sent data send 3a tcp: block From blocked state, transfer some data to the send queue and remain blocked send 4 tcp: fast fail Fail with EAGAIN: non-blocking semantics requested and call would block send 5 tcp: fast fail Fail with pending error send 5a tcp: slow urgent fail Fail from blocked state with pending error send 6 tcp: fast fail Fail with ENOTCONN or EPIPE: socket not connected send 7 tcp: rc Fail with EPIPE or ESHUTDOWN: socket shut down for writing send 8 tcp: fast fail Fail with EOPNOTSUPP: message flag not valid 15.21.6 Rules Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 1 232 send 1 tcp: fast succeed Successfully send data without blocking h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode str ′′)))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq @ str ′, sndurp′, rcvq , rcvurp, iobc)))]]〉 st ∈ {ESTABLISHED;CLOSE WAIT} ∧ opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ space ∈ send queue space (sf .n(SO SNDBUF))(length sndq)(MSG OOB ∈ opts)h.arch cb.t maxseg i2 ∧ ({MSG PEEK;MSG WAITALL} ∩ opts = ∅ ∨ bsd arch h.arch) ∧ (if space ≥ length str then str ′ = str ∧ str ′′ = [ ] else (ff .b(O NONBLOCK) ∨ (MSG DONTWAIT ∈ opts ∧ ¬bsd arch h.arch)) ∧ (if bsd arch h.arch then space ≥ sf .n(SO SNDLOWAT) else space > 0) ∧ (str ′, str ′′) = SPLIT space str ) ∧ sndurp′ = (if (MSG OOB ∈ opts) ∧ (n = length str) then ↑(length(sndq @ str ′)− 1) else sndurp) Description From thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made. fd refers to a TCP socket sid that has binding quad (↑ i1, ↑ p1, ↑i2, ↑ p2), has no pending error, is not shutdown for writing, and is in state ESTABLISHED or CLOSE WAIT. The MSG PEEK and MSG WAITALL flags are not set in opts0. space is the space in the socket’s send queue, calculated using send queue space (p93). This rule covers two cases: (1) there is space in the socket’s send queue for all the data; and (2) there is not space for all the data but the call is non-blocking (the MSG DONTWAIT flag is set in opts or the socket’s O NONBLOCK flag is set), and the space is greater than zero, or, on FreeBSD, greater than the minimum number of bytes for send() operations on the socket, sf .n(SO SNDLOWAT). In (1) all of the data str is appended to the socket’s send queue and the returned string, str ′′, is the empty string. In (2), the first space bytes of data, str ′, are appended to the socket’s send queue and the remaining data, str ′′, is returned. In both cases a tid ·send(fd , ∗, implode str , opts0) transition is made, leaving the thread state Ret(OK(implode str ′′)). If the data was marked as out-of-band, MSG OOB ∈ opts, then the socket’s send urgent pointer will point to the end of the send queue. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 1 233 Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 2 234 FreeBSD The MSG PEEK and MSG WAITALL flags may be set in opts0 but for the call to be non-blocking the socket’s O NONBLOCK flag must be set: the MSG DONTWAIT flag has no effect. send 2 tcp: block Block waiting for space in socket’s send queue h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ∗, str , opts))never timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 opts = list to set opts0 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ ¬((¬bsd arch h.arch ∧MSG DONTWAIT ∈ opts) ∨ ff .b(O NONBLOCK)) ∧ space ∈ send queue space (sf .n(SO SNDBUF))(length sndq)(MSG OOB ∈ opts)h.arch cb.t maxseg i2 ∧ ({MSG PEEK;MSG WAITALL} ∩ opts = ∅ ∨ bsd arch h.arch) ∧ ((st ∈ {ESTABLISHED;CLOSE WAIT} ∧ space < length str) ∨ (linux arch h.arch ∧ st ∈ {SYN SENT;SYN RECEIVED})) Description From thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made. fd refers to a TCP socket sid that has binding quad (↑ i1, ↑ p1, ↑i2, ↑ p2), has no pending error, is not shutdown for writing, and is in state ESTABLISHED or CLOSE WAIT. The call is a blocking one: the socket’s O NONBLOCK flag is not set and the MSG DONTWAIT flag is not set in opts0. The MSG PEEK and MSG WAITALL flags are not set in opts0. The space in the socket’s send queue, space (calculated using send queue space (p93)), is less than the length in bytes of the data to be sent, str . The call blocks, leaving the thread state Send2(sid , ∗, str , opts) via a tid ·send(fd , ∗, implode str , opts0) transition. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations FreeBSD The MSG PEEK, MSG WAITALL, and MSG DONTWAIT flags may all be set in opts0: all three are ignored by FreeBSD. Linux In addition to the above, the rule also applies if connection establishment is still taking place for the socket: it is in state SYN SENT or SYN RECEIVED. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 3a 235 send 3 tcp: slow nonurgent succeed Successfully return from blocked state having sent data h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ∗, str , opts))d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(implode str ′′)))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq @ str ′, sndurp′, rcvq , rcvurp, iobc)))]]〉 st ∈ {ESTABLISHED;CLOSE WAIT} ∧ space ∈ send queue space (sf .n(SO SNDBUF))(length sndq)(MSG OOB ∈ opts)h.arch cb.t maxseg i2 ∧ space ≥ length str ∧ str ′ = str ∧ str ′′ = [ ] ∧ sndurp′ = if MSG OOB ∈ opts then ↑(length(sndq @ str ′)− 1) else sndurp Description Thread tid is blocked in state Send2(sid , ∗, str , opts) where the TCP socket sid has binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2), has no pending error, is not shutdown for writing, and is in state ESTABLISHED or CLOSE WAIT. The space in the socket’s send queue, space (calculated using send queue space (p93)), is greater than or equal to the length of the data to be sent, str . The data is appended to the socket’s send queue and the call successfully returns the empty string. A τ transition is made, leaving the thread state Ret(OK“”). If the data was marked as out-of-band, MSG OOB ∈ opts, then the socket’s urgent pointer will be updated to point to the end of the socket’s send queue. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. send 3a tcp: block From blocked state, transfer some data to the send queue and remain blocked h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ∗, str , opts))d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ∗, str ′′, opts))never timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq @ str ′, sndurp′, rcvq , rcvurp, iobc)))]]〉 st ∈ {ESTABLISHED;CLOSE WAIT} ∧ space ∈ send queue space (sf .n(SO SNDBUF))(length sndq)(MSG OOB ∈ opts)h.arch cb.t maxseg i2 ∧ space < length str ∧ space > 0 ∧ (str ′, str ′′) = SPLIT space str ∧ sndurp′ = if MSG OOB ∈ opts then ↑(length(sndq @ str ′)− 1) else sndurp Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 4 236 Description Thread tid is blocked in state Send2(sid , ∗, str , opts) where TCP socket sid has binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2), has no pending error, is not shutdown for writing, and is in state ESTABLISHED or CLOSE WAIT. The amount of space in the socket’s send queue, space (calculated using send queue space (p93)), is less than the length of the remaining data to be sent, str , and greater than 0. The socket’s send queue is filled by appending the first space bytes of str , str ′, to it. A τ transition is made, leaving the thread state Send2(sid , ∗, str ′′, opts) where str ′′ is the remaining data to be sent. If the data in str is out-of-band, MSG OOB is set in opts, then the socket’s urgent pointer is updated to point to the end of the socket’s send queue. Note it is unclear whether or not MSG OOB should be removed from opts in the state. send 4 tcp: fast fail Fail with EAGAIN: non-blocking semantics requested and call would block h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗,F, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ opts = list to set opts0 ∧ ({MSG PEEK;MSG WAITALL} ∩ opts = ∅ ∨ bsd arch h.arch) ∧ ((¬bsd arch h.arch ∧MSG DONTWAIT ∈ opts) ∨ ff .b(O NONBLOCK)) ∧ ((st ∈ {ESTABLISHED;CLOSE WAIT} ∧ space ∈ send queue space (sf .n(SO SNDBUF))(length sndq)(MSG OOB ∈ opts)h.arch cb.t maxseg i2 ∧ ¬(space ≥ length str ∨ (if bsd arch h.arch then space ≥ sf .n(SO SNDLOWAT) else space > 0))) ∨ (st ∈ {SYN SENT;SYN RECEIVED} ∧ linux arch h.arch)) Description From thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made. fd refers to a TCP socket that has binding quad (↑ i1, ↑ p1, ↑ i2, ↑p2), has no pending error, is not shutdown for writing, and is in state ESTABLISHED or CLOSE WAIT. The call is a non-blocking one: either the socket’s O NONBLOCK flag is set or the MSG DONTWAIT flag is set in opts0. The MSG PEEK and MSG WAITALL flags are not set in opts0. The space in the socket’s send queue, space (calculated using send queue space (p93)), is less than both the length of the data to send str ; and on FreeBSD is less than the minimum number of bytes for socket send operations, sf .n(SO SNDLOWAT), or on Linux and WinXP is equal to zero. The call would have to block, but because it is non-blocking, it fails with an EAGAIN error. A tid ·send(fd , ∗, implode str , opts0) transition is made, leaving the thread in state Ret(FAIL EAGAIN). Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 6 237 FreeBSD For the call to be non-blocking, the socket’s O NONBLOCK flag must be set; the MSG DONTWAIT flag is ignored. Additionally, the MSG PEEK and MSG WAITALL flags may be set in opts0 as they are also ignored. Linux This rule also applies if the socket is in state SYN SENT or SYN RECEIVED, in which case the send queue size does not matter. send 5 tcp: fast fail Fail with pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e]〉)]]〉 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of sock .pr = PROTO TCP Description From thread tid , which is in the Run state, a send(fd , addr , implode str , opts0) call is made. fd refers to a socket sock identified by sid with pending error ↑e. The call fails, returning the pending error. A tid ·send(fd , addr , implode str , opts) transition is made, leaving the thread in state Ret(FAIL e). Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. send 5a tcp: slow urgent fail Fail from blocked state with pending error h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ∗, str , opts))d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗]〉)]]〉 proto of sock .pr = PROTO TCP Description Thread tid is blocked in state Send2(sid , ∗, str , opts) from an earlier send() call. The TCP socket sid has pending error ↑ e so the call can now return, failing with the error. A τ transition is made, leaving the thread state Ret(FAIL e). send 6 tcp: fast fail Fail with ENOTCONN or EPIPE: socket not connected h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer)]〉 fd ∈ dom(h.fds) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 7 238 fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = (h.socks[sid ]) ∧ TCP PROTO(tcp sock) = sock .pr ∧ sock .es = ∗ ∧ (tcp sock .st ∈ {CLOSED;LISTEN} ∨ (tcp sock .st ∈ {SYN SENT;SYN RECEIVED} ∧ ¬(linux arch h.arch)) ∨ F (* Placeholder for: if tcp_disconnect or tcp_usrclose has been invoked *) ) ∧ err = (if linux arch h.arch then EPIPE else ENOTCONN) Description From thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made. fd refers to a TCP socket sock identified by sid that does not have a pending error. The socket is not synchronised: it is in state CLOSED, LISTEN, SYN SENT, or SYN RECEIVED. The call fails with an ENOTCONN error, or EPIPE on Linux. A tid ·send(fd , ∗, implode str , opts0) transition is made, leaving the thread in state Ret(FAIL err) where err is one of the above errors. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations Linux The rule does not apply if the socket is in state SYN RECEIVED or SYN SENT. send 7 tcp: rc Fail with EPIPE or ESHUTDOWN: socket shut down for writing h 〈[ts := ts ⊕ (tid 7→ (t)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,T, cantrcvmore,TCP PROTO(tcp)))]]〉 lbl−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,T, cantrcvmore,TCP PROTO(tcp)))]]〉 ∃fd ff str opts0 i2 p2. fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ t = Run ∧ lbl = tid ·send(fd , ∗, implode str , opts0) ∧ rc = fast fail ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2 ∧ (if tcp.st 6= CLOSED then ∃i1 p1.is1 = ↑ i1 ∧ ps1 = ↑ p1 else T) ∨ ∃opts str . t = Send2(sid , ∗, str , opts) ∧ lbl = τ ∧ rc = slow urgent fail ∧ (if windows arch h.arch then err = ESHUTDOWN else err = EPIPE) Description Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (UDP only) 239 This rule covers two cases: (1) from thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made; and (2) thread tid is blocked in state Send2(sid , ∗, str , opts). In (1), fd refers to a TCP socket sid that has binding quad (is1, ps1, ↑ i2, ↑ p2). In both cases the socket is shutdown for writing. The call fails with an EPIPE error. The thread is left in state Ret(FAIL EPIPE), via a tid ·send(fd , ∗, implode str , opts0) transition in (1) or a τ transition in (2). Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations WinXP The call fails with an ESHUTDOWN error instead of EPIPE. send 8 tcp: fast fail Fail with EOPNOTSUPP: message flag not valid h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EOPNOTSUPP))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of(h.socks[sid ]).pr = PROTO TCP ∧ opts = list to set opts0 ∧ (MSG PEEK ∈ opts ∨MSG WAITALL ∈ opts) ∧ ¬bsd arch h.arch Description From thread tid , which is in the Run state, a send(fd , ∗, implode str , opts0) call is made. fd refers to a TCP socket identified by sid . Either the MSG PEEK or MSG WAITALL flag is set in opts0. These flags are not supported so the call fails with an EOPNOTSUPP error. A tid ·send(fd , ∗, implode str , opts0) transition is made, leaving the thread in state Ret(FAIL EOPNOTSUPP). Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Variations FreeBSD This rule does not apply. 15.22 send() (UDP only) send : (fd ∗ (ip ∗ port) option ∗ string ∗msgbflag list)→ string This section describes the behaviour of send() for UDP sockets. A call to send(fd, addr , data,flags) enqueues a UDP datagram to send to a peer. Here the fd argument is a file descriptor referring to a UDP socket from Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (UDP only) 240 which to send data. The destination address of the data can be specified either by the addr argument, which can be ↑(i3, p3) or ∗, or by the socket’s peer address (its is2 and ps2 fields) if set. For a successful send(), at least one of these two must be specified. If the socket has a peer address set and addr is set to ↑(i3, p3), then the address used is architecture-dependent: on FreeBSD the send() call will fail with an EISCONN error; on Linux and WinXP i3, p3 will be used. The string, data, is the data to be sent. The length in bytes of data must be less than the architecture- dependent maximum payload for a UDP datagram. Sending a string of length zero bytes is acceptable. The msgbflag list is the list of message flags for the send() call. The possible flags are MSG DONTWAIT and MSG OOB. MSG DONTWAIT specifies that non-blocking behaviour should be used for this call: see rules send 10 and send 11 . MSG OOB specifies that the data to be sent is out-of-band data, which is not meaningful for UDP sockets. FreeBSD ignores this flag, but on Linux and WinXP the send() call will fail: see rule send 20 . The return value of the send() call is a string of the data which was not sent. A partial send may occur when the call is interrupted by a signal after having sent some data. For a datagram to be sent, the socket must be bound to a local port. When a send() call is made, the socket is autobound to an ephemeral port if it does not have its local port bound. A successful send() call only guarantees that the datagram has been placed on the host’s out queue. It does not imply that the datagram has left the host, let alone been successfully delivered to its destination. A call to send() may block if there is no room on the socket’s send buffer and non-blocking behaviour has not been requested. 15.22.1 Errors In addition to errors returned via ICMP (see deliver in icmp 3 (p337)), a call to send() can fail with the errors below, in which case the corresponding exception is raised: EADDRINUSE The socket’s peer address is not set and the destination address specified would give the socket a binding quad i1, p1, i2, p2 which is already in use by another socket. EADDRNOTAVAIL There are no ephemeral ports left for autobinding to. EAGAIN The send() call would block and non-blocking behaviour is requested. This may have been done either via theMSG DONTWAIT flag being set in the send() flags or the socket’s O NONBLOCK flag being set. EDESTADDRREQ The socket does not have its peer address set, and no destination address was specified. EINTR A signal interrupted send() before any data was transmitted. EISCONN On FreeBSD, a destination address was specified and the socket has a peer address set. EMSGSIZE The message is too large to be sent in one datagram. ENOTCONN The socket does not have its peer address set, and no destination address was specified. This can occur either when the call is first made, or if it blocks and if the peer address is unset by a call to disconnect() whilst blocked. EOPNOTSUPP The MSG OOB flag is set on Linux or WinXP. EPIPE Socket shut down for writing. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (UDP only) 241 ENOBUFS Out of resources. ENOMEM Out of resources. 15.22.2 Common cases send 9 ; return 1 ; 15.22.3 API Posix: ssize_t sendto(int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr socklen_t dest_len); FreeBSD: ssize_t sendto(int s, const void *msg, size_t len, int flags, const struct sockaddr *to, socklen_t tolen); Linux: int sendto(int s, const void *msg, size_t len, int flags, const struct sockaddr *to, socklen_t tolen); WinXP: int sendto(SOCKET s, const char* buf, int len, int flags, const struct sockaddr* to, int tolen); In the Posix interface: • socket is the file descriptor of the socket to send from, corresponding to the fd argument of the model send(). • message is a pointer to the data to be sent of length length. The two together correspond to the string argument of the model send(). • flags is an OR of the message flags for the send() call, corresponding to the msgbflag list in the model send(). • dest_addr and dest_len correspond to the addr argument of the model send(). dest_addr is either null or a pointer to a sockaddr structure containing the destination address for the data. If it is null it corresponds to addr = ∗. If it contains an address, then it corresponds to addr = ↑(i3, p3) where i3 and p3 are the IP address and port specified in the sockaddr structure. • the returned ssize_t is either non-negative or -1. If it is non-negative then it is the amount of data from message that was sent. If it is -1 then it indicates an error, in which case the error is stored in errno. This is different to the model send()’s return value of type string which is the data that was not sent. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). There are other functions used to send data on a socket. send() is similar to sendto() except it does not have the address and address_len arguments. It is used when the destination address of the data does not need to be specified. sendmsg(), another output function, is a more general form of sendto(). 15.22.4 Model details If the call blocks then the thread enters state Send2(sid, ↑(addr , is1, ps1, is2, ps2), str , opts) where • sid : sid is the identifier of the socket that made the send() call, • addr : (ip ∗ port) option is the destination address specified in the send() call, • is1 : ip option is the socket’s local IP address, possibly ∗, • ps1 : port option is the socket’s local port, possibly ∗, • is2 : ip option is the IP address of the socket’s peer, possibly ∗, • ps2 : ip option is the port of the socket’s peer, possibly ∗, Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send() (UDP only) 242 • str : string is the data to be sent, and • opts : msgbflag list is the set of options for the send() call. The following errors are not modelled: • On FreeBSD, EACCES signifies that the destination address is a broadcast address and the SO_BROADCAST flag has not been set on the socket. Broadcast is not modelled here. • In Posix, EACCES signifies that write access to the socket is denied. This is not modelled here. • On FreeBSD and Linux, EFAULT signifies that the pointers passed as either the address or address_len arguments were inaccessible. This is an artefact of the C interface to accept() that is excluded by the clean interface used in the model. • In Posix and on Linux, EINVAL signifies that an invalid argument was passed. The typing of the model interface prevents this from happening. • In Posix, EIO signifies that an I/O error occurred while reading from or writing to the file system. This is not modelled. • In Posix, ENETDOWN signifies that the local network interface used to reach the destination is down. This is not modelled. The following flags are not modelled: • On Linux, MSG_CONFIRM is used to tell the link layer not to probe the neighbour. • On Linux, MSG_NOSIGNAL requests not to send SIGPIPE errors on stream-oriented sockets when the other end breaks the connection. UDP is not stream-oriented. • On FreeBSD and WinXP, MSG_DONTROUTE is used by routing programs. • On FreeBSD, MSG_EOR is used to indicate the end of a record for protocols that support this. It is not modelled because UDP does not support records. • On FreeBSD, MSG_EOF is used to implement Transaction TCP. 15.22.5 Summary send 9 udp: fast succeed Enqueue datagram and return successfully send 10 udp: block Block waiting to enqueue datagram send 11 udp: fast fail Fail with EAGAIN: call would block and non-blocking be- haviour has been requested send 12 udp: fast fail Fail with ENOTCONN: no peer address set in socket and no destination address provided send 13 udp: fast fail Fail with EMSGSIZE: string to be sent is bigger than UDPpayloadMax send 14 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL or ENOBUFS: there are no ephemeral ports left send 15 udp: slow urgent suc- ceed Return from blocked state after datagram enqueued send 16 udp: slow urgent fail Fail: blocked socket has entered an error state send 17 udp: slow urgent fail Fail with EMSGSIZE or ENOTCONN: blocked socket has had peer address unset or string to be sent is too big send 18 udp: fast fail Fail with EOPNOTSUPP: MSG PEEK flag not sup- ported for send() calls on WinXP; or MSG OOB flag not supported on WinXP and Linux send 19 udp: fast fail Fail with EADDRINUSE: on FreeBSD, local and destina- tion address quad in use by another socket send 21 udp: fast fail Fail with EISCONN: socket has peer address set and desti- nation address is specified in call on FreeBSD send 22 udp: fast fail Fail with EPIPE or ESHUTDOWN: socket shut down for writing send 23 udp: fast fail Fail with pending error Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 9 243 15.22.6 Rules send 9 udp: fast succeed Enqueue datagram and return successfully h0 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(“”)))sched timer); socks := socks ⊕ [(sid , sock 〈[es := es; ps1 := ↑ p′1; pr :=UDP PROTO(udp)]〉)]; bound := bound ; oq := oq ′]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ es := es; pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sock .cantsndmore = F ∧ STRLEN (implode str) ≤ UDPpayloadMax h0.arch ∧ ((addr 6= ∗) ∨ (sock .is2 6= ∗)) ∧ p′1 ∈ autobind(sock .ps1,PROTO UDP, h0.socks) ∧ (if sock .ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ dosend(h.ifds, h.rttab, (addr , str), (sock .is1, ↑ p′1, sock .is2, sock .ps2), h0.oq , oq ′,T) ∧ (if bsd arch h.arch then (h0.socks[sid ]).sf .n(SO SNDBUF) ≥ STRLEN (implode str) else MSG OOB /∈ (list to set opts0)) ∧ (¬(windows arch h.arch) =⇒ es = ∗) Description Consider a UDP socket sid referenced by fd that is not shutdown for writing and has no pending errors. From thread tid , which is in the Run state, a call send(fd , addr , implode str , opts0) succeeds if: • the length of str is less than UDPpayloadMax (p70), the architecture-dependent maximum payload for a UDP datagram. • The socket has a peer IP address set in its is2 field or the addr argument is ↑(i3, p3), specifying a destination address. • The socket is bound to a local port p′1, or it can be autobound to p′1 and sid added to the list of bound sockets. • A UDP datagram is constructed from the socket’s binding quad (sock .is1, ↑p′1, sock .is2, sock .ps2), the destination address argument addr , and the data str . This datagram is successfully enqueued on the outqueue of the host, oq to form outqueue oq ′ using auxiliary function dosend (p96). A tid ·send(fd , addr , implode str , opts0) transition is made, leaving the thread in state Ret(OK(“”)) and the host with new outqueue oq ′. If the socket was autobound to a port then sid is appended to the host’s list of bound sockets. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations Posix The MSG OOB flag is not set in opts0. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 10 244 FreeBSD On FreeBSD there is an additional condition for a successful send(): the amount of data to be sent must be less than or equal to the size of the socket’s send buffer. Linux The MSG OOB flag is not set in opts0. WinXP The MSG OOB flag is not set in opts0 and any pending errors are ignored. send 10 udp: block Block waiting to enqueue datagram h0 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ Timed(Send2(sid , ↑(addr , sock .is1, ↑ p′1, sock .is2, sock .ps2), str , opts), never timer)); socks := socks ⊕ [(sid , sock 〈[es := es; ps1 := ↑ p′1; pr :=UDP PROTO(udp)]〉)]; bound := bound ; oq := oq ′]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ es := es; pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sock .cantsndmore = F ∧ (¬(windows arch h.arch) =⇒ es = ∗) ∧ opts = list to set opts0 ∧ ¬((¬bsd arch h.arch ∧MSG DONTWAIT ∈ opts) ∨ ff .b(O NONBLOCK)) ∧ ((linux arch h.arch ∨ windows arch h.arch) =⇒ MSG OOB /∈ opts) ∧ p′1 ∈ autobind(sock .ps1,PROTO UDP, h0.socks) ∧ (if sock .ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ dosend(h0.ifds, h0.rttab, (addr , str), (sock .is1, ↑ p′1, sock .is2, sock .ps2), h0.oq , oq ′,F) ∧ ((addr 6= ∗) ∨ (sock .is2 6= ∗)) Description Consider a UDP socket sid referenced by fd that is not shutdown for writing and has no pending errors. A send(fd , addr , implode str , opts0) call is made from thread tid which is in the Run state. Either the socket is a blocking one: its O NONBLOCK flag is not set, or the call is a blocking one: the MSG DONTWAIT flag is not set in opts0. The socket is either bound to local port p′1 or can be autobound to a port p ′ 1. Either the socket has its peer IP address set, or the destination address of the send() call is set: addr 6= ∗. A UDP datagram, constructed from the socket’s binding quad sock .is1, ↑p′1, sock .is2, sock .ps2, the destina- tion address argument addr , and the data str , cannot be placed on the outqueue of the host oq . The call blocks, waiting for the datagram to be enqueued on the host’s outqueue. The thread is left in state Send2(sid , ↑(addr , sock .is1, ↑ p′1, sock .is2, sock .ps2), str , opts). If the socket was autobound to a port then sid is appended to the head of the host’s list of bound sockets. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 11 245 The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Variations FreeBSD The MSG DONTWAIT flag may be set in opts0: it is ignored by FreeBSD. Linux The MSG OOB flag must not be set in opts0. WinXP TheMSG OOB flag must not be set in opts0, and any pending error on the socket is ignored. send 11 udp: fast fail Fail with EAGAIN: call would block and non-blocking behaviour has been requested h0 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EAGAIN))sched timer); socks := socks ⊕ [(sid , sock 〈[es := es; ps1 := ↑ p′1; pr :=UDP PROTO(udp)]〉)]; bound := bound ; oq := oq ′]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ es := es; pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sock .cantsndmore = F ∧ (¬(windows arch h.arch) =⇒ es = ∗) ∧ p′1 ∈ autobind(sock .ps1,PROTO UDP, h0.socks) ∧ (if sock .ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ ((addr 6= ∗) ∨ (sock .is2 6= ∗)) ∧ opts = list to set opts0 ∧ ((¬bsd arch h.arch ∧MSG DONTWAIT ∈ opts) ∨ ff .b(O NONBLOCK)) ∧ dosend(h0.ifds, h0.rttab, (addr , str), (sock .is1, sock .ps1, sock .is2, sock .ps2), h0.oq , oq ′,F) Description Consider a UDP socket sid referenced by fd that is not shutdown for writing and has no pending errors. The thread tid is in the Run state and a call send(fd , addr , implode str , opts0 is made. The socket is either locally bound to a port p′1 or can be autobound to a port p ′ 1. Either the socket has a peer IP address set, or a destination address was provided in the send() call: addr 6= ∗. Either the socket is non-blocking: its O NONBLOCK flag is set, or the call is non-blocking: MSG DONTWAIT flag was set in the opts0 argument of send(). A UDP datagram (constructed from the socket’s binding quad (sock .is1, sock .ps1, sock .is2, sock .ps2), the destination address argument addr , and the data str) cannot be placed on the outqueue of the host oq . The send() call fails with an EAGAIN error. A tid ·send(fd , addr , implode str , opts0) transition is made, leaving the thread state FAIL (EAGAIN), and the host with outqueue oq ′. If the socket was autobound to a port, sid is appended to the host’s list of bound sockets. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 12 246 Note that on Linux EWOULDBLOCK and EAGAIN are aliased. Variations FreeBSD The socket’s O NONBLOCK flag must be set for the rule to apply; the MSG DONTWAIT flag is ignored by FreeBSD. WinXP Pending errors on the socket are ignored. send 12 udp: fast fail Fail with ENOTCONN: no peer address set in socket and no destination address provided h0 tid ·send(fd , ∗, implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps ′1, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]; bound := bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (if bsd arch h.arch then err = EDESTADDRREQ else err = ENOTCONN) ∧ (¬(windows arch h.arch) =⇒ es = ∗) ∧ (if linux arch h.arch then ∃p′1.p′1 ∈ autobind(ps1,PROTO UDP, h0.socks) ∧ ps ′1 = ↑ p′1 ∧ (if ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) else bound = h0.bound ∧ ps ′1 = ps1) Description Consider a UDP socket sid referenced by fd that has no pending errors. A call send(fd , addr , implode str , opts0 is made from thread tid which is in the Run state. The socket is either locally bound to a port p′1 or it can be autobound to a port p ′ 1. The socket does not have a peer address set, and no destination address is specified in the send() call: addr = ∗. The call will fail with an ENOTCONN error. A tid ·send(fd , ∗, implode str , opts0) transition will be made, leaving the thread in state Ret(FAIL ENOTCONN. If the socket was autobound then sid is appended to the head of the host’s list of bound sockets, h0.bound , resulting in the new list bound . Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations FreeBSD On FreeBSD the error returned is EDESTADDRREQ, the socket must not be shut down for writing, and if it is not bound to a local port it will not be autobound. WinXP Any pending error on the socket is ignored, and if the socket’s local port is not bound, ps1 = ∗, then it will not be autobound. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 14 247 send 13 udp: fast fail Fail with EMSGSIZE: string to be sent is bigger than UDPpayloadMax h0 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EMSGSIZE))sched timer); socks := socks ⊕ [(sid , sock 〈[ps1 := ps ′1; pr :=UDP PROTO(udp)]〉)]; bound := bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ (STRLEN (implode str) > UDPpayloadMax h0.arch ∨ (bsd arch h.arch ∧ STRLEN (implode str) > (h0.socks[sid ]).sf .n(SO SNDBUF))) ∧ ps ′1 ∈ {sock .ps1} ∪ (image(↑)(autobind(sock .ps1,PROTO UDP, h0.socks))) ∧ (if sock .ps1 = ∗ ∧ ps ′1 6= ∗ then bound = sid :: h0.bound else bound = h0.bound) Description Consider a UDP socket sid referenced by fd . A call send(fd , addr , implode str , opts0) is made from thread tid which is in the Run state. The length in bytes of str is greater than UDPpayloadMax, the architecture-dependent maximum payload size for a UDP datagram. The send() call fails with an EMSGSIZE error. A tid ·send(fd , addr , implode str , opts0) transition is made leaving the thread in state Ret(FAIL EMSGSIZE). Additionally, the socket’s local port ps1 may be autobound if it was not bound to a local port when the send() call was made. If the autobinding occurs, then the socket’s sid is added to the list of bound sockets h0.bound , leaving the host’s list of bound sockets as bound . Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations FreeBSD On FreeBSD, the send() call may also fail with EMSGSIZE if the size of str is greater than the value of the socket’s SO SNDBUF option. send 14 udp: fast fail Fail with EAGAIN, EADDRNOTAVAIL or ENOBUFS: there are no ephemeral ports left h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ∗, ∗, ∗, ∗, es, cantsndmore, cantrcvmore,UDP PROTO(udp)))]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ cantsndmore = F ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 15 248 (¬(windows arch h.arch) =⇒ es = ∗) ∧ autobind(∗,PROTO UDP, h.socks) = ∅ ∧ e ∈ {EAGAIN;EADDRNOTAVAIL;ENOBUFS} Description Consider a UDP socket sid referenced by fd that is not shutdown for writing and has no pending errors. The socket has no peer address set, and is not bound to a local IP address or port. From the Run state, thread tid makes a send(fd , addr , implode str , opts0) call. The socket cannot be auto-bound to an ephemeral port so the call fails. The error returned will be EAGAIN, EADDRNOTAVAIL, or ENOBUFS. A tid ·send(fd , addr , implode str , opts0) transition will be made. The thread will be left in state RET (FAIL e) where e is one of the above errors. Model details The data to be sent is of type string in the send() call but is a byte list when the datagram is constructed. Here the data, str is of type byte list and in the transition implode str is used to convert it into a string. Variations WinXP Any pending error on the socket is ignored. send 15 udp: slow urgent succeed Return from blocked state after datagram enqueued h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ↑(addr , is1, ps1, is2, ps2), str , opts))d); socks := socks ⊕ [(sid , sock 〈[es := es; pr :=UDP PROTO(udp)]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK(“”)))sched timer); socks := socks ⊕ [(sid , sock 〈[es := es; pr :=UDP PROTO(udp)]〉)]; oq := oq ′]〉 sock .cantsndmore = F ∧ (¬(windows arch h.arch) =⇒ es = ∗) ∧ STRLEN (implode str) ≤ UDPpayloadMax h.arch ∧ (dosend(h.ifds, h.rttab, (addr , str), (is1, ps1, is2, ps2), h.oq , oq ′,T) ∨ dosend(h.ifds, h.rttab, (addr , str), (sock .is1, sock .ps1, sock .is2, sock .ps2), h.oq , oq ′,T)) ∧ (addr 6= ∗ ∨ sock .is2 6= ∗ ∨ is2 6= ∗) Description Consider a UDP socket sid that is not shutdown for writing and has no pending errors. The thread tid is blocked in state Send2(sid , ↑(addr , is1, ps1, is2, ps2), str). A datagram can be constructed using str as its data. The length in bytes of str is less than or equal to UDPpayloadMax, the architecture-dependent maximum payload size for a UDP datagram. There are three possible destination addresses: • addr , the destination address specified in the send() call. • is2, ps2, the socket’s peer address when the send() call was made. • sock .is2, sock .ps2, the socket’s current peer address. At least one of addr , is2, and sock .is2 must specify an IP address: they are not all set to ∗. One of the three addresses will be used as the destination address of the datagram. The datagram can be successfully enqueued on the host’s outqueue, h.oq , resulting in a new outqueue oq ′. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 17 249 An τ transition is made, leaving the thread state Ret(OK(“”)), and the host with new outqueue oq ′. send 16 udp: slow urgent fail Fail: blocked socket has entered an error state h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ↑(addr , is1, ps1, is2, ps2), str))d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e; pr :=UDP PROTO(udp)]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗; pr :=UDP PROTO(udp)]〉)]]〉 ¬(windows arch h.arch) Description Consider a UDP socket sid that has pending error ↑ e. The thread tid is blocked in state Send2(sid , ↑(addr , is1, ps1, is2, ps2), str). The error, e, will be returned to the caller. At τ transition is made, leaving the thread state RET (FAIL e). Note that the error has occurred after the thread entered the Send2 state: rule send 11 specifies that the call cannot block if there is a pending error. Variations WinXP This rule does not apply: all pending errors on a socket are ignored for a send() call. send 17 udp: slow urgent fail Fail with EMSGSIZE or ENOTCONN: blocked socket has had peer address unset or string to be sent is too big h 〈[ts := ts ⊕ (tid 7→ (Send2(sid , ↑(addr , is1, ps1, is2, ps2), str , opts))d); socks := socks ⊕ [(sid , sock 〈[sf := sf ; es := es; pr :=UDP PROTO(udp)]〉)]]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[sf := sf ; es := es; pr :=UDP PROTO(udp)]〉)]]〉 (¬(windows arch h.arch) =⇒ es = ∗) ∧ (∃oq ′.dosend(h.ifds, h.rttab, (addr , str), (is1, ps1, is2, ps2), h.oq , oq ′,T)) ∧ ((STRLEN (implode str) > UDPpayloadMax h.arch ∧ (e = EMSGSIZE)) ∨ (bsd arch h.arch ∧ STRLEN (implode str) > sf .n(SO SNDBUF) ∧ (e = EMSGSIZE)) ∨ ((sock .is2 = ∗) ∧ (addr = ∗) ∧ (e = ENOTCONN))) Description Consider a UDP socket sid with no pending errors. The thread tid is blocked in state Send2(sid , ↑(addr , is1, ps1, is2, ps2), str). A datagram is constructed with str as its payload. Its destination address is taken from addr , the destina- tion address specified when the send() call was made, or (is2, ps2), the socket’s peer address when the send() call was made. It is possible to enqueue the datagram on the host’s outqueue, h.oq . This rule covers two cases. In the first, the length in bytes of str is greater than UDPpayloadMax, the architecture-dependent maximum payload size for a UDP datagram. The error EMSGSIZE is returned. In the second case, the original send() call did not have a destination address specified: addr = ∗, and the socket has had the IP address of its peer address unset: sock .is2 = ∗. The peer address of the socket when the send() call was made, (is2, ps2), is ignored, and an ENOTCONN error is returned. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 19 250 In either case, a τ transition is made, leaving the thread state Ret(FAIL e) where e is either EMSGSIZE or ENOTCONN. Variations FreeBSD An EMSGSIZE error can also be returned if the size of str is greater than the value of the socket’s SO SNDBUF option. WinXP Any pending error on the socket is ignored. send 18 udp: fast fail Fail with EOPNOTSUPP: MSG PEEK flag not supported for send() calls on WinXP; or MSG OOB flag not supported on WinXP and Linux h0 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EOPNOTSUPP))sched timer); socks := socks ⊕ [(sid , sock 〈[ps1 := ps ′1; pr :=UDP PROTO(udp)]〉)]; bound := bound ]〉 h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[ ps1 := ps1; pr :=UDP PROTO(udp)]〉)]]〉 ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ opts = list to set opts0 ∧ ((MSG PEEK ∈ opts ∧ windows arch h.arch) ∨ (MSG OOB ∈ opts ∧ sock .cantsndmore = F ∧ (linux arch h.arch ∨ windows arch h.arch))) ∧ (if linux arch h.arch then ∃p′1.p′1 ∈ autobind(ps1,PROTO UDP, h0.socks) ∧ ps ′1 = ↑ p′1 ∧ (if ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) else ps1 = ps ′ 1 ∧ bound = h0.bound) Description Consider a UDP socket sid referenced by fd . From thread tid , which is in the Run state, a send(fd , addr , implode str , opts0) call is made. This rule covers two cases. In the first, on WinXP, the MSG PEEK flag is set in opts0. In the second case, on Linux and WinXP, the socket has not been shut down for writing, and the MSG OOB flag is set in opts0. In either case, the send() call fail with an EOPNOTSUPP error. A tid ·send(fd , addr , implode str , opts0) transition is made, leaving the thread in state Ret(FAIL EOPNOTSUPP). Model details The opts0 argument is of type list. In the model it is converted to a set opts using list to set. The presence of MSG PEEK is checked for in opts rather than in opts0. Variations FreeBSD FreeBSD ignores the MSG PEEK and MSG OOB flags for send(). Linux Linux ignores the MSG PEEK flag for send(). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 21 251 send 19 udp: fast fail Fail with EADDRINUSE: on FreeBSD, local and destination address quad in use by another socket h0 tid ·send(fd , ↑(i2, p2), implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EADDRINUSE))sched timer); socks := socks ⊕ [(sid , sock)]; bound := bound ]〉 bsd arch h.arch ∧ h0 = h 〈[ ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 ∧ sock .cantsndmore = F ∧ (¬(windows arch h.arch) =⇒ sock .es = ∗) ∧ p′1 ∈ autobind(sock .ps1,PROTO UDP, h0.socks) ∧ (if sock .ps1 = ∗ then bound = sid :: h0.bound else bound = h0.bound) ∧ i ′1 ∈ auto outroute(i2, sock .is1, h0.rttab, h0.ifds) ∧ fd ∈ dom(h0.fds) ∧ fid = h0.fds[fd ] ∧ h0.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = (h0.socks[sid ]) ∧ proto of sock .pr = PROTO UDP ∧ (∃sid ′. sid ′ ∈ dom(h0.socks) ∧ let s = h0.socks[sid ′] in s.is1 = ↑ i ′1 ∧ s.ps1 = ↑ p′1 ∧ s.is2 = ↑ i2 ∧ s.ps2 = ↑ p2 ∧ proto of s.pr = PROTO UDP) Description On FreeBSD, consider a UDP socket sid referenced by fd that is not shutdown for writing. From thread tid , which is in the Run state, a send(fd , ↑(i2, p2), implode str , opts0) call is made. The socket is bound to local port p′1 or it can be autobound to port p ′ 1. The socket can be bound to a local IP address i ′ 1 which has a route to i2. Another socket, sid ′, is locally bound to (i ′1, p ′ 1) and has its peer address set to (i2, p2). The send() call will fail with an EADDRINUSE error. A tid ·send(fd , ↑(i2, p2), implode str , opts0) transition will be made, leaving the thread state Ret(FAIL EADDRINUSE). Variations Linux This rule does not apply. WinXP This rule does not apply. send 21 udp: fast fail Fail with EISCONN: socket has peer address set and destination address is specified in call on FreeBSD h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[es := ∗; is2 := ↑ i2; ps2 := ↑ p2; pr :=UDP PROTO(udp)]〉)]]〉 tid ·send(fd , ↑(i3, p3), implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ send 22 252 h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EISCONN))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗; is2 := ↑ i2; ps2 := ↑ p2; pr :=UDP PROTO(udp)]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ bsd arch h.arch Description Consider a UDP socket sid referenced by fd that has its peer address set: is2 = ↑i2, and ps2 = ↑ p2. From thread tid , which is in the Run state, a send(fd , ↑(i3, p3), implode str , opts0) call is made. On FreeBSD, the call will fail with the EISCONN error, as the call specified a destination address even though the socket has a peer address set. A tid ·send(fd , ↑(i3, p3), implode str , opts0) transition will be made, leaving the thread state Ret(FAIL EISCONN). Variations Posix If the socket is connectionless-mode, the message shall be sent to the address spec- ified by ↑(i3, p3). See the above send() rules. Linux This rule does not apply. Linux allows the send() call to occur. See the above send() rules. WinXP This rule does not apply. WinXP allows the send() call to occur. See the above send() rules. send 22 udp: fast fail Fail with EPIPE or ESHUTDOWN: socket shut down for writing h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,T, cantrcvmore,UDP PROTO(udp)))]]〉 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL err))sched timer); socks := socks ⊕ [(sid ,Sock(↑ fid , sf , is1, ps1, is2, ps2, es,T, cantrcvmore,UDP PROTO(udp)))]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ if windows arch h.arch then err = ESHUTDOWN else err = EPIPE Description From thread tid , which is in the Run state, a send(fd , addr , implode str , opts0) call is made where fd refers to a UDP socket sid that is shut down for writing. The call fails with an EPIPE error. A tid ·send(fd , addr , implode str , opts0) transition is made, leaving the thread in state Ret(FAIL EPIPE). Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setfileflags() (TCP and UDP) 253 WinXP The call fails with an ESHUTDOWN error rather than EPIPE. send 23 udp: fast fail Fail with pending error h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[es := ↑ e]〉)]]〉 tid ·send(fd , addr , implode str , opts0)−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer); socks := socks ⊕ [(sid , sock 〈[es := ∗]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of sock .pr = PROTO UDP ∧ ¬(windows arch h.arch) Description From thread tid , which is in the Run state, a send(fd , addr , implode str , opts0) call is made where fd refers to a UDP socket sid that has pending error ↑ e. The call fails, returning the pending error. A tid ·send(fd , addr , implode str , opts0) transition is made, leaving the thread in state Ret(FAIL e). Variations WinXP This rule does not apply: all pending errors are ignored for send() calls on WinXP. 15.23 setfileflags() (TCP and UDP) setfileflags : (fd ∗ filebflag list)→ unit A call to setfileflags(fd,flags) sets the flags on a file referred to by fd. flags is the list of file flags to set. The possible flags are: • O ASYNC Specifies whether signal driven I/O is enabled. • O NONBLOCK Specifies whether a socket is non-blocking. The call returns successfully if the flags were set, or fails with an error otherwise. 15.23.1 Errors A call to setfileflags() can fail with the errors below, in which case the corresponding exception is raised: EBADF The file descriptor passed is not a valid file descriptor. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setfileflags 1 254 15.23.2 Common cases setfileflags 1 ; return 1 15.23.3 API setfileflags() is Posix fcntl(fd,F_GETFL,flags). On WinXP it is ioctlsocket() with the FIONBIO com- mand. Posix: int fcntl(int fildes, int cmd, ...); FreeBSD: int fcntl(int fd, int cmd, ...); Linux: int fcntl(int fd, int cmd); WinXP: int ioctlsocket(SOCKET s, long cmd, u_long* argp) In the Posix interface: • fildes is a file descriptor for the file to retrieve flags from. It corresponds to the fd argument of the model setfileflags(). On WinXP the s is a socket descriptor corresponding to the fd argument of the model setfileflags(). • cmd is a command to perform an operation on the file. This is set to F_GETFL for the model setfileflags(). On WinXP, cmd is set to FIONBIO to get the O NONBLOCK flag; there is no O ASYNC flag on WinXP. • The call takes a variable number of arguments. For the model setfileflags() it takes three arguments: the two described above and a third of type long which represents the list of flags to set, corresponding to the flags argument of the model setfileflags(). On WinXP this is the argp argument. • The returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.23.4 Model details The following errors are not modelled: • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. • WSAENOTSOCK is a possible error on WinXP as the ioctlsocket() call is specific to a socket. In the model the setfileflags() call is performed on a file. 15.23.5 Summary setfileflags 1 all: fast succeed Update all the file flags for an open file description 15.23.6 Rules setfileflags 1 all: fast succeed Update all the file flags for an open file description h 〈[ts := ts ⊕ (tid 7→ (Run)d); files :=files ⊕ [(fid ,File(ft ,ff 〈[b :=ffb]〉))]]〉 tid ·setfileflags(fd ,flags)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); files :=files ⊕ [(fid ,File(ft ,ff 〈[b :=ffb′]〉))]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ ffb′ = λx .x ∈ flags Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsockbopt() (TCP and UDP) 255 Description From thread tid , which is in the Run state, a setfileflags(fd ,flags) call is made. fd refers to the open file description (fid ,File(ft ,ff 〈[b :=ffb]〉)) where ffb is the set of boolean file flags currently set. flags is a list of boolean file flags, possibly containing duplicates. All of the boolean file flags for the file description will be updated. The flags in flags will all be set to T, and all other flags will be set to F, resulting in a new set of boolean file flags, ffb′. A tid ·setfileflags(fd ,flags) transition is made, leaving the thread state Ret(OK()). Note this is not exactly the same as getfileflags 1 : getfileflags never returns duplicates, but duplicates may be passed to setfileflags. 15.24 setsockbopt() (TCP and UDP) setsockbopt : (fd ∗ sockbflag ∗ bool)→ unit A call setsockbopt(fd, f , b) sets the value of one of a socket’s boolean flags. Here the fd argument is a file descriptor referring to a socket on which to set a flag, f is the boolean socket flag to set, and b is the value to set it to. Possible boolean flags are: • SO BSDCOMPAT Specifies whether the BSD semantics for delivery of ICMPs to UDP sockets with no peer address set is enabled. • SO DONTROUTE Requests that outgoing messages bypass the standard routing facilities. The des- tination shall be on a directly-connected network, and messages are directed to the appropriate network interface according to the destination address. • SO KEEPALIVE Keeps connections active by enabling the periodic transmission of messages, if this is supported by the protocol. • SO OOBINLINE Leaves received out-of-band data (data marked urgent) inline. • SO REUSEADDR Specifies that the rules used in validating addresses supplied to bind() should allow reuse of local ports, if this is supported by the protocol. 15.24.1 Errors A call to setsockbopt() can fail with the errors below, in which case the corresponding exception is raised: ENOPROTOOPT The option is not supported by the protocol. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.24.2 Common cases setsockbopt 1 ; return 1 15.24.3 API setsockbopt() is Posix setsockopt() for boolean-valued socket flags. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsockbopt 1 256 Posix: int setsockopt(int socket, int level, int option_name, const void *option_value, socklen_t option_len); FreeBSD: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); Linux: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); WinXP: int setsockopt(SOCKET s, int level, int optname, const char* optval,int optlen); In the Posix interface: • socket is the file descriptor of the socket to set the option on, corresponding to the fd argument of the model setsockbopt(). • level is the protocol level at which the flag resides: SOL_SOCKET for the socket level options, and option_name is the flag to be set. These two correspond to the flag argument of the model setsockbopt() where the possible values of option_name are limited to: SO BSDCOMPAT, SO DONTROUTE, SO KEEPALIVE, SO OOBINLINE, and SO REUSEADDR. • option_value is a pointer to a location of size option_len containing the value to set the flag to. These two correspond to the b argument of type bool in the model setsockbopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.24.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. Note this error is not specified by Posix. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to setsockbopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.24.5 Summary setsockbopt 1 all: fast succeed Successfully set a boolean socket flag setsockbopt 2 udp: fast fail Fail with ENOPROTOOPT: SO KEEPALIVE and SO OOBINLINE options not supported for a UDP socket on WinXP 15.24.6 Rules setsockbopt 1 all: fast succeed Successfully set a boolean socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 tid ·setsockbopt(fd , f , b)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocknopt() (TCP and UDP) 257 sock ′ = sock 〈[ sf := sock .sf 〈[ b := sock .sf .b ⊕ (f 7→ b)]〉]〉 ∧ (windows arch h.arch ∧ proto of sock .pr = PROTO UDP =⇒ f /∈ {SO KEEPALIVE;SO OOBINLINE}) Description Consider a socket sid , referenced by fd , and with socket flags sock .sf . From thread tid , which is in the Run state, a setsockbopt(fd , f , b) call is made. f is the boolean socket flag to be set, and b is the boolean value to set it to. The call succeeds. A tid ·setsockbopt(fd , f , b) is made, leaving the thread state Ret(OK()). The socket’s boolean flags, sock .sf .b, are updated such that f has the value b. Variations WinXP As above, except that if sid is a UDP socket, then f cannot be SO KEEPALIVE or SO OOBINLINE. setsockbopt 2 udp: fast fail Fail with ENOPROTOOPT: SO KEEPALIVE and SO OOBINLINE options not supported for a UDP socket on WinXP h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 tid ·setsockbopt(fd , f , b)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer); socks := socks ⊕ [(sid , sock 〈[pr :=UDP PROTO(udp)]〉)]]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ f ∈ {SO KEEPALIVE;SO OOBINLINE} Description On WinXP, consider a UDP socket sid referenced by fd . From thread tid , which is in the Run state, a setsockbopt(fd , f , b) call is made, where f is either SO KEEPALIVE or SO OOBINLINE. The call fails with an ENOPROTOOPT error. A tid ·setsockbopt(fd , f , b) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations FreeBSD This rule does not apply. Linux This rule does not apply. 15.25 setsocknopt() (TCP and UDP) setsocknopt : (fd ∗ socknflag ∗ int)→ unit Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocknopt() (TCP and UDP) 258 A call setsocknopt(fd, f ,n) sets the value of one of a socket’s numeric flags. The fd argument is a file descriptor referring to a socket to set a flag on, f is the numeric socket flag to set, and n is the value to set it to. Possible numeric flags are: • SO RCVBUF Specifies the receive buffer size. • SO RCVLOWAT Specifies the minimum number of bytes to process for socket input operations. • SO SNDBUF Specifies the send buffer size. • SO SNDLOWAT Specifies the minimum number of bytes to process for socket output operations. 15.25.1 Errors A call to setsocknopt() can fail with the errors below, in which case the corresponding exception is raised: EINVAL On FreeBSD, attempting to set a numeric flag to zero. ENOPROTOOPT The option is not supported by the protocol. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.25.2 Common cases setsocknopt 1 ; return 1 15.25.3 API setsocknopt() is Posix setsockopt() for numeric-valued socket flags. Posix: int setsockopt(int socket, int level, int option_name, const void *option_value, socklen_t option_len); FreeBSD: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); Linux: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); WinXP: int setsockopt(SOCKET s, int level, int optname, const char* optval,int optlen); In the Posix interface: • socket is the file descriptor of the socket to set the option on, corresponding to the fd argument of the model setsocknopt(). • level is the protocol level at which the flag resides: SOL_SOCKET for the socket level options, and op- tion_name is the flag to be set. These two correspond to the flag argument of the model setsocknopt() where the possible values of option_name are limited to: SO RCVBUF, SO RCVLOWAT, SO SNDBUF, and SO SNDLOWAT. • option_value is a pointer to a location of size option_len containing the value to set the flag to. These two correspond to the n argument of type int in the model setsocknopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). 15.25.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. Note this error is not specified by Posix. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocknopt 2 259 • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to setsocknopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.25.5 Summary setsocknopt 1 all: fast succeed Successfully set a numeric socket flag setsocknopt 2 all: fast fail Fail with EINVAL: on FreeBSD numeric socket flags cannot be set to zero setsocknopt 4 all: fast fail Fail with ENOPROTOOPT: SO SNDLOWAT not set- table on Linux 15.25.6 Rules setsocknopt 1 all: fast succeed Successfully set a numeric socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 tid ·setsocknopt(fd , f ,n)−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ n ′ =max(sf min n h.arch f )(min(sf max n h.arch f )(clip int to num n)) ∧ ns = (if bsd arch h.arch ∧ f = SO SNDBUF ∧ n ′ < sock .sf .n(SO SNDLOWAT) then (sock .sf .n ⊕ (f 7→ n ′))⊕ (SO SNDLOWAT 7→ n ′) else sock .sf .n ⊕ (f 7→ n ′)) ∧ sock ′ = sock 〈[ sf := sock .sf 〈[ n :=ns]〉]〉 Description Consider the socket sid , referenced by fd , with numeric socket flags sock .sf .n. From the thread tid , which is in the Run state, a setsocknopt(fd , f ,n) call is made where f is a numeric socket flag to be updated, and n is the integer value to set it to. The call succeeds. A tid ·setsocknopt(fd , f ,n) transition is made, leaving the thread state Ret(OK()). The socket’s numeric flag f is updated to be the value n ′ which is: the architecture-specific minimum value for f sf min n h.arch f , if n is less than this value; the architecture-specific maximum value for f , i.e. sf max n h.arch f , if n is greater than this value, or n otherwise. Variations FreeBSD If the flag to be set is SO SNDBUF and the new value n is less than the value of the socket’s SO SNDLOWAT flag then the SO SNDLOWAT flag is also set to n. setsocknopt 2 all: fast fail Fail with EINVAL: on FreeBSD numeric socket flags cannot be set to zero h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocktopt() (TCP and UDP) 260 tid ·setsocknopt(fd , f ,n)−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINVAL))sched timer)]〉 clip int to num n = 0 ∧ bsd arch h.arch Description On FreeBSD, from thread tid , which is in the Run state, a setsocknopt(fd , f ,n) call is made where fd is a file descriptor, f is a numeric socket flag, and n is an integer value to set f to. Because the numeric value of n equals 0, the call fails with an EINVAL error. A tid ·setsocknopt(fd , f ,n) transition is made, leaving the thread state Ret(FAIL EINVAL). Variations Posix This rule does not apply. Linux This rule does not apply. WinXP This rule does not apply. setsocknopt 4 all: fast fail Fail with ENOPROTOOPT: SO SNDLOWAT not settable on Linux h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·setsocknopt(fd , f ,n)−−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer)]〉 linux arch h.arch ∧ f = SO SNDLOWAT Description On Linux, from thread tid , which is in the Run state, a setsocknopt(fd , f ,n) call is made. f = SO SNDLOWAT, which is not settable, so the call fails with an ENOPROTOOPT error. A tid ·setsocknopt(fd , f ,n) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations FreeBSD This rule does not apply. WinXP This rule does not apply. Note the warning from the Win32 docs (at MSDN setsockopt): ”If the setsockopt function is called before the bind function, TCP/IP options will not be checked with TCP/IP until the bind occurs. In this case, the setsockopt function call will always succeed, but the bind function call may fail because of an early setsockopt failing.” This is currently unimplemented. 15.26 setsocktopt() (TCP and UDP) setsocktopt : (fd ∗ socktflag ∗ (int ∗ int) option)→ unit Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocktopt() (TCP and UDP) 261 A call setsocktopt(fd, f , t) sets the value of one of a socket’s time-option flags. The fd argument is a file descriptor referring to a socket to set a flag on, f is the time-option socket flag to set, and t is the value to set it to. Possible time-option flags are: • SO RCVTIMEO Specifies the timeout value for input operations. • SO SNDTIMEO Specifies the timeout value that an output function blocks because flow control pre- vents data from being sent. If t = ∗ then the timeout is disabled. If t = ↑(s,ns) then the timeout is set to s seconds and ns nanoseconds. 15.26.1 Errors A call to setsocktopt() can fail with the errors below, in which case the corresponding exception is raised: EBADF The file descriptor fd does not refer to a valid file descriptor. EDOM The timeout value is too big to fit in the socket structure. ENOPROTOOPT The option is not supported by the protocol. ENOTSOCK The file descriptor fd does not refer to a socket. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.26.2 Common cases setsocktopt 1 ; return 1 15.26.3 API setsocktopt() is Posix setsockopt() for time-option socket flags. Posix: int setsockopt(int socket, int level, int option_name, const void *option_value, socklen_t option_len); FreeBSD: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); Linux: int setsockopt(int s, int level, int optname, const void *optval, socklen_t optlen); WinXP: int setsockopt(SOCKET s, int level, int optname, const char* optval,int optlen); In the Posix interface: • socket is the file descriptor of the socket to set the option on, corresponding to the fd argument of the model setsocktopt(). • level is the protocol level at which the flag resides: SOL_SOCKET for the socket level options, and option_name is the flag to be set. These two correspond to the flag argument of the model setsocktopt() where the possible values of option_name are limited to: SO RCVTIMEO and SO SNDTIMEO. • option_value is a pointer to a location of size option_len containing the value to set the flag to. These two correspond to the t argument of type (int ∗ int) option in the model setsocktopt(). • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ setsocktopt 4 262 15.26.4 Model details The following errors are not modelled: • EFAULT signifies the pointer passed as option_value was inaccessible. On WinXP, the error WSAEFAULT may also signify that the optlen parameter was too small. Note this error is not specified by Posix. • EINVAL signifies the option_name was invalid at the specified socket level. In the model, typing prevents an invalid flag from being specified in a call to setsocknopt(). • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.26.5 Summary setsocktopt 1 all: fast succeed Successfully set a time-option socket flag setsocktopt 4 all: fast fail Fail with ENOPROTOOPT: on WinXP SO LINGER not settable for a UDP socket setsocktopt 5 all: fast fail Fail with EDOM: timeout value too long to fit in socket structure 15.26.6 Rules setsocktopt 1 all: fast succeed Successfully set a time-option socket flag h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 tid ·setsocktopt(fd , f , t)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ tltimeopt wf t ∧ t ′ = time of tltimeopt t ∧ t ′ ≥ 0 ∧ (if f ∈ {SO RCVTIMEO;SO SNDTIMEO} ∧ t ′ = 0 then t ′′ =∞ else t ′′ = t ′) ∧ (if f = SO LINGER ∧ t = ↑(s,ns) then ns = 0 else T) ∧ (f ∈ {SO RCVTIMEO;SO SNDTIMEO} =⇒ t ′′ =∞∨ t ′′ ≤ sndrcv timeo t max) ∧ sock ′ = sock 〈[ sf := sock .sf 〈[ t := sock .sf .t ⊕ (f 7→ t ′′)]〉]〉 Description From thread tid , which is in the Run state, a setsocktopt(fd , f , t) call is made. fd refers to a socket sid which has time-option socket flags sock .sf .t ; f is a time-option socket flag: either SO RCVTIMEO or SO SNDTIMEO; and t is the well formed time-option value to set f to. The call succeeds. A tid ·setsocktopt(fd , f , t) transition is made, leaving the thread state Ret(OK()). If t = ∗ or t = ↑(0, 0) then the socket’s time-option flags are updated such that sock .sf .t(f ) = ∗, representing ∞; otherwise the socket’s time-option flags are updated such that f has the time value represented by t , which must be less than snd rcv timeo t max . Model details The type of t is (int ∗ int) option, but the type of a time-option socket flag is time. The auxiliary function time of tltimeopt is used to do the conversion. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ shutdown() (TCP and UDP) 263 setsocktopt 4 all: fast fail Fail with ENOPROTOOPT: on WinXP SO LINGER not settable for a UDP socket h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·setsocktopt(fd , f , t)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOPROTOOPT))sched timer)]〉 windows arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ proto of(h.socks[sid ]).pr = PROTO UDP ∧ f = SO LINGER Description On WinXP, from thread tid , which is in the Run state, a setsocktopt(fd , f , t) call is made. fd is a file descriptor referring to a UDP socket sid , f is the time-option socket SO LINGER. The flag f is not settable, so the call fails with an ENOPROTOOPT error. A tid ·setsocktopt(fd , f , t) transition is made, leaving the thread state Ret(FAIL ENOPROTOOPT). Variations FreeBSD This rule does not apply. Linux This rule does not apply. setsocktopt 5 all: fast fail Fail with EDOM: timeout value too long to fit in socket structure h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·setsocktopt(fd , f , t)−−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EDOM))sched timer)]〉 f ∈ {SO RCVTIMEO;SO SNDTIMEO} ∧ tltimeopt wf t ∧ t ′ = time of tltimeopt t ∧ (if t ′ = 0 then t ′′ =∞ else t ′′ = t ′) ∧ ¬(t ′′ =∞∨ t ′′ ≤ sndrcv timeo t max) Description From thread tid , which is currently in the Run state, a setsocktopt(fd , f , t) call is made. f is a time-option socket flag that is either SO RCVTIMEO or SO SNDTIMEO, and t is the time value to set f to. The call fails with an EDOM error because the value t is too large to fit in the socket structure: it is not zero and it is greater than sndrcv timeo t max. A tid ·setsocktopt(fd , f , t) call is made, leaving the thread state Ret(FAIL EDOM). Model details The type of t is (int ∗ int) option, but the type of a time-option socket flag is time. The auxiliary function time of tltimeopt is used to do the conversion. 15.27 shutdown() (TCP and UDP) shutdown : (fd ∗ bool ∗ bool)→ unit Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ shutdown() (TCP and UDP) 264 A call of shutdown(fd, r ,w) shuts down either the read-half of a connection, the write-half of a connection, or both. The fd is a file descriptor referring to the socket to shutdown; the r and w indicate whether the socket should be shut down for reading and writing respectively. For a TCP socket, shutting down the read-half empties the socket’s receive queue, but data will still be delivered to it and subsequent recv() calls will return data. Shutting down the write-half of a TCP connection causes the remaining data in the socket’s send queue to be sent and then TCP’s connection termination to occur. For Linux and WinXP, a TCP socket may only be shut down if it is in the ESTABLISHED state; on FreeBSD a socket may be shut down in any state. For a UDP socket, if the socket is shutdown for reading, data may still be read from the socket’s receive queue on Linux, but on FreeBSD and WinXP this is not the case. Shutting down the socket for writing causes subsequent send() calls to fail. 15.27.1 Errors A call to shutdown() can fail with the errors below, in which case the corresponding exception is raised: ENOTCONN The socket is not connected and so cannot be shut down. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. ENOBUFS Out of resources. 15.27.2 Common cases A TCP socket is created and connects to a peer; data is transferred between the two; the socket has no more data to send so calls shutdown() to inform the peer of this: socket 1 ; . . . ; connect 1 ; . . . ; shutdown 1 ; return 1 15.27.3 API Posix: int shutdown(int socket, int how); FreeBSD: int shutdown(int s, int how); Linux: int shutdown(int s, int how); WinXP: int shutdown(SOCKET s, int how); In the Posix interface: • socket is a file descriptor referring to the socket to shut down. This corresponds to the fd argument of the model shutdown(). • how is an integer specifying the type of shutdown corresponding to the (r ,w) arguments in the model shutdown(). If how is set to SHUT_RD then the read half of the connection is to be shut down, corre- sponding to a shutdown(fd,T,F) call in the model; if it is set to SHUT_WR then the write half of the connection is to be shut down, corresponding to a shutdown(fd,F,T) call in the model; if it is set to SHUT_RDWR then both the read and write halves of the connection are to be shut down, corresponding to a shutdown(fd,T,T) call in the model. • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The FreeBSD, Linux, and WinXP interfaces are similar, except where noted. 15.27.4 Model details The following errors are not modelled: • EINVAL signifies that the how argument is invalid. In the model the how argument is represented by the two boolean flags r and w which guarantees that the only values allowed are (T,T), (T,F), (F,T), and Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ shutdown 1 265 (F,F). The first three correspond to the allowed values of how: SHUT_RD, SHUT_WR, and SHUT_RDWR. The last possible value, (F,F), is not allowed by Posix, but the model allows a shutdown(fd,F,F) call, which has no effect on the socket. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.27.5 Summary shutdown 1 tcp: fast succeed Shut down read or write half of TCP connection shutdown 2 udp: fast succeed Shutdown UDP socket for reading, writing, or both shutdown 3 tcp: fast fail Fail with ENOTCONN: cannot shutdown a socket that is not connected on Linux and WinXP shutdown 4 udp: fast fail Fail with ENOTCONN: socket’s peer address not set on Linux 15.27.6 Rules shutdown 1 tcp: fast succeed Shut down read or write half of TCP connection h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock)]]〉 tid ·shutdown(fd , r ,w)−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock ′)]]〉 sock = Sock(↑ fid , sf , is1, ps1, is2, ps2, es, cantsndmore, cantrcvmore, pr) ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ pr = TCP PROTO tcp sock ∧ if bsd arch h.arch ∧ tcp sock .st ∈ {CLOSED;LISTEN} ∧ w then let sock ′′ = (tcp close h.arch sock) in sock ′ = sock ′′ 〈[ cantsndmore :=(w ∨ cantsndmore); cantrcvmore :=(r ∨ cantrcvmore); pr :=TCP PROTO(tcp sock of sock ′′ 〈[ cb :=ˆ(λcb.cb 〈[ bsd cantconnect :=T]〉); lis := ∗]〉) ]〉 else (¬bsd arch h.arch =⇒ ∃i1 p1 i2 p2.tcp sock .st = ESTABLISHED ∧ is1 = ↑ i1 ∧ ps1 = ↑ p1 ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2 ∧ tcp sock .lis = ∗) ∧ pr ′ = TCP PROTO(tcp sock 〈[ rcvq :=ˆ[ ]onlywhen r ; cb :=ˆ(λcb.cb 〈[ tf shouldacknow :=ˆ T onlywhen w ]〉)]〉) ∧ sock ′ = Sock(↑ fid , sf , is1, ps1, is2, ps2, es,w ∨ cantsndmore, r ∨ cantrcvmore, pr ′) Description From thread tid , which is in the Run state, a shutdown(fd , r ,w) call is made. fd refers to a TCP socket sid which is in the ESTABLISHED state and has binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2). The call suceeds: a tid ·shutdown(fd , r ,w) transition is made, leaving the thread in state Ret(OK()). If r = T then the read-half of the connection is shut down, setting cantrcvmore = T and emptying the socket’s receive queue; if w = T then the write-half of the connection is shut down, setting cantsndmore = T; otherwise, the socket is unchanged. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ shutdown 3 266 Variations FreeBSD The TCP socket can be in any state, not just ESTABLISHED. If the socket is in the CLOSED or LISTEN and is to be shutdown for writing, w = T, then the socket is closed, see tcp close (p121). Note that testing has shown the socket’s listen queue is not always set to ∗ after a shutdown() call. The precise condition for this being done needs to be investigated. shutdown 2 udp: fast succeed Shutdown UDP socket for reading, writing, or both h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[cantrcvmore := cantrcvmore; cantsndmore := cantsndmore; pr :=UDP PROTO(udp pr)]〉)]]〉 tid ·shutdown(fd , r ,w)−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK()))sched timer); socks := socks ⊕ [(sid , sock 〈[cantrcvmore :=(r ∨ cantrcvmore); cantsndmore :=(w ∨ cantsndmore); pr :=UDP PROTO(udp pr)]〉)]]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ (linux arch h.arch =⇒ sock .is2 6= ∗) Description Consider a UDP socket sid , referenced by fd . From thread tid , which is in the Run state, a shutdown(fd , r ,w) call is made and succeeds. A tid ·shutdown(fd , r ,w) transition is made, leaving the thread state Ret(OK()). If the socket was shut- down for reading when the call was made or r = T then the socket is shutdown for reading. If the socket was shutdown for writing when the call was made or w = T then the socket is shutdown for writing. Variations Linux As above, with the added condition that the socket’s peer IP address must be set: sock .is2 6= ∗. shutdown 3 tcp: fast fail Fail with ENOTCONN: cannot shutdown a socket that is not connected on Linux and WinXP h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·shutdown(fd , r ,w)−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOTCONN))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ TCP PROTO(tcp sock) = (h.socks[sid ]).pr ∧ tcp sock .st 6= ESTABLISHED ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ sockatmark() (TCP only) 267 ¬(bsd arch h.arch) Description From thread tid , which is in the Run state, a shutdown(fd , r ,w) call is made where fd refers to a TCP socket sid which is not in the ESTABLISHED state. The call fails with an ENOTCONN error. A tid ·shutdown(fd , r ,w) transition is made, leaving the thread state Ret(FAIL ENOTCONN). Variations FreeBSD This rule does not apply. shutdown 4 udp: fast fail Fail with ENOTCONN: socket’s peer address not set on Linux h 〈[ts := ts ⊕ (tid 7→ (Run)d); socks := socks ⊕ [(sid , sock 〈[is2 := ∗; pr :=UDP PROTO(udp)]〉)]]〉 tid ·shutdown(fd , r ,w)−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOTCONN))sched timer); socks := socks ⊕ [(sid , sock 〈[is2 := ∗; cantsndmore :=(w ∨ sock .cantsndmore); cantrcvmore :=(r ∨ sock .cantrcvmore); pr :=UDP PROTO(udp)]〉)]]〉 linux arch h.arch ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) Description On Linux, consider a UDP socket sid referenced by fd with no peer IP address set: is2 := ∗. From thread tid , which is in the Run state, a shutdown(fd , r ,w) call is made, and fails with an ENOTCONN error. A tid ·shutdown(fd , r ,w) transition is made, leaving the thread state Ret(FAIL ENOTCONN). If the socket was shutdown for reading when the call was made or r = T then the socket is shutdown for reading. If the socket was shutdown for writing when the call was made or w = T then the socket is shutdown for writing. Variations FreeBSD This rule does not apply: see rule shutdown 2 . WinXP This rule does not apply: see rule shutdown 2 . 15.28 sockatmark() (TCP only) sockatmark : fd→ bool A call to sockatmark(fd) returns a bool specifying whether or not a socket is at the urgent mark. Here fd is a file descriptor referring to a socket. If fd refers to a TCP socket then the call will succeed, returning T if that socket is at the urgent mark, and F if it is not. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ sockatmark() (TCP only) 268 If fd refers to a UDP socket then on FreeBSD the call will return F and on all other architectures it will fail with an EINVAL error: there is no concept of urgent data for UDP so calling sockatmark() does not make sense. 15.28.1 Errors A call to sockatmark() can fail with the errors below, in which case the corresponding exception is raised: EINVAL Calling sockatmark() on a UDP socket does not make sense. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. 15.28.2 Common cases sockatmark 1 ; return 1 15.28.3 API Posix: int sockatmark(int s); FreeBSD: int ioctl(int d, unsigned long request, int* argp); Linux: int ioctl(int d, int request, int* argp); WinXP: int ioctlsocket(SOCKET s, long cmd, u_long* argp); In the Posix interface: • s is a file descriptor referring to a socket. This corresponds to the fd argument of the model sockatmark(). • the returned int is either 0 or 1 to indicate success or -1 to indicate an error, in which case the error code is in errno. If the return value is 1 then the socket is at the urgent mark corresponding to a return value of T in the model sockatmark(); if the return value is 0 then the socket is not at the urgent mark, corresponding to a return value of F in the model. The FreeBSD, Linux, and WinXP interfaces are significantly different: to check whether or not a socket is at the urgent mark, the ioctl() function must be used. In the FreeBSD interface: • d is a file descriptor referring to a socket, corresponding to the fd argument of the model sockatmark(). • request selects which control function is to be performed. For sockatmark(), the request is SIOCATMARK. • argp is a pointer to a location to store the result of the call in. If the socket is at the urgent mark then 1 will be in the location pointed to by argp upon return, corresponding to a return value of T in the model sockatmark(); if the socket is not at the urgent mark, then argp will contain the value 0, corresponding to a return value of F in the model. • the returned int is either 0 to indicate success or -1 to indicate an error, in which case the error code is in errno. On WinXP an error is indicated by a return value of SOCKET_ERROR, not -1, with the actual error code available through a call to WSAGetLastError(). The Linux and WinXP interfaces are similar. 15.28.4 Model details The following errors are not modelled: • On FreeBSD, Linux, and WinXP, EFAULT can be returned if the argp parameter points to memory not in a valid part of the process address space. This is an artefact of the C interface to ioctl() that is excluded by the clean interface used in the model sockatmark(). • On FreeBSD and Linux, EINVAL can be returned if request is not a valid request. The model sockatmark() is implemented using the SIOCATMARK request which is valid. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ sockatmark 2 269 • ENOTTY is possible when making an ioctl() call but is not modelled. • WSAEINPROGRESS is WinXP-specific and described in the MSDN page as ”A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function”. This is not modelled here. 15.28.5 Summary sockatmark 1 tcp: fast succeed Successfully return whether or not a TCP socket is at the urgent mark sockatmark 2 udp: rc Fail with EINVAL: calling sockatmark() on a UDP socket does not make sense 15.28.6 Rules sockatmark 1 tcp: fast succeed Successfully return whether or not a TCP socket is at the urgent mark h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·sockatmark(fd)−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK b))sched timer)]〉 fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ h.socks[sid ] = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(ESTABLISHED, cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ b = (rcvurp = ↑ 0) Description From thread tid , which is in the Run state, a sockatmark(fd) call is made. fd refers to a TCP socket identified by sid which is in the ESTABLISHED state and has binding quad (↑ i1, ↑ p1, ↑ i2, ↑ p2). The call succeeds, returning T if the socket is at the urgent mark: rcvurp = ↑ 0; or F otherwise. A tid ·sockatmark(fd) transition is made, leaving the thread state Ret(OK b) where b is a boolean: T or F as above. sockatmark 2 udp: rc Fail with EINVAL: calling sockatmark() on a UDP socket does not make sense h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·sockatmark(fd)−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(ret))sched timer)]〉 proto of(h.socks[sid ]).pr = PROTO UDP ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ if bsd arch h.arch then rc = fast succeed ∧ ret = OK(F) else rc = fast fail ∧ ret = FAIL EINVAL Description Consider a UDP socket sid referenced by fd . From thread tid , which is in the Run state, a sockatmark(fd) call is made. On FreeBSD the call succeeds, returning F; on Linux and WinXP the call fails with an EINVAL error. A tid ·sockatmark(fd) transition is made, leaving the thread state Ret(OK(F)) on FreeBSD, and in state Ret(FAIL EINVAL) on Linux and WinXP. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ sockatmark 2 270 Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ socket() (TCP and UDP) 271 Posix As above: the call succeeds, returning F. FreeBSD As above: the call succeeds, returning F. Linux As above: the call fails with an EINVAL error. WinXP As above: the call fails with an EINVAL error. 15.29 socket() (TCP and UDP) socket : sock type → fd A call to socket(type) creates a new socket. Here type is the type of socket to create: SOCK STREAM for TCP and SOCK DGRAM for UDP. The returned fd is the file descriptor of the new socket. 15.29.1 Errors A call to socket() can fail with the errors below, in which case the corresponding exception is raised: EMFILE No more file descriptors for this process. ENOBUFS Out of resources. ENOMEM Out of resources. ENFILE Out of resources. 15.29.2 Common cases TCP: socket 1 ; return 1 ; connect 1 ; . . . UDP: socket 1 ; return 1 ; bind 1 ; return 1 ; send 9 ; . . . 15.29.3 API Posix: int socket(int domain, int type, int protocol); FreeBSD: int socket(int domain, int type, int protocol); Linux: int socket(int doamin, int type, int protocol); WinXP: SOCKET socket(int af, int type, int protocol); In the Posix interface: • domain specifies the communication domain in which the socket is to be created, specifying the protocol family to be used. Only IPv4 sockets are modelled here, so domain is set to AF_INET or PF_INET. • type specifies the communication semantics: SOCK_STREAM provides sequenced, reliable, two-way, connection-based byte streams; SOCK_DGRAM supports datagrams (connectionless, unreliable messages of a fixed maximum length). This corresponds to the sock type argument of the model socket(). • protocol specifies the particular protocol to be used for the socket. A protocol of 0 requests to use the default for the appropriate socket type: TCP for SOCK_STREAM and UDP for SOCK_DGRAM. Alternatively a specific protocol number can be used: 6 for TCP and 17 for UDP. In the model, SOCK STREAM refers to a TCP socket and SOCK DGRAM to a UDP socket so the protocol argument is not necessary. A call to socket(SOCK STREAM) in the model interface, would be a socket(AF_INET,SOCK_STREAM,0) call in Posix; a call to socket(SOCK DGRAM) in the model interface would be a socket(AF_INET,SOCK_DGRAM,0) call in Posix. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ socket 1 272 The FreeBSD, Linux and WinXP interfaces are similar modulo argument renaming, except where noted above. 15.29.4 Model details The following errors are not modelled: • In Posix and on Linux, EACCES specifies that the process does not have appropriate privileges. We do not model a privilege state in which socket creation would be disallowed. • In Posix and on Linux, EAFNOSUPPORT, specifies that the implementation does not support the address domain. FreeBSD, Linux, and WinXP all support AF_INET sockets. • On Linux, EINVAL means unknown protocol, or protocol domain not available. Both TCP and UDP are known protocols for Linux, and AF_INET is a known domain on Linux. • In Posix and on Linux, EPROTONOTSUPPORT specifies that the protocol is not supported by the address family, or the protocol is not supported by the implementation. FreeBSD, Linux, and WinXP all support the TCP and UDP protocols. • In Posix, EPROTOTYPE signifies that the socket type is not supported by the protocol. Both SOCK_STREAM and SOCK_DGRAM are supported by TCP and UDP respectively. • On WinXP, WSAESOCKTNOSUPPORT means the specified socket type is not supported in this address family. The AF_INET family supports both SOCK_STREAM and SOCK_DGRAM sockets. The AF_INET6, AF_LOCAL, AF_ROUTE, and AF_KEY address families; SOCK_RAW socket type; and all protocols other than TCP and UDP are not modelled. 15.29.5 Summary socket 1 all: fast succeed Successfully return a new file descriptor for a fresh socket socket 2 all: fast fail Fail with EMFILE: out of file descriptors for this process 15.29.6 Rules socket 1 all: fast succeed Successfully return a new file descriptor for a fresh socket h 〈[ts := ts ⊕ (tid 7→ (Run)d); fds := fds; files :=files; socks := socks]〉 tid ·(socket(socktype))−−−−−−−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(OK fd))sched timer); fds := fds ′; files :=files ⊕ [(fid ,File(FT Socket(sid),ff default))]; socks := socks ⊕ [(sid , sock)]]〉 card(dom(fds)) < OPEN MAX∧ fid /∈ (dom(files)) ∧ sid /∈ (dom(socks)) ∧ nextfd h.arch fds fd ∧ fds ′ = fds ⊕ (fd ,fid) ∧ (case socktype of SOCK DGRAM→ (sock = Sock(↑ fid , sf default h.arch socktype, ∗, ∗, ∗, ∗, ∗,F,F,UDP Sock([ ]))) ‖ SOCK STREAM→ (sock = Sock(↑ fid , sf default h.arch socktype, ∗, ∗, ∗, ∗, ∗,F,F, Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Miscellaneous (TCP and UDP) 273 TCP Sock(CLOSED, initial cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)))) Description From thread tid , which is in the Run state, a socket(socktype) call is made. The number of open file descriptors is less than the maximum permitted, OPEN MAX. If socktype = SOCK STREAM then a new TCP socket sock is created, in the CLOSED state, with initial cb (p101) as its control block, and all other fields uninitialised; if socktype = SOCK DGRAM then a new, unitialised UDP socket sock is created. A new open file description is created pointing to the socket, and a new file descriptor, fd , is allocated in an architecture specific way (see nextfd (p??)) to point to the open file description. The host’s finite map of sockets is updated to include an entry mapping the socket identifier sid to the socket; its finite map of file descriptions is updated to add an entry mapping the file descriptor fid to the file description of the socket; and its finite map of file descriptors is updated, adding a mapping from fd to fid . A tid ·socket(sock type) transition is made, leaving the thread state Ret(OKfd) to return the new file descriptor. socket 2 all: fast fail Fail with EMFILE: out of file descriptors for this process h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·(socket(s))−−−−−−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EMFILE))sched timer)]〉 card(dom(h.fds)) ≥ OPEN MAX Description From thread tid , which is in the Run state, a socket(s) call is made. The number of open file descriptors is greater than the maximum allowed number, OPEN MAX, and so the call fails with an EMFILE error. A tid ·socket(s) transition is made, leaving the thread state Ret(FAIL EMFILE). 15.30 Miscellaneous (TCP and UDP) This section collects the remaining Sockets API rules: • The rule return 1 characterising how the the results of system calls are returned to the caller, with transitions from the thread state (Ret v)d . • Rules badf 1 and notsock 1 deal with all the Sockets API calls that take a file descriptor argument, dealing uniformly with the error cases in which that file descriptor is not valid or does not refer to a socket. • Rule intr 1 applies to all the thread states for blocked calls, Accept2(sid) etc., characterising the behaviour in the case where the call is interrupted by a signal. • Rules resourcefail 1 and resourcefail 2 deal with the cases where calls fail due to a lack of system resources. 15.30.1 Errors Common errors. EBADF The file descriptor passed is not a valid file descriptor. ENOTSOCK The file descriptor passed does not refer to a socket. EINTR The system was interrupted by a caught signal. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ badf 1 274 ENOMEM Out of resources. ENOBUFS Out of resources. ENFILE Out of resources. 15.30.2 Summary return 1 all: misc nonurgent Return result of system call to caller badf 1 all: fast fail Fail with EBADF: not a valid file descriptor notsock 1 all: fast fail Fail with ENOTSOCK: file descriptor not a valid socket intr 1 all: slow nonurgent fail Fail with EINTR: blocked system call interrupted by signal resourcefail 1 all: fast badfail Fail with ENFILE, ENOBUFS or ENOMEM: out of re- sources resourcefail 2 all: slow nonurgent bad- fail Fail with ENFILE, ENOBUFS or ENOMEM: from a blocked state with out of resources 15.30.3 Rules return 1 all: misc nonurgent Return result of system call to caller h 〈[ts := ts ⊕ (tid 7→ (Ret v)d)]〉 tid ·v−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Run)never timer)]〉 T Description A system call from thread tid has completed, leaving the thread state (Ret v)d . The value v (which may be of the form OK v ′ or FAIL v ′, for success or failure respectively) is returned to the caller before the timer d expires. The thread continues its execution, indicated by the resulting thread state (Run)never timer. badf 1 all: fast fail Fail with EBADF: not a valid file descriptor h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·opn−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 fd op fd opn ∧ fd /∈ dom(h.fds) ∧ (if windows arch h.arch then e = ENOTSOCK else e = EBADF) Description From thread tid , which is in the Run state, a system call opn is made. The call requires a single valid file descriptor, but the descriptor passed, fd is not valid: it does not refer to an open file description. The call fails with an EBADF error, or an ENOTSOCK error on WinXP. A tid ·opn transition is made, leaving the thread state Ret(FAIL e) where e is one of the above errors. The system calls this rule applies to are: accept(), bind(), close(), connect(), disconnect(), dup(), dupfd(), getfileflags(), setfileflags(), getsockname(), getpeername(), getsockbopt(), getsockerr(), getsocklistening(), getsocknopt(), getsocktopt(), listen(), recv(), send(), setsockbopt(), setsocknopt(), setsocktopt(), shutdown(), and sockatmark(). See the definition of fd op (p35). Variations Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ intr 1 275 FreeBSD As above: the call fails with an EBADF error. Linux As above: the call fails with an EBADF error. WinXP As above: the call fails with an ENOTSOCK error. notsock 1 all: fast fail Fail with ENOTSOCK: file descriptor not a valid socket h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·opn−−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL ENOTSOCK))sched timer)]〉 fd sockop fd opn ∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(ft ,ff ) ∧ ¬(∃sid .ft = FT Socket(sid)) Description From thread tid , which is in the Run state, a system call opn is made. The call requires a single file descriptor referring to a socket. The file descriptor fd that the user passes refers to an open file description File(ft ,ff ) that does not refer to a socket. The call fails with an ENOTSOCK error. A tid ·opn transition is made, leaving the thread state Ret(FAIL ENOTSOCK). The system calls this rule applies to are: accept(), bind(), connect(), disconnect(), getpeername(), getsockbopt(), getsockerr(), getsocklistening(), getsockname(), getsocknopt(), getsocktopt(), listen(), recv(), send(), setsockbopt(), setsocknopt(), setsocktopt(), shutdown(), and sockatmark(). See the definition of fd sockop (p35). intr 1 all: slow nonurgent fail Fail with EINTR: blocked system call interrupted by signal h 〈[ts := ts ⊕ (tid 7→ (st)d)]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL EINTR))sched timer)]〉 sock = (h.socks[sid ]) ∧ (st = Close2(sid) ∨ st = Connect2(sid) ∨ st = Recv2(sid ,n, opts) ∨ st = Send2(sid , addr , str , opts) ∨ st = PSelect2(readfds,writefds, exceptfds) ∨ st = Accept2(sid)) Description If on socket sid as user call blocked leaving a thread in one of the states: Close2(sid), Connect2(sid), Recv2(sid), Send2(sid), PSelect2(sid) or Accept2(sid) and a signal is caught, the calls fails returning error EINTR. Model details This rule is non-deterministic, allowing blocked calls to be interrupted at any point, as the specification does not model the dynamics of signals. Variations POSIX POSIX says that a system call ”shall fail” if ”interrupted by a signal”. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ resourcefail 2 276 resourcefail 1 all: fast badfail Fail with ENFILE, ENOBUFS or ENOMEM: out of resources h 〈[ts := ts ⊕ (tid 7→ (Run)d)]〉 tid ·call−−−−−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 ¬ INFINITE RESOURCES∧ fd ∈ dom(h.fds) ∧ fid = h.fds[fd ] ∧ h.files[fid ] = File(FT Socket(sid),ff ) ∧ sock = (h.socks[sid ]) ∧ ((call = socket(socktype) ∧ e ∈ {ENFILE;ENOBUFS;ENOMEM}) ∨ (call = bind(fd , is1, ps1) ∧ e = ENOBUFS) ∨ (call = connect(fd , i2, ↑ p2) ∧ e = ENOBUFS) ∨ (call = listen(fd ,n) ∧ e = ENOBUFS) ∨ (call = recv(fd ,n, opts) ∧ e ∈ {ENOMEM;ENOBUFS}) ∨ (call = getsockname(fd) ∧ e = ENOBUFS) ∨ (call = getpeername(fd) ∧ e = ENOBUFS) ∨ (call = shutdown(fd , r ,w) ∧ e = ENOBUFS) ∨ (call = accept(fd) ∧ e ∈ {ENFILE;ENOBUFS;ENOMEM} ∧ proto of sock .pr = PROTO TCP)) Description Thread tid performs a socket(), bind(), connect(), listen(), recv(), getsockname(), getpeername(), shutdown() or accept() system call on socket sid , referred to by fd , when insufficient system-wide resources are available to complete the request. Return a failure of ENFILE, ENOBUFS or ENOMEM immediately to the calling thread. This rule applies only when it is assumed that the host being modelled does not have INFINITE RESOURCES, i.e. the host does not have unlimited memory, mbufs, file descriptors, etc. Model details The modelling of failure is deliberately non-deterministic because the cause of errors such as ENFILE are determined by more than is modelled in this specification. In order to be more precise, the model would need to describe the whole system to determine when such error conditions could and should arise. resourcefail 2 all: slow nonurgent badfail Fail with ENFILE, ENOBUFS or ENOMEM: from a blocked state with out of resources h 〈[ts := ts ⊕ (tid 7→ (t)d)]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (Ret(FAIL e))sched timer)]〉 ¬ INFINITE RESOURCES∧ sock = (h.socks[sid ]) ∧ ((t = Accept2(sid) ∧ e ∈ {ENFILE;ENOBUFS;ENOMEM}) ∨ (t = Connect2(sid) ∧ e = ENOBUFS) ∨ (t = Recv2(sid ,n, opts) ∧ e ∈ {ENOBUFS;ENOMEM})) Description If thread tid of host h is in state Accept2(sid), Connect2(sid) or Recv2(sid) following an accept(), connect() or recv() system call that blocked, and the host has subsequently exhausted its system-wide resources, fail with ENFILE, ENOBUFS or ENOMEM. The error is immediately returned to the thread that made the system call. Calls to connect() only return ENOBUFS when resources are exhausted and calls to recv() only return ENOBUFS or ENOMEM. This rule applies only when it is assumed that the host being modelled does not have INFINITE RESOURCES, i.e. the host does not have unlimited memory, mbufs, file descriptors, etc. Model details Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ resourcefail 2 277 The modelling of failure is deliberately non-deterministic because the cause of errors such as ENFILE are determined by more than is modelled in this specification. In order to be more precise, the model would need to describe the whole system to determine when such error conditions could and should arise. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 16 Host LTS: TCP Input Processing 16.1 Input Processing (TCP only) These rules deal with the processing of TCP segments from the host’s input queue. The most important are deliver in 1 , deliver in 2 , and deliver in 3 . deliver in 1 deals with a passive open: a socket in LISTEN state that receives a SYN and sends a SYN ,ACK . deliver in 2 deals with the completion of an active open: a socket in SYN SENT state (that has previously sent a SYN with the connect 1 rule) that receives a SYN ,ACK and sends an ACK . It also deals with simultaneous opens. deliver in 3 deals with the common cases of TCP data exchange and connection close: sockets in connected states that receive data, ACK s, and FIN s. This rule is structured using the relational monad, combining auxiliaries di3 topstuff, di3 ackstuff, di3 datastuff etc., to factor out many of the imperative effects of the code. The other rules deal with RST s and a variety of pathological situations. 16.1.1 Summary deliver in 1 tcp: network nonurgent Passive open: receive SYN, send SYN,ACK deliver in 1b tcp: network nonurgent For a listening socket, receive and drop a bad datagram and either generate a RST segment or ignore it. Drop the incom- ing segment if the socket’s queue of incomplete connections is full. deliver in 2 tcp: network nonurgent Completion of active open (in SYN SENT receive SYN,ACK and send ACK) or simultaneous open (in SYN SENT receive SYN and send SYN,ACK) deliver in 2a tcp: network nonurgent Receive bad or boring datagram and RST or ignore for SYN SENT socket deliver in 3 tcp: network nonurgent Receive data, FINs, and ACKs in a connected state di3 topstuff deliver in 3 initial checks di3 newackstuff deliver in 3 new ack processing, used in di3 ackstuff di3 ackstuff deliver in 3 ACK processing di3 datastuff really deliver in 3 data processing di3 datastuff deliver in 3 data processing di3 ststuff deliver in 3 TCP state change processing di3 socks update deliver in 3 socket update processing deliver in 3a tcp: network nonurgent Receive data with invalid checksum or offset deliver in 3b tcp: network nonurgent Receive data after process has gone away deliver in 3c tcp: network nonurgent Receive stupid ACK or LAND DoS in SYN RECEIVED state deliver in 4 tcp: network nonurgent Receive and drop (silently) a non-sane or martian segment deliver in 5 tcp: network nonurgent Receive and drop (maybe with RST) a sane segment that does not match any socket 278 deliver in 1 279 deliver in 6 tcp: network nonurgent Receive and drop (silently) a sane segment that matches a CLOSED socket deliver in 7 tcp: network nonurgent Receive RST and zap non-{CLOSED; LISTEN; SYN SENT; SYN RECEIVED; TIME WAIT} socket deliver in 7a tcp: network nonurgent Receive RST and zap SYN RECEIVED socket deliver in 7b tcp: network nonurgent Receive RST and ignore for LISTEN socket deliver in 7c tcp: network nonurgent Receive RST and ignore for SYN SENT(unacceptable ack) or TIME WAIT socket deliver in 7d tcp: network nonurgent Receive RST and zap SYN SENT(acceptable ack) socket deliver in 8 tcp: network nonurgent Receive SYN in non-{CLOSED; LISTEN; SYN SENT; TIME WAIT} state deliver in 9 tcp: network nonurgent Receive SYN in TIME WAIT state if there is no matching LISTEN socket or sequence number has not increased 16.1.2 Rules deliver in 1 tcp: network nonurgent Passive open: receive SYN, send SYN,ACK h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ]〉 τ−→ h 〈[socks := socks ′ ⊕ (* Listening socket *) [(sid ,Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis ′, [ ], ∗, [ ], ∗,NO OOBDATA))); (* New socket formed by the incoming SYN *) (sid ′,Sock(∗, sf ′, ↑ i1, ↑ p1, ↑ i2, ↑ p2, ∗, cantsndmore, cantrcvmore, TCP Sock(SYN RECEIVED, cb′′, ∗, [ ], ∗, [ ], ∗,NO OOBDATA)))]; iq := iq ′; oq := oq ′]〉 (* Summary: A host h with listening socket sock referenced by index sid receives a valid and well-formed SYN segment seg addressed to socket sock . A new socket in the SYN RECEIVED state is constructed, referenced by sid ′(6= sid), is added to the queue of incomplete incoming connection attempts q , and a SYN ,ACK segment is generated in reply with some field values being chosen or negotiated. The reply segment is finally queued on the host’s output queue for transmission, ignoring any errors upon queueing failure. *) sid /∈ (dom(socks)) ∧ sid ′ /∈ (dom(socks)) ∧ sid 6= sid ′ ∧ (* Take TCP segment seg from the head of the host’s input queue *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (* The segment must be of an acceptable form *) (* Note: some segment fields are ignored during TCP connection establishment and as such may contain arbitrary values. These are equal to the identifiers postfixed with discard below, which are otherwise unconstrained. *) (∃win ws mss PSH discard URG discard FIN discard urp discard data discard ack discard . seg = 〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack discard : tcp seq local); Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 1 280 URG :=URG discard ; ACK :=F; (* ACK must be F in a SYN segment *) PSH :=PSH discard ; RST :=F; (* Valid SYN segments never have RST set *) SYN :=T; (* Is a SYN segment! *) FIN :=FIN discard ; win :=win ; ws :=ws ; urp := urp discard ; mss :=mss ; ts := ts; data := data discard ]〉 ∧ (* Equality of some type casts *) w2n win = win ∧ option map ord ws = ws ∧ option map w2n mss = mss ) ∧ (* The segment is addressed to an IP address belonging to one of the interfaces of host h and is not addressed from or to a link-layer multicast or an IP-layer broadcast address *) i1 ∈ local ips h.ifds ∧ ¬(is broadormulticast h.ifds i1) ∧ ¬(is broadormulticast h.ifds i2) ∧ (* Find the socket sock that has the best match for the address quad in segment seg , see tcp socket best match (p86). Socket sock must have a form matching the patten Sock(. . . ). *) tcp socket best match socks(sid , sock)seg h.arch ∧ sock = Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, [ ], ∗, [ ], ∗,NO OOBDATA)) ∧ (* A BSD socket in the LISTEN state may have its peer’s IP address is2 and port ps2 set because listen() can be called from any TCP state. On other architectures they are both constrained to ∗. *) ((is2 = ∗ ∧ ps2 = ∗) ∨ (bsd arch h.arch ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2)) ∧ (* If socket sid has a local IP address specified it should be the same as the destination IP address of the segment seg , otherwise the seg is not addressed to this socket. If the socket does not have a local IP address the segment is acceptable because the socket is listening on all local IP addresses. The segment must not have been sent by socket sock . Note: a socket is permitted to connect to itself by a simultaneous open. This is handled by deliver in 2 (p285) and not here. *) (case is1 of ↑ i1 ′ → i1 ′ = i1 ‖ ∗ → T) ∧ ¬(i1 = i2 ∧ p1 = p2) ∧ (* If another socket in the TIME WAIT state matches the address quad of the SYN segment then only proceed with the new incoming connection attempt if the sequence number of the segment seq is strictly greater than the next expected sequence number on the TIME WAIT socket, rcv nxt . This prevents old or duplicate SYN segments from previous incarnations of the connection from inadvertently creating new connections. *) ¬(∃(sid , sock) :: socks. ∃tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .st = TIME WAIT ∧ sock .is1 = ↑ i1 ∧ sock .ps1 = ↑ p1 ∧ sock .is2 = ↑ i2 ∧ sock .ps2 = ↑ p2 ∧ seq ≤ tcp sock .cb.rcv nxt) ∧ (* Otherwise, the TIME WAIT sock is completely defunct because there is a new connection attempt from the same remote end-point. Close it completely. *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 1 281 (* Note: this models the behaviour in RFC1122 Section 4.2.2.13 which states that a new SYN with a sequence number larger than the maximum seen in the last incarnation may reopen the connection, i.e., reuse the socket for the new connection changing out of the TIME WAIT state. This is modelled by closing the existing TIME WAIT socket and creating the new socket from scratch. *) socks ′ = $o f (λsock . if ∃tcp sock .sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .st = TIME WAIT ∧ sock .is1 = ↑ i1 ∧ sock .ps1 = ↑ p1 ∧ sock .is2 = ↑ i2 ∧ sock .ps2 = ↑ p2 then tcp close h.arch sock else sock )socks ∧ (* Accept the new connection attempt to the incomplete connection queue if the queue of completed (established) connections is not already full *) accept incoming q0 lis T ∧ (* Possibly drop an arbitrary connection from the queue of incomplete connection attempts – this covers the behaviour of FreeBSD when the oldest connection in the SYN bucket or in the whole SYN cache is dropped, depending upon which became full. *) (choose drop :: drop from q0 lis. if drop then ∃q0L sid ′′ q0R. lis.q0 = q0L@ (sid ′′ :: q0R) ∧ q ′0 = q0L@ q0R else q ′0 = lis.q0 ) ∧ (* Put the new incomplete connection on the (possibly pruned) incomplete connections queue. *) lis ′ = lis 〈[ q0 := sid ′ :: q ′0]〉 ∧ (* Create a SYN,ACK segment in reply: *) (* The maximum segment size of the outgoing SYN,ACK reply segment must be in range, i.e., less than the maximum IP segment size minus the space consumed by IP and TCP headers. This is deliberately non-deterministic: an implementation would query the interface’s MTU and subtract the header space required. *) advmss ∈ {n | n ≥ 1 ∧ n ≤ (65535− 40)} ∧ (* Be non-deterministic in deciding whether to transmit a maximum segment size option. A host either supports the maximum segment size option or not – here the specfication permits either sending the option or not, but if the option is sent it must contain the advertised mss chosen previously by the host. This captures all acceptable behaviour. *) advmss ′ ∈ {∗; ↑ advmss} ∧ (* If a timestamp option was present in the received segment and a non-deterministic choice is made to do timestamping on this connection (i.e., the host supports timestamping), then timestamping is being used for this connection. Other- wise, timestamping is not used because one or both hosts do not support it. A real host would either do timestamping or not depending on its configuration. Here all acceptable behaviour must be permitted. *) tf rcvd tstmp′ = is some ts ∧ (choose want tstmp :: {F;T}. tf doing tstmp′ = (tf rcvd tstmp′ ∧ want tstmp) ) ∧ (* Lookup the bandwidth delay product from the route metric cache and calculate the size of the receive and send buffers, the maximum segment size and the initial congestion window. *) bw delay product for rt = ∗ ∧ (rcvbufsize ′, sndbufsize ′, t maxseg ′, snd cwnd ′) = Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 1 282 calculate buf sizes advmss mss bw delay product for rt(is localnet h.ifds i2) (sf .n(SO RCVBUF))(sf .n(SO SNDBUF))tf doing tstmp′ h.arch ∧ (* Store the new receive and send buffer sizes *) sf ′ = sf 〈[ n := funupd list sf .n[(SO RCVBUF, rcvbufsize ′); (SO SNDBUF, sndbufsize ′)]]〉 ∧ (* Non-deterministically choose to do window scaling (i.e., choose whether this host supports window scaling or not). Do window scaling on the new connection if the received SYN segment contained a window scaling option and this host supports it. A real host would either be configured to do window scaling or not (provided it supported window scaling). Here all acceptable behaviour must be permitted. *) req ws ∈ {F;T} ∧ tf doing ws ′ = (req ws ∧ is some ws) ∧ (if tf doing ws ′ then (* Doing window scaling *) (* Constrain the receive scale to be within the correct range and the send scale to be that received from the remote host *) rcv scale ′ ∈ {n | n ≥ 0 ∧ n ≤ TCP MAXWINSCALE} ∧ snd scale ′ = option case 0 I ws else (* Otherwise, turn off scaling *) rcv scale ′ = 0 ∧ snd scale ′ = 0) ∧ (* Constrain the receive window for the new connection – this is advertised in the SYN ,ACK reply. No scaling is performed here as scaling is not applied to segments containing a valid SYN since the support for window scaling has not been fully negotitated yet! *) rcv window ∈ {n | n ≥ 0 ∧ n ≤ TCP MAXWIN∧ n ≤ sf .n(SO RCVBUF)} ∧ (* Time the SYN,ACK reply segment. This is a new connection thus no previous timers can be running. *) (let t rttseg ′ = ↑(ticks of h.ticks, cb.snd nxt) in (* Initial sequence number of SYN ,ACK reply segment is unconstrained. *) iss ∈ {n | T} ∧ (* The ack value in the reply segment must acknowledge the remote host’s initial SYN . *) let ack ′ = seq + 1 in (* Update the new connection’s control block in light of above. *) cb′ = cb 〈[ tt keep := ↑((())slow timer TCPTV KEEP IDLE); tt rexmt := start tt rexmt h.arch 0 F cb.t rttinf ; iss := iss; irs := seq ; rcv wnd := rcv window ; tf rxwin0sent :=(rcv window = 0); rcv adv := ack ′ + rcv window ; rcv nxt := ack ′; snd una := iss; snd max := iss + 1; (* SYN consumes one-byte of sequence space *) snd nxt := iss + 1; (* SYN consumes one-byte of sequence space *) snd cwnd := snd cwnd ′; rcv up := seq + 1; (* Pull along with left edge of unused window *) t maxseg := t maxseg ′; (* The negotiated mss, with options removed *) tadvmss := advmss ′; (* Remember the mss advertised (if any) by this socket in case the SYN segment is retransmitted *) rcv scale := rcv scale ′; snd scale := snd scale ′; tf doing ws := tf doing ws ′; ts recent := case ts of Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 1b 283 ∗ → cb.ts recent ‖ ↑(ts val , ts ecr)→ (ts val)TimeWindowkern timer dtsinval ; last ack sent := ack ′; t rttseg := t rttseg ′; tf req tstmp := tf doing tstmp′; tf doing tstmp := tf doing tstmp′ ]〉) ∧ (* Construct the SYN,ACK segment using the values stored in the updated control block for the new connection. See make syn ack segment (p107). *) choose seg ′ :: make syn ack segment cb′(i1, i2, p1, p2)(ticks of h.ticks). (* Add the SYN,ACK reply segment to the host’s output queue, ignoring failure. Constrain the new connection’s initial control block cb to have just the right values in case queueing of the segment fails (perhaps due to a routing failure) and some control block state has to be rolled back. See rollback tcp output (p117) and enqueue or fail (p118) for more detail. *) enqueue or fail T h.arch h.rttab h.ifds[TCP seg ′]oq (cb 〈[ snd nxt := iss; (* If queueing fails, need to retransmit the SYN *) snd max := iss; (* If queueing fails, need to retransmit the SYN *) t maxseg := t maxseg ′; last ack sent := tcp seq foreign 0w; rcv adv := tcp seq foreign 0w ]〉)cb′(cb′′, oq ′) Model details During TCP connection establishment, BSD uses syn-caches and syn-buckets to protect against some types of denial-of-service attack. These techniques delay the memory allocation for a socket’s data structures until connection establishment is complete. They are not modelled directly in this specification, which instead favours the use of the full socket structure for clarity. The behaviour is observationally equivalent provided correct bounds are applied to the lengths of the incoming connection queues. When a socket completes connection establishment, i.e., enters the ESTABLISHED state, BSD updates the socket’s control block t maxseg field to the minimum of the maximum segment size it advertised in the emitted SYN,ACK segment and that received in the SYN segment from the remote end. This update is later than perhaps it need be. This model updates the t maxseg at the moment both the maximum segment values are known. As a consequence the initial maximum segment value advertised by the host must be stored just in case the SYN,ACK segment need be retransmitted. Variations FreeBSD On FreeBSD, the listen() socket call can be called on a TCP socket in any state, thus it is possible for a listening TCP socket to have a peer address, i.e., is2 and ps2 pair, specified. This in turn affects the behaviour of connection establishment because an incoming SYN segment only matches this type of listening socket if its address quad matches the socket’s entire address quad, heavily restricting the usefulness of such a socket. Such a restrictive peer address binding is permitted by the model for FreeBSD only. deliver in 1b tcp: network nonurgent For a listening socket, receive and drop a bad datagram and either generate a RST segment or ignore it. Drop the incoming segment if the socket’s queue of Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 1b 284 incomplete connections is full. h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: A host h with listening socket sock referenced by index sid receives a segment seg addressed to socket sock . The segment either contains an invalid combination of the SYN and ACK flags, is a forged segment trying to force the listening socket sock to connect to itself, or the new incomplete connection can not be added to the queue of incomplete connections because the completed connections queue is full. The segment is dropped. If the segment had the ACK flag set and not SYN , a RST segment is generated and added to the host’s output queue oq for transmission. *) (* Take TCP segment seg from the head of the host’s input queue *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (* The segment must be of an acceptable form *) (* Note: some segment fields are ignored during TCP connection establishment and as such may contain arbitrary values. These are equal to the identifiers postfixed with discard below, which are otherwise unconstrained. *) (∃seq discard ack discard URG discard PSH discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack discard : tcp seq local); URG :=URG discard ; ACK :=ACK ; (* might be set in a bad SYN segment *) PSH :=PSH discard ; RST :=F; (* SYN segments never have RST set *) SYN :=SYN ; (* might not be set in a bad segment to a listening socket *) FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ (* Segment is addressed to an IP address belonging to one of the interfaces of host h and is not a link-layer multicast or IP-layer broadcast address *) i1 ∈ local ips h.ifds ∧ ¬(is broadormulticast h.ifds i1)∧ (* very unlikely, since i1 ∈ local ips h.ifds *) ¬(is broadormulticast h.ifds i2) ∧ (* Find the socket sock that has the best match for the address quad in segment seg , see tcp socket best match (p86). Socket sock must have a form matching the patten Sock(. . . ). *) tcp socket best match(socks\\sid)(sid , sock)seg h.arch ∧ sock = Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, ↑ lis, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (* If socket sock has a local IP address specified it should be the same as the destination IP address of segment seg . *) (case is1 of ↑ i1 ′ → i1 ′ = i1 ‖ ∗ → T) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2 285 (* A BSD socket in the LISTEN state may have its peer’s IP address is2 and port ps2 set because listen() can be called from any TCP state. On other architectures they are both constrained to ∗. *) ((is2 = ∗ ∧ ps2 = ∗) ∨ (bsd arch h.arch ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2)) ∧ (* Check that either: (a) the SYN , ACK flag combination is bad, or (b) the socket is illegally connecting to itself (Note: it is not possible to perform a self-connect once a socket is in the LISTEN state by using the sockets interface alone – it can only be achieved by a forged incoming segment. It is possible for a TCP socket to connect to itself but this is achieved through a sequence of socket calls that avoids entering the LISTEN state), or (c) the new incomplete connection can not be added to the incomplete connections queue because the queue of complete connections is full. *) (ACK ∨ (¬SYN ∧ ¬ACK ) ∨ (SYN ∧ ¬ACK ∧ i1 = i2 ∧ p1 = p2) ∨ accept incoming q0 lis F ) ∧ (* If an ACK with no SYN has been received send a RST segment, else just silently drop everything else. See dropwithreset (p120). *) (if ¬SYN ∧ACK then dropwithreset seg h.ifds(ticks of h.ticks)BANDLIM RST OPENPORT bndlm bndlm ′ outsegs else outsegs = [ ] ∧ bndlm ′ = bndlm) ∧ (* Add the RST segment (if any) to the host’s output queue, ignoring failure. See enqueue and ignore fail (p118). *) enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ deliver in 2 tcp: network nonurgent Completion of active open (in SYN SENT receive SYN,ACK and send ACK) or simultaneous open (in SYN SENT receive SYN and send SYN,ACK) h 〈[socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore,TCP PROTO tcp sock))]; iq := iq ; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid ,Sock(↑ fid , sf ′, ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore ′, TCP Sock(st ′, cb′′, ∗, [ ], ∗, rcvq ′, rcvurp′, iobc′)))]; iq := iq ′; oq := oq ′]〉 tcp sock = TCP Sock0(SYN SENT, cb, ∗, [ ], ∗, [ ], ∗,NO OOBDATA) ∧ (* Take TCP segment seg from the head of the host’s input queue *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃win ws urp mss PSH discard . win = w2n win ∧ ws = option map ord ws ∧ urp = w2n urp ∧ mss = option map w2n mss ∧ seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2 286 seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH discard ; RST :=F; SYN :=T; FIN :=FIN ; win :=win ; ws :=ws ; urp := urp ; mss :=mss ; ts := ts; data := data ]〉) ∧ (* Note that there does not exist a better socket match to which the segment should be sent, as the whole quad is matched exactly *) (* The ACK must be acceptable, else send RST. Typically (no data on active open), this is the same as ack = iss+1 *) (ACK =⇒ (cb.iss < ack ∧ ack ≤ cb.snd max )) ∧ (* resolve negotiated window scaling *) (case (cb.request r scale,ws) of (↑ rs, ↑ ss)→ rcv scale ′ = rs ∧ snd scale ′ = ss ∧ tf doing ws ′ = T ‖ 15432 → rcv scale ′ = 0 ∧ snd scale ′ = 0 ∧ tf doing ws ′ = F) ∧ (* resolve negotiated timestamping *) tf rcvd tstmp′ = is some ts ∧ tf doing tstmp′ = (tf rcvd tstmp′ ∧ cb.tf req tstmp) ∧ (* Note that for test generation at present we clear the route metric cache so this will always be NONE. BSD reads from the routing cache if there is an entry, otherwise passes NONE here. *) bw delay product for rt = ∗ ∧ let ourmss = (case cb.t advmss of ∗ → cb.t maxseg (* we did not advertise an MSS, so use the default value *) ‖ ↑ v → v) in ((rcvbufsize ′, sndbufsize ′, t maxseg ′′, snd cwnd ′) = if mss 6= ∗ ∨ ¬bsd arch h.arch then calculate buf sizes ourmss mss bw delay product for rt (is localnet h.ifds i2)(sf .n(SO RCVBUF)) (sf .n(SO SNDBUF))tf doing tstmp′ h.arch else (* Note that since tcp_mss() is not called snd_cwnd remains at its initial (stupidly high) value. *) (sf .n(SO RCVBUF), sf .n(SO SNDBUF), cb.t maxseg , cb.snd cwnd) ) ∧ sf ′ = sf 〈[ n := funupd list sf .n[(SO RCVBUF, rcvbufsize ′); (SO SNDBUF, sndbufsize ′)]]〉 ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2 287 rcv window = calculate bsd rcv wnd sf ′ tcp sock ∧ let (t softerror ′, t rttseg ′, t rttinf ′, tt rexmt ′) = (if ACK then (* completion of active open. Conditions originally copied verbatim from deliver in 3 . *) (* update RTT estimators from timestamp or roundtrip time *) let emission time = case ts of ↑(ts val , ts ecr)→ ↑(ts ecr − 1) ‖ ∗ → (case cb.t rttseg of ↑(ts0, seq0)→ if ack > seq0 then ↑ ts0 else ∗ ‖ ∗ → ∗) in (* clear soft error, cancel timer, and update estimators if we successfully timed a segment round-trip *) let (t softerror ′, t rttseg ′, t rttinf ′) = if is some emission time then (∗, ∗, update rtt(real of int(ticks of h.ticks − the emission time)/HZ) cb.t rttinf ) else (cb.t softerror , cb.t rttseg , cb.t rttinf ) in (* mess with retransmit timer if appropriate *) let tt rexmt ′ = (if ack = cb.snd max then (* if acked everything, stop *) ∗ (* needoutput = 1 – see below *) else if mode of cb.tt rexmt = ↑ RexmtSyn then (* if partial ack, restart from current backoff value, which is always zero because of the above updates to the RTT estimators and shift value. *) start tt rexmtsyn h.arch 0 T t rttinf ′ else if mode of cb.tt rexmt ∈ {∗; ↑ Rexmt} then (* ditto *) start tt rexmt h.arch 0 T t rttinf ′ else if emission time 6= ∗ then case cb.tt rexmt of (* bizarre but true. tcp_input.c:1766 says c.f. Phil Karn’s retransmit algorithm *) ∗ → ∗ ‖ ↑(((mode, shift))d)→ ↑(((mode, 0))d) else (* do nothing *) cb.tt rexmt ) in (t softerror ′, t rttseg ′, t rttinf ′, tt rexmt ′) else (* simultaneous open *) (cb.t softerror , Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2 288 cb.t rttseg , cb.t rttinf , start tt rexmt h.arch 0 T cb.t rttinf ) (* reset rexmt timer *) ) in (* urgent pointer processing. See deliver in 3 for discussion (these conditions are originally copied verbatim from there). *) (∃iobc rcvurp. iobc = NO OOBDATA∧ (* we know the initial state has no OOB data *) rcvurp = ∗ ∧ (if URG ∧ urp > 0 ∧ urp + 0 ≤ SB MAX then (if seq + urp > cb.rcv up then rcv up′ = seq + 1 + urp ∧ rcvurp′ = ↑(0 + num(seq + urp − cb.rcv nxt)) else rcv up′ = cb.rcv nxt∧ (* pull along with window *) rcvurp′ = rcvurp) ∧ (if urp ≤ length data ∧ sf .b(SO OOBINLINE) = F then iobc′ = OOBDATA(EL(urp − 1)data) ∧ data deoobed = (TAKE(urp − 1)data) @ (DROP urp data) else iobc′ = (if seq + urp > cb.rcv up then NO OOBDATA else iobc) ∧ data deoobed = data) else rcv up′ = seq + 1 ∧ rcvurp′ = rcvurp ∧ iobc′ = iobc ∧ data deoobed = data) ) ∧ (* data processing is much simpler here than in deliver in 3 because we know we will only ever receive the one SYN ,ACK datagram (duplicates will be rejected, and there’s only one datagram and so cannot be reordered). *) data ′ = TAKE rcv window data deoobed ∧ FIN ′ = (if data ′ = data deoobed then FIN else F) ∧ rcvq ′ = data ′∧ (* because rcvq is empty initially *) rcv nxt ′ = seq + 1 + length data ′ + (if FIN ′ then 1 else 0) ∧ rcv wnd ′ = rcv window − length data ′ ∧ cb′ = cb 〈[ tt rexmt := tt rexmt ′; (* not persist, because we do not have any data to send *) t idletime := stopwatch zero; (* just received a segment *) tt keep := ↑((())slow timer TCPTV KEEP IDLE); tt conn est := ∗; tt delack := ∗; snd una :=ˆ ack onlywhen ACK ; (* = cb.iss + 1, or +2 if full ack of SYN,FIN *) snd nxt :=ˆ ack onlywhen(ACK ∧ cantsndmore); (* prepare for possible outbound FIN *) snd max :=ˆ ack onlywhen(ACK ∧ cantsndmore ∧ ack > cb.snd max ); (* we doubt snd max can ever increase here, but put this in for safety *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2 289 snd wl1 := if ACK then seq + 1 else seq ; (* must update window. c.f. TCPv2p951, TCPv2p981f, and tcp_input.c:1824 *) snd wl2 :=ˆ ack onlywhen ACK ; snd wnd :=win snd scale ′; snd cwnd := if ACK ∧ ack > cb.iss + 1 then (* BSD clamps snd_cwnd to the maximum window size (65535), but only if we received an ack for data other than the initial SYN. See tcp_input.c::1791 *) min(snd cwnd ′)(TCP MAXWIN snd scale ′) else snd cwnd ′; rcv scale := rcv scale ′; snd scale := snd scale ′; tf doing ws := tf doing ws ′; irs := seq ; rcv nxt := rcv nxt ′; rcv wnd := rcv wnd ′; tf rxwin0sent :=(rcv wnd ′ = 0); rcv adv := rcv nxt ′ + (rcv wnd ′ rcv scale ′) rcv scale ′; rcv up := rcv up′; t maxseg := t maxseg ′′; ts recent := case ts of (* record irrespective of whether we negotiated to do this or not, like BSD *) ∗ → cb.ts recent ‖ ↑(ts val , ts ecr)→ (ts val)TimeWindowkern timer dtsinval ; (* timestamp will become invalid in 24 days *) last ack sent := rcv nxt ′; t softerror := t softerror ′; t rttseg := t rttseg ′; t rttinf := t rttinf ′; tf req tstmp := tf doing tstmp′; tf doing tstmp := tf doing tstmp′ ]〉 ∧ (* now generate seg ′, unless we’re delaying the ACK *) (choose seg ′ :: (if ACK then (* completion of active open *) make ack segment cb′(cantsndmore ∧ ack < cb.iss + 2)(i1, i2, p1, p2)(ticks of h.ticks) else (* simultaneous open *) let cb′′′ = (if ((linux arch h.arch) ∧ cb.tf req tstmp) then cb′ 〈[ tf req tstmp :=T; tf doing tstmp :=T]〉 else cb′) in (if bsd arch h.arch then make ack segment cb′′′ F(i1, i2, p1, p2)(ticks of h.ticks) else make syn ack segment cb′′′(i1, i2, p1, p2)(ticks of h.ticks))). (* Add the segment to the host’s output queue. See enqueue or fail (p118). *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 2a 290 enqueue or fail T h.arch h.rttab h.ifds[TCP seg ′]oq (cb 〈[ t rttinf := cb′.t rttinf ; t maxseg := t maxseg ′′; snd nxt := cb.snd nxt ; tt delack := cb.tt delack ; last ack sent := cb.last ack sent ; rcv adv := cb.rcv adv ]〉)cb′(cb′′, oq ′) ) ∧ (* Note that we change state even if enqueuing or routing returned an error, trusting to retransmit to solve our problem. *) (if ACK then (* completion of active open *) (if ¬FIN ′ then (cantrcvmore ′ = cantrcvmore ∧ st ′ = (if cantsndmore = F then ESTABLISHED else if cb.snd max > cb.iss + 1 ∧ ack ≥ cb.snd max then (* our FIN is ACK ed *) FIN WAIT 2 else FIN WAIT 1)) (* we were trying to send a FIN from SYN SENT, so move straight to FIN WAIT 2. Definitely the case with BSD; should also be true for other archs. *) else (cantrcvmore ′ = T ∧ st ′ = (if cantsndmore = F then CLOSE WAIT else LAST ACK))) (* we were trying to send a FIN from SYN SENT and also receive a FIN, so we move straight into LAST ACK. *) else (* simultaneous open *) (if ¬FIN ′ then (st ′ = SYN RECEIVED ∧ cantrcvmore ′ = cantrcvmore) else (st ′ = CLOSE WAIT∧ (* yes, really! (in BSD) even though we’ve not yet had our initial SYN acknowl- edged! See tcp_input.c:2065 +/-2000 *) cantrcvmore ′ = T)) ) deliver in 2a tcp: network nonurgent Receive bad or boring datagram and RST or ignore for SYN SENT socket h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: For a SYN SENT socket unacceptable acks get RSTed; boring but otherwise OK segments are ig- nored. *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3 291 sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(SYN SENT, cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (* Take TCP segment seg from the head of the host’s input queue *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃seq discard URG discard PSH discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG discard ; ACK :=ACK ; PSH :=PSH discard ; RST :=F; SYN :=SYN ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ (* Note that there does not exist a better socket match to which the segment should be sent, as the whole quad is matched exactly. *) ((ACK ∧ ¬(cb.iss < ack ∧ ack ≤ cb.snd max )) ∨ (¬SYN ∧ (¬ACK ∨ (ACK ∧ cb.iss < ack ∧ ack ≤ cb.snd max )))) ∧ (if ACK ∧ ¬(cb.iss < ack ∧ ack ≤ cb.snd max ) then dropwithreset seg h.ifds(ticks of h.ticks)BANDLIM UNLIMITED bndlm bndlm ′ outsegs else if ¬SYN ∧ (¬ACK ∨ (ACK ∧ cb.iss < ack ∧ ack ≤ cb.snd max )) then outsegs = [ ] ∧ bndlm ′ = bndlm else F) ∧ let tcp sock = tcp sock of sock in (* BSD rcv_wnd bug: the receive window updated code in tcp_input gets executed before the segment is processed, so even for bad segments, it gets updated. *) let rcv window = calculate bsd rcv wnd sf tcp sock in sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ rcv wnd := if bsd arch h.arch then rcv window else tcp sock .cb.rcv wnd ; rcv adv := if bsd arch h.arch then tcp sock .cb.rcv nxt + rcv window else tcp sock .cb.rcv adv ; t idletime := stopwatch zero; tt keep := ↑((())slow timer TCPTV KEEP IDLE) ]〉]〉)]〉 ∧ enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3 292 deliver in 3 tcp: network nonurgent Receive data, FINs, and ACKs in a connected state h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ′; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 sid /∈ (dom(socks)) ∧ sock .pr = TCP PROTO(tcp sock) ∧ (* Assert that the socket meets some sanity properties. This is logically superfluous but aids semi-automatic model checking. See sane socket (p84) for further details. *) sane socket sock ∧ (* Take TCP segment seg from the head of the host’s input queue *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (* The segment must be of an acceptable form *) (* Note: some segment fields (namely TCP options ws and mss), are only used during connection establishment and any values assigned to them in segments during a connection are simply ignored. They are equal to the identifiers ws discard and mss discard respectively, which are otherwise unconstrained. *) (∃win urp ws discard mss discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; (* Urgent/OOB data is processed by this rule *) ACK :=ACK ; (* Acknowledgements are processed *) PSH :=PSH ; (* Push flag maybe set on an incoming data segment *) RST :=F; (* RST segments are not handled by this rule *) SYN :=SYN ; (* SYN flag set may be set in the final segment of a simultaneous open *) FIN :=FIN ; (* Processing of FIN flag handled *) win :=win ; ws :=ws discard ; urp := urp ; mss :=mss discard ; ts := ts; data := data (* Segment may have data *) ]〉 ∧ (* Equality of some type casts, and application of the socket’s send window scaling to the received window advertis- ment *) win = w2n win tcp sock .cb.snd scale ∧ urp = w2n urp ) ∧ (* The socket is fully connected so its complete address quad must match the address quad of the segment seg . By definition, sock is the socket with the best address match thus the auxiliary function tcp socket best match is not required here. *) sock .is1 = ↑ i1 ∧ sock .ps1 = ↑ p1 ∧ sock .is2 = ↑ i2 ∧ sock .ps2 = ↑ p2 ∧ (* The socket must be in a connected state, or is in the SYN RECEIVED state and seg is the final segment completing a passive or simultaneous open. *) tcp sock .st /∈ {CLOSED;LISTEN;SYN SENT} ∧ tcp sock .st ∈ {SYN RECEIVED;ESTABLISHED;CLOSE WAIT;FIN WAIT 1;FIN WAIT 2; CLOSING;LAST ACK;TIME WAIT} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3 293 (* For a socket in the SYN RECEIVED state check that the ACK is valid (the acknowledge value ack is not outside the range of sequence numbers that have been transmitted to the remote socket) and that the segment is not a LAND DoS attack (the segment’s sequence number is not smaller than the remote socket’s (the receiver from this socket’s perspective) initial sequence number) *) ¬(tcp sock .st = SYN RECEIVED ∧ ((ACK ∧ (ack ≤ tcp sock .cb.snd una ∨ ack > tcp sock .cb.snd max )) ∨ seq < tcp sock .cb.irs)) ∧ (* If socket sock has previously emitted a FIN segment check that a thread is still associated with the socket, i.e. check that the socket still has a valid file identifier fid 6= ∗. If not, and the segment contains new data, the segment should not be processed by this rule as there is no thread to read the data from the socket after processing. Query: how does this st condition relate to wesentafin below? *) ¬(tcp sock .st ∈ {FIN WAIT 1;CLOSING;LAST ACK;FIN WAIT 2;TIME WAIT} ∧ sock .fid = ∗ ∧ seq + length data > tcp sock .cb.rcv nxt) ∧ (* A SYN should be received only in the SYN RECEIVED state. *) (SYN =⇒ tcp sock .st = SYN RECEIVED) ∧ (* Socket sock has previously sent a FIN segment iff snd max is strictly greater than the sequence number of the byte after the last byte in the send queue sndq . *) let wesentafin = tcp sock .cb.snd max > tcp sock .cb.snd una + length tcp sock .sndq in (* If the socket sock has previously sent a FIN segment it has been acknowledged by segment seg if the segment has the ACK flag set and an acknowledgment number ack ≥ cb.snd max . *) let ourfinisacked = (wesentafin ∧ACK ∧ ack ≥ tcp sock .cb.snd max ) in (* Process the segment and return an updated socket state *) (* The segment processing is performed by the four relations below, i.e., di3 topstuff, di3 ackstuff, di3 datastuff and di3 ststuff. Each of these relates a socket and bandwidth limiter state before the segment is processed to a tuple containing an updated socket, new bandwidth limiter state, a list of zero or more segments to output and a continue flag. The aim is to model the progression of the segment through tcp_input(). When the continue flag is T segment processing should continue. The infix function andThen applies the function on its left hand side and only continues with the function on its right hand side if the left hand function’s continue flag is T. For a further explanation of this relational monad behaviour see aux relmonad (p??). *) let topstuff = (* Initial processing of the segment: PAWS (protection against wrap sequence numbers); ensure segment is not entirely off the right hand edge of the window; timer updates, etc. For further information see di3 topstuff (p294).*) di3 topstuff seg h.arch h.rttab h.ifds(ticks of h.ticks) and ackstuff = (* Process the segment’s acknowledgement number and do congestion control. See di3 ackstuff (p298).*) di3 ackstuff tcp sock seg ourfinisacked h.arch h.rttab h.ifds(ticks of h.ticks) and datastuff theststuff = (* Extract and reassemble data (including urgent data). See di3 datastuff (p304). *) di3 datastuff theststuff tcp sock seg ourfinisacked h.arch and ststuff FIN reass = (* Possibly change the socket’s state (especially on receipt of a valid FIN ). See di3 ststuff (p305). *) di3 ststuff FIN reass ourfinisacked ack in (topstuff andThen ackstuff andThen datastuff ststuff ) (sock , bndlm) (* state before *) ((sock ′, bndlm ′, outsegs), continue ′)∧ (* state after *) sock ′.pr = TCP PROTO(tcp sock ′) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 topstuff 294 (* If socket sock was initially in the SYN RECEIVED state and after processing seg is in the ESTABLISHED state (or if the segment contained a FIN and the socket is in one of the FIN WAIT 1, FIN WAIT 2 or CLOSE WAIT states), the socket is probably on some other socket’s incomplete connections queue and seg is the final segment in a passive open. If it is on some other socket’s incomplete connections queue the other socket is updated to move the newly connected socket’s reference from the incomplete to the complete connections queue (unless the complete connection queue is full, in which case the new connection is dropped and all references to it are removed). If not, seg is the final segment in a simultaneous open in which case no other sockets are updated. The auxiliary function di3 socks update (p308) does all the hard work, updating the relevant sockets in the finite map socks to yield socks ′. *) (if tcp sock .st = SYN RECEIVED ∧ tcp sock ′.st ∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSE WAIT} then di3 socks update sid(socks ⊕ (sid , sock ′))socks ′ else (* If the socket was not initially in the SYN RECEIVED state, i.e.seg was processed by an already connected socket, ensure the updated socket is in the final finite maps of sockets. *) socks ′ = socks ⊕ (sid , sock ′)) ∧ (* Queue any segments for output on the host’s output queue. In the common case there are no segments to be output as output is handled by deliver out 1 etc. The exception is that di3 ackstuff (and its auxiliaries) require an immediate ACK segment to be emitted under certain congestion control conditions. See di3 ackstuff (p298) and di3 newackstuff (p295) for further details. *) enqueue oq list qinfo(oq , outsegs, oq ′) – deliver in 3 initial checks : di3 topstuff seg arch rttab ifds ticks = (* monadic state accessor: sock is the socket processing the segment, as determined by deliver in 3 *) (get sockλsock . (* Pull out the TCP protocol and control blocks *) let tcp sock = tcp sock of sock in let cb = tcp sock .cb in (* If the segment has the SYN flag set, increment the sequence number so that it is the sequence number of the first byte of data in the segment *) let seq = tcp seq flip sense seg .seq + (if seg .SYN then 1 else 0) in (* The sequence number of the byte logically after the last byte of data in the segment *) let rseq = seq + length seg .data in let ts = seg .ts in (* PAWS (Protection Against Wrapped Sequence numbers) check: If the segment contains a timestamp value that is strictly less than ts recent then the segment is invalid and the PAWS check fails. The value ts recent is the timestamp value of the most recent of the previous segments that was successfully processed, i.e., the last segment that deliver in 3 processed without dropping. *) let paws failed = (∃ts val ts ecr ts recent . ts = ↑(ts val , ts ecr)∧ (* segment’s timestamp field is a pair *) timewindow val of cb.ts recent = ↑ ts recent∧ (* most recent timestamp recorded *) ts val < ts recent) in (* check the segment’s timestamp is not old *) (* If the segment lies entirely off the right-hand edge of sock ’s receive window then it should be dropped, provided it is not a window probe. *) let segment off right hand edge = (let rcv wnd ′ = calculate bsd rcv wnd sock .sf tcp sock in (* size of receive window *) (seq ≥ cb.rcv nxt + rcv wnd ′)∧ (* segment starts on or after the right hand edge *) (rseq > cb.rcv nxt + rcv wnd ′)∧ (* segment ends after the right hand edge *) (rcv wnd ′ 6= 0)) in (* The segment is not a window probe, i.e., rcv wnd ′ is not zero *) (* Drop the segment being processed if either the PAWS check or the ”off right hand edge of window” checks fail *) let drop it = (paws failed ∨ segment off right hand edge) in Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 newackstuff 295 (* The value ts recent will be updated to hold the value of the segment’s timestamp field if the segment is not dropped. Timestamps are invalidated after 24 days - this is ensured by the attached kernel timer kern timer dtsinval. *) let ts recent ′ = (fst(the ts))TimeWindowkern timer dtsinval in (* Reset the socket’s idle timer and keepalive timer to start counting from zero as activity is taking place on the socket: a segment is being processed. If the FIN WAIT 2 timer is enabled this may be reset upon processing this segment. See update idle (p119) for further details *) let (t idletime ′, tt keep′, tt fin wait 2 ′) = update idle tcp sock in (* Using the monadic state accessor modify cb (p??), update the socket’s control block with the new timer values and the most recent timestamp seen. The ts recent field is only updated if the segment currently being processed is not scheduled to be dropped, has a timestamp value set and is from a segment whose first byte of data has sequence number less than or equal to the last acknowledgement number sent in a segment to the remote end. The last condition (when coupled with the PAWS check above) ensures that ts recent only increases monotonically and as is only updated by either a duplicate segment with a newer timestamp, or the next in-order segment expected by the receiving socket with a newer timestamp. It would be incorrect to record the newer timestamps of out-of-order segments because they would fail the PAWS check and get dropped Note: if a reasonably continuous stream of segments is being received with increasing timestamp values and few data segments are sent in return such that acknowledgments are delayed, i.e., every other segment is acknowledged), then only the timestamp from every other segment is recorded by these conditions. This is still sufficient to protect against wrapped sequence numbers. *) modify cb(λcb′.cb′ 〈[ tt keep := tt keep′; tt fin wait 2 := tt fin wait 2 ′; t idletime := t idletime ′; ts recent :=ˆ ts recent ′ onlywhen (¬drop it ∧ is some ts ∧ seq ≤ cb.last ack sent) ]〉) andThen if drop it then (* Decided to drop the segment. mlift dropafterack or fail (p120) may decide to RST the connection depending upon the socket state. If so, the RST segment is retained on the monadic output segment list returned to deliver in 3 for queueing. *) mlift dropafterack or fail seg arch rttab ifds ticks andThen (* After dropping, stop processing the segment. No need to waste time processing the segment any further *) stop else (* Otherwise the segment is valid so allow processsing to continue. *) cont ) – deliver in 3 new ack processing, used in di3 ackstuff : di3 newackstuff tcp sock 0 seg ourfinisacked arch rttab ifds ticks = (* Pull some fields out of the segment *) let ack = tcp seq flip sense seg .ack in let ts = seg .ts in (* Get the socket’s control block using the monadic state accessor get cb. *) (get cbλcb′. (if ¬TCP DO NEWRENO∨cb′.t dupacks < 3 then (* If not doing NewReno-style Fast Retransmit or there have been fewer than 3 duplicate ACKS then clear the duplicate ACK counter. If there were more than 3 duplicate ACKS previously then the congestion window was inflated as per RFC2581 so retract it to snd ssthresh *) modify cb(λcb′.cb′ 〈[ t dupacks := 0; snd cwnd :=ˆ(min cb′.snd cwnd cb′.snd ssthresh) (* retract the window safely *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 newackstuff 296 onlywhen(cb′.t dupacks ≥ 3)]〉) else if TCP DO NEWRENO∧cb′.t dupacks ≥ 3 ∧ ack < cb′.snd recover then (* The host supports NewReno-style Fast Recovery, the socket has received at least three duplicate ACK s previ- ously and the new ACK does not complete the recovery process, i.e., there are further losses or network delays. The new ACK is a partial ACK per RFC2582. Perform a retransmit of the next unacknowledged segment and deflate the congestion window as per the RFC. *) modify cb(λcb′.cb′ 〈[ (* Clear the retransmit timer and round-trip time measurement timer. These will be started by tcp output really when the retransmit is actioned. *) tt rexmt := ∗; t rttseg := ∗; (* Segment to retransmit starts here *) snd nxt := ack ; (* Allow one segment to be emitted *) snd cwnd := cb′.t maxseg ]〉) andThen (* Attempt to create a segment for output using the modified control block (this is a relational monad idiom) *) mlift tcp output perhaps or fail ticks arch rttab ifds andThen (* Finally update the control block: *) modify cb(λcb′.cb′ 〈[ (* RFC2582 partial window deflation: deflate the congestion window by the amount of data freshly acknowledged and add back one maximum segment size *) snd cwnd :=num(int of num cb′.snd cwnd − (ack − cb′.snd una) + int of num cb′.t maxseg); snd nxt := cb′.snd nxt ]〉) (* restore previous value *) else if TCP DO NEWRENO∧cb′.t dupacks ≥ 3 ∧ ack ≥ cb′.snd recover then (* The host supports NewReno-style Fast Recovery, the socket has received at least three duplicate ACK segments and the new ACK acknowledges at least everything upto snd recover , completing the recovery process. *) modify cb(λcb′.cb′ 〈[ t dupacks := 0; (* clear the duplicate ACK counter *) (* Open up the congestion window, being careful to avoid an RFC2582 Ch3.5 Pg6 ”burst of data”. *) snd cwnd :=( if cb′.snd max − ack < int of num cb′.snd ssthresh then (* If snd ssthresh is greater than the number of bytes of data still unacknowledged and presumed to be in-flight, set snd cwnd to be one segment larger than the total size of all the segments in flight. This is burst avoidance: tcp output is only able to send upto one further segment until some of the in flight data is acknowledged. *) num(cb′.snd max − ack + int of num cb′.t maxseg) else (* Otherwise, set snd cwnd to be snd ssthresh, forbidding any further segment output until some in flight data is acknowledged.*) cb′.snd ssthresh) ]〉) else assert failure“di3 newackstuff” (* impossible *) ) andThen (* Check ack value is sensible, i.e., not greater than the highest sequence number transmitted so far *) if ack > cb′.snd max then (* Drop the segment and possibly emit a RST segment *) mlift dropafterack or fail seg arch rttab ifds ticks andThen stop Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 newackstuff 297 else (* continue processing *) (* If the retransmit timer is set and the socket has done only one retransmit and it is still within the bad retransmit timer window, then because this is an ACK of new data the retransmission was done in error. Flag this so that the control block can be recovered from retransmission mode. This is known as a ”bad retransmit”. *) let revert rexmt = (mode of cb′.tt rexmt ∈ {↑ Rexmt; ↑ RexmtSyn} ∧ shift of cb′.tt rexmt = 1 ∧ timewindow open cb′.t badrxtwin) in (* Attempt to calculate a new round-trip time estimate *) let emission time = case (ts, cb′.t rttseg) of (↑(ts val , ts ecr), )→ (* By using the segment’s timestamp if it has one *) ↑(ts ecr − 1) ‖ (∗, ↑(ts0, seq0))→ (* Or if not, by the control blocks round-trip timer, if it covers the segment(s) being acknowledged *) if ack > seq0 then ↑ ts0 else ∗ ‖ (∗, ∗)→ (* Otherwise, it is not possible to calculate a round-trip update *) ∗ in (* If a new round-trip time estimate was calculated above, update the round-trip information held by the socket’s control block *) let t rttinf ′ = case emission time of ↑ t rttinf → update rtt(real of int(ticks − the emission time)/HZ) cb′.t rttinf ‖ ∗ → cb′.t rttinf in (* Update the retransmit timer *) let tt rexmt ′ = (if ack = cb′.snd max then ∗ (* If all sent data has been acknowledged, disable the timer *) else case mode of cb′.tt rexmt of ∗ → (* If not set, set it as there is still unacknowledged data *) start tt rexmt arch 0 T t rttinf ′ ‖ ↑ Rexmt→ (* If set, reset it as a new acknowledgement segment has arrived *) start tt rexmt arch 0 T t rttinf ′ ‖ 444 → (* Otherwise, leave it alone. The timer will never be in RexmtSyn here and the only other case is Persist, in which case it should be left alone until such time as a window update is received *) cb′.tt rexmt ) in (* Update the send queue and window *) let (snd wnd ′, sndq ′) = (if ourfinisacked then (* If this socket has previously emitted a FIN segment and the FIN has now been ACK ed, decrease snd wnd by the length of the send queue and clear the send queue.*) (cb′.snd wnd − length tcp sock 0 .sndq , [ ]) else (* Otherwise, reduce the send window by the amound of data acknowledged as it is now consuming space on the receiver’s receive queue. Remove the acknowledged bytes from the send queue as they will never need to be retransmitted.*) (cb′.snd wnd − num(ack − tcp sock 0 .cb.snd una), DROP(num(ack − tcp sock 0 .cb.snd una))tcp sock 0 .sndq) ) in (* Update the control block *) modify cb(λcb.cb 〈[ (* If revert rexmt (above) flags that a bad retransmission occured, undo the congestion avoidance changes *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 ackstuff 298 snd cwnd :=ˆ cb.snd cwnd prev onlywhen revert rexmt ; snd ssthresh :=ˆ cb.snd ssthresh prev onlywhen revert rexmt ; snd nxt :=ˆ cb′.snd max onlywhen revert rexmt ; t badrxtwin :=ˆ TimeWindowClosed onlywhen revert rexmt ]〉) andThen modify cb(λcb.cb 〈[ (* Update the round-trip time estimates and retransmit timer *) t rttinf := t rttinf ′; tt rexmt := tt rexmt ′; (* If the ACK segment allowed us to successfully time a segment (and update the round-trip time estimates) then clear the soft error flag and clear the segment round-trip timer in order that it can be used on a future segment. *) t softerror :=ˆ ∗ onlywhen is some emission time; t rttseg :=ˆ ∗ onlywhen is some emission time; (* Update the congestion window by the algorithm in expand cwnd (p99) only when not performing NewReno retransmission or the duplicate ACK counter is zero, i.e., expand the congestion window when this ACK is not a NewReno-style partial ACK and hence the connection has yet recovered *) snd cwnd :=ˆ expand cwnd cb.snd ssthresh tcp sock 0 .cb.t maxseg (TCP MAXWIN tcp sock 0 .cb.snd scale)cb.snd cwnd onlywhen(¬TCP DO NEWRENO∨cb′.t dupacks = 0); snd wnd := snd wnd ′; (* The updated send window *) snd una := ack ; (* Have had up to ack acknowledged *) snd nxt :=max ack cb.snd nxt ; (* Ensure invariant snd nxt ≥ snd una *) (* Reset the 2MSL timer if in the TIME WAIT state as have received a valid ACK segment for the waiting socket *) tt 2msl :=ˆ ↑((())slow timer(2∗TCPTV MSL)) onlywhen(tcp sock 0 .st = TIME WAIT) ]〉) andThen modify tcp sock(λs.s 〈[ sndq := sndq ′]〉) andThen (* The send queue update *) (if tcp sock 0 .st = LAST ACK ∧ ourfinisacked then (* If the socket’s FIN has been acknowledged and the socket is in the LAST ACK state, close the socket and stop processing this segment *) modify sock(tcp close arch) andThen stop else if tcp sock 0 .st = TIME WAIT ∧ ack > tcp sock 0 .cb.snd una(* data acked past FIN *) then (* If the socket is in TIME WAIT and this segment contains a new acknowledgement (that acknowledges past the FIN segment, drop it—it’s invalid. Stop processing. *) mlift dropafterack or fail seg arch rttab ifds ticks andThen stop else (* Otherwise, flag that deliver in 3 can continue processing the segment if need be *) cont) )(* cb’ *) – deliver in 3 ACK processing : di3 ackstuff tcp sock 0 seg ourfinisacked arch rttab ifds ticks = (* Pull some fields out of the segment *) let ack = tcp seq flip sense seg .ack in let seq = tcp seq flip sense seg .seq in let data = seg .data in (* Pull out senders advertised window from the segment, applying the sender’s scaling *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 ackstuff 299 let win = w2n seg .win tcp sock 0 .cb.snd scale in (* Get the socket’s control block using the monadic state accessor get cb. Process the acknowledgement data in the segment, do some congestion control calculations and finally update the control blocks *) (get cbλcb. (* The segment is possibly a duplicate ack if it contains no data, does not contain a window update and the socket has unacknowledged data (the retransmit timer is still active). The no data condition is important: if this socket is sending little or no data at present and is waiting for some previous data to be acknowledged, but is receiving data filled segments from the other end, these may all contain the same acknowledgement number and trigger the retransmit logic erroneously. *) let has data = (data 6= [ ] ∧ (bsd arch arch =⇒ (cb.rcv nxt < seq + length data ∧ seq < cb.rcv nxt + cb.rcv wnd))) in let maybe dup ack = (¬has data ∧ win = cb.snd wnd ∧mode of cb.tt rexmt = ↑ Rexmt) in if ack ≤ cb.snd una ∧maybe dup ack then (* Received a duplicate acknowledgement: it is an old acknowledgement (strictly less than snd una) and it meets the duplicate acknowledgement conditions above. Do Fast Retransmit/Fast Recovery Congestion Control (RFC 2581 Ch3.2 Pg6) and NewReno-style Fast Recovery (RFC 2582, Ch3 Pg3), updating the control block variables and creating segments for transmission as appropriate. *) let t dupacks ′ = cb.t dupacks + 1 in if t dupacks ′ < 3 then (* Fewer than three duplicate acks received so far. Just increment the duplicate ack counter. We must continue processing, in case FIN is set. *) modify cb(λcb′.cb′ 〈[ t dupacks := t dupacks ′]〉) andThen cont else if t dupacks ′ > 3 ∨ (t dupacks ′ = 3 ∧ TCP DO NEWRENO∧ack < cb.snd recover) then (* If this is the 4th or higher duplicate ACK then Fast Retransmit/Fast Recovery congestion control is already in progress. Increase the congestion window by another maximum segment size (as the duplicate ACK indicates another out-or-order segment has been received by the other end and is no longer consuming network resource), increment the duplicate ACK counter, and attempt to output another segment. *) (* If this is the 3rd duplicate ACK , the host supports NewReno extensions and ack is strictly less than the fast recovery ”recovered” sequence number snd recover , then the host is already doing NewReno-style fast recovery and has possibly falsely retransmitted a segment, the retransmitted segment has been lost or it has been delayed. Reset the duplicate ACK counter, increase the congestion window by a maximum segment size (for the same reason as before) and attempt to output another segment. NB: this will not cause a cycle to develop! The retransmission timer will eventually fire if recovery does not happen ”fast”. *) modify cb(λcb′.cb′ 〈[ t dupacks := if t dupacks ′ = 3 then 0 (* false retransmit, or further loss or delay *) else t dupacks ′; snd cwnd := cb.snd cwnd + cb.t maxseg ]〉) andThen mlift tcp output perhaps or fail ticks arch rttab ifds andThen stop (* no need to process the segment any further *) else if t dupacks ′ = 3 ∧ ¬(TCP DO NEWRENO∧ack < cb.snd recover) then (* If this is the 3rd duplicate segment and if the host supports NewReno extensions, a NewReno-style Fast Retransmit is not already in progress, then do a Fast Retransmit *) (* Update the control block before the retransmit to reflect which data requires retransmission *) modify cb(λcb′.cb′ 〈[ t dupacks := t dupacks ′; (* increment the counter *) (* Set to half the current flight size as per RFC2581/2582 *) snd ssthresh :=max 2((min cb.snd wnd cb.snd cwnd)div 2 div cb.t maxseg) ∗ cb.t maxseg ; (* If doing NewReno-style Fast Retransmit set to the highest sequence number trans- mitted so far snd max . *) snd recover :=ˆ cb.snd max onlywhen TCP DO NEWRENO; Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 datastuff really 300 (* Clear the retransmit timer and round-trip time measurement timer. These will be started by tcp output really when the retransmit is actioned. *) tt rexmt := ∗; t rttseg := ∗; (* Sequence number to retransmit—this is equal to the ack value in the duplicate ACK segment *) snd nxt := ack ; (* Ensure the congestion window is large enough to allow one segment to be emitted *) snd cwnd := cb.t maxseg ]〉) andThen (* Attempt to create a segment for output using the modified control block (this is all a relational monad idiom) *) mlift tcp output perhaps or fail ticks arch rttab ifds andThen (* Finally, update the congestion window to snd ssthresh plus 3 maximum segment sizes (this is the artificial inflation of RFC2581/2582 because it is known that the 3 segments that generated the 3 duplicate acknowl- edgments are received and no longer consuming network resource. Also put snd nxt back to its previous value. *) modify cb(λcb′.cb′ 〈[ snd cwnd := cb′.snd ssthresh + cb.t maxseg ∗ t dupacks ′; snd nxt :=max cb.snd nxt cb′.snd nxt ]〉) andThen stop (* no need to process the segment any further *) else assert failure“di3 ackstuff” (* Believed to be impossible—here for completion and safety *) else if ack ≤ cb.snd una ∧ ¬maybe dup ack then (* Have received an old (would use the word ”duplicate” if it did not have a special meaning) ACK and it is neither a duplicate ACK nor the ACK of a new sequence number thus just clear the duplicate ACK counter. *) modify cb(λcb′.cb′ 〈[ t dupacks := 0]〉) else (* Must be: ack > cb.snd una *) (* This is the ACK of a new sequence number—this case is handled by the auxiliary function di3 newackstuff (p295) *) di3 newackstuff tcp sock 0 seg ourfinisacked arch rttab ifds ticks ) – deliver in 3 data processing : di3 datastuff really the ststuff tcp sock 0 seg bsd fast path arch = (* Pull some fields out of the segment *) let ACK = seg .ACK in let FIN = seg .FIN in let PSH = seg .PSH in let URG = seg .URG in let ack = tcp seq flip sense seg .ack in let urp = w2n seg .urp in let data = seg .data in let seq = tcp seq flip sense seg .seq + (if seg .SYN then 1 else 0) in (* Pull out the senders advertised window and apply the sender’s scale factor *) let win = w2n seg .win (tcp sock 0 ).cb.snd scale in (* Get the socket’s control block using the monadic state accessor get cb. Process the segments data and possibly update the send window *) (get sockλsock . let tcp sock = tcp sock of sock in let cb = tcp sock .cb in Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 datastuff really 301 (* Trim segment to be within the receive window *) (* Trim duplicate data from the left edge of data, i.e., data before cb.rcv nxt . Adjust seq , URG and urp in respect of left edge trimming. If the urgent data has been trimmed from the segment’s data, URG is cleared also. Note: the urgent pointer always points to the byte immediately following the urgent byte and is relative to the start of the segment’s data. An urgent pointer of zero signifies that there is no urgent data in the segment. *) let trim amt left = if cb.rcv nxt > seq then min(num(cb.rcv nxt − seq))(length data) else 0 in let data trimmed left = DROP trim amt left data in let seq trimmed = seq + trim amt left in (* Trimmed data starts at seq trimmed *) let urp trimmed = if urp > trim amt left then urp − trim amt left else 0 in let URG trimmed = if urp trimmed 6= 0 then URG else F in (* Trim any data outside the receive window from the right hand edge. If all the data is within the window and the FIN flag is set then the FIN flag is valid and should be processed. Note: this trimming may remove urgent data from the segment. The urgent pointer and flag are not cleared here because there is still urgent data to be received, but now in a future segment. *) let data trimmed left right = TAKE cb.rcv wnd data trimmed left in let FIN trimmed = if data trimmed left right = data trimmed left then FIN else F in (* Processing of urgent (OOB) data: *) (* We have a valid urgent pointer iff the trimmed segment has its urgent flag set with a non-zero urgent pointer, and the urgent pointer plus the length of the receive queue is less than or equal to SB MAX. The last condition is imposed by FreeBSD, supposedly to prevent soreceive from crashing (although we cannot identify why it might crash). *) let urp valid = (URG trimmed ∧ urp trimmed > 0 ∧ urp trimmed + length tcp sock .rcvq ≤ SB MAX) in (* This is a new urgent pointer, i.e., it is greater than any previous one stored in cb.rcv up. Note: the urgent pointer is relative to the sequence number of a segment *) let urp advanced = (urp valid ∧ (seq trimmed + urp trimmed > cb.rcv up)) in (* The urgent pointer lies within segment seg and the socket is not set to do inline delivery, therefore it is possible to pull out the urgent byte from the stream *) let can pull = (urp valid ∧ urp trimmed ≤ length data trimmed left right ∧ sock .sf .b(SO OOBINLINE) = F) in (* Build trimmed segment to place on reassembly queue. If urgent data is in this segment and the socket is not doing inline delivery (and hence the urgent byte is stored in iobc), remove the urgent byte from the segment’s data so that it does not get placed in the receive queue, and set spliced urp to the sequence number of the urgent byte. *) let rseg =〈[ seq := seq trimmed ; spliced urp := if can pull then ↑(cb.rcv nxt + urp trimmed − 1) else ∗; FIN :=FIN trimmed ; data := if can pull then (TAKE(urp − 1)data trimmed left right) @ (DROP urp data trimmed left right) else data trimmed left right ]〉 in (* Perform a monadic socket state update *) modify tcp sock(λs.s 〈[ cb := s.cb 〈[ (* If the segment’s urgent pointer is valid and advances the urgent pointer, update rcv up with the new absolute pointer, otherwise just pull it along with the left hand edge of the receive window. Note: an earlier segment may have set rcv up to point somewhere into a future segment. The use of max ensures that the pointer is not accidentally overwritten until the future segment arrives. *) (* FreeBSD does not pull rcv up along in the fast path; this is a bug *) rcv up :=ˆ(if urp advanced then seq trimmed + urp trimmed else max cb.rcv up cb.rcv nxt) onlywhen¬(bsd arch arch ∧ bsd fast path)]〉; Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 datastuff really 302 (* If the urgent pointer is valid and advances the urgent pointer, update rcvurp—the socket’s receive queue urgent data index—to be the index into the receive queue where the new urgent data will be stored. Note: the subtraction of 1 is correct because rcvurp points to the location where the urgent byte is stored not the byte immediately following the urgent byte (as is the convention for the urp field in the TCP header). *) rcvurp :=ˆ(↑(length tcp sock .rcvq + num(seq trimmed + urp trimmed − cb.rcv nxt − 1))) onlywhen urp advanced ; (* If the segment’s urgent pointer is valid, the urgent data is within this segment and the socket is not doing inline delivery of urgent data, pull out the urgent byte into iobc. If the urgent data is within a future segment set iobc to NO OOBDATA to signify that the urgent data is not available yet, otherwise leave iobc alone if the urgent pointer is not valid. *) iobc :=ˆ(if can pull then OOBDATA(EL(urp − 1) data trimmed left right) else NO OOBDATA) onlywhen urp valid ]〉) andThen (* Processing of non-urgent data. There are 6 cases to consider: *) (chooseM{F;T}λFIN reass. (* Case (1) The segment contains new in-order, in-window data possibly with a FIN and the receive window is not closed. Note: it is possible that the segment contains just one byte of OOB data that may have already been pulled out into iobc if OOB delivery is out-of-line. In which case, the below must still be performed even though no data is contributed to the reassembly buffer in order that rcv nxt is updated correctly (because a byte of urgent data consumes a byte of sequence number space). This is why data trimmed left right is used rather than data deoobed in some of the conditions below. *) (if seq trimmed = cb.rcv nxt ∧ seq trimmed + length data trimmed left right + (if FIN trimmed then 1 else 0) > cb.rcv nxt ∧ cb.rcv wnd > 0 then (* Only need to acknowledge the segment if there is new in-window data (including urgent data) or a valid FIN *) let have stuff to ack = (data trimmed left right 6= [ ] ∨ FIN trimmed) in (* If the socket is connected, has data to ACK but no FIN to ACK , the reassembly queue is empty, the socket is not currently within a bad retransmit window and an ACK is not already being delayed, then delay the ACK . *) let delay ack = (tcp sock .st ∈ {ESTABLISHED;CLOSE WAIT;FIN WAIT 1; CLOSING;LAST ACK;FIN WAIT 2} ∧ have stuff to ack ∧ ¬FIN trimmed ∧ cb.t segq = [ ] ∧ ¬cb.tf rxwin0sent ∧ cb.tt delack = ∗) in (* Check to see whether any data or a FIN can be reassembled. tcp reass returns the set of all possible reassemblies, one of which is chosen non-deterministically here. Note: a FIN can only be reassembled once all the data has been reassembled. The len result from tcp reass is the length of the reassembled data, data reass, plus the length of any out-of-line urgent data that is not included in the reassembled data but logically occurs within it. This is to ensure that control block variables such as rcv nxt are incremented by the correct amount, i.e., by the amount of data (whether urgent or not) received successfully by the socket. See tcp reass (p100) for further details. *) let rsegq = rseg :: cb.t segq in (chooseM(tcp reass cb.rcv nxt rsegq)λ(data reass, len,FIN reass0 ). (* Length (in sequence space) of reassembled data, counting a FIN as one byte and including any out-of-line urgent data previously removed *) let len reass = len + (if FIN reass0 then 1 else 0) in (* Add the reassembled data to the receive queue and increment rcv nxt to mark the sequence number of the byte past the last byte in the receive queue*) let rcvq ′ = tcp sock .rcvq @ data reass in Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 datastuff really 303 let rcv nxt ′ = cb.rcv nxt + len reass in (* includes oob bytes as they occupy sequence space *) (* Prune the receive queue of any data or FIN s that were reassembled, keeping all segments that contain data at or past sequence number cb.rcv nxt + len reass. *) let t segq ′ = tcp reass prune rcv nxt ′ rsegq in (* Reduce the receive window in light of the data added to the receive queue. Do not include out-of-line urgent data because it does not store data in the receive queue. *) let rcv wnd ′ = cb.rcv wnd − length data reass in (* Hack: assertion used to share values with later conditions *) assert(FIN reass = FIN reass0 ) andThen (* Update the socket state *) modify tcp sock(λs.s 〈[ rcvq := rcvq ′; (* the updated receive queue *) cb := s.cb 〈[ (* Start the delayed ack timer if decided to earlier, i.e., delay ack = T. *) tt delack :=ˆ ↑((())fast timer TCPTV DELACK)onlywhen delay ack ; (* Set if not delaying an ACK and have stuff to ACK *) tf shouldacknow :=ˆ¬delay ack onlywhen have stuff to ack ; t segq := t segq ′; (* updated reassembly queue, post-pruning *) rcv nxt := rcv nxt ′; rcv wnd := rcv wnd ′ ]〉 ]〉) )(* chooseM *) (* Case (2) The segment contains new out-of-order in-window data, possibly with a FIN , and the receive window is not closed. Note: it may also contain in-window urgent data that may have been pulled out-of-line but still require processing to keep reassembly happy. *) else if seq trimmed > cb.rcv nxt ∧ seq trimmed < cb.rcv nxt + cb.rcv wnd ∧ length data trimmed left right + (if FIN trimmed then 1 else 0) > 0 ∧ cb.rcv wnd > 0 then (* Hack: assertion used to share values with later conditions *) assert(FIN reass = F) andThen (* Update the socket’s TCP control block state *) modify cb(λcb.cb 〈[ (* Add the segment to the reassembly queue *) t segq := rseg :: cb.t segq ; (* Acknowledge out-of-order data immediately (per RFC2581 Ch4.2) *) tf shouldacknow :=T ]〉) (* Case (3) The segment is a pure ACK segment (contains no data) (and must be in-order). *) (* Invariant here that seq trimmed = seq if segment is a pure ACK . Note: the length of the original segment (not the trimmed segment) is used in the guard to ensure this really was a pure ACK segment. *) else if ACK ∧ seq trimmed = cb.rcv nxt ∧ length data + (if FIN then 1 else 0) = 0 then (* Hack: assertion used to share values with later conditions *) assert(FIN reass = F) (* Have not received a FIN *) (* Case (4) Segment contained no useful data—was a completely old segment. Note: the original fields from the segment, i.e., seq , data and FIN are used in the guard below—the trimmed variants are useless here! *) (* Case (5) Segment is a window probe. Note: the original fields from the segment, i.e., data and FIN are used in the guard below—the trimmed variants are useless here! *) (* Case (6) Segment is completely beyond the window and is not a window probe *) else if (seq < cb.rcv nxt ∧ seq + length data + (if FIN then 1 else 0) ≤ cb.rcv nxt)∨ (* (4) *) (seq trimmed = cb.rcv nxt ∧ cb.rcv wnd = 0 ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 datastuff 304 length data + (if FIN then 1 else 0) > 0)∨ (* (5) *) T then (* (6) *) (* Hack: assertion used to share values with later conditions *) assert(FIN reass = F) andThen (* Definitely false—segment is outside window *) (* Update socket’s control block to assert that an ACK segment should be sent now. *) (* Source: TCPIPv2p959 says ”segment is discarded and an ack is sent as a reply” *) modify cb(λcb.cb 〈[ tf shouldacknow :=T]〉) else assert failure“di3 datastuff”(* impossible *) ) andThen (* Finished processing the segment’s data *) (* Thread the reassembled FIN flag through to di3 ststuff *) the ststuff FIN reass )(* chooseM FIN reass *) )(* get sock \sock *) – deliver in 3 data processing : di3 datastuff the ststuff tcp sock 0 seg ourfinisacked arch = (* Pull some fields out of the segment *) let ACK = seg .ACK in let FIN = seg .FIN in let PSH = seg .PSH in let URG = seg .URG in let ack = tcp seq flip sense seg .ack in let urp = w2n seg .urp in let data = seg .data in let seq = tcp seq flip sense seg .seq + (if seg .SYN then 1 else 0) in let win = w2n seg .win (tcp sock 0 ).cb.snd scale in get sockλsock . let tcp sock = tcp sock of sock in let cb = tcp sock .cb in (* Various things do not happen if BSD processes the segment using its header prediction (fast-path) code. Header prediction occurs only in the ESTABLISHED state, with segments that have only ACK and/or PSH flags set, are in-order, do not contain a window update, when data is not being retransmitted (no congestion is occuring) and either: (a) the segment is a valid pure ACK segment of new data, less than three duplcicate ACK s have been received and the congestion window is at least as large as the send window, or (b) the segment contains new data, does not acknowlegdge any new data, the segment reassembly queue is empty and there is space for the segment’s data in the socket’s receive buffer. *) let bsd fast path = ((tcp sock .st = ESTABLISHED) ∧ ¬seg .SYN ∧ ¬FIN ∧ ¬seg .RST ∧ ¬URG ∧ACK ∧ seq = cb.rcv nxt ∧ cb.snd wnd = win ∧ cb.snd max = cb.snd nxt ∧ ( (ack > cb.snd una ∧ ack ≤ cb.snd max ∧ cb.snd cwnd ≥ cb.snd wnd ∧ cb.t dupacks < 3) ∨ (ack = cb.snd una ∧ cb.t segq = [ ] ∧ (length data) < (sock .sf .n(SO RCVBUF)− length tcp sock .rcvq)))) in Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 ststuff 305 (* Update the send window using the received segment if the segment will not be processed by BSD’s fast path, has the ACK flag set, is not to the right of the window, and either: (a) the last window update was from a segment with sequence number less than seq , i.e., an older segment than the current segment, or (b) the last window update was from a segment with sequence number equal to seq but with an acknowledgement number less than ack , i.e., this segment acknowledges newer data than the segment the last window update was taken from, or (c) the last window update was from a segment with sequence number equal to seq and acknowledgement number equal to ack , i.e., a segment similar to that the previous update came from, but this segment contains a larger window advertisment than was previously advertised, or (d) this segment is the third segment during connection establishement (state is SYN RECEIVED) and does not have the FIN flag set. *) let update send window = (¬bsd fast path ∧ seg .ACK ∧ seq ≤ cb.rcv nxt + cb.rcv wnd ∧ (cb.snd wl1 < seq ∨ (cb.snd wl1 = seq ∧ (cb.snd wl2 < ack ∨ cb.snd wl2 = ack ∧ win > cb.snd wnd)) ∨ (tcp sock .st = SYN RECEIVED ∧ ¬FIN ))) in (* This replaces BSD’s snd_wl1 := seq-1 hack; should perhaps be ¬FIN reass *) let seq trimmed =max seq(min cb.rcv nxt(seq + length data)) in (* Write back the window updates *) modify cb(λcb.cb 〈[ snd wnd :=ˆ win onlywhen update send window ; snd wl1 :=ˆ seq trimmed onlywhen update send window ; snd wl2 :=ˆ ack onlywhen update send window (* persist timer will be set by deliver out 1 if this updates the window to zero and there is data to send *) ]〉) andThen (* If in TIME WAIT or will transition to it from CLOSING, ignore any URG, data, or FIN. Note that in FIN WAIT 1 or FIN WAIT 2, we still process data, even if ourfinisacked . *) if tcp sock .st = TIME WAIT ∨ (tcp sock .st = CLOSING ∧ ourfinisacked) then (* pull along urgent pointer *) modify cb(λcb.cb 〈[ rcv up :=max cb.rcv up cb.rcv nxt ]〉) andThen the ststuff F else di3 datastuff really the ststuff tcp sock 0 seg bsd fast path arch – deliver in 3 TCP state change processing : di3 ststuff FIN reass ourfinisacked ack = (* The entirety of this function is an encoding of the TCP State Transition Diagram (as it is, not as it is traditionally depicted) post-SYN SENT state. It specifies for given start state and set of conditions (all or some of which are affected by the processing of the current segment), which state the TCP socket should be moved into next *) (* Get the TCP socket using the monadic state accessor get cb. *) (get sockλsock . let cb = (tcp sock of sock).cb in (* ...and its control block *) (* Several of the encoded transitions (below) require the socket to be moved into the TIME WAIT state, in which case the 2MSL timer is started, all other timers are cancelled and the socket’s state is changed to TIME WAIT. This common idiom is defined monadically as a function here *) let enter TIME WAIT = modify tcp sock(λs.s 〈[ st :=TIME WAIT; cb := s.cb Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 ststuff 306 〈[ tt 2msl := ↑((())slow timer(2∗TCPTV MSL)); tt rexmt := ∗; tt keep := ∗; tt delack := ∗; tt conn est := ∗; tt fin wait 2 := ∗ ]〉 ]〉) in (* If the processing of the current segment has led to FIN reass being asserted then the whole data stream from the other end has been received and reconstructed, including the final FIN flag. The socket should have its read-half flagged as shut down, i.e., cantrcvmore = T, otherwise the socket is not modified. *) (if FIN reass then modify sock(λs.s 〈[ cantrcvmore :=T]〉) else cont) andThen (* State Transition Diagram encoding: *) (* The state transition encoding, case-split on the current state and whether a FIN from the remote end has been reassembled *) case ((tcp sock of sock).st ,FIN reass) of (SYN RECEIVED,F)→ (* In SYN RECEIVED and have not received a FIN *) if ack ≥ cb.iss + 1 then (* This socket’s initial SYN has been acknowledged *) modify tcp sock(λs.s 〈[ st := if ¬sock .cantsndmore then ESTABLISHED (* socket is now fully connected *) else (* The connecting socket had it’s write-half shutdown by shutdown() forcing a FIN to be emitted to the other end *) if ourfinisacked then (* The emitted FIN has been acknowledged *) FIN WAIT 2 else (* Still waiting for the emitted FIN to be acknowledged *) FIN WAIT 1 ]〉) else (* Not a valid path *) stop ‖ (SYN RECEIVED,T)→ (* In SYN RECEIVED and have received a FIN *) (* Enter the CLOSE WAIT state, missing out ESTABLISHED *) modify tcp sock(λs.s 〈[ st :=CLOSE WAIT]〉) ‖ (ESTABLISHED,F)→ (* In ESTABLISHED and have not received a FIN *) (* Doing common-case data delivery and acknowledgements. Remain in ESTABLISHED. *) cont ‖ (ESTABLISHED,T)→ (* In ESTABLISHED and received a FIN *) (* Move into the CLOSE WAIT state *) modify tcp sock(λs.s 〈[ st :=CLOSE WAIT]〉) ‖ (CLOSE WAIT,F)→ (* In CLOSE WAIT and have not received a FIN *) (* Do nothing and remain in CLOSE WAIT. The socket has its receive-side shut down due to the FIN it received previously from the remote end. It can continue to emit segments containing data and receive acknowledgements back until such a time that it closes down and emits a FIN *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 ststuff 307 cont ‖ (CLOSE WAIT,T)→ (* In CLOSE WAIT and received (another) FIN *) (* The duplicate FIN will have had a new sequence number to be valid and reach this point; RFC793 says ”ignore” it so do not change state! If it were a duplicate with the same sequence number as the previously accepted FIN , then the deliver in 3 acknowledgement processing function di3 ackstuff would have dropped it. *) cont ‖ (FIN WAIT 1,F)→ (* In FIN WAIT 1 and have not received a FIN *) (* This socket will have emitted a FIN to enter FIN WAIT 1. *) if ourfinisacked then (* If this socket’s FIN has been acknowledged, enter state FIN WAIT 2 and start the FIN WAIT 2 timer. The timer ensures that if the other end has gone away without emitting a FIN and does not transmit any more data the socket is closed rather left dangling. *) modify tcp sock(λs.s 〈[ st :=FIN WAIT 2; cb := s.cb 〈[ tt fin wait 2 :=ˆ ↑((())slow timer TCPTV MAXIDLE) onlywhen sock .cantrcvmore (* believe always true *) ]〉 ]〉) else (* If this socket’s FIN has not been acknowledged then remain in FIN WAIT 1 *) cont ‖ (FIN WAIT 1,T)→ (* In FIN WAIT 1 and received a FIN *) if ourfinisacked then (* ...and this socket’s FIN has been acknowledged then the connection has been closed successfully so en- ter TIME WAIT. Note: this differs slightly from the behaviour of BSD which momentarily enters the FIN WAIT 2 and after a little more processing enters TIME WAIT *) enter TIME WAIT else (* If this socket’s FIN has not been acknowledged then the other end is attempting to close the connection simultaneously (a simultaneous close). Move to the CLOSING state *) modify tcp sock(λs.s 〈[ st :=CLOSING]〉) ‖ (FIN WAIT 2,F)→ (* In FIN WAIT 2 and have not received a FIN *) (* This socket has previously emitted a FIN which has already been acknowledged. It can continue to receive data from the other end which it must acknowledge. During this time the socket should remain in FIN WAIT 2 until such a time that it receives a valid FIN from the remote end, or if no activity occurs on the connection the FIN WAIT 2 timer will fire, eventually closing the socket *) cont ‖ (FIN WAIT 2,T)→ (* In FIN WAIT 2 and have received a FIN *) (* Connection has been shutdown so enter TIME WAIT *) enter TIME WAIT ‖ (CLOSING,F)→ (* In CLOSING and have not received a FIN *) if ourfinisacked then (* If this socket’s FIN has been acknowledged (common-case), enter TIME WAIT as the connection has been successfully closed *) enter TIME WAIT else (* Otherwise, the other end has not yet received or processed the FIN emitted by this socket. Remain in the CLOSING state until it does so. Note: if the previosuly emitted FIN is not acknowledged this socket’s retransmit timer will eventually fire causing retransmission of the FIN . *) cont ‖ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ di3 socks update 308 (CLOSING,T)→ (* In CLOSING and have received a FIN *) (* The received FIN is a duplicate FIN with a new sequence number so as per RFC793 is ignored – if it were a duplicate with the same sequence number as the previously accepted FIN , then the deliver in 3 acknowledgement processing function di3 ackstuff would have dropped it. *) if ourfinisacked then (* If this socket’s FIN has been acknowledged then the connection is now successfully closed, so enter TIME WAIT state *) enter TIME WAIT else (* Otherwise, ignore the new FIN and remain in the same state *) cont ‖ (LAST ACK,F)→ (* In LAST ACK and have not received a FIN *) (* Remain in LAST ACK until this socket’s FIN is acknowledged. Note: eventually the retransmit timer will fire forcing the FIN to be retransmitted. *) cont ‖ (LAST ACK,T)→ (* In LAST ACK and have received a FIN *) (* This transition is handled specially at the end of di3 newackstuff at which point processing stops, thus this transition is not possible *) assert failure“di3 ststuff” (* impossible *) ‖ (TIME WAIT,F)→ (* In TIME WAIT and have not received a FIN *) (* Remaining in TIME WAIT until the 2MSL timer expires *) cont ‖ (TIME WAIT,T)→ (* In TIME WAIT and have received a FIN *) (* Remaining in TIME WAIT until the 2MSL timer expires *) cont ) – deliver in 3 socket update processing : di3 socks update sid socks socks ′ = let sock 1 = socks[sid] in ∃tcp sock 1 . TCP PROTO(tcp sock 1 ) = sock 1 .pr ∧ (* Socket sock 1 referenced by identifier sid has just finished connection establishement and either there is another socket with sock 1 on its pending connections queue and this is the completion of a passive open, or there is not another socket and this is the completion of a simultaneous open. See the inline comment in deliver in 3 (p292) for further details. *) let interesting = λsid ′. sid ′ 6= sid ∧ case (socks[sid ′]).pr of UDP PROTO udp sock → F ‖ TCP PROTO(tcp sock ′)→ case tcp sock ′.lis of ∗ → F ‖ ↑ lis → sid ∈ lis.q0 in let interesting sids = (dom(socks)) ∩ interesting in if interesting sids 6= {} then Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3a 309 (* There exists another socket sock ′ that is listening and has socket sock 1 referenced by sid on its queue of incomplete connections lis.q0. *) ∃sid ′ sock ′ tcp sock ′ lis q0L q0R. sid ′ ∈ interesting sids ∧ sock ′ = socks[sid ′] ∧ sock ′.pr = TCP PROTO tcp sock ′ ∧ sid ′ 6= sid ∧ tcp sock ′.lis = ↑ lis ∧ lis.q0 = q0L@ (sid :: q0R) ∧ (* Choose non-deterministically whether there is room on the queue of completed connections *) choose ok :: accept incoming q lis. if ok then (* If there is room, then remove socket sid from the queue of incomplete connections and add it to the queue of completed connections. *) let lis ′ = lis 〈[ q0 := q0L@ q0R; q := sid :: lis.q ]〉 in (* Update the newly connected sockets receive window *) let rcv window = calculate bsd rcv wnd sock 1 .sf tcp sock 1 in (* BSD bug - rcv adv gets incorrectly set using the old value of rcv wnd , as this is done by the syncache, which is called from tcp_input() before the rcv wnd update takes place. Note that we have the following: SYN_SENT- >ESTABLISHED => update rcv wnd then rcv adv SYN_RCVD->ESTABLISHED => update rcv adv then rcv wnd *) let cb′ = tcp sock 1 .cb 〈[ rcv wnd := rcv window ; rcv adv := tcp sock 1 .cb.rcv nxt + tcp sock 1 .cb.rcv wnd ]〉 in (* Update both the newly connected socket and the listening socket *) socks ′ = socks ⊕ [(sid, sock 1 〈[ pr :=TCP PROTO(tcp sock 1 〈[ cb := cb′]〉)]〉); (sid ′, sock ′ 〈[ pr :=TCP PROTO(tcp sock ′ 〈[ lis := ↑ lis ′]〉)]〉)] else (* ...otherwise there is no room on the listening socket’s completed connections queue, so drop the newly connected socket and remove it from the listening socket’s queue of incomplete connections. Note: the dropped connection is not sent a RST but a RST is sent upon receipt of further segments from the other end as the socket entry has gone away. *) (* Note that the above note needs to be verified by testing. *) let lis ′ = lis 〈[ q0 := q0L@ q0R]〉 in socks ′ = socks ⊕ (sid ′, sock ′ 〈[ pr :=TCP PROTO(tcp sock ′ 〈[ lis := ↑ lis ′]〉)]〉) else (* There is no such socket with socket sid on its queue of incomplete connections, thus socket sid was involved in a simultaneous open. Do not update any socket. *) socks ′ = socks deliver in 3a tcp: network nonurgent Receive data with invalid checksum or offset h 〈[socks := socks; iq := iq ]〉 τ−→ h 〈[socks := socks; iq := iq ′]〉 (* Summary: This rule is a placeholder for the case where a received segment has an invalid checksum or offset, in which case implementations should drop it on the floor. The model of TCP segments does not contain checksum or offset, however, hence the F below. *) sid ∈ dom(socks) ∧ sock 0 = socks[sid ] ∧ sock 0 .is1 = ↑ i1 ∧ sock 0 .ps1 = ↑ p1 ∧ sock 0 .is2 = ↑ i2 ∧ sock 0 .ps2 = ↑ p2 ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3b 310 sock 0 .pr = TCP PROTO(tcp sock 0 ) ∧ dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃win urp ws discard mss discard . win = w2n win tcp sock 0 .cb.snd scale ∧ urp = w2n urp ∧ seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH ; RST :=F; SYN :=F; FIN :=FIN ; win :=win ; ws :=ws discard ; urp := urp ; mss :=mss discard ; ts := ts; data := data ]〉) ∧ (* Note that there does not exist a better socket match to which the segment should be sent, as the whole quad is matched exactly *) tcp sock 0 .st /∈ {CLOSED;LISTEN;SYN SENT} ∧ tcp sock 0 .st ∈ {SYN RECEIVED;ESTABLISHED;CLOSE WAIT;FIN WAIT 1;FIN WAIT 2; CLOSING;LAST ACK;TIME WAIT} ∧ F (* invalid checksum or offset *) deliver in 3b tcp: network nonurgent Receive data after process has gone away h 〈[socks := socks; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ′; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: if data arrives after the process associated with a socket has gone away, close socket and emit RST segment. *) sid ∈ dom(socks) ∧ sock 0 = socks[sid ] ∧ sock 0 .is1 = ↑ i1 ∧ sock 0 .ps1 = ↑ p1 ∧ sock 0 .is2 = ↑ i2 ∧ sock 0 .ps2 = ↑ p2 ∧ sock 0 .pr = TCP PROTO(tcp sock 0 ) ∧ dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃win urp ws discard mss discard . win = w2n win tcp sock 0 .cb.snd scale ∧ urp = w2n urp ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 3c 311 seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH ; RST :=F; SYN :=F; FIN :=FIN ; win :=win ; ws :=ws discard ; urp := urp ; mss :=mss discard ; ts := ts; data := data ]〉) ∧ (* Note that there does not exist a better socket match to which the segment should be sent, as the whole quad is matched exactly. *) (* test that this is data arriving after process has gone away *) tcp sock 0 .st ∈ {FIN WAIT 1;CLOSING;LAST ACK;FIN WAIT 2;TIME WAIT} ∧ sock 0 .fid = ∗ ∧ seq + length data > tcp sock 0 .cb.rcv nxt ∧ (* close socket and emit RST segment *) socks ′ = socks ⊕ (sid , tcp close h.arch sock 0 ) ∧ dropwithreset ignore fail seg h.arch h.ifds h.rttab(ticks of h.ticks) BANDLIM UNLIMITED bndlm bndlm ′ outsegs ∧ enqueue oq list qinfo(oq , outsegs, oq ′) deliver in 3c tcp: network nonurgent Receive stupid ACK or LAND DoS in SYN RECEIVED state h 〈[socks := socks; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ′; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: if we receive a stupid ACK or a LAND DoS in SYN RECEIVED state then update timers and emit a RST appropriately. *) sid ∈ dom(socks) ∧ sock 0 = socks[sid ] ∧ sock 0 .is1 = ↑ i1 ∧ sock 0 .ps1 = ↑ p1 ∧ sock 0 .is2 = ↑ i2 ∧ sock 0 .ps2 = ↑ p2 ∧ sock 0 .pr = TCP PROTO(tcp sock 0 ) ∧ dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃win urp ws discard mss discard . win = w2n win tcp sock 0 .cb.snd scale ∧ urp = w2n urp ∧ seg =〈[ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 4 312 is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH ; RST :=F; SYN :=F; FIN :=FIN ; win :=win ; ws :=ws discard ; urp := urp ; mss :=mss discard ; ts := ts; data := data ]〉) ∧ (* Note that there does not exist a better socket match to which the segment should be sent, as the whole quad is matched exactly. *) (* test for stupid ACK in SYN RECEIVED, and for LAND DoS attack *) tcp sock 0 .st = SYN RECEIVED ∧ ((ACK ∧ (ack ≤ tcp sock 0 .cb.snd una ∨ ack > tcp sock 0 .cb.snd max )) ∨ seq < tcp sock 0 .cb.irs) ∧ (* incoming segment; update timers *) let (t idletime ′, tt keep′, tt fin wait 2 ′) = update idle tcp sock 0 in let tcp sock ′ = tcp sock 0 〈[ cb := tcp sock 0 .cb 〈[ t idletime := t idletime ′; tt keep := tt keep′; tt fin wait 2 := tt fin wait 2 ′]〉]〉 in socks ′ = socks ⊕ (sid , sock 0 〈[ pr :=TCP PROTO(tcp sock ′)]〉) ∧ (* emit RST. See dropwithreset ignore fail (p120) and enqueue oq list qinfo (p??). *) dropwithreset ignore fail seg h.arch h.ifds h.rttab(ticks of h.ticks) BANDLIM UNLIMITED bndlm bndlm ′ outsegs ∧ enqueue oq list qinfo(oq , outsegs, oq ′) deliver in 4 tcp: network nonurgent Receive and drop (silently) a non-sane or martian segment h 〈[iq := iq ]〉 τ−→ h 〈[iq := iq ′]〉 (* Summary: Receive and drop any segment for this host that does not have sensible checksum or offset fields, or one that originates from a martian address. The first part of this condition is a placeholder, awaiting the day when we switch to a non-lossy segment representation, hence the F. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ seg .is2 = ↑ i2 ∧ is1 = seg .is1 ∧ i2 ∈ local ips(h.ifds) ∧ (F∨ (* placeholder for segment checksum and offset field not sensible *) ¬( T∧ (* placeholder for not a link-layer multicast or broadcast *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 6 313 ¬(is broadormulticast h.ifds i2)∧ (* seems unlikely, since i1 ∈ local ips h.ifds *) ¬(is1 = ∗) ∧ ¬ is broadormulticast h.ifds(the is1) ) ) deliver in 5 tcp: network nonurgent Receive and drop (maybe with RST) a sane segment that does not match any socket h 〈[iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: Receive and drop any segment for this host that does not match any sockets (but does have sensible checksum and offset fields). Typically, generate RST in response, computing ack and seq to supposedly make the other end see this as an ’acceptable ack’. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ seg .is2 = ↑ i1 ∧ i1 ∈ local ips(h.ifds) ∧ seg .ps2 = ↑ p1 ∧ seg .is1 6= ∗ ∧ seg .ps1 6= ∗ ∧ T∧ (* placeholder for segment checksum and offset field sensible *) ¬(∃((sid, sock) :: h.socks)tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ match score(sock .is1, sock .ps1, sock .is2, sock .ps2) (the seg .is1, seg .ps1, the seg .is2, seg .ps2) > 0 ) ∧ dropwithreset seg h.ifds(ticks of h.ticks)BANDLIM RST CLOSEDPORT bndlm bndlm ′ outsegs ′ ∧ enqueue and ignore fail h.arch h.rttab h.ifds outsegs ′ oq oq ′ deliver in 6 tcp: network nonurgent Receive and drop (silently) a sane segment that matches a CLOSED socket h 〈[iq := iq ]〉 τ−→ h 〈[iq := iq ′]〉 (* Summary: Receive and drop any segment for this host that does not match any sockets (but does have sensible checksum or offset fields). Note that pathological segments where is1, ps1, or ps2 are not set in the segment are not dealt with here but need to be. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃((sid, sock) :: h.socks)tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ match score(sock .is1, sock .ps1, sock .is2, sock .ps2) (the seg .is1, seg .ps1, the seg .is2, seg .ps2) > 0 ∧ tcp socket best match h.socks(sid, sock)seg h.arch ∧ tcp sock .st = CLOSED) ∧ seg .is2 = ↑ i1 ∧ i1 ∈ local ips(h.ifds) ∧ T (* placeholder for segment checksum and offset field sensible *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 7 314 deliver in 7 tcp: network nonurgent Receive RST and zap non-{CLOSED; LISTEN; SYN SENT; SYN RECEIVED; TIME WAIT} socket h 〈[ts := ts ⊕ (tid 7→ (tsst)d); socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 τ−→ h 〈[ts := ts ⊕ (tid 7→ (tsst)d); socks := socks ⊕ [(sid , sock ′)]; iq := iq ′]〉 (* Summary: receive RST and silently zap non-{CLOSED; LISTEN; SYN SENT; SYN RECEIVED; TIME WAIT} socket *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ st /∈ {CLOSED;LISTEN;SYN SENT;SYN RECEIVED;TIME WAIT} ∧ (∃seq discard ack discard URG discard ACK discard PSH discard SYN discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack discard : tcp seq local); URG :=URG discard ; ACK :=ACK discard ; PSH :=PSH discard ; RST :=T; SYN :=SYN discard ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ ( (* sock .st ∈ {CLOSED;LISTEN;SYN SENT;SYN RECEIVED;TIME WAIT} excluded already above *) if st ∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSE WAIT} then err = ↑ ECONNRESET else (* sock .st ∈ {CLOSING;LAST ACK} – leave existing error *) err = sock .es) ∧ (* see tcp close (p121) *) sock ′ = tcp close h.arch(sock 〈[ es := err ]〉) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 7a 315 deliver in 7a tcp: network nonurgent Receive RST and zap SYN RECEIVED socket h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 τ−→ h 〈[socks := socks ⊕ socks update ′; iq := iq ′]〉 (* Summary: receive RST and zap SYN RECEIVED socket, removing from listen queue etc. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ (∃seq discard ack discard URG discard ACK discard PSH discard SYN discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack discard : tcp seq local); URG :=URG discard ; ACK :=ACK discard ; PSH :=PSH discard ; RST :=T; SYN :=SYN discard ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ sid /∈ dom(socks) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(SYN RECEIVED, cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ ( (* There is a corresponding listening socket – passive open *) (∃(sid ′, lsock) :: socks\\sid . ∃tcp lsock lis q0L q0R lsock ′. lsock .pr = TCP PROTO(tcp lsock) ∧ tcp lsock .st = LISTEN ∧ tcp lsock .lis = ↑ lis ∧ lis.q0 = q0L@ (sid :: q0R) ∧ lsock ′ = lsock 〈[ pr :=TCP PROTO(tcp lsock 〈[ lis := ↑(lis 〈[ q0 := q0L@ q0R]〉)]〉)]〉 ∧ socks update ′ = [(sid ′, lsock ′); (sid , sock ′)] ) ∨ ( (* No corresponding socket – simultaneous open *) socks update ′ = [(sid , sock ′)])) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 7b 316 (* We do not delete the socket entry here because of simultaneous opens. Keep existing error for SYN RECEIVED socket on RST *) sock ′ = (tcp close h.arch sock)〈[ ps1 := if bsd arch h.arch then ∗ else sock .ps1]〉 deliver in 7b tcp: network nonurgent Receive RST and ignore for LISTEN socket h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ′]〉 (* Summary: receive RST and ignore for LISTEN socket *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sock = Sock(↑ fid , sf , is1, ↑ p1, is2, ps2, es, cantsndmore, cantrcvmore, TCP Sock(LISTEN, cb, lis, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (* BSD listen bug – since we can call listen() from any state, the peer IP/port may have been set *) ((is2 = ∗ ∧ ps2 = ∗) ∨ (bsd arch h.arch ∧ is2 = ↑ i2 ∧ ps2 = ↑ p2)) ∧ i1 ∈ local ips h.ifds ∧ T∧ (* placeholder for not a link-layer multicast or broadcast *) (* seems unlikely, since i1 ∈ local ips h.ifds *) ¬(is broadormulticast h.ifds i1) ∧ ¬(is broadormulticast h.ifds i2) ∧ (case is1 of ↑ i1 ′ → i1 ′ = i1 ‖ ∗ → T) ∧ (∃seq discard ack discard URG discard ACK discard PSH discard SYN discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack discard : tcp seq local); URG :=URG discard ; ACK :=ACK discard ; PSH :=PSH discard ; RST :=T; SYN :=SYN discard ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ tcp socket best match(socks\\sid)(sid , sock)seg h.arch (* there does not exist a better socket match to which the segment should be sent *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 7c 317 deliver in 7c tcp: network nonurgent Receive RST and ignore for SYN SENT(unacceptable ack) or TIME WAIT socket h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′]〉 (* Summary: receive RST and ignore for SYN SENT(unacceptable ack) or TIME WAIT socket *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sid /∈ dom(socks) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ st ∈ {SYN SENT;TIME WAIT} ∧ (∃seq discard URG discard PSH discard SYN discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG discard ; ACK :=ACK ; PSH :=PSH discard ; RST :=T; SYN :=SYN discard ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ (* no- or unacceptable- ACK *) (st = SYN SENT =⇒ (¬ACK ∨ (ACK ∧ ¬(cb.iss < ack ∧ ack ≤ cb.snd max )))) ∧ sock .pr = TCP PROTO(tcp sock) ∧ (if st = TIME WAIT then (* only update if ≥ ESTABLISHED, c.f. tcp\_input.c:887 *) sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb 〈[ t idletime := stopwatch zero; (* just received segment *) tt keep := ↑((())slow timer TCPTV KEEP IDLE)]〉 ]〉)]〉 else (* st = SYN SENT *) (* BSD rcv_wnd bug: the receive window updated code in tcp_input gets executed before the segment is processed, so even for bad segments, it gets updated *) let rcv window = calculate bsd rcv wnd sf tcp sock in sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 7d 318 〈[ rcv wnd := if bsd arch h.arch then rcv window else tcp sock .cb.rcv wnd ; rcv adv := if bsd arch h.arch then tcp sock .cb.rcv nxt + rcv window else tcp sock .cb.rcv adv ]〉 ]〉)]〉) deliver in 7d tcp: network nonurgent Receive RST and zap SYN SENT(acceptable ack) socket h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′]〉 (* Summary Receiving an acceptable-ack RST segment: kill the connection and set the socket’s error field appropri- ately, unless we are WinXP where we simply ignore the RST. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sid /∈ dom(socks) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(SYN SENT, cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (∃seq discard URG discard PSH discard SYN discard FIN discard win discard ws discard urp discard mss discard ts discard data discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq discard : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG discard ; ACK :=T; PSH :=PSH discard ; RST :=T; SYN :=SYN discard ; FIN :=FIN discard ; win :=win discard ; ws :=ws discard ; urp := urp discard ; mss :=mss discard ; ts := ts discard ; data := data discard ]〉 ) ∧ cb.iss < ack ∧ ack ≤ cb.snd max∧ (* acceptable ack *) (if windows arch h.arch then sock ′ = sock (* Windows XP just ignores RST’s with a valid ack during connection establishment *) else (∃err . err ∈ {ECONNREFUSED;ECONNRESET}∧ (* Note it is unclear whether or not this error will overwrite any existing error on the socket *) sock ′ = (tcp close h.arch sock)〈[ ps1 := if bsd arch h.arch then ∗ else sock .ps1; es := ↑ err ]〉)) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 8 319 deliver in 8 tcp: network nonurgent Receive SYN in non-{CLOSED; LISTEN; SYN SENT; TIME WAIT} state h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: Receive a SYN in non-{CLOSED; LISTEN; SYN SENT; TIME WAIT} state. Drop it and (de- pending on the architecture) generate a RST. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sid /∈ dom(socks) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (∃ws discard mss discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH ; RST :=F; SYN :=T; FIN :=FIN ; win :=win; ws :=ws discard ; urp := urp; mss :=mss discard ; ts := ts; data := data ]〉 ) ∧ (* Note that it may be the case that this rule should only apply when the SYN is in the trimmed window, should not it?; it’s OK if there’s a SYN bit set, for example in a retransmission. *) st /∈ {CLOSED;LISTEN;SYN SENT;TIME WAIT} ∧ sock .pr = TCP PROTO(tcp sock) ∧ let t idletime ′ = stopwatch zero in let tt keep′ = if tcp sock .st 6= SYN RECEIVED then ↑((())slow timer TCPTV KEEP IDLE) else tcp sock .cb.tt keep in let tt fin wait 2 ′ = if tcp sock .st = FIN WAIT 2 then ↑((())slow timer TCPTV MAXIDLE) else tcp sock .cb.tt fin wait 2 in sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ tt keep := tt keep′; Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 9 320 tt fin wait 2 := tt fin wait 2 ′; t idletime := t idletime ′]〉 ]〉)]〉 ∧ (if bsd arch h.arch then make rst segment from cb tcp sock .cb(i1, i2, p1, p2)seg ′ else T) ∧ dropwithreset seg h.ifds(ticks of h.ticks)BANDLIM UNLIMITED bndlm bndlm ′ outsegs ∧ outsegs ′ = (if bsd arch h.arch then (TCP(seg ′)) :: outsegs else outsegs) ∧ enqueue each and ignore fail h.arch h.rttab h.ifds outsegs ′ oq oq ′ deliver in 9 tcp: network nonurgent Receive SYN in TIME WAIT state if there is no matching LISTEN socket or sequence number has not increased h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ; bndlm := bndlm]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock)]; iq := iq ′; oq := oq ′; bndlm := bndlm ′]〉 (* Summary: Receive a SYN in TIME WAIT} state where there is no matching LISTEN socket. Drop it and (depending on the architecture) generate a RST. *) dequeue iq(iq , iq ′, ↑(TCP seg)) ∧ sid /∈ dom(socks) ∧ sock = Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(TIME WAIT, cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)) ∧ (∃ws discard mss discard . seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := tcp seq flip sense(seq : tcp seq foreign); ack := tcp seq flip sense(ack : tcp seq local); URG :=URG ; ACK :=ACK ; PSH :=PSH ; RST :=F; SYN :=T; FIN :=FIN ; win :=win; ws :=ws discard ; urp := urp; mss :=mss discard ; ts := ts; data := data ]〉 ) ∧ (* no matching LISTEN socket, or the sequence number has not increased *) ((seq ≤ (tcp sock of sock).cb.rcv nxt) ∨ ¬(∃((sid , sock) :: socks)tcp sock . sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .st = LISTEN ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in 9 321 sock .is1 ∈ {∗; ↑ i1} ∧ sock .ps1 = ↑ p1) ) ∧ (if bsd arch h.arch then make rst segment from cb cb(i1, i2, p1, p2)seg ′ else T) ∧ dropwithreset seg h.ifds(ticks of h.ticks)BANDLIM RST CLOSEDPORT bndlm bndlm ′ outsegs ∧ outsegs ′ = (if bsd arch h.arch then (TCP(seg ′)) :: outsegs else outsegs) ∧ enqueue each and ignore fail h.arch h.rttab h.ifds outsegs ′ oq oq ′ (* This rule does not appear in the BSD code; what happens there is that the old TIME WAIT state socket is closed, and then the code jumps back to the top. So this rule covers the case where it then discovers nothing else is listening, like deliver in 5 . *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 17 Host LTS: TCP Output 17.1 Output (TCP only) A TCP implementation would typically perform output deterministically, e.g. during the processing a received segment it may construct and enqueue an acknowledgement segment to be emitted. This means that the detailed behaviour of a particular implementation depends on exactly where the output routines are called, affecting when segments are emitted. The contents of an emitted segment, on the other hand, must usu- ally be determined by the socket state (especially the tcpcb), not from transient program variables, so that retransmissions can be performed. In this specification we choose to be somewhat nondeterministic, loosely specifying when common-case TCP output to occur. This simplifies the modelling of existing implementations (avoiding the need to capture the code points at which the output routines are called) and should mean the specification is closer to capturing the set of all reasonable implementations. A significant defect in the current specification is that it does not impose a very tight lower bound on how often output takes place. The satisfactory dynamic behaviour of TCP connections depends on an ”ACK clock” property, with receivers acknowledging data sufficiently often to update the sender’s send window. Characterising this may need additional constraints. The rule presented in this chapter describes TCP output in the common case, i.e. the behaviour of TCP when emitting a non-SYN, non-RST segment. The whole behaviour is captured by the single rule deliver out 1 which relies upon the auxiliary functions tcp output required (p111) and tcp output really (p113). Output (strictly, adding segments to the host’s output queue) may take place whenever this rule can fire; it does construct the output segments purely from the socket state. The two auxiliary functions are loosely based on BSD’s TCP output function, which can be logically divided into two halves. The first of these —to some approximation— is a guard that prevents output from occuring unless it is valid to do so, and the second actually creates a segment and passes it to the IP layer for output. This distinction is mirrored in the specification, with tcp output required acting as the guard and tcp output really forming the segment ready to be appended to the host’s output queue. Unfortunately it is not possible to be as clean here as one might hope, because under some circumstances tcp output required may have side-effects. It should be noted that tcp output really only creates a segment and does not perform any ”output” — the act of adding the segment (perhaps unreliably) to the host’s output queue is the job of the caller. The output cases not covered by deliver out 1 are handled specially and often in a more determinis- tic way. Segments with the SYN flag set are created by the auxiliary functions make syn segment (p106) and make syn ack segment (p107) and are output deterministically in response to either user events or seg- ment input. SYN segments are emitted by the rules commonly involved in connection establishment, namely connect 1 , deliver in 1 , deliver in 2 , timer tt rexmtsyn 1 and timer tt rexmt 1 and are special-cased in this way for clarity because connection establishment performs extra work such as option negotiation and state initialisation. The creation of RST segments is performed by the auxiliaries make rst segment from cb (p109) and make rst segment from seg (p110), and are used by the rules that require a reset segment to be emitted in response to a user event, e.g. a close() call on a socket with a zero linger time, or as a socket’s response to receiving some types of invalid segment. In a few places, mainly in the specification of certain congestion control methods, some rules use tcp output really (p113) or the wrapper functions tcp output perhaps (p116) and 322 deliver out 1 323 mlift tcp output perhaps or fail (p118) directly and—more importantly—deterministically. This is partly for clarity, perhaps because an RFC states that output ”MUST” occur at that point, and partly for convenience, possibly because the model would require much extra state (hence adding unnecessary complexity) if the output function was not used in-place. The tcp output perhaps function almost entirely mimics an implementation’s TCP output function. It calls tcp output required to check that output can take place, applying any side-effects that it returns, and finally creates the segment with tcp output really. See tcp output perhaps (p116) and mlift tcp output perhaps or fail (p118) for more information. Other auxiliary functions are involved in TCP output and are described earlier. Once a seg- ment has been constructed it is added to the host’s output queue by one of enqueue or fail (p118), enqueue or fail sock (p118), enqueue and ignore fail (p118), enqueue each and ignore fail (p118) or mlift tcp output perhaps or fail (p118). These functions are used by deliver out 1 and other rules in the specification to non-deterministically add a segment to the host’s output queue. In the common case, a segment is added to the host’s output queue successfully. In other cases, the auxiliary function rollback tcp output (p117) may assert a segment is unroutable and prevent the segment from being added to the queue. Some failures are non-deterministic in order to model ”out of resource” style errors, although most are deterministic routing failures determined from the socket and host states. rollback tcp output has a second task to ”undo” several of the socket’s control block changes upon an error condition. Some of the enqueue functions ignore failure, e.g. enqueue and ignore fail, and upon an error they just fail to queue the segment and do not update the socket with the ”rolled-back” control block returned by rollback tcp output. 17.1.1 Summary deliver out 1 tcp: network nonurgent Common case TCP output 17.1.2 Rules deliver out 1 tcp: network nonurgent Common case TCP output h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′′)]; oq := oq ′]〉 (* Summary: output TCP segment if possible. In some cases update the socket’s persist timer without performing output. *) (* The TCP socket is connected *) sid /∈ dom(socks) ∧ sock = Sock(fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore,TCP PROTO(tcp sock)) ∧ tcp sock = TCP Sock0(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc) ∧ (* and either is in a synchronised state with initial SYN acknowledged. . . *) ((st ∈ {ESTABLISHED;CLOSE WAIT;FIN WAIT 1;FIN WAIT 2;CLOSING; LAST ACK;TIME WAIT} ∧ cb.snd una 6= cb.iss) ∨ (* . . . or is in the SYN SENT or SYN RECEIVED state and a FIN needs to be emitted *) (st ∈ {SYN SENT;SYN RECEIVED} ∧ cantsndmore ∧ cb.tf shouldacknow) ) ∧ (* A segment will be emitted if tcp output required asserts that a segment can be output (do output). If tcp output required returns a function to alter the socket’s persist timer (persist fun), then this does not of itself mean that a segment is required, however deliver out 1 should still fire to allow the update to take place. *) let (do output , persist fun) = tcp output required h.arch h.ifds sock in (do output ∨ persist fun 6= ∗) ∧ (* Apply any persist timer side-effect from tcp output required *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver out 1 324 let sock0 = option case sock(λf .sock 〈[ pr :=TCP PROTO(tcp sock cb :=ˆ f )]〉)persist fun in (if do output then (* output a segment *) (* Construct the segment to emit, updating the socket’s state *) tcp output really h.arch F(ticks of h.ticks)h.ifds sock0(sock ′, outsegs ′) ∧ sock ′.pr = TCP PROTO(tcp sock ′) ∧ (* Add the segment to the host’s output queue, rolling back the socket’s control block state if an error occurs *) enqueue or fail sock(tcp sock ′.st ∈ {CLOSED;LISTEN;SYN SENT})h.arch h.rttab h.ifds outsegs ′ oq sock0 sock ′(sock ′′, oq ′) else (* Do not output a segment, but ensure things are tidied up *) oq = oq ′ ∧ sock ′′ = sock0 ) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 18 Host LTS: TCP Timers 18.1 Timers (TCP only) 18.1.1 Summary timer tt rexmtsyn 1tcp: misc nonurgent SYN retransmit timer expires timer tt rexmt 1 tcp: misc nonurgent retransmit timer expires timer tt persist 1 tcp: misc nonurgent persist timer expires timer tt keep 1 tcp: network nonurgent keepalive timer expires timer tt 2msl 1 tcp: misc nonurgent 2*MSL timer expires timer tt delack 1 tcp: misc nonurgent delayed-ACK timer expires timer tt conn est 1tcp: misc nonurgent connection establishment timer expires timer tt fin wait 2 1tcp: misc nonurgent FIN WAIT 2 timer expires 18.1.2 Rules timer tt rexmtsyn 1 tcp: misc nonurgent SYN retransmit timer expires h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; oq := oq ′]〉 sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .cb.tt rexmt = ↑(((RexmtSyn, shift))d) ∧ timer expires d∧ (* timer has expired *) tcp sock .st = SYN SENT∧ (* this rule is incomplete: RexmtSyn is possible in other states, since deliver in 2 may change state without clearing tt rexmt *) cb = tcp sock .cb ∧ (if shift + 1 ≥ TCP MAXRXTSHIFT then (* Timer has expired too many times. Drop and close the connection *) (* since socket state is SYN SENT, no segments can be output *) tcp drop and close h.arch(↑ ETIMEDOUT)sock(sock ′, [ ]) ∧ oq ′ = oq else (* Update the control block based upon the number of occasions on which the timer expired *) (if shift + 1 = 1 ∧ cb.t rttinf .tf srtt valid then (* On the first retransmit store values for recovery from a bad retransmit *) (* we cannot guess the safe window for this if we do not know the RTT, hence the second condition *) 325 timer tt rexmtsyn 1 326 snd cwnd prev ′ = cb.snd cwnd ∧ snd ssthresh prev ′ = cb.snd ssthresh ∧ t badrxtwin ′ = (())TimeWindowkern timer(time(cb.t rttinf .t srtt/2)) (* kern timer for a ticks-based deadline *) else (* Otherwise keep the previous values *) snd cwnd prev ′ = cb.snd cwnd prev ∧ snd ssthresh prev ′ = cb.snd ssthresh prev ∧ t badrxtwin ′ = cb.t badrxtwin (* should be TimeWindowClosed, since retransmit timer is always longer than t srtt/2 *) ) ∧ (if (shift + 1 = 3) ∧ ¬(linux arch h.arch) then (* On the third retransmit turn off window scaling and times- tamping options *) tf req tstmp′ = F ∧ request r scale ′ = ∗ else (* Otherwise keep the previous values *) tf req tstmp′ = cb.tf req tstmp ∧ request r scale ′ = cb.request r scale ) ∧ let t rttinf ′ = (if shift + 1 > TCP MAXRXTSHIFT div 4 then (* Invalidate the recorded smoothed round-trip time for the connection after TCP MAXRXTSHIFT div 4 retransmits *) (* Note that the BSD code adjusts the srtt and rttvar values here to ensure that if it does not get a new rtt measurement before the next retransmit it can still use the existing values. We do not need to do this for two reasons: (1) we have a flag to invalidate the srtt values (the only reason BSD updates srtt to be zero and hacks rrttvar is to mark it invalid and request a new rtt update), and (2) the BSD RTTVAR BUG does not affect SYN retransmits in any case (because for SYN retransmits srtt is zero and BSD hacks up rttvar appropriately at the start of a new connection to make everything just work) *) (* Note that the socket’s route should be discarded. *) cb.t rttinf 〈[ tf srtt valid :=F]〉 else cb.t rttinf ) in cb′ = cb 〈[ (* Restart the rexmt timer to time the retransmitted SYN *) tt rexmt := start tt rexmtsyn h.arch(shift + 1)F cb.t rttinf ; (* reset to next backoff point *) t badrxtwin := t badrxtwin ′; t rttinf := t rttinf ′ 〈[ t lastshift := shift + 1; t wassyn :=T]〉; tf req tstmp := tf req tstmp′; request r scale := request r scale ′; snd nxt := cb.iss + 1; (* value after sending SYN *) snd recover := cb.iss + 1; (* value after sending SYN *) t rttseg := ∗; snd cwnd := cb.t maxseg ; (* Calculation as per BSD *) snd ssthresh := cb.t maxseg ∗max 2(min cb.snd wnd cb.snd cwnd div(2 ∗ cb.t maxseg)); snd cwnd prev := snd cwnd prev ′; snd ssthresh prev := snd ssthresh prev ′; t dupacks := 0]〉 ∧ (∃i1 i2 p1 p2.(sock .is1, sock .is2, sock .ps1, sock .ps2) = (↑ i1, ↑ i2, ↑ p1, ↑ p2) ∧ (* Create the segment to be retransmitted *) choose seg ′ :: (make syn segment cb′(i1, i2, p1, p2)(ticks of h.ticks)). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt rexmt 1 327 (* Attempt to add the new segment to the host’s output queue, constraining the final control block state *) enqueue or fail F h.arch h.rttab h.ifds[TCP seg ′]oq (cb 〈[ snd nxt := cb.iss; tt delack := ∗; last ack sent := tcp seq foreign 0w; rcv adv := tcp seq foreign 0w ]〉)cb′(cb′′, oq ′) ) ∧ sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb′′]〉)]〉 ) timer tt rexmt 1 tcp: misc nonurgent retransmit timer expires h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′′)]; oq := oq ′]〉 sock .pr = TCP PROTO(tcp sock) ∧ sock ′.pr = TCP PROTO(tcp sock ′) ∧ (tcp sock .st /∈ {CLOSED;LISTEN;SYN SENT;CLOSE WAIT;FIN WAIT 2;TIME WAIT} ∨ (tcp sock .st = LISTEN ∧ bsd arch h.arch)) ∧ tcp sock .cb.tt rexmt = ↑(((Rexmt, shift))d) ∧ timer expires d ∧ cb = tcp sock .cb ∧ (if shift + 1 > (if tcp sock .st = SYN RECEIVED then TCP SYNACKMAXRXTSHIFT else TCP MAXRXTSHIFT) then (* Note that BSD’s syncaches have a much lower threshold for retransmitting SYN,ACKs than normal *) (* drop connection *) tcp drop and close h.arch(↑ ETIMEDOUT)sock(sock ′, [TCP seg ′]) (* will always get exactly one segment *) else (* on first retransmit, store values for recovery from bad retransmit *) (* we cannot guess the safe window for this if we do not know the RTT, hence the second condition *) (if shift + 1 = 1 ∧ cb.t rttinf .tf srtt valid then snd cwnd prev ′ = cb.snd cwnd ∧ snd ssthresh prev ′ = cb.snd ssthresh ∧ t badrxtwin ′ = (())TimeWindowkern timer(time(cb.t rttinf .t srtt/2)) (* kern timer for a ticks-based deadline *) else snd cwnd prev ′ = cb.snd cwnd prev ∧ snd ssthresh prev ′ = cb.snd ssthresh prev ∧ t badrxtwin ′ = cb.t badrxtwin)∧ (* should be TimeWindowClosed, since retransmit timer is always longer than t srtt/2 *) (* NB: The socket is not in SYN SENT here; the rexmt timer has been split into two, and SYN SENT uses tt rexmtsyn. *) let t rttinf ′ = (if shift + 1 > TCP MAXRXTSHIFT div 4 then (* Note that the socket’s route should be discarded. *) cb.t rttinf 〈[ tf srtt valid :=F; t srtt :=ˆ(cb.t rttinf .t srtt/4) onlywhen(bsd arch h.arch ∧ BSD RTTVAR BUG) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt rexmt 1 328 ]〉 else cb.t rttinf ) in (* backoff the timer and do a retransmit *) cb′ = cb 〈[ tt rexmt := start tt rexmt h.arch(shift + 1)F cb.t rttinf ; (* reset to next backoff point *) (* tcp output really touches this again, but actually leaves it the same, unless sock .snd urp is set and win0 6= 0, weirdly *) t badrxtwin := t badrxtwin ′; t rttinf := t rttinf ′ 〈[ t lastshift := shift + 1; t wassyn :=F ]〉; snd nxt := cb.snd una; (* want to retransmit from snd una *) snd recover := cb.snd max ; t rttseg := ∗; snd cwnd := cb.t maxseg ; snd ssthresh := cb.t maxseg ∗max 2(min cb.snd wnd cb.snd cwnd div(2 ∗ cb.t maxseg)); snd cwnd prev := snd cwnd prev ′; snd ssthresh prev := snd ssthresh prev ′; t dupacks := 0]〉 ∧ (if tcp sock .st = SYN RECEIVED then (∃i1 i2 p1 p2. (* If we’re Linux doing a simultaneous open and support timestamping then ensure timestamping is enabled in any retransmitted SYN,ACK segments. See deliver in 2 for the rationale in full, but in short Linux is RFC1323 compliant and makes a hash of option negotiation during a simultaneous open. We make the option decision early (as per the RFC and BSD) and have to hack up SYN,ACK segments to contain timestamp options if the Linux host supports timestamping. *) (* Note: this behaviour is also safe if we are here due to a passive open. In this case, if the remote end does not support timestamping, tf req tstmp is F due to the option negotiation in deliver in 1 . Then tf doing tstmp is necessarily F too and the retransmitted SYN,ACK segment does not contain a timestamp. OTOH, if tf req tstmp is still T then so is tf doing tstmp and the faked up cb below is safe. *) (* Note that similar to the above note on timestamping, window scaling may also have to be dealt with here. *) let cb′′′ = (if ((linux arch h.arch) ∧ cb.tf req tstmp) then cb′ 〈[ tf req tstmp :=T; tf doing tstmp :=T]〉 else cb′) in (* Note that tt delack and possibly other timers should be cleared here *) (sock .is1, sock .is2, sock .ps1, sock .ps2) = (↑ i1, ↑ i2, ↑ p1, ↑ p2) ∧ (* We are in SYN RECEIVED and want to retransmit the SYN,ACK, so we either got here via deliver in 1 or deliver in 2 . In both cases, calculate buf sizes was used to set cb.t maxseg to the correct value (as per tcp_mss() in BSD), however, we need to use the old values in retransmitting the SYN,ACK, as per tcp_mssopt() in BSD. make syn ack segment therefore uses the value stored in cb.t advmss to set the same mss option in the segment, so we do not need to do anything special here. *) seg ′ ∈ make syn ack segment cb′′′(i1, i2, p1, p2)(ticks of h.ticks) ∧ (* We need to remember to add the length of the segment data (i.e. 1 for a SYN) back onto snd nxt in the cb, since this is what tcp output really does for normal retransmits. If we do not do this, then we’ll end up trying to send the first lot of data with a seq of iss, rather than iss + 1 *) sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb′ 〈[ snd nxt := cb′.snd nxt + 1]〉]〉)]〉 ) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt keep 1 329 else if tcp sock .st = LISTEN then (* BSD LISTEN bug: in BSD it is possible to transition a socket to the LISTEN state without cancelling the rexmt timer. In this case, segments are emitted with no flags set. *) bsd arch h.arch ∧ (∃i1 i2 p1 p2. (sock .is1, sock .is2, sock .ps1, sock .ps2) = (↑ i1, ↑ i2, ↑ p1, ↑ p2) ∧ seg ′ ∈ bsd make phantom segment cb′(i1, i2, p1, p2)(ticks of h.ticks)(sock .cantsndmore)) ∧ (* Retransmission only continues if FIN is set in the outgoing segment (really!) *) sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb′ 〈[ tt rexmt :=ˆ ∗ onlywhen¬seg ′.FIN ]〉]〉)]〉 else (* ESTABLISHED,FIN WAIT 1,CLOSING,LAST ACK *) (* i.e., cannot be CLOSED,LISTEN,SYN SENT,CLOSE WAIT,FIN WAIT 2,TIME WAIT *) tcp output really h.arch F(ticks of h.ticks)h.ifds (sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := cb′]〉)]〉) (sock ′, [TCP seg ′]) (* always emits exactly one segment *) ) ) ∧ enqueue or fail T h.arch h.rttab h.ifds[TCP seg ′]oq cb′ tcp sock ′.cb(cb′′, oq ′) ∧ sock ′′ = sock ′ 〈[ pr :=TCP PROTO(tcp sock ′ 〈[ cb := cb′′]〉)]〉 timer tt persist 1 tcp: misc nonurgent persist timer expires h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′′)]; oq := oq ′]〉 sock .pr = TCP PROTO(tcp sock) ∧ sock ′.pr = TCP PROTO(tcp sock ′) ∧ tcp sock .cb.tt rexmt = ↑(((Persist, shift))d) ∧ timer expires d ∧ let sock0 = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ tt rexmt := start tt persist(shift + 1)tcp sock .cb.t rttinf h.arch]〉]〉)]〉 in tcp output really h.arch T (* T indicates a window probe is requested *) (ticks of h.ticks)h.ifds sock0 (sock ′, outsegs ′) ∧ enqueue or fail sock(tcp sock ′.st ∈ {CLOSED;LISTEN;SYN SENT})h.arch h.rttab h.ifds outsegs ′ oq sock0 sock ′(sock ′′, oq ′) timer tt keep 1 tcp: network nonurgent keepalive timer expires h 〈[socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]; oq := oq ]〉 Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt 2msl 1 330 τ−→ h 〈[socks := socks ⊕ [(sid ,Sock(↑ fid , sf , ↑ i1, ↑ p1, ↑ i2, ↑ p2, es, cantsndmore, cantrcvmore, TCP Sock(st , cb′, ∗, sndq , sndurp, rcvq , rcvurp, iobc)))]; oq := oq ′]〉 (* Note that in another rule the following needs to be specified: if the timer has expired for the last time, then (in another rule): (if HAVERCVDSYN (i.e., not CLOSED/LISTEN/SYN SENT) then send a RST else do not do anything yet) ∧ copy soft error to es ∧ free tcpcb, saving RTT *) cb.tt keep = ↑((())d) ∧ timer expires d ∧ (* Note the following condition also needs to be investigated: cb.t rcvtime+tcp keepidle+tcp keepcnt ∗tcp keepintvl < NOW ∧ – still probing *) (∃win . w2n win = cb.rcv wnd cb.rcv scale ∧ let ts = if cb.tf doing tstmp then let ts ecr ′ = option case (ts seq 0w) I (timewindow val of cb.ts recent) in ↑((ticks of h.ticks), ts ecr ′) else ∗ in seg =〈[ is1 := ↑ i2; is2 := ↑ i1; ps1 := ↑ p2; ps2 := ↑ p1; seq := cb.snd una − 1; (* deliberately outside window *) ack := cb.rcv nxt ; URG :=F; ACK :=T; PSH :=F; RST :=F; SYN :=F; FIN :=F; win :=win ; ws := ∗; urp := 0w; mss := ∗; ts := ts; data :=[ ] ]〉) ∧ enqueue and ignore fail h.arch h.rttab h.ifds[TCP seg ]oq oq ′ ∧ cb′ = cb 〈[ tt keep := ↑((())slow timer TCPTV KEEPINTVL); last ack sent := seg .ack ]〉 timer tt 2msl 1 tcp: misc nonurgent 2*MSL timer expires h 〈[socks := socks ⊕ [(sid , sock)]]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]]〉 (* Summary: When the 2MSL TIME WAIT period expires, the socket is closed. *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt fin wait 2 1 331 sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .cb.tt 2msl = ↑((())d) ∧ timer expires d ∧ sock ′ = tcp close h.arch sock timer tt delack 1 tcp: misc nonurgent delayed-ACK timer expires h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′′)]; oq := oq ′]〉 sock .pr = TCP PROTO(tcp sock) ∧ sock ′.pr = TCP PROTO(tcp sock ′) ∧ tcp sock .cb.tt delack = ↑((())d) ∧ timer expires d ∧ let sock0 = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ tt delack := ∗]〉]〉)]〉 in tcp output really h.arch F(ticks of h.ticks)h.ifds sock0(sock ′, outsegs ′) ∧ enqueue or fail sock(tcp sock ′.st ∈ {CLOSED;LISTEN;SYN SENT})h.arch h.rttab h.ifds outsegs ′ oq sock0 sock ′(sock ′′, oq ′) Description This overlaps with deliver out 1 . This is a bit odd, but is a consequence of our liberal nondeterministic TCP output. timer tt conn est 1 tcp: misc nonurgent connection establishment timer expires h 〈[socks := socks ⊕ [(sid , sock)]; oq := oq ]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; oq := oq ′]〉 (* Summary: If the connection-establishment timer goes off, drop the connection (possibly RST ing the other end). *) sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .cb.tt conn est = ↑((())d) ∧ timer expires d ∧ tcp drop and close h.arch(↑ ETIMEDOUT) (sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ tt conn est := ∗]〉]〉)]〉)(sock ′, outsegs) ∧ (* Note it should be the case that the socket is in SYN SENT, and so outsegs will be empty, but that is not definite. *) enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ Description POSIX: says, in the INFORMATIVE section APPLICATION USAGE, that the state of the socket is unspecified if connect() fails. We could (in the POSIX ”architecture”) model this accurately. timer tt fin wait 2 1 tcp: misc nonurgent FIN WAIT 2 timer expires h 〈[socks := socks ⊕ [(sid , sock)]]〉 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]]〉 sock .pr = TCP PROTO(tcp sock) ∧ tcp sock .cb.tt fin wait 2 = ↑((())d) ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ timer tt fin wait 2 1 332 timer expires d ∧ sock ′ = tcp close h.arch sock Description This stops the timer and closes the socket. Unlike BSD, we take steps to ensure that this timer only fires when it is really time to close the socket. Specifically, we reset it every time we receive a segment while in FIN WAIT 2, to TCPTV MAXIDLE. This means we do not need any guarding conditions here; we just do it. This means that we do not directly model the BSD behaviour of ”sleep for 10 minutes, then check every 75 seconds to see if the connection has been idle for 10 minutes”. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 19 Host LTS: UDP Input Processing 19.1 Input Processing (UDP only) 19.1.1 Summary deliver in udp 1 udp: network nonur- gent Get UDP datagram from host’s in-queue and deliver it to a matching socket deliver in udp 2 udp: network nonur- gent Get UDP datagram from host’s in-queue but generate ICMP, as no matching socket deliver in udp 3 udp: network nonur- gent Get UDP datagram from host’s in-queue and drop as from a martian address 19.1.2 Rules deliver in udp 1 udp: network nonurgent Get UDP datagram from host’s in-queue and deliver it to a matching socket h0 τ−→ h0 〈[iq := iq ′; socks := socks ⊕ [(sid , sock pr :=UDP Sock(rcvq ′))]]〉 h0 = h 〈[ iq := iq ; socks := socks ⊕ [(sid , sock pr :=UDP Sock(rcvq))]]〉 ∧ rcvq ′ = rcvq @ [Dgram msg(〈[ data := data; is := ↑ i3; ps := ps3]〉)] ∧ dequeue iq(iq , iq ′, ↑(UDP(〈[ is1 := ↑ i3; is2 := ↑ i4; ps1 := ps3; ps2 := ps4; data := data]〉))) ∧ (∃(ifid , ifd) :: (h0.ifds).i4 ∈ ifd.ipset) ∧ sid ∈ lookup udp h0.socks(i3, ps3, i4, ps4)h0.bound h0.arch ∧ T∧ (* placeholder for ”not a link-layer multicast or broadcast” *) ¬(is broadormulticast h0.ifds i4)∧ (* seems unlikely, since i1 ∈ local ips h.ifds *) ¬(is broadormulticast h0.ifds i3) Description At the head of the host’s in-queue is a UDP datagram with source address (↑ i3, ps3), destination address (↑ i4, ps4), and data data. The destination IP address, i4, is an IP address for one of the host’s interfaces and is not an IP- or link-layer broadcast or multicast address and neither is the source IP address, i3. The UDP socket sid matches the address quad of the datagram (see lookup udp (p86) for details). A τ transition is made. The datagram is removed from the host’s in-queue, iq , and appended to the tail of the socket’s receive queue, rcvq ′, leaving the host with in-queue iq ′ and the socket with receive queue rcvq ′. 333 deliver in udp 3 334 deliver in udp 2 udp: network nonurgent Get UDP datagram from host’s in-queue but generate ICMP, as no matching socket h iq := iq τ−→ h 〈[iq := iq ′; oq := if icmp to go then oq ′ else h.oq ]〉 dequeue iq(iq , iq ′, ↑(UDP(〈[ is1 := ↑ i3; is2 := ↑ i4; ps1 := ps3; ps2 := ps4; data := data]〉))) ∧ lookup udp h.socks(i3, ps3, i4, ps4)h.bound h.arch = ∅ ∧ icmp = ICMP(〈[ is1 := ↑ i4; is2 := ↑ i3; is3 := ↑ i3; is4 := ↑ i4; ps3 := ps3; ps4 := ps4; proto :=PROTO UDP; seq := ∗; t := ICMP UNREACH(PORT)]〉) ∧ (enqueue oq(h.oq , icmp, oq ′,T) ∨ icmp to go = F) (* non-deterministic ICMP generation *) ∧ i4 ∈ local ips h.ifds ∧ T∧ (* placeholder for ”not a link-layer multicast or broadcast” *) ¬(is broadormulticast h.ifds i4)∧ (* seems unlikely, since i1 ∈ local ips h.ifds *) ¬(is broadormulticast h.ifds i3) Description At the head of the host’s in-queue, iq , is a UDP datagram with source address (↑i3, ps3), destination address (↑ i4, ps4), and data data. The destination IP address, i4, is an IP address for one of the host’s interfaces and is neither a broadcast or multicast address; the source IP address, i3, is also not a broadcast or multicast address. None of the sockets in the host’s finite map of sockets, h.socks, match the datagram (see lookup udp (p86) for details). A τ transition is made. The datagram is removed from the host’s in-queue, leaving it with in-queue iq ′. An ICMP Port-unreachable message may be generated and appended to the tail of the host’s out-queue in response to the datagram. deliver in udp 3 udp: network nonurgent Get UDP datagram from host’s in-queue and drop as from a martian address h 〈[iq := iq ]〉 τ−→ h 〈[iq := iq ′]〉 dequeue iq(iq , iq ′, ↑(UDP dgram)) ∧ dgram.is2 = ↑ i2 ∧ is1 = dgram.is1 ∧ i2 ∈ local ips(h.ifds) ∧ (F ∨ ¬(T ∧ ¬(is broadormulticast h.ifds i2)∧ (* seems unlikely, since i1 ∈ local ips h.ifds *) ¬(is1 = ∗) ∧ ¬ is broadormulticast h.ifds(the is1) ) ) Description At the head of the host’s in-queue, iq , is a UDP datagram with destination IP address ↑i2 which is an IP address for one of the host’s interfaces. Either i2 is an IP-layer broadcast or multicast address, or the source IP address, is1, is not set or is an IP-layer broadcast or multicast address. A τ transition is made. The datagram is dropped from the host’s in-queue, leaving it with in-queue iq ′. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 20 Host LTS: ICMP Input Processing 20.1 Input Processing (ICMP only) 20.1.1 Summary deliver in icmp 1 all: network nonurgent Receive ICMP UNREACH NET etc for known socket deliver in icmp 2 all: network nonurgent Receive ICMP UNREACH NEEDFRAG for known socket deliver in icmp 3 all: network nonurgent Receive ICMP UNREACH PORT etc for known socket deliver in icmp 4 all: network nonurgent Receive ICMP PARAMPROB etc for known socket deliver in icmp 5 all: network nonurgent Receive ICMP SOURCE QUENCH for known socket deliver in icmp 6 all: network nonurgent Receive and ignore other ICMP deliver in icmp 7 all: network nonurgent Receive and ignore invalid or unmatched ICMP 20.1.2 Rules deliver in icmp 1 all: network nonurgent Receive ICMP UNREACH NET etc for known socket h0 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′]〉 h0 = h 〈[ socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ]〉 ∧ dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ icmp.t ∈ {ICMP UNREACH c | c ∈ {NET;HOST;SRCFAIL;NET UNKNOWN;HOST UNKNOWN; ISOLATED; TOSNET;TOSHOST;PREC VIOLATION;PREC CUTOFF}} ∧ icmp.is3 = ↑ i3 ∧ i3 /∈ IN MULTICAST∧ sid ∈ lookup icmp h0.socks icmp h0.arch h0.bound ∧ (case sock .pr of TCP PROTO(tcp sock)→ (∃icmpseq .icmp.seq = ↑ icmpseq ∧ if tcp sock .cb.snd una ≤ icmpseq ∧ icmpseq < tcp sock .cb.snd max then if tcp sock .st = ESTABLISHED then sock ′ = sock∧ (* ignore transient error while connected *) oq ′ = oq else if tcp sock .st ∈ {CLOSED;LISTEN;SYN SENT;SYN RECEIVED} ∧ 335 deliver in icmp 2 336 tcp sock .cb.tt rexmt 6= ∗ ∧ shift of tcp sock .cb.tt rexmt > 3 ∧ tcp sock .cb.t softerror 6= ∗ then tcp drop and close h.arch(↑ EHOSTUNREACH)sock(sock ′, outsegs) ∧ enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ else sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ t softerror := ↑ EHOSTUNREACH]〉]〉)]〉 ∧ oq ′ = oq else (* Note the case where it is a syncache entry is not dealt with here: a syncache_unreach() should be done instead *) sock ′ = sock ∧ oq ′ = oq) ‖ UDP PROTO(udp sock)→ if windows arch h.arch then sock ′ = sock 〈[ pr :=UDP PROTO(udp sock 〈[ rcvq := udp sock .rcvq @ [(Dgram error(〈[ e :=ECONNRESET]〉))]]〉)]〉 ∧ oq ′ = oq else sock ′ = sock 〈[ es :=ˆ ↑ ECONNREFUSED onlywhen((sock .is2 6= ∗) ∨ ¬(SO BSDCOMPAT ∈ sock .sf .b))]〉 ∧ oq ′ = oq) Description Corresponds to FreeBSD 4.6-RELEASE’s PRC UNREACH NET. deliver in icmp 2 all: network nonurgent Receive ICMP UNREACH NEEDFRAG for known socket h0 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′]〉 h0 = h 〈[ socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ]〉 ∧ dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ icmp.t = ICMP UNREACH(NEEDFRAG icmpmtu) ∧ (icmp.is3 = ∗ ∨ the icmp.is3 /∈ IN MULTICAST) ∧ sid ∈ lookup icmp h0.socks icmp h0.arch h0.bound ∧ let nextmtu = if F∧ (* Note this is a placeholder for ”there is a host (not net) route for icmp.is4” *) F then (* Note this is a placeholder for ”rmx.mtu not locked” *) let curmtu = 1492 in (* Note this value should be taken from rmx.mtu *) let nextmtu = case icmpmtu of ↑ mtu → w2n mtu ‖ ∗ → next smaller(mtu tab h0.arch)curmtu in if nextmtu < 296 then (* Note this should lock curmtu in rmxcache; and not change rmxcache MTU from curmtu *) ↑ curmtu else (* Note here, nextmtu should be stored in rmxcache *) ↑ nextmtu else ∗ in (case sock .pr of TCP PROTO(tcp sock)→ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in icmp 3 337 (∃icmpseq .icmp.seq = ↑ icmpseq ∧ if is some icmp.is3 then (if tcp sock .cb.snd una ≤ icmpseq ∧ icmpseq < tcp sock .cb.snd max then if nextmtu = ∗ then sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ t maxseg :=MSSDFLT]〉]〉)]〉 ∧ oq ′ = oq else let mss =min(sock .sf .n(SO SNDBUF)) (rounddown MCLBYTES (the nextmtu − 40− (if tcp sock .cb.tf doing tstmp then 12 else 0))) in (* BSD: TS, plus NOOP for alignment *) if mss ≤ tcp sock .cb.t maxseg then let sock ′′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ t maxseg :=mss; t rttseg := ∗; snd nxt := tcp sock .cb.snd una ]〉]〉)]〉 in ∃sock ′′′ outsegs tcp sock ′′′. sock ′′′.pr = TCP PROTO(tcp sock ′′′) ∧ tcp output perhaps h.arch(ticks of h.ticks)h.ifds sock ′′(sock ′′′, outsegs) ∧ enqueue or fail sock(tcp sock ′′′.st /∈ {CLOSED;LISTEN;SYN SENT}) h.arch h.rttab h.ifds outsegs oq sock ′′ sock ′′′(sock ′, oq ′) else sock ′ = sock ∧ oq ′ = oq else (* Note the case where it is a syncache entry is not dealt with here: a syncache_unreach() should be done instead *) sock ′ = sock ∧ oq ′ = oq) else sock ′ = sock ∧ oq ′ = oq) ‖ UDP PROTO(udp sock)→ if windows arch h.arch then sock ′ = sock 〈[ pr :=UDP PROTO(udp sock 〈[ rcvq := udp sock .rcvq @ [(Dgram error(〈[ e :=EMSGSIZE]〉))]]〉)]〉 ∧ oq ′ = oq else sock ′ = sock 〈[ es := ↑ EMSGSIZE]〉 ∧ oq ′ = oq) Description Corresponds to FreeBSD 4.6-RELEASE’s PRC MSGSIZE. deliver in icmp 3 all: network nonurgent Receive ICMP UNREACH PORT etc for known socket h0 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′]〉 h0 = h 〈[ socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ]〉 ∧ dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ icmp.t ∈ {ICMP UNREACH c | c ∈ {PROTOCOL;PORT;NET PROHIB;HOST PROHIB;FILTER PROHIB}} ∧ Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in icmp 4 338 icmp.is3 = ↑ i3 ∧ i3 /∈ IN MULTICAST∧ sid ∈ lookup icmp h0.socks icmp h0.arch h0.bound ∧ (case sock .pr of TCP PROTO(tcp sock)→ (∃icmpseq .icmp.seq = ↑ icmpseq ∧ if tcp sock .cb.snd una ≤ icmpseq ∧ icmpseq < tcp sock .cb.snd max then if tcp sock .st = SYN SENT then tcp drop and close h.arch(↑ ECONNREFUSED)sock(sock ′, [ ]) (* know from definition of tcp drop and close that no segs will be emitted *) else sock ′ = sock ∧ oq ′ = oq else (* Note the case where it is a syncache entry is not dealt with here: a syncache_unreach() should be done instead *) sock ′ = sock ∧ oq ′ = oq) ‖ UDP PROTO(udp sock)→ (if windows arch h.arch then sock ′ = sock 〈[ pr :=UDP PROTO(udp sock 〈[ rcvq := udp sock .rcvq @ [(Dgram error(〈[ e :=ECONNRESET]〉))]]〉)]〉 ∧ oq ′ = oq else sock ′ = sock 〈[ es :=ˆ ↑(ECONNREFUSED) onlywhen((sock .is2 6= ∗) ∨ ¬(SO BSDCOMPAT ∈ sock .sf .b))]〉 ∧ oq ′ = oq)) Description Corresponds to FreeBSD 4.6-RELEASE’s PRC UNREACH PORT and PRC UNREACH ADMIN PROHIB. deliver in icmp 4 all: network nonurgent Receive ICMP PARAMPROB etc for known socket h0 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′; oq := oq ′]〉 h0 = h 〈[ socks := socks ⊕ [(sid , sock)]; iq := iq ; oq := oq ]〉 ∧ dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ icmp.t ∈ {ICMP PARAMPROB c | c ∈ {BADHDR;NEEDOPT}} ∧ icmp.is3 = ↑ i3 ∧ i3 /∈ IN MULTICAST∧ sid ∈ lookup icmp h0.socks icmp h0.arch h0.bound ∧ (case sock .pr of TCP PROTO(tcp sock)→ (∃icmpseq .icmp.seq = ↑ icmpseq ∧ if tcp sock .cb.snd una ≤ icmpseq ∧ icmpseq < tcp sock .cb.snd max then if tcp sock .st ∈ {CLOSED;LISTEN;SYN SENT;SYN RECEIVED} ∧ tcp sock .cb.tt rexmt 6= ∗ ∧ shift of tcp sock .cb.tt rexmt > 3 ∧ tcp sock .cb.t softerror 6= ∗ then tcp drop and close h.arch(↑ ENOPROTOOPT)sock(sock ′, outsegs) ∧ enqueue and ignore fail h.arch h.rttab h.ifds outsegs oq oq ′ else Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in icmp 6 339 sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ t softerror := ↑ ENOPROTOOPT]〉]〉)]〉 ∧ oq ′ = oq else sock ′ = sock ∧ oq ′ = oq) ‖ UDP PROTO(udp sock)→ (if windows arch h.arch then sock ′ = sock 〈[ pr :=UDP PROTO(udp sock 〈[ rcvq := udp sock .rcvq @ [(Dgram error(〈[ e :=ENOPROTOOPT]〉))]]〉)]〉 ∧ oq ′ = oq else sock ′ = sock 〈[ es := ↑(ENOPROTOOPT)]〉 ∧ oq ′ = oq)) Description Corresponds to FreeBSD 4.6-RELEASE’s PRC PARAMPROB. deliver in icmp 5 all: network nonurgent Receive ICMP SOURCE QUENCH for known socket h0 τ−→ h 〈[socks := socks ⊕ [(sid , sock ′)]; iq := iq ′]〉 h0 = h 〈[ socks := socks ⊕ [(sid , sock)]; iq := iq ]〉 ∧ dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ icmp.t = ICMP SOURCE QUENCH QUENCH ∧ icmp.is3 = ↑ i3 ∧ i3 /∈ IN MULTICAST∧ sid ∈ lookup icmp h0.socks icmp h0.arch h0.bound ∧ (case sock .pr of TCP PROTO(tcp sock)→ (∃icmpseq .icmp.seq = ↑ icmpseq ∧ if tcp sock .cb.snd una ≤ icmpseq ∧ icmpseq < tcp sock .cb.snd max then sock ′ = sock 〈[ pr :=TCP PROTO(tcp sock 〈[ cb := tcp sock .cb 〈[ snd cwnd := 1 ∗ tcp sock .cb.t maxseg ]〉]〉)]〉 (* Note the state of the TCP socket should be checked here. *) (* Note it might be necessary to make an allowance for local/remote connection? *) else (* Note the case where it is a syncache entry is not dealt with here: a syncache_unreach() should be done instead *) sock ′ = sock) ‖ UDP PROTO(udp sock)→ (if windows arch h.arch then sock ′ = sock 〈[ pr :=UDP PROTO(udp sock 〈[ rcvq := udp sock .rcvq @ [(Dgram error(〈[ e :=EHOSTUNREACH]〉))]]〉)]〉 else sock ′ = sock 〈[ es := ↑(EHOSTUNREACH)]〉)) Description Corresponds to FreeBSD 4.6-RELEASE’s PRC QUENCH. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ deliver in icmp 7 340 deliver in icmp 6 all: network nonurgent Receive and ignore other ICMP h 〈[iq := iq ]〉 τ−→ h 〈[iq := iq ′]〉 dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ (icmp.t ∈ {ICMP TIME EXCEEDED INTRANS; ICMP TIME EXCEEDED REASS} ∨ icmp.t ∈ {ICMP UNREACH(OTHER x ) | x ∈ UNIV } ∨ icmp.t ∈ {ICMP SOURCE QUENCH(OTHER x ) | x ∈ UNIV } ∨ icmp.t ∈ {ICMP TIME EXCEEDED(OTHER x ) | x ∈ UNIV } ∨ icmp.t ∈ {ICMP PARAMPROB(OTHER x ) | x ∈ UNIV }) Description If ICMP TIME EXCEEDED (either INTRANS or REASS), or if a bad code is received, then ignore silently. deliver in icmp 7 all: network nonurgent Receive and ignore invalid or unmatched ICMP h 〈[iq := iq ]〉 τ−→ h 〈[iq := iq ′]〉 dequeue iq(iq , iq ′, ↑(ICMP icmp)) ∧ (icmp.t ∈ {ICMP UNREACH c | ¬∃x .c = OTHER x} ∨ icmp.t ∈ {ICMP PARAMPROB c | c ∈ {BADHDR;NEEDOPT}} ∨ icmp.t = ICMP SOURCE QUENCH QUENCH) ∧ (if ∃icmpmtu.icmp.t = ICMP UNREACH(NEEDFRAG icmpmtu) then ∃i3.icmp.is3 = ↑ i3 ∧ i3 ∈ IN MULTICAST else (icmp.is3 = ∗ ∨ the icmp.is3 ∈ IN MULTICAST∨ ¬(∃(sid, s) :: (h.socks). s.is1 = icmp.is3 ∧ s.is2 = icmp.is4 ∧ s.ps1 = icmp.ps3 ∧ s.ps2 = icmp.ps4 ∧ proto of s.pr = icmp.proto))) Description If the ICMP is a type we handle, but the source IP is IP 0 0 00 or a multicast address, or there’s no matching socket, then drop silently. ICMP UNREACH NEEDFRAG is handled specially, since we do not care if it’s IP 0 0 0 0, only if it’s multicast. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 21 Host LTS: Network Input and Output 21.1 Input and Output (Network only) 21.1.1 Summary deliver in 99 all: network nonurgent Really receive things deliver in 99a all: network nonurgent Ignore things not for us deliver out 99 all: network nonurgent Really send things deliver loop 99 all: network nonurgent Loop back a loopback message 21.1.2 Rules deliver in 99 all: network nonurgent Really receive things h 〈[iq := iq ]〉 msg−−−→ h 〈[iq := iq ′]〉 sane msg msg ∧ ↑ i1 = msg .is2 ∧ i1 ∈ local ips(h.ifds) ∧ enqueue iq(iq ,msg , iq ′, queued) Description Actually receive a message from the wire into the input queue. Note that if it cannot be queued (because the queue is full), it is silently dropped. We only accept messages that are for this host. We also assert that any message we receive is well-formed (this excludes elements of type msg that have no physical realisation). Note the delay in in-queuing the datagram is not modelled here. deliver in 99a all: network nonurgent Ignore things not for us h 〈[iq := iq ]〉 msg−−−→ h 〈[iq := iq ′]〉 ↑ i1 = msg .is2 ∧ i1 /∈ local ips(h.ifds) ∧ iq = iq ′ Description Do not accept messages that are not for this host. 341 deliver loop 99 342 deliver out 99 all: network nonurgent Really send things h 〈[oq := oq ]〉 msg−−−→ h 〈[oq := oq ′]〉 dequeue oq(oq , oq ′, ↑ msg) ∧ (∃i2.msg .is2 = ↑ i2 ∧ i2 /∈ local ips h.ifds) Description Actually emit a segment from the output queue. Note the delay in dequeuing the datagram is not modelled here. deliver loop 99 all: network nonurgent Loop back a loopback message h 〈[iq := iq ; oq := oq ]〉 lbl−−→ h 〈[iq := iq ′; oq := oq ′]〉 dequeue oq(oq , oq ′, ↑ msg) ∧ (∃i2.msg .is2 = ↑ i2 ∧ i2 ∈ local ips h.ifds) ∧ (lbl = if windows arch h.arch then τ else←−−→msg) ∧ enqueue iq(iq ,msg , iq ′, queued) Description Deliver a loopback message (for loopback address, or any of our addresses) from the outqueue to the inqueue. (if we tagged each message in the outqueue with its interface, we’d just pick loopback-interface segments, but we do not, so we just discriminate on IP addresses). Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 22 Host LTS: BSD Trace Records and Interface State Changes 22.1 Trace Records and Interface State Changes (BSD only) 22.1.1 Summary trace 1 all: misc nonurgent Trace TCPCB state, ESTABLISHED or later trace 2 all: misc nonurgent Trace TCPCB state, pre-ESTABLISHED interface 1 all: misc nonurgent Change connectivity 22.1.2 Rules trace 1 all: misc nonurgent Trace TCPCB state, ESTABLISHED or later h Lh trace tr−−−−−−−−−−→ h sid ∈ dom(h.socks) ∧ tr = (flav , sid , quad , st , cb) ∧ st ∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSING; CLOSE WAIT;LAST ACK;TIME WAIT} ∧ tracesock eq tr sid(h.socks[sid ]) Description This rule exposes certain of the fields of the socket and TCPCB, to allow open-box testing. Note that although the label carries an entire TCPCB, only certain selected fields are constrained to be equal to the actual TCPCB. See tracesock eq (p63) and tracecb eq (p62) for details. Checking trace equality is problematic as BSD generates trace records that fall logically inbetween the atomic transitions in this model. This happens frequently when in a state before ESTABLISHED. We only check for equality when we are in ESTABLISHED or later states. trace 2 all: misc nonurgent Trace TCPCB state, pre-ESTABLISHED h Lh trace tr−−−−−−−−−−→ h sid ∈ dom(h.socks) ∧ tr = (flav , sid , quad , st , cb) ∧ st /∈ {ESTABLISHED;FIN WAIT 1;FIN WAIT 2;CLOSING; CLOSE WAIT;LAST ACK;TIME WAIT} ∧ 343 interface 1 344 (st = CLOSED∨ (* BSD emits one of these each time a tcpcb is created, eg at end of 3WHS *) ((∃sock tcp sock . sock = (h.socks[sid ]) ∧ proto of sock .pr = PROTO TCP ∧ tcp sock = tcp sock of sock ∧ (case quad of ↑(is1, ps1, is2, ps2)→ if flav = TA DROP ∨ tcp sock .st = CLOSED then T else is1 = sock .is1 ∧ ps1 = sock .ps1 ∧ is2 = sock .is2 ∧ ps2 = sock .ps2 ‖ ∗ → T) ∧ (st = tcp sock .st ∨ tcp sock .st = CLOSED)))) interface 1 all: misc nonurgent Change connectivity h 〈[ifds := ifds]〉 Lh interface(ifid , up)−−−−−−−−−−−−−−−−−−−→ h 〈[ifds := ifds ′]〉 ifid ∈ dom(ifds) ∧ ifds ′ = ifds ⊕ (ifid , (ifds[ifid ])〈[ up := up]〉) Description Allow interfaces to be externally brought up or taken down. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Chapter 23 Host LTS: Time Passage 23.1 Time Passage auxiliaries (TCP and UDP) Time passage is a function, completely deterministic. Any nondeterminism must occur as a result of a tau (or other) transition. In the present semantics, time passage merely: 1. decrements all timers uniformly 2. prevents time passage if a timer reaches zero 3. prevents time passage if an urgent action is enabled. We model the first two points with functions Time Pass ∗, for various types ∗. These functions return an option type: if the result is NONE then time may not pass for the given duration. Essentially they pick out everything in a host state of type ′a timed, and do something with it. We treat the last point in the rule epsilon 1 (p348) itself, below. 23.1.1 Summary Time Pass timedoption time passes for an ′a timed option value Time Pass tcpcb time passes for a tcp control block Time Pass socket time passes for a socket fmap every apply f to range of finite map, and succeed if each application succeeds fmap every pred apply f to range of finite map, and succeed if each application succeeds Time Pass host time passes for a host 23.1.2 Rules – time passes for an ′a timed option value : (Time Pass timedoption : duration→ ′a timed option→ ′a timed option option) dur x0 = case x0 of ∗ → ↑ ∗ ‖ ↑ x → (case Time Pass timed dur x of ∗ → ∗ ‖ ↑ x0 ′ → ↑(↑ x0 ′)) – time passes for a tcp control block : 345 Time Pass socket 346 (Time Pass tcpcb : duration→ tcpcb→ tcpcb set option)(* recall: ’a set == ’a -> bool *) dur cb = let tt rexmt ′ = Time Pass timedoption dur cb.tt rexmt and tt keep′ = Time Pass timedoption dur cb.tt keep and tt 2msl ′ = Time Pass timedoption dur cb.tt 2msl and tt delack ′ = Time Pass timedoption dur cb.tt delack and tt conn est ′ = Time Pass timedoption dur cb.tt conn est and tt fin wait 2 ′ = Time Pass timedoption dur cb.tt fin wait 2 and ts recent ′s = Time Pass timewindow dur cb.ts recent and t badrxtwin ′s = Time Pass timewindow dur cb.t badrxtwin and t idletime ′s = Time Pass stopwatch dur cb.t idletime in if is some tt rexmt ′ ∧ is some tt keep′ ∧ is some tt 2msl ′ ∧ is some tt delack ′ ∧ is some tt conn est ′ ∧ is some tt fin wait 2 ′ then ↑(λcb′. choose ts recent ′ :: ts recent ′s. choose t badrxtwin ′ :: t badrxtwin ′s. choose t idletime ′ :: t idletime ′s. cb′ = cb 〈[ (* not going to list everything here; too much! *) tt rexmt := the tt rexmt ′; tt keep := the tt keep′; tt 2msl := the tt 2msl ′; tt delack := the tt delack ′; tt conn est := the tt conn est ′; tt fin wait 2 := the tt fin wait 2 ′; ts recent := ts recent ′; t badrxtwin := t badrxtwin ′; t idletime := t idletime ′ ]〉) else ∗ – time passes for a socket : (Time Pass socket : duration→ socket→ socket set option) dur s = case s.pr of UDP PROTO(udp)→ ↑{s} ‖ TCP PROTO(tcp s)→ let cb′s = Time Pass tcpcb dur tcp s.cb in if is some cb′s then ↑(λs ′. choose cb′ :: the cb′s. s ′ = s 〈[ (* fid unchanged *) (* sf unchanged *) (* is1,ps1,is2,ps2 unchanged *) (* es unchanged *) pr :=TCP PROTO(tcp s 〈[ cb := cb′]〉) ]〉) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Time Pass host 347 else ∗ – apply f to range of finite map, and succeed if each application succeeds : (fmap every : (′a → ′b option)→ (′c 7→ ′a)→ (′c 7→ ′b) option) f fm = let fm ′ = f o f fm in if ∗ ∈ rng(fm ′) then ∗ else ↑(the o f fm ′) – apply f to range of finite map, and succeed if each application succeeds : (fmap every pred : (′a → ′b set option)→ (′c 7→ ′a)→ (′c 7→ ′b)set option) f fm = if ∃y .y ∈ rng(fm) ∧ f y = ∗ then ∗ else ↑{fm ′ | dom(fm) = dom(fm ′) ∧ ∀x .x ∈ dom(fm) =⇒ fm ′[x ] ∈ (the(f (fm[x ])))} – time passes for a host : (Time Pass host : duration→ host→ host set option) dur h = let ts ′ = fmap every(Time Pass timed dur)h.ts and socks ′s = fmap every pred(Time Pass socket dur)h.socks and iq ′ = Time Pass timed dur h.iq and oq ′ = Time Pass timed dur h.oq and ticks ′s = Time Pass ticker dur h.ticks in if is some ts ′ ∧ is some socks ′s ∧ is some iq ′ ∧ is some oq ′ then ↑(λh ′. choose socks ′ :: the socks ′s. choose ticks ′ :: ticks ′s. h ′ = h 〈[ (* arch unchanged *) (* ifds unchanged *) ts := the ts ′; (* files unchanged *) socks := socks ′; (* listen unchanged *) (* bound unchanged *) iq := the iq ′; oq := the oq ′; ticks := ticks ′ (* fds unchanged *) Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ rn 348 ]〉) else ∗ 23.2 Host transitions with time (TCP and UDP) We now build the relation =⇒, which includes time transitions, from the relation −→, which is instantaneous. This avoids circularity (or at best inductiveness) in the definition of the transition relation. 23.2.1 Summary epsilon 1 all: misc nonurgent Time passes epsilon 2 all: misc nonurgent Inductively defined time passage rn rp: rc 23.2.2 Rules epsilon 1 all: misc nonurgent Time passes h dur===⇒ h ′ let hs ′ = Time Pass host dur h in is some hs ′ ∧ h ′ ∈ (the hs ′) ∧ ¬(∃rn rp rc lbl h ′.rn/ ∗ rp, rc ∗ /h lbl−−→ h ′ ∧ is urgent rc) Description Allow time to pass for dur seconds. This is only enabled if the host state is not urgent, i.e. if no urgent rule can fire. Notice that, apart from when a timer becomes zero, a host state never becomes urgent due merely to time passage. This means we need only test for urgency at the beginning of the time interval, not throughout it. epsilon 2 all: misc nonurgent Inductively defined time passage h dur===⇒ h ′ (∃h1 h2 dur ′ dur ′′. dur ′ < dur ∧ (∃rn rp rc.rn/ ∗ rp, rc ∗ /h dur ′ ===⇒ h1) ∧ (∃rn rp rc.rn/ ∗ rp, rc ∗ /h1 τ=⇒ h2) ∧ dur ′ + dur ′′ = dur ∧ (∃rn rp rc.rn/ ∗ rp, rc ∗ /h2 dur ′′ ====⇒ h ′) ) Description Combine time passage and τ transitions. Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ rn 349 rn rp: rc h lbl==⇒ h ′ rn/ ∗ rp, rc ∗ / h lbl−−→ h ′ Description Embed all non-time transitions in the full LTS Rule version: $Id: TCP1 hostLTSScript.sml,v 1.961 2005/03/18 10:34:36 kw217 Exp $ Part XIV TCP1 evalSupport 350 Chapter 24 Initial state This file defines a function to construct certain initial host states for use in automated trace checking, along with other constants used in typical traces. The interfaces, routing table and some host fields are taken from the initial_host line at the start of a valid trace. 24.1 Initial state (TCP and UDP) The initial state of a host. 24.1.1 Summary simple ifd eth simple ethernet interface simple ifd lo simple loopback interface simple rttab simple routing table tid initial initial thread id simple host simple host state dummy cb dummy socket minimal socket dummy sockets initial host function to construct an initial host for trace checking 24.1.2 Rules – simple ethernet interface : simple ifd eth i = (ETH 0,〈[ ipset :={i}; primary := i ;netmask :=NETMASK 24; up :=T]〉) – simple loopback interface : simple ifd lo = (LO,〈[ ipset :=LOOPBACK ADDRS; primary := ip localhost; netmask :=NETMASK 8; up :=T]〉) – simple routing table : simple rttab = [〈[ destination ip := ip localhost; destination netmask :=NETMASK 8; ifid :=LO]〉; 〈[ destination ip := IP 0 0 0 0; 351 dummy socket 352 destination netmask :=NETMASK 0; ifid :=ETH 0]〉] – initial thread id : tid initial = TID 0 – simple host state : simple host i tick0 remdr0 = 〈[ arch :=FreeBSD 4 6 RELEASE; privs :=F; ifds := ∅ ⊕ [simple ifd lo; simple ifd eth i ]; rttab := simple rttab; ts := ∅ ⊕ (tid initial 7→ (Run)never timer); files := ∅; socks := ∅; listen :=[ ]; bound :=[ ]; iq := ([ ])never timer ; oq := ([ ])never timer ; bndlm := bandlim state init; ticks :=Ticker(tick0 , remdr0 , tickintvlmin, tickintvlmax); fds := ∅]〉 – : dummy cb =〈[ tt rexmt := ∗; tt 2msl := ∗; tt conn est := ∗; tt delack := ∗; tt keep := ∗; tt fin wait 2 := ∗; t idletime :=Stopwatch(0, 1, 1); t badrxtwin :=TimeWindowClosed; ts recent :=TimeWindowClosed]〉 – minimal socket : dummy socket(is, p) = 〈[ fid := ∗; sf :=〈[ b :=λx .F;n :=λx .0; t :=λx .∞]〉; is1 := is; ps1 := ↑ p; is2 := ∗; ps2 := ∗; pr :=TCP PROTO(〈[ st :=LISTEN; cb := dummy cb; lis := ↑〈[ q0 :=[ ]; q :=[ ]; qlimit := 10]〉 ]〉) ]〉 Rule version: $Id: TCP1 evalSupportScript.sml,v 1.31 2005/01/13 06:04:38 mn200 Exp $ initial host 353 Description This is a pretty minimally-defined socket, just enough to say ”this port is bound”. – : dummy sockets n[ ] = [ ] ∧ dummy sockets n(p :: ps) = (SID n,dummy socket p) :: dummy sockets(n + 1)ps – function to construct an initial host for trace checking : initial host(i : ip)(t : tid)(arch : arch)(ispriv : bool) (heldports : (ip option#port)list)(ifaces : (ifid#ifd)list) (rt : routing table) (init tick : ts seq) (init tick remdr : duration) = simple host i init tick init tick remdr 〈[ arch := arch; privs := ispriv ; ifds := ∅ ⊕ ifaces; rttab := rt ; ts := ∅ ⊕ (t 7→ (Run)never timer); fds := case arch of (* per architecture, note down FDs preallocated for internal use by OCaml or the test harness *) Linux 2 4 20 8→ ∅⊕ [(FD 0,FID 0); (FD 1,FID 0); (FD 2,FID 0); (FD 3,FID 0); (FD 4,FID 0); (FD 5,FID 0); (FD 6,FID 0); (FD 1000,FID 0) ] ‖ FreeBSD 4 6 RELEASE→ ∅⊕ [(FD 0,FID 0); (FD 1,FID 0); (FD 2,FID 0); (FD 3,FID 0); (FD 4,FID 0); (FD 5,FID 0); (FD 6,FID 0); (FD 7,FID 0) ] ‖WinXP Prof SP1→ ∅; (* Windows FDs are not allocated in order, so there’s no need to specify anything here. *) files := ∅ ⊕ (FID 0, File(FT Console,〈[ b :=λx .F]〉)); socks := ∅ ⊕ (dummy sockets 0 heldports) ]〉 Rule version: Index abstime, 20 accept 1 , 126 accept 2 , 127 accept 3 , 127 accept 4 , 128 accept 5 , 129 accept 6 , 129 accept 7 , 130 accept incoming q , 91 accept incoming q0 , 91 andThen, 104 arch, 60 assert , 104 assert failure, 104 ASSERTION FAILURE , 4 auto outroute, 82 autobind , 85 backlog fudge, 75 badf 1 , 274 bandlim reason, 61 bandlim rst ok , 95 bandlim rst ok always, 94 bandlim rst ok simple, 94 bandlim state init , 94 bind 1 , 133 bind 2 , 134 bind 3 , 134 bind 5 , 135 bind 7 , 135 bind 9 , 135 bound after , 85 bound port allowed , 85 bound ports protocol autobind , 85 bsd arch, 79 bsd make phantom segment , 109 BSD RTTVAR BUG , 66 calculate bsd rcv wnd , 93 calculate buf sizes, 93 calculate tcp options len, 92 chooseM , 104 clip int to num, 2 close 1 , 138 close 10 , 144 close 2 , 138 close 3 , 139 close 4 , 140 close 5 , 141 close 6 , 142 close 7 , 142 close 8 , 143 computed rto, 97 computed rxtcur , 97 CONCAT OPTIONAL, 3 connect 1 , 148 connect 10 , 161 connect 2 , 152 connect 3 , 152 connect 4 , 153 connect 4a, 154 connect 5 , 154 connect 5a, 155 connect 5b, 156 connect 5c, 157 connect 5d , 157 connect 6 , 158 connect 7 , 158 connect 8 , 159 connect 9 , 160 cont , 104 decr list , 3 deliver in 1 , 279 deliver in 1b, 283 deliver in 2 , 285 deliver in 2a, 290 deliver in 3 , 291 deliver in 3a, 309 deliver in 3b, 310 deliver in 3c, 311 deliver in 4 , 312 deliver in 5 , 313 deliver in 6 , 313 deliver in 7 , 314 deliver in 7a, 315 deliver in 7b, 316 deliver in 7c, 317 deliver in 7d , 318 deliver in 8 , 319 deliver in 9 , 320 deliver in 99 , 341 deliver in 99a, 341 deliver in icmp 1 , 335 deliver in icmp 2 , 336 deliver in icmp 3 , 337 deliver in icmp 4 , 338 deliver in icmp 5 , 339 deliver in icmp 6 , 339 deliver in icmp 7 , 340 354 INDEX 355 deliver in udp 1 , 333 deliver in udp 2 , 333 deliver in udp 3 , 334 deliver loop 99 , 342 deliver out 1 , 323 deliver out 99 , 341 dequeue, 90 dequeue iq , 90 dequeue oq , 90 dgram, 58 dgram error , 58 dgram msg , 58 di3 ackstuff , 298 di3 datastuff , 304 di3 datastuff really , 300 di3 newackstuff , 295 di3 socks update, 308 di3 ststuff , 305 di3 topstuff , 294 diqmax , 67 disconnect 1 , 164 disconnect 2 , 165 disconnect 3 , 166 disconnect 4 , 163 disconnect 5 , 164 do tcp options, 92 doqmax , 67 dosend , 96 DROP , 3 drop from q0 , 91 dropwithreset , 120 dropwithreset ignore fail , 120 dschedmax , 67 dtsinval , 73 dummy cb, 352 dummy socket , 352 dummy sockets, 353 dup 1 , 167 dup 2 , 167 dupfd 1 , 169 dupfd 3 , 170 dupfd 4 , 170 duration, 20 emit segs, 105 emit segs pred , 105 enqueue, 90 enqueue and ignore fail , 118 enqueue each and ignore fail , 118 enqueue iq , 90 enqueue list , 91 enqueue list qinfo, 91 enqueue oq , 90 enqueue oq bndlim rst , 95 enqueue oq list , 91 enqueue oq list qinfo, 91 enqueue or fail , 118 enqueue or fail sock , 118 ephemeral ports, 69 epsilon 1 , 348 epsilon 2 , 348 err , 16 error , 7 expand cwnd , 99 fast timer , 88 FAST TIMER INTVL, 68 FAST TIMER MODEL INTVL, 68 fd , 14 fd op, 35 FD SETSIZE , 69 fd sockop, 35 fdle, 83 fdlt , 83 ff default , 71 ff default b, 71 fid , 53 fid ref count , 84 File, 53 file, 53 filebflag , 14 fileflags, 53 filetype, 53 fm exists, 2 fmap every , 347 fmap every pred , 347 funupd , 2 funupd list , 2 fuzzy timer , 47 get cb, 104 get sock , 104 get tcp sock , 104 getfileflags 1 , 171 getifaddrs 1 , 173 getpeername 1 , 175 getpeername 2 , 176 getsockbopt 1 , 178 getsockbopt 2 , 178 getsockerr 1 , 180 getsockerr 2 , 180 getsocklistening 1 , 182 getsocklistening 2 , 183 getsocklistening 3 , 182 getsockname 1 , 185 getsockname 2 , 185 getsockname 3 , 186 getsocknopt 1 , 188 getsocknopt 4 , 188 getsocktopt 1 , 190 getsocktopt 4 , 191 host , 61 hostThreadState, 61 HZ , 68 icmp paramprob code, 30 icmp redirect code, 29 Rule version: INDEX 356 icmp source quench code, 29 icmp time exceeded code, 30 icmp unreach code, 29 icmpDatagram, 30 icmpType, 30 if any , 80 if broadcast , 80 ifd , 60 ifid , 13 ifid up, 82 in local , 80 in loopback , 80 IN MULTICAST , 80 INADDR BROADCAST , 80 INFINITE RESOURCES , 66 initial cb, 101 initial host , 353 inqueue timer , 88 INSERT ORDERED , 3 interface 1 , 344 intr 1 , 275 iobc, 57 IP , 80 ip, 13 ip localhost , 80 is broadormulticast , 81 is localnet , 80 is urgent , 39 kern timer , 88 KERN TIMER INTVL, 68 KERN TIMER MODEL INTVL, 68 leastfd , 83 left shift num, 2 Lhost0 , 38 LIB interface, 33 linux arch, 79 listen 1 , 193 listen 1b, 194 listen 1c, 194 listen 2 , 195 listen 3 , 195 listen 4 , 196 listen 5 , 197 listen 7 , 197 local ips, 80 local primary ips, 80 lookup icmp, 87 lookup udp, 86 LOOPBACK ADDRS , 80 loopback on wire, 83 make ack segment , 108 make rst segment from cb, 109 make rst segment from seg , 110 make syn ack segment , 107 make syn segment , 106 MAP OPTIONAL, 3 mask , 80 mask bits, 80 match score, 85 MCLBYTES , 70 mlift dropafterack or fail , 120 mlift tcp output perhaps or fail , 118 mliftc, 105 mliftc bndlm, 105 mode of , 97 modify cb, 104 modify sock , 104 modify tcp sock , 104 msg , 31 msg is1 , 31 msg is2 , 31 msgbflag , 15 MSIZE , 70 MSSDFLT , 74 mtu tab, 99 netmask , 14 never timer , 47 next smaller , 99 nextfd , 83 nonurgent , 39 NOTIN ′, 3 notsock 1 , 275 num floor , 2 num floor and frac, 2 onlywhen, 2 oob extra sndbuf , 70 OPEN MAX , 69 OPEN MAX FD , 69 opttorel , 46 ORDERINGS , 3 outqueue timer , 88 outroute, 82 outroute ifids, 81 port , 13 privileged ports, 69 proto eq , 59 proto of , 59 protocol , 29 protocol info, 58 pselect 1 , 200 pselect 2 , 203 pselect 3 , 203 pselect 4 , 204 pselect 5 , 205 pselect 6 , 205 pselect timeo t max , 73 real mult time, 19 real of int , 2 realopt of time, 20 recv 1 , 209 recv 11 , 221 Rule version: INDEX 357 recv 12 , 222 recv 13 , 222 recv 14 , 223 recv 15 , 224 recv 16 , 224 recv 17 , 225 recv 2 , 211 recv 20 , 225 recv 21 , 227 recv 22 , 227 recv 23 , 228 recv 24 , 228 recv 3 , 211 recv 4 , 213 recv 5 , 214 recv 6 , 214 recv 7 , 215 recv 8 , 215 recv 8a, 216 recv 9 , 217 REPLICATE , 3 resourcefail 1 , 276 resourcefail 2 , 276 retType, 34 return 1 , 274 rexmtmode, 55 right shift num, 2 rn, 348 rollback tcp output , 117 rounddown, 2 roundup, 2 route and enqueue oq , 91 routeable, 81 routing table entry , 60 rttinf , 55 rule cat , 39 rule ids, 42 rule proto, 39 rule status, 39 sane msg , 31 sane seg , 27 sane socket , 84 sane udpdgm, 27 SB MAX , 70 sched timer , 88 send 1 , 231 send 10 , 244 send 11 , 245 send 12 , 246 send 13 , 247 send 14 , 247 send 15 , 248 send 16 , 249 send 17 , 249 send 18 , 250 send 19 , 250 send 2 , 234 send 21 , 251 send 22 , 252 send 23 , 253 send 3 , 235 send 3a, 235 send 4 , 236 send 5 , 237 send 5a, 237 send 6 , 237 send 7 , 238 send 8 , 239 send 9 , 243 send queue space, 93 seq32 , 21 seq32 coerce, 21 seq32 diff , 21 seq32 fromto, 21 seq32 geq , 21 seq32 gt , 21 seq32 leq , 21 seq32 lt , 21 seq32 max , 21 seq32 min, 21 seq32 minus, 21 seq32 minus ′, 21 seq32 plus, 21 seq32 plus ′, 21 setfileflags 1 , 254 setsockbopt 1 , 256 setsockbopt 2 , 257 setsocknopt 1 , 259 setsocknopt 2 , 259 setsocknopt 4 , 260 setsocktopt 1 , 262 setsocktopt 4 , 262 setsocktopt 5 , 263 sf default , 72 sf default b, 71 sf default n, 71 sf default t , 72 sf max n, 72 sf min n, 72 sharp timer , 47 shift of , 97 shutdown 1 , 265 shutdown 2 , 266 shutdown 3 , 266 shutdown 4 , 267 sid , 53 signal , 10 simple host , 352 simple ifd eth, 351 simple ifd lo, 351 simple limit , 94 simple rttab, 351 slow timer , 88 SLOW TIMER INTVL, 68 SLOW TIMER MODEL INTVL, 68 sndrcv timeo t max , 73 Rule version: INDEX 358 Sock , 59 sockatmark 1 , 269 sockatmark 2 , 269 sockbflag , 14 socket , 59 socket 1 , 272 socket 2 , 273 socket listen, 57 sockflags, 58 socknflag , 15 socktflag , 15 socktype, 16 soexceptional , 203 SOMAXCONN , 70 soreadable, 202 sowriteable, 202 SPLIT , 3 SPLIT REV , 3 SPLIT REV 0 , 3 SS FLTSZ , 74 SS FLTSZ LOCAL, 74 start tt persist , 97 start tt rexmt , 97 start tt rexmt gen, 97 start tt rexmtsyn, 97 stop, 104 stopwatch, 51 stopwatch val of , 51 stopwatch zero, 68 stopwatchfuzz , 68 TAKE , 3 TAKEWHILE , 3 TAKEWHILE REV , 3 tcp backoffs, 96 TCP BSD BACKOFFS , 76 tcp close, 121 TCP DO NEWRENO , 74 tcp drop and close, 121 TCP LINUX BACKOFFS , 76 TCP MAXRXTSHIFT , 77 TCP MAXWIN , 73 TCP MAXWINSCALE , 73 tcp output perhaps, 116 tcp output really , 113 tcp output required , 111 TCP Q0MAXLIMIT , 74 TCP Q0MINLIMIT , 74 tcp reass, 100 tcp reass prune, 101 tcp seq foreign, 22 tcp seq foreign to local , 22 tcp seq local , 22 tcp seq local to foreign, 22 TCP Sock , 59 TCP Sock0 , 59 tcp sock of , 59 tcp socket , 58 tcp socket best match, 86 tcp syn backoffs, 96 TCP SYN BSD BACKOFFS , 77 TCP SYN LINUX BACKOFFS , 77 TCP SYN WINXP BACKOFFS , 77 TCP SYNACKMAXRXTSHIFT , 77 TCP WINXP BACKOFFS , 76 tcpcb, 55 tcpForeign, 22 tcpLocal , 22 tcpReassSegment , 54 tcpSegment , 26 tcpstate, 54 TCPTV DELACK , 75 TCPTV KEEP IDLE , 76 TCPTV KEEP INIT , 76 TCPTV KEEPCNT , 76 TCPTV KEEPINTVL, 76 TCPTV MAXIDLE , 76 TCPTV MIN , 75 TCPTV MSL, 76 TCPTV PERSMAX , 76 TCPTV PERSMIN , 76 TCPTV REXMTMAX , 75 TCPTV RTOBASE , 75 TCPTV RTTVARBASE , 75 test outroute, 82 test outroute ip, 82 the time, 20 tick imax , 50 tick imin, 50 ticker , 50 ticker ok , 50 tickintvlmax , 68 tickintvlmin, 68 ticks of , 50 tid , 16 tid initial , 352 time, 19 time gt , 19 time gte, 19 time lt , 19 time lte, 19 time max , 19 time min, 19 time minus dur , 19 time of tltime, 89 time of tltimeopt , 89 time pass additive, 45 Time Pass host , 347 Time Pass socket , 346 Time Pass stopwatch, 51 Time Pass tcpcb, 345 Time Pass ticker , 50 Time Pass timed , 48 Time Pass timedoption, 345 Time Pass timer , 47 Time Pass timewindow , 49 time pass trajectory , 46 Rule version: INDEX 359 time plus dur , 19 time zero, 20 timed , 48 timed expires, 48 timed timer of , 48 timed val of , 48 timer , 47 timer expires, 47 timer tt 2msl 1 , 330 timer tt conn est 1 , 331 timer tt delack 1 , 331 timer tt fin wait 2 1 , 331 timer tt keep 1 , 329 timer tt persist 1 , 329 timer tt rexmt 1 , 327 timer tt rexmtsyn 1 , 325 timewindow , 49 timewindow open, 49 timewindow val of , 49 TLang , 17 TLang type, 16 tlang typing , 17 tltimeopt of time, 89 tltimeopt wf , 89 trace 1 , 343 trace 2 , 343 tracecb eq , 62 traceflavour , 62 tracesock eq , 63 ts seq , 23 tstamp, 22 type abbrev bandlim state, 61 type abbrev byte, 21 type abbrev duration, 19 type abbrev routing table, 61 type abbrev tcp seq foreign, 22 type abbrev tcp seq local , 22 type abbrev tracerecord , 62 type abbrev ts seq , 23 UDP Sock , 59 UDP Sock0 , 59 udp sock of , 59 udp socket , 58 udpDatagram, 27 UDPpayloadMax , 70 unix arch, 79 update idle, 119 update rtt , 98 upper timer , 47 urgent , 39 windows arch, 79 Rule version: