--------------[ Readings ]-------------- Read textbook's chapter 12 (TCP preliminaries). You may read ahead to chapters 13--15 that cover more of TCP. We followed Ch.12 pretty closely today in class, with pcaps. As you read about TCP handshake, look through the packet capture http://www.cs.dartmouth.edu/~sergey/cs60/lec5-tcp-3way-hadshake-and-fails.pcapng (I accidentally deleted it but re-created it after class, with the same sessions and payload; in case you wonder about timestamps). Packets 1--9 are my session "nc www.cs.dartmouth.edu 80" that expired before I typed anything. Netcat called connect(), which resulted in the SYN packet being sent; server accept()-ed and sent SYN+ACK, my kernel completed the handshake with an ACK. No typing meant no write()s, and the server resent SYN+ACK just in case; then it timed out and sent a FIN for a graceful teardown of the connection. That was ACK-ed, and it took till packet 9. Packets 10--20 are from another connection, which gave me a 302 HTTP redirect: $ nc www.cs.dartmouth.edu 80 GET / 302 Found

Found

The document has moved here.

Apache/2.4.23 (Fedora) Server at www.cs.dartmouth.edu Port 80

See this stream reconstructed by Wireshark by going through "Analyze > Follow > TCP stream" with one of these packets selected. You will get a representation of the full stream going both ways ("full duplex"). Finally, packet 21 is a SYN from connect() caused by "nc www.cs.dartmouth.edu 81" . With no program listening at port 81/tcp, the other side sends a RST. TCP uses it's own packets to signal "no one is listening, go away" (packet 22). For UDP, this is different: a UDP packet to a port no program claimed with bind(), the response is an ICMP Port Unreachable (packets 23 & 24). I caused the UDP packet to be sent with -u option for nc: "nc -u www.cs.dartmouth.edu 81" Note that packet 24 contains a part (the headers) of the offending UDP packet 23 as its payload. This is done to make 'net diagnosis easier. ----------------[ TCP server skeleton ]---------------- See http://www.cs.dartmouth.edu/~sergey/cs60/tcpserv.c for the typical TCP server skeleton (socket -> bind -> listen -> accept, then fork and continue listening in the parent, process data in child). Note that the original listening socket is closed in child, and the child's new socket returned by accept is closed in the parent, since these are not used. For a server without a fork(), see http://www.cs.dartmouth.edu/~sergey/cs60/tcpserver-nofork.c Note the slightly different styles; also note that ALL system calls have their return values checked! This is a style requirement not to be ignored. ----------------[ Parsing a binary TCP protocol ]---------------- In http://www.cs.dartmouth.edu/~sergey/cs60/tcp-pow/ find client and server code for a silly client/server pair that uses a binary protocol with unnaturally aligned uint32_t fields; there is also a uint16_t length field. This protocol is written so that forgetting ntohl()/htonl() or htons()/ntohs() anywhere would lead to bugs such as too much memory being printed, wrong numbers being multiplied or the result interpreted wrongly. Try recompiling the code without any of these and observe the results; see also README.txt. The lesson of this simple protocol is that in network programming C makes one care extra hard about its memory model: how values and variables are laid out byte-wise in memory, and how they are aligned. C network programming is more about its memory model than about socket()/connect()/bind()/listen()/accept() system calls! These two files also show basic approaches to parsing binary content (neither of which is my favorite; I prefer to write declarative code that explains what valid input is, not imperative code where one can easily forget to step forward a char* pointer). The approach with defining structs corresponding to the per-byte layout of arriving bytes in memory is slightly less error-prone. Yet even so you must explicitly (and continually) covert multi-byte fields from/to network/host byte order. This is what the client mostly does. The approach that steps a pointer through an array of bytes to point at where fields start, and casts the pointer to the respective field value types and dereferences it is somewhat more error-prone. What if you forget to step the pointer, or step it off by one? This is what the server mostly does. Newer architectures try to wrap binary data marshaling & unmarshaling with less error-prone code. See Google's Protocol Buffers (ProtoBuffs) as an example. The more general approach is the Parser Combinator approach, pursued for systems languages by Hammer (C/C++ & more), and Nom (the up-and-coming systems language Rust). Still, the internals of these systems are programmed in C or similar; there is no getting around the memory model when working with packets. Extra credit will be given for any bugs still left in my code :) ----------------[ Read ahead: DNS ]---------------- A good concise intro to DNS is found in Shalunov; a great summary with a dive-in into specific fields is at http://unixwiz.net/techtips/iguide-kaminsky-dns-vuln.html. The latter describes the format at enough depth to cover the caching vulnerability discovered by Dan Kaminsky in 2008. DNS failed to resist spoofing where TCP generally succeeded; the difference was down to using 2-byte integers vs 4-byte ones (for syn and ack numbers in TCP). DNS is the very devil to parse (and has seen many parsing vulns over the years). I wrote some code (in http://www.cs.dartmouth.edu/~sergey/cs60/dns/) to parse a simple DNS response, lifting the request from a pcap (dns-tcp.pcap). This will give you some idea of how hard manual DNS parsing is! This code parses only one particular case of DNS packet composition, but it already has to do with domain names encoded as <0x04> cs60 <0x02> cs <0x09> dartmouth <0x03> edu <0x00> the next-part's-length byte being used where you expert the dot. Moreover, instead of repeating a name or its part, DNS uses <0xc0> <0xYY> where YY is the offset from the start of the DNS payload (the Transaction ID field). Hence <0xc0> <0x0c>, offset 12, used the second time the above domain name would appear in the packet. Have fun with parsing!