Understanding complex tcpdump filters
I finally understood more complex tcpdump (or rather pcap) filter rules. For example, straight from the pcap-filter(7) man page:
To select all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets.
tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)
The port filter should be obvious. Taking apart the code inside the brackets we get
ip[2:2] At offset 2 in the IP packet, read 2 bytes. This is the total length of the IP packet (which wraps the TCP packet)
(ip[0] & 0xf) << 2 Get the first byte of the IP packet. The first byte encodes the version and the header length in the high and low nibble respectively. We mask it with 0xf (00001111 in binary), throwing away the version in the higher nibble. Then we left-shift by 2 to decode the value of the header length.
(tcp[12] & 0xf0) >> 2 Get the byte at offset 12, which encoded the TCP header length in the higher nibble. Mask it with 0xf0 (11110000 in binary), to zero out the lower nibble. Then right-shift by two, to again decode the length.
Any packet that returns 0 after subtracting the header lengths from the total length contains no data, just like the man page advertises.
For a second example, here's a filter for HTTP POST requests:
tcp dst port 80 and tcp[((tcp[12] & 0xf0) >> 2):4] = 0x504f5354
Again, the port filter is obvious. The more interesting part breaks down to:
(tcp[12] & 0xf0) >> 2 This is exactly the same expression as already explained above and it will return the TCP header length. It is used here inside another TCP byte index, which effectively picks the offset of the first byte after the header. If we substitute this expression as headerLength, the filter can be simplified as tcp[headerLength:4]: We read the first four bytes of the TCP payload, which in HTTP contains the verb. The result is compared against 0x504f5354 which is simply POST encoded in hex.