How to use PKTMON to conduct troubleshooting of TCP. Use the built-in Windows packet sniffer to display the TCP segments directly on the command-line. Learn how to interpret the PKTMON output of TCP session setup, session tear-down, as well as data transportation.
TCP (Transmission Control Protocol) carries the majority of all data traffic, both on the public Internet and in internal datacenters. With PKTMON, we can investigate much of the protocol details in the packets using CMD or Powershell.
This article is written with the assumption of the reader already having at least intermediate knowledge of the workings of the TCP protocol.
After having the proper capture filters applied, and PKTMON configured to run in real-time mode, output will be displayed on the command line:
Output similar to this will be displayed. In the following text, the various items shown will be fully explained.
Total frame length:
We will see several length fields, which indicate different properties. The leftmost length is the total frame size, minus the four byte Frame Check Sequence. The full frame size was here 1518 Bytes, counting with the trailing four byte Ethernet checksum.
This size includes all the headers (Ethernet, IP, and TCP) plus the actual TCP payload.
Source and destination IP addresses:
After the full frame length, we see the source IP and the source port (here TCP/443), as well as the destination IP address and the destination port (here a high dynamic client port).
After the IP addresses and ports, we will find the TCP flags. The representation of these is mostly intuitive, for example, the “P” equals the TCP Push flag. Further, S is the Synchronize flag, R is the Reset flag, etc.
One of the flags will look different then the others. This is the Acknowledgement flag, which will be represented only by a dot. We will not see an A for this flag, only a dot. This dot can be alone or combined with other flags.
One interesting fact is regardless if the session is a very short-lived, or carries terabytes of data, the Acknowledgment flag being present in all segments of a TCP session, except for the very first packet. The initial SYN packet is the only exception and will not have the ACK flag set. (This is by design and very natural, due to no data can be “acknowledged” in packet 1.)
TCP three-way handshake in PKTMON
Above, we see the famous TCP three-way handshake. (For now, we can ignore the E and W flags, which will be explained later.) Note that the ACK flag is represented as a dot on packet two and three, and is not present, as expected, on packet one.
|Note to troubleshooters: It is crucial to verify that the third ACK packet, represented in the output with a dot, is indeed visible. It is an easy oversight to just observe the two SYN packets and assume we have proper communication end to end. However, if running PKTMON on the target server, it is fully possible to see the first two SYN and SYN/ACK packets, but still not having proper network communication in place. This could be caused in a situation where the routing from the client to the server is functional only in one direction. This would cause the SYN to be delivered correctly, and the SYN/ACK being sent from the target. However, if the return-routing is not functional, the packet number two never reaches the client and the session is not established.
Note, that multiple TCP session can be established / attempt to establish simultaneously, the TCP client and server port number could be used to correlate and verify the packets being part of the same session. The pair of destination and source ports will be mirrored in the opposite direction.
|Note to troubleshooters: Some network equipment, e.g., firewalls could, even without the traffic being “NATed”, re-write the client dynamic client TCP port in the path. When the traffic returns, the port is modified back to the original port number. This can cause confusion when looking at the same TCP session setup at both sides simultaneously.
TCP packet one:
The SYN flag is mostly known for being the “session opener”. However, it also has a specific meaning, which is to signal the initial sequence number that the client will use to send upcoming data.
TCP packet two:
In packet two, the server side responds and confirms that it received the clients initial sequence number. This is conducted through setting the ACK flag (represented here as a dot) and the acknowledgement field to the first value plus one.
|Note to troubleshooters: some network equipment has the “feature” of providing “sequence number randomization”, meaning the firewall will change the sequence number while the packets passes through the device. The return packet will then be rewritten while passing through the same device. This behavior could cause confusion if looking at both ends simultaneously. If having multiple stateful devices in the data path, or certain tunnel setups, the in-flight change of the sequence number could cause failures. Typically today, the “randomization” of TCP sequence numbers will be properly arranged at the end-points and should not be conducted by intermediate devices.
Still in packet two, the server also enables the SYN flag, meaning it desires to synchronize its selected server initial sequence number.
In packet number three, in the three-way handshake, the clients acknowledges the received sequence number by setting the ACK flag (represented by a dot), and the value of the acknowledgement field to the received value plus one.
The ACK in packet two confirms the SYN in packet one.
The ACK in packet three confirms the SYN in packet two.
There are a number of possible TCP options that could be exchanged in the TCP handshake. The options are sent in packet number one and packet number two.
If being in a situation where we are certain that the three-way handshake completes correctly, it is possible to make the capture filter more fine-grained, by only capturing the first two packets in the session. This could be accomplished by first clearing the current filter, then add a new filter only hitting the SYN-enabled first two packets:
pktmon filter remove
pktmon filter add -t TCP SYN -p 443
ptkmon filter list
(Note that the -p could be omitted to capture all TCP session establishment, or, obviously, the port number replaced with any arbitrary TCP port number.)
The MSS field announces the Maximum Segment Size. The MSS is the number of bytes remaining in the frame after the various headers have been removed. For example, on an Ethernet-based network, the original maximum frame size is 1518 Byte. After removing the overhead for Ethernet, IP, and TCP, a certain number of bytes is possible to be used as application payload. For TCP carried over IPv4 on Ethernet, this number is 1460 Bytes.
The reason for signaling the MSS value is to avoid potential IP fragmentation, if one of the nodes is located on a network with lower MTU value.
|Note for troubleshooters: At times, various equipment in the data path, especially VPN concentrators, can modify the MSS value in flight. That is, the original MSS set by one node could be altered, “invisible” to the sender, on the way to the destination. This could be done to further decrease the MSS, thus making room for the additional overhead of the VPN tunnel headers. If the tunnel devices do not reduce the MSS, performance issues by IP fragmentation might occur.
Worse, if the endpoint has enabled the IPv4 flag “Do not fragment”, the packet will be dropped.
Ensure proper MSS adjustment are indeed conducted by the VPN devices. Note that this will not be visible on the sender side, but must be verified on the remote end.
To verify if this occurs, use PKTMON on both end points and validate if the MSS sent is the same as the one being received, or has been adjusted.
The “nop” option means No Operation. This is inserted as padding, due to the total length of the TCP options must apply to certain built-in rules. These details will be explained in an additional article of the TCP header.
The PKTMON output “wscale” means the Window Scale option. It will define a number, e.g., 8. The receiving machine will calculate 2 raised to the power of 8.
For example, if a TCP nodes signals “wscale 8″, the remote machine will calculate this to 256.
This calculated value will then be used to calculate the “true value” of the Window Size field on all incoming frames.
The Window Size, simplified, informs the sender the maximum number of bytes that it is prepared to receive at a given point in time. This was used in the very early days of TCP/IP, with machines typically equipped with low memory and slow CPUs. To avoid being overwhelmed, the receiver could reduce the “window size”, demanding the sender to slow down. The TCP protocol states that the sender can not send more data then fits into the signaled window size.
The window size field is a fixed sized field in the TCP header, with 16 bits length. With 16 bits, the highest number possible to express is 65535.
As hardware capabilities grew over time, as well as the network bandwidth was improved, the TCP window size became a bottle neck. At no point could the sender have more than 64 KB of data in flight. The size of the field could not be altered, due to millions of devices expecting this very field. However, a solution was found to utilize the same field, but ask the receiver to interpret it differently.
The two nodes have exactly one chance to inform the other part of the desire of “scaling”. This must be expressed in packet 1 and 2 in the TCP three-way handshake.
In the picture above, one of the nodes signals the window scale value of 8. The receiver calculates 2 raised to the power of 8, equaling to 256. For all incoming packets in this session, the value in the “window size” field should automatically be multiplied with 256.
Here, the remote side announces the capability to receive 128 KB of data in flight. Note that this value now can dynamically increase and decrease during the session life-time. The only thing that can not be altered is the scaling value itself.
The two sides will each signal their respective scale value. It is fully natural that the values are different. Each end-point in the TCP connection will interpret the incoming window-size values according to the initial scale-value from the remote side.
|Note to troubleshooters 1: if the “wscale” value is not set, or is set very low, e.g., 1, this will typically deteriorate the network performance significantly. The effect will be worse if there is a higher delay, but high bandwidth. Ensure the wscale value is set on both sides and is set to a properly high value.
|Note to troubleshooters 2: it is crucial to capture the initial SYN and SYN/ACK packets to be able to read the “true values” of the window size field in a TCP session. If the first two packets were not captured, it is not possible to determine if a window scale issue is at hand. That is, if not having access to the TCP Option Window Scale value, a “low” window size value seen in the conversation could look drastically low, but if combined with the Scale value, in reality signaling a high receive buffer size.
The third of the common TCP options to be found is the Selective Acknowledgement Permitted. This is written by PKTMON as “sackOK“. This option must only be set in the packet one and two in the TCP session, i.e., only in the SYN and the SYN–ACK packets.
The ability to support Selective Acknowledgement reduces the need for excessive resends of packets when a single segment is lost in transmission.
|Note to troubleshooters: Both sides must announce their support for Selective Acks for the feature to be used. If one of the sides does not signal “sackOK” in the first two packets of the TCP session setup, the feature will be unavailable. Especially on networks with high bandwidth, but also a higher round-trip time, the lack of SACK will be noticeable performance-wise. Ensure the “sackOK” is visible on both the SYN as well as the SYN-ACK packet.
Explicit Congestion Notification TCP flags:
Explicit Congestion Notification, ECN, is a relatively “new” addition to TCP. This feature, if supported by both ends in a TCP session, as well as in intermediate routers, can signal to the nodes the requirement to temporarily decrease their send rate to avoid drops in router queues.
The “new” TCP flags involved are:
PKTMON output “E”, is the ECE bit. ECE = Explicit Congestion Notification Echo.
PKTMON output “W”, is the CWR bit. CWR = Congestion Window Reduced.
This article will not describe the session usage of the Explicit Congestion Notification protocol, which also involves previously unused parts of the IPv4 header. However, we will discuss the TCP session setup.
During the session life-time, the ECE and the CWR bits have specific meaning, signaling detected congestion in the data path. However, when used in the TCP session establishment, they are used as a “negotiation”.
To negotiate the support for ECN, the initiating node will set both the E and the W flag in the TCP SYN packet. This is could be read as: “the sending node supports ECN“. If the receiving node also has ECN enabled, this is signaled by setting only the E flag in the TCP header. The combination of SYN + ACK + ECN could be read as: “the target confirms its ability to use ECN“.
The target will never set the E flag in the SYN-ACK packet if the client did not set the two E and W flags in the TCP SYN packet.
|Note to troubleshooters: The ECE and CWR flags in the TCP header are relatively “new” in the world of TCP. Various inspecting firewalls could, incorrectly, detect the presence of these flags to be suspicious, and drop the packets.
The length field at the far right on the command line is the TCP payload. Note that this is different than the first length field on the left side of the command line, displaying the total frame size.
Typically, on an Ethernet-based network, the TCP payload size will be 1460 bytes. The maximum frame size is 1518 bytes, where the various headers will consume space, leaving 1460 for the actual application data.
Note that TCP does not need to always carry 1460 bytes in every frame. This is dependent on the application. Especially with “real-time” applications like Telnet, SSH, and RDP, very small payload sizes would be expected. However, for file transfers and similar, the full 1460 size should be used for optimal performance.
Note that the sequence numbers are not counters “per packet”, but per byte. For every byte carried by TCP, the sequence number will increase with the same number.
In the picture above, we can see how the PKTMON real-time output displays the sequence number range carried inside this very packet.
If you have any question, please leave a comment below.