The TcpAnalyzer is nearly complete. I finally solved some comprehension and algorithmic issues with TcpAnalyzer and now have everything working perfectly, conceptually and in implementation form. I wrote it in Java, but my plans are to reimplement it in native C for performance reasons, after its all done and working correctly.
There were a few areas of the analyzer, that seemed like I was chasing my tail, with bi-directional streams its easy to spin yourself completely around in the wrong direction. But what was giving me all the headache turned out to be a simple, forgot to copy packet buffer to new memory location before putting on the queue. Packet objects that would be placed on segment timeout queues, would simply change their internal state "magically" and cause all kinds of havoc.
I must infer that this kind of "gotch'a" is going to be very common with jNetPcap, people forgetting to make a copy of the packet from shared buffer and spending hours or days hunting for a phantom problem. I'm considering changing the API around, where the default behavior would be to make the copy before returning packets to the user from Pcap.loop and Pcap.dispatch methods and providing a way to disable the copy for advanced users that want to utilize the shared buffers. I think that will be the best approach for more general users that may not be aware that they have to make a copy first. Advanced users won't loose that neat feature and performance gain with no copy mode, since they will have a way to enable or disable the default copy.
I'm using Wireshark to decode and analyze capture files then compare its analysis to my own. I'm finding errors in the analysis wireshark produces when decoding tcp streams with packets that are out of order.
Specifically in "tests/test-http-jpeg.pcap" file, Wireshark is incorrectly reporting that "TCP previous segment lost" and then shortly after that a "TCP retransmission" occurred, supposedly retransmitted the last segment. This analysis is incorrect.
I'm not trying to put down wirehark in any way, its an awesome peace of software and I depend on it heavily for development of jNetPcap and the capture file in question has incomplete data creating a very very complex case for analysis. That is why I'm interested in it because jNetPcap should be able to handle it.
In particular I'm analyzing frame numbers 9-23 + 74 (which contains the last TCP FIN packet).
ACKs | Frame #s | Seq #/Length
------------------------------------
+--+
16 |15,17| 1/0
+--+--+--+
20 | | | ?| 1/1460
+--+--+--+--+
| | | |21| 1461/1219
+--+--+--+--+
22,23 | |19| | | 2681/5
+--+--+--+--+--+
| | | | |74| 2686/1
+--+--+--+--+--+
[time ==>]
According to my own analysis, what is happening is that no segments are being lost, but 1 TCP segment in particular is fragmented by IP layer and the 2 pieces sent via different routes. The second IP fragment is what is recorded in the file (#18), while the 1st segment (with the TCP header I might add), is send using an alternate path and not present in the capture file.
I have the TCP analyzer working. It tracks seq/ack and a bunch of other stuff. It works currently well with the happy case, where everything goes smoothly between client and server.
Now I'm working on making it handle all the un-happy cases with duplicate ACKs, out of order segments, timedout segments, etc, etc..
Its a very complex protocol and from analyzer's perspective its even more complex. An analyzer sits between client and server, sender and receiver. It sees things that sender sends out but potentially receiver never receives. It has to keep virtual state for both client and sender and visa versa in opposite direction.
None the less its starting to look really well. Keeping track of seq and acks and receive windows. There is a ton of events and analysis it can attach. For now, every TCP packet gets a TcpDuplexStream analysis tag. This analysis object contains 2 TcpStream objects, one for each direction of the duplex stream. Each TcpStream keeps track of various tcp parameters and progress as tcp stream time ticks on (i.e. packets are received.) Errors detected will also be attached, as well as (if user requests it) a snapshot of the TCP stream progress for every packet. This would allow one to graph any of the TCP parameters on a stream by stream basis. For example you could graph how much unacknowledged data remains in the sender send buffer and its relation ship to receivers receive window. Time based analysis and graphing since you now know exactly how long it took to acknowledge send segments, retransmissions and how they occured. All on a per packet that is on a per stream basis.
I didn't get to work on tcp today due to some other tasks that came up. I have been thinking heavily about TCP analyzers and how I want to designed them, the last few days though.
There is lots and lots of analysis that TCP can provide, but we have to take into consideration performance. The tcp analysis will be finely partitioned with the user having the option of enabling as much or as little of it as is necessary.
My priority is to create the main stream and tcp ack analyzers that fragmentation analyzer and segment reassembler will require. Higher level protocols will be able to tell TCP how much of the data to reassemble.
I start work on this tomorrow.
I have Ip4 analyzer and reassemblers working. These are 2 separate classes. Ip4FragmentationAnalyzer tags ip packets with FragmentSequence analysis objects, while Ip4Reassembler takes those fragments, reassembles them in a new buffer and creates a new Ip PDU packet. Here is a code from jUnit testcase:
public void testIp4Reasemble() throws IOException {
ip4Defrag = new Ip4Reassembler(ip4Analyzer);
controller.add(new JPacketHandlerand output from the last reassembled packet:
Frame #243 Ip: ******* Ip4 - "ip version 4" - offset=0 (0x0) length=20 Ip: Ip: version = 4 Ip: hlen = 5 [5 * 4 = 20 bytes, No Ip Options] Ip: diffserv = 0x0 (0) Ip: 0000 00.. = [0] code point: not set Ip: .... ..0. = [0] ECN bit: not set Ip: .... ...0 = [0] ECE bit: not set Ip: length = 5720 Ip: id = 0x252 (594) Ip: flags = 0x2 (2) Ip: 0.. = [0] reserved Ip: .1. = [1] DF: do not fragment: set Ip: ..0 = [0] MF: more fragments: not set Ip: offset = 0 Ip: ttl = 254 [time to live] Ip: type = 17 [next: udp] Ip: checksum = 0x0 (0) Ip: source = 131.151.1.146 Ip: destination = 131.151.32.21 Ip:
I have everything so far working better then expected. Ip4Analyzer is working, its not done yet, but all the mundane "package" details for analysis are in place and now the actual analyzers are the easy part.
Here is output from 4 Ip fragments. Not reassembled yet, but already analyzed and prepared for reassembler:
Ip: ******* Ip4 - "ip version 4" - offset=14 (0xE) length=20 Ip: Ip: version = 4 Ip: hlen = 5 [5 * 4 = 20 bytes, No Ip Options] Ip: diffserv = 0x0 (0) Ip: 0000 00.. = [0] code point: not set Ip: .... ..0. = [0] ECN bit: not set Ip: .... ...0 = [0] ECE bit: not set Ip: length = 1280 Ip: id = 0x23D (573) Ip: flags = 0x2 (2) Ip: 0.. = [0] reserved Ip: .1. = [1] DF: do not fragment: set Ip: ..0 = [0] MF: more fragments: not set Ip: offset = 555 [555 * 8 = 4440 bytes] Ip: ttl = 254 [time to live] Ip: type = 17 [ip fragment of udp PDU] Ip: checksum = 0x4AAF (19119) Ip: source = 131.151.1.146 Ip: destination = 131.151.32.21 Ip: Ip: *** Fragment Sequence analysis *** Ip: Status: all fragments found Ip: Frame #0: offset= 0-1479, len=1500, dts=0.00 us, flags=[DF, MF] Ip: Frame #1: offset=1480-2959, len=1500, dts=457.00 us, flags=[DF, MF] Ip: Frame #2: offset=2960-4439, len=1500, dts=193.00 us, flags=[DF, MF] Ip: Frame #3: offset=4440-5699, len=1280, dts=85.00 us, flags=[DF]
I am very happy with how this is working out. I will finish the Ip4 and tcp analyzers and reassemblers this weekend.
PS: I will be increasing the timestamp resolution to nano seconds. This is a little mute with Pcap since both libpcap live capture and offline captures only support micro second resolution.
Have been working on analyzers and the new package org.jnetpcap.analysis.
Ip4FragmentAnalyzer is nearly complete. It keeps track of ip fragments and attaches FragmentSequence analysis information to each fragment. Tcp segments are going to be handled the same way, reusing the same analysis class. Ip4Reassembler is going to be very trivial, because all it does is monitor packets that have a complete FragmentSequence attached, when they do, the call FragmentSequence.getPacketSequence() returns a list of packets part of the complete fragment. All it has to do is copy them to a new buffer and scan as a new packet. All the analysis is already done by Ip4FragmentAnalyzer.
The hard part was actually how attach information to packets. Packets can be copied into a raw byte buffer or byte[] and copied out of them to create new packets that are fully scanned. I also like the "peer" API a lot with how the headers are bound today. JAnalysis objects are bound the same way now, even when they contain references to other java objects.
I will tidy things up tomorrow morning and check what I have in. Once Ip4 protocol is done with analysis and reassembly, I'll do TCP next.
The way you analyzers come into play is the user instantiates an AnalyzerController object, add whatever JAnalyzer based classes you want to it and then pass the controller over to Pcap.loop or dispatch. It implements the JPacketHandler interface. It also accepts a JHandlerHandler as a listener which is where the user gets to see the analyzed packets. It goes right in between the Pcap.loop and user's JPacketHandler. Analyzers can dispatch numerous different events based on what the analyzer is. That is another way to attach to the analyzers.
Here is what has been found in rc4.
.so soft link to .so.rc4 main lib file. Will add that link in post-install to debian package control file.
Added org.jnetpcap.nio.JReference class. This class is needed to track JNI global object references. For those that are familiar with how JNI works, once you allocate a JNI global reference, its upto your application to release it when its no longer needed.
JMemory class already keeps track of allocated memory very well, but it needed more to keep track of JNI global references. Especially with how packet_state_t and header_state_t structures are setup, you couldn't just call JNIEnv->NewGlobalRef and set the reference in packet or header state structures. Now JMemory works with JReference (allocating one if its needed) to keep track of JNI global references. During the peering process the memory owner's JReference object is passed to the peer. When all the objects that used those JNI references natively expire, JNI references are deleted by JReference destructor method.
Anyway, its a little complex, but the bottom line is that JNI global references to java objects can now be safely created and they will be released, just like native memory is released as well.
Also added JAnalysis jobject reference to packet_state_t and header_t structures. This allows hierarchal analysis objects to be attached to packets and headers.
Here is debug output from a test method that sets 2 JAnalysis objects using JPacket.State.setAnalysis() and JHeader.State.setAnalysis() methods:
JMemory: JMemory@af4baee: size=168 bytes JMemory: owner=nio.JMemoryPool$Block.class(size=10240/offset=2590) JMemory: references(capacity=3, @25c798, @256cf4, @0) JPacket.State: sizeof(packet_state_t)=104 JPacket.State: sizeof(header_t)=16 and *4=64 JPacket.State: pkt_header_map=0x27 JPacket.State: pkt_header_count=4 JPacket.State: pkt_headers[0]=[hdr_id=1 ETHERNET ,hdr_offset=0 ,hdr_length=14] JPacket.State: pkt_headers[1]=[hdr_id=2 IP4 ,hdr_offset=14 ,hdr_length=20]
Flows are done. Now I'm working on analysis. What we want is a set of Analyzers that can monitor packet streams, analyze them and record information about it down to header and field level.
The analysis may happen in native land by native analyzers, or in java land by java analyzers. The core protocols would have native analyzers provided while user defined ones will be in java.
I think I have an API that will work well with the headers. Uses similar accessor methods. Instead of hasHeader you ask hasAnalysis. You preallocate various types of analysis objects, the ones you are interested in and if analyzers created output and attached it to packets, headers or fields you can get them.
The best and easiest example of this would be packet sequences. A packet sequence discovers which packets are IP fragments or TCP segments. This information is attached to the packet. So if Ip datagram is broken up into 4 fragments, each packet that is an IP fragment will have FragmentSequence analysis object attached to it. From that object is a linked list of the other 3 packets that are the rest of the fragments. You at that point can easily access any of the other 3 packets and do further analysis or whatever. An IP reassembler would be able to reassemble the IP datagram from all fragments without having to do any analysis on its own. Its a way to break the complex tasks that protocol analysis involves into smaller chunks while still allowing the user to read that analysis information very easily. How easily? Lets take a look at an example.
public void testAnalysisSyntax() {
JPacket packet = TestUtils.getPcapPacket(HTTP, 5);
Ethernet eth = new Ethernet();
Ip4 ip = new Ip4();
Tcp tcp = new Tcp();
FragmentSequence ipSequence = new FragmentSequence();
FragmentSequence tcpSequence = new FragmentSequence();
final HeaderAnalysis ethValidation = new HeaderAnalysis();
/*