Network traffic analysis can be overwhelming. Even with a solid foundation, it’s not unusual for a packet capture to contain so much data that it’s difficult to get a sense for what’s going on. Improving your analysis techniques can mean the difference between wasting hours on a challenge and solving it in five minutes.

Prerequisites

This guide is aimed at players who have some networking experience. In particular:

You should have at least some experience with Wireshark–enough to know the basic UI.
You should understand how routing and IP addresses work.
You should understand the client-server model.
You should understand the OSI model.

If you don’t yet meet these prerequisites, check out John’s guide for newcomers.

Weeding out the junk

Harder challenges tend to contain realistic packet captures, meaning that most of the packets will be unrelated to your task. Usually there will be some hint as to what sort of traffic you’re after; your first step should always be to filter out anything that’s irrelevant.

In an ideal scenario, you can start with some assumptions about what traffic you’re after. For example, if the challenge pertains to an individual’s specific interactions with a specific website, you can be pretty certain you’re only looking for HTTP traffic on port 80–in this scenario, it wouldn’t make sense to include encrypted traffic, since you wouldn’t be able to read it. If the questions simply ask about which websites someone visited, you would have to broaden your search a bit; you may need to include DNS (port 53) and HTTPS (port 443). Of course, you may find down the line that these assumptions are incorrect; perhaps there’s a web server running on port 8080. However, you should start your search with a reasonable set of assumptions, then broaden your search if you’re not able to find the answers you need.

If you’re not sure exactly what protocol you’re after, you may have to narrow down the list manually. For example, if you’re after a document that someone printed, it could be using LPT, IPP, or something else entirely. Wireshark does allow you to view statistics about what protocols were used, but I often find that it’s helpful to manually filter out traffic protocol-by-protocol; this gives me a chance to look at a selection of packets from each protocol to get a sense of what might be relevant to the challenge. To accomplish this, I’ll start by viewing all packets, then filtering out the protocols that account for the majority–typically HTTP, SSL, and DNS. I’ll start building a query that looks like !ssl && !http && !dns && !(tcp.port == 80), adding each filter to the list one at a time. (Note that I’ve included both !http and !(tcp.port == 80)–Wireshark doesn’t always know all the protocols with which any given packet is associated, so multiple filters may be necessary.) As I add protocols to the list, I’ll carefully examine a selection of the remaining packets to see if there’s anything interesting; if there isn’t, I’ll continue filtering them out.

Analyzing what’s left

While Wireshark is usually pretty good about dissecting packets, it’s not perfect, and it can’t always reassemble everything. This means you may have to consult the specifications for the relevant protocols and manually decode packets on your own. If some packets aren’t being reassembled correctly, you may need to alter your filters a bit so you can see them, especially if Wireshark isn’t grouping them under the correct layer-7 protocol.

When Wireshark’s dissectors do work correctly, you may find that you want to quickly view the value of a field for a lot of packets, but you don’t want to manually scroll through all of them to look at the detailed dissection. Instead, you can add your own custom columns to Wireshark that show the value of any protocol field. This can be particularly useful for dealing with SNI or HTTP headers, for example.

Extracting data from Wireshark

The hardest challenges often require you to reassemble a file from the packet capture. For example, if the packet capture shows someone downloading a file from a web server, you might need to obtain that file.

In an ideal scenario, you can accomplish this via Wireshark’s File -> Export Objects menu. This works well for a handful of common protocols, but it won’t work if Wireshark is having trouble reassembling packets, and if you’re looking at a more obscure protocol, you’ll be out of luck.

The approaches for handling these scenarios vary:

If you had more time, you could write your own dissector, but that’s not ideal for a timed competition.
If you’re in a hurry and the file is small, you may be able simply copy the bytes from the relevant packets and manually reassemble them in a hex editor. You’ll need to pay close attention to the protocols at hand; packets are often sent out-of-order or transmitted multiple times, and different protocols have different ways of handling these scenarios. The Analyze -> Follow menu may be able to help with TCP-based protocols, but if you’re dealing with UDP, you’ll likely need to go packet-by-packet.
Wireshark’s command-line counterpart, tshark, can be used to automatically extract data from relevant packets. tshark is quite powerful and deserves a guide of its own, but the basic idea is to perfect your filter in Wireshark, run it in tshark, and pipe the output to other tools like jq that will allow you to automatically sort and reassemble the data into a single file. This has a bit of a learning curve, so you should practice with tshark and jq prior to the actual competition. You’ll need to be comfortable with command-line tools and capable of some basic scripting.