PAS-26: Generate list of observed unresolvable FQDNs



Issue Information

Issue Type: New Feature
 
Priority: Major
Status: Open

Reported By:
Ben Tasker
Assigned To:
Ben Tasker
Project: PCAP Analysis Script (PAS)
Resolution: Unresolved
Affects Version: 0.1,
Target version: 0.1,
Components: SSL/TLS ,

Created: 2016-02-03 11:04:59
Time Spent Working
Estimated:
 
90 minutes
Remaining:
  
4 minutes
Logged:
  
86 minutes


Description
I want the script to extract a list of FQDN's from SNI, as well as any returned certificate Common Names (and SANs) and then attempt to resolve each of them.

For any which are unresolvable, the FQDN, the associated IP and port should be recorded


Issue Links

Toggle State Changes

Activity


There are two reasons why this information may be interesting:

Unadvertised Services

Seeing (for example) a HTTPS connection go out with foo.bar.invalid as the SNI FQDN suggests that the destination server has a service responding to that FQDN. That it's not being advertised in DNS means it's potentially interesting as an apparent attempt has been made to keep it hidden from public view


Identifying Tor Usage

Depending on the version of the tor client the user is using, the value of the SNI and the certificate Common Name during a SSL handshake can help identify connections to a Tor Entry Guard.

The SNI FQDN tends to be a random (at least in appearance) string (e.g. www.3avkpvvrqtgkdk.com) with the guard then returning a different string as the cert Common Name (e.g. www.52iaby6bzurz7c4gugy.net)

So within tshark we'd be looking at the following
- ssl.handshake.extensions_server_name
- x509sat.printableString

The presence of two non-resolvable FQDN's in one handshake is a reasonable starting indicator that the connection may be Tor related. For further confirmation, any IP's meeting that criteria could then be cross-compared against the publicly available list of Tor nodes.

Whilst this method is more reliable than relying on destination port numbers (as a node's ORPort is configurable) it is contingent on tshark recognising that the SSL dissector should be applied to the stream (which won't be the case for some ports) so may need to look into sane ways of forcing that. To begin with, though, simply grabbing the low-hanging fruit is a reasonable starting point
Note that PAS-17 implemented a configuration option to only allow passive checks. Given this feature will be contingent on attempting to resolve any FQDN observed in the PCAP this falls firmly in the active check category, so will need to honour the configuration option PASSIVE_ONLY
Need to double check that it won't break anything, but should be able to drop extraction of CN/SAN's in here https://github.com/bentasker/PCAPAnalyseandReport/blob/c157a45136b1fa11f9a2c09b67e1779530d24c01/PCAP_Analysis.sh#L412

Would then need to adjust processing of that temp file (https://github.com/bentasker/PCAPAnalyseandReport/blob/c157a45136b1fa11f9a2c09b67e1779530d24c01/PCAP_Analysis.sh#L499) to pull them out

If we add an additional printf into that loop we can simply write the IP, Port, SNI Name, CN and SANs out to a seperate report file for processing later.

In effect, it'll be very similar to the existing visitedsites.csv report - except that that report doesn't include certificate CN/SANs, so an alternative might simply be to update that report to include the source for a FQDN (i.e. Host Header, SNI Hostname, Cert CN, Cert SAN) and then use that as an input to avoid duplication

No part of that is active, so we don't need to honour PASSIVE_ONLY at this point.
That looks like a reasonable way of approaching it actually, visitedsites.csv is currently generated as follows
printf '\tBuilding FQDN list\n'
cat ${TMPDIR}/httprequests.txt ${TMPDIR}/sslrequests.txt | awk -F '	' '{print $8}' | sort | uniq > "${REPORTDIR}/visitedsites.csv"


So splitting that into multiple parts so we can include a source identifier makes sense. Can then look at adjusting the generation of sslrequests.txt so that there's a column for CN,SANs etc (or if needed, dump them out to a seperate temp file)

Repo: PCAPAnalyseandReport
Commit: 291593330f0a39ca07fc53d04a3b0a8943fb0028
Author: Ben Tasker <github@<Domain Hidden>>

Date: Wed Feb 03 11:50:55 2016 +0000
Commit Message: Added source identifier to visitedsites.csv. See PAS-26



Modified (-)(+)
-------
Docs/Reports.md
PCAP_Analysis.sh




Webhook User-Agent

GitHub-Hookshot/21f57ba


View Commit

btasker changed timespent from '0 minutes' to '46 minutes'
Now that the source identifier has been added, need to adjust the processing run to include the CN/SAN.
Test run going at the moment for extraction of CN's/SANs.

One option for a little further down the road: PAS-13 will be implementing a DNS transaction log, so we could conceivably turn this into a passive check by searching the output of that for lookups for extracted FQDNs. We can only extract SNI/CN etc at time of the handshake, so for the average connection there will likely have been a DNS lookup just before (assuming a previous query hasn't been cached).

But for the connections that this issue is interested in, there will likely have been no lookup (or at the very least, an NXDOMAIN). So, we could create a "lite" version of this feature which uses the information PAS-13 will ultimately capture.

I think it needs to be optional - so the active version of the check still needs to be implemented. As well as being controlled by PASSIVE_ONLY it might be worth defining a configuration option to allow other active checks to be used alongside the passive version of this check.

Either way, the passive version is currently blocked by PAS-13
Commit 2b59001 adds Certificate names to visitedsites.csv

It does currently include an empty SNI line:
        SNI


But that should be reasonably easy to correct, so I'm going to come back to it once the lookups are implemented.

Repo: PCAPAnalyseandReport
Commit: 2b5900139dc4db50b90ad5f4956e604f63a49038
Author: Ben Tasker <github@<Domain Hidden>>

Date: Wed Feb 03 12:22:06 2016 +0000
Commit Message: Added extraction of certificate names. See PAS-26



Modified (-)(+)
-------
PCAP_Analysis.sh




Webhook User-Agent

GitHub-Hookshot/21f57ba


View Commit

btasker changed timespent from '46 minutes' to '71 minutes'
btasker changed status from 'Open' to 'In Progress'
btasker changed status from 'In Progress' to 'Open'
btasker changed timespent from '71 minutes' to '86 minutes'
Test run of the lookup based functionality going at the moment
The current implementation generates a new report called unresolvabledomains.csv with the following fields

- Src IP
- Dest IP
- Src IPv6
- Dest IPv6
- Src Port
- Dest Port
- SNI Name
- Certificate Names

Rows should all be unique, though there may be duplication (for example where Certificate names contains two unresolvable names). For a Tor connection, that'll almost certainly be the case as the issuer name will likely also be unresolvable.

Repo: PCAPAnalyseandReport
Commit: 491ab4c6c027cf29e6e8c9eb89825436344ae110
Author: Ben Tasker <github@<Domain Hidden>>

Date: Wed Feb 03 13:27:23 2016 +0000
Commit Message: Implemented generation of unresolvabledomains.csv. See PAS-26



Modified (-)(+)
-------
Docs/Reports.md
PCAP_Analysis.sh




Webhook User-Agent

GitHub-Hookshot/21f57ba


View Commit

The logical next step would be to walk the SSL connection and pick out connections where the (unresolvable) SNI name differs to the (unresolvable) Certificate names for any given connection - there's a very good chance those are connections into the Tor network.

I'll raise a separate FR for that though as I don't want this issue becoming too Tor specific - Raised PAS-28

Work log


Ben Tasker
Permalink
2016-02-03 11:51:29

Time Spent: 46 minutes
Log Entry: Designing and starting changes

Ben Tasker
Permalink
2016-02-03 12:23:42

Time Spent: 25 minutes
Log Entry: Implementing and testing Certname extraction

Ben Tasker
Permalink
2016-02-03 12:57:36

Time Spent: 15 minutes
Log Entry: Building lookup functionality