Proposal

Given a stream of TCP/IP packets, it is a non-trivial problem to determine what operating system emitted those packets judging solely by the packet stream itself. Operating system detection by packet stream usually relies on how the operating system emitting the packets implements various optional parts of the TCP and IP protocols.

However, there is another way of attacking the detection problem that does not rely on methods of TCP/IP implementation -- timing. Operating systems will emit TCP/IP packets in response to packet loss (actual or simulated) at different times, depending on how they are configured. (See the paper http://www.usenix.org/events/sec00/full_papers/smart/smart_html/ for basic details on this attack). This configuration almost always involves hard-coded variables in the operating system source.

I propose a machine learning approach to determing operating system fingerprints by simulating TCP/IP packet loss. This is a problem that is difficult, if not impossible, to program without machine learning and/or some form of signal processing. As far as I am aware, it has also not been done yet.

The training data will be generated by running simulated packet loss against machines implementing different operating systems. As packet timing will always vary somewhat due to machine and network load, it seems acceptable to use the same simulated packet loss when it comes to both generating the training data set as well as running the learned algorithm "for real".

I am interested in using a Bayesian network to create my classifier. There are a number of independent features that can be tested for with a network traffic generator and loss simulator, including many different parameters of loss, so with my (limited) understanding it seems that a Bayesian network is a natural way to model such belief. In addition, updating belief is a common and well-defined operation in Bayesian networks, and belief should be updated with every run -- not simply during the training runs. When evidence for many features points to one operating system and a single feature points to another, the belief in the single feature needs to be updated.

By the milestone, I will have liked to implemented the packet loss simulator and a complete enough version of the algorithm to distinguish between three vastly different operating systems -- ie Windows 7, Mac OS X (some version), and Linux 2.6.31. After the milestone I'd like to see just how deep the detector can classify operating systems -- can it detect service packs? What about minor patch levels? The challenge will be distinguishing minor patch levels in unrelated components from network background noise, without using any signal processing techniques for noise reduction.