Nmap/IPv6 OS Integration
IPv6 has its own database, classification engine, and integration tools. See the book sections on IPv6 matching and IPv6 probes and features.
The integration tools are in https://svn.nmap.org/nmap-exp/luis/ipv6tests. See the README for some commands.
First build liblinear and the Python binding.
cd liblinear-1.8 make cd python make cd ../..
Submissions come in an mbox with wrapped fingerprints that look like this.
There's a simple run-length encoding where bb{N}
stands for
the byte bb
repeated N times.
$ cat t.fp OS:SCAN(V=6.46%E=6%D=6/11%OT=22%CT=1%CU=37709%PV=N%DS=0%DC=L%G=Y%TM=5398B1 OS:94%P=x86_64-unknown-linux-gnu)S1(P=6000{4}280640XX{32}0016926bfb9a5d888 OS:face9b5a012aaaa003000000204ffc40402080a5cf62431ff{4}01030307%ST=0.06965 OS:%RT=0.26963)S2(P=6000{4}280640XX{32}0016926c73e9c2938face9b6a012aaaa003 OS:000000204ffc40402080a5cf6244aff{4}01030307%ST=0.169595%RT=0.269659)S3(P OS:=6000{4}280640XX{32}0016926d1d007f858face9b7a012aaaa003000000204ffc4010 OS:1080a5cf62463ff{4}01030307%ST=0.269585%RT=0.469605)S4(P=6000{4}280640XX OS:{32}0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cff{ OS:4}01030307%ST=0.369603%RT=0.469627)S5(P=6000{4}280640XX{32}0016926f0271 OS:b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ff{4}01030307%ST=0. OS:469583%RT=0.661085)S6(P=6000{4}240640XX{32}00169270082a9c9b8face9ba9012 OS:aaaa002c00000204ffc40402080a5cf624aeff{4}%ST=0.569583%RT=0.661118)IE1(P OS:=6000{4}803a40XX{32}8109d26cabcd00{122}%ST=0.611868%RT=0.661144)IE2(P=6 OS:000{4}583a40XX{32}0401d3d300{3}386001234500280026XX{32}3c00010400{4}2b0 OS:0010400{12}3a00010400{4}8000d3ecabcd0001%ST=0.661051%RT=0.858067)NS(P=6 OS:000{4}183affXX{32}8800b7a9c000{3}XX{16}%ST=0.759501%RT=0.858103)U1(P=60 OS:00{3}01643a40XX{32}010468ee00{4}6001234501341138XX{32}9240934d01346e8d4 OS:3{300}%ST=0.80877%RT=0.858122)TECN(P=6000{4}200640XX{32}001692719c25dc1 OS:e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.0558 OS:2)T4(P=6000{4}140640XX{32}00169274a879a96100{4}500400{3}1c0000%ST=1.006 OS:62%RT=1.05586)T5(P=6000{4}140640XX{32}0001927500{4}8face9bf501400{3}1c0 OS:000%ST=1.05579%RT=1.35385)T6(P=6000{4}140640XX{32}0001927650d5e2c900{4} OS:500400{3}1c0000%ST=1.10505%RT=1.35388)T7(P=6000{4}140640XX{32}000192770 OS:0{4}8face9c1501400{3}1c0000%ST=1.15422%RT=1.35389)EXTRA(FL=12345)
You can unwrap it:
$ ./unwrap t.fp SCAN(V=6.46%E=6%D=6/11%OT=22%CT=1%CU=37709%PV=N%DS=0%DC=L%G=Y%TM=5398B194%P=x86_64-unknown-linux-gnu) S1(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926bfb9a5d888face9b5a012aaaa003000000204ffc40402080a5cf62431ffffffff01030307%ST=0.06965%RT=0.26963) S2(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926c73e9c2938face9b6a012aaaa003000000204ffc40402080a5cf6244affffffff01030307%ST=0.169595%RT=0.269659) S3(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926d1d007f858face9b7a012aaaa003000000204ffc40101080a5cf62463ffffffff01030307%ST=0.269585%RT=0.469605) S4(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cffffffff01030307%ST=0.369603%RT=0.469627) S5(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926f0271b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ffffffff01030307%ST=0.469583%RT=0.661085) S6(P=6000000000240640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00169270082a9c9b8face9ba9012aaaa002c00000204ffc40402080a5cf624aeffffffff%ST=0.569583%RT=0.661118) IE1(P=6000000000803a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8109d26cabcd0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000%ST=0.611868%RT=0.661144) IE2(P=6000000000583a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0401d3d3000000386001234500280026XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX3c000104000000002b0001040000000000000000000000003a000104000000008000d3ecabcd0001%ST=0.661051%RT=0.858067) NS(P=6000000000183affXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8800b7a9c0000000XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX%ST=0.759501%RT=0.858103) U1(P=6000000001643a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX010468ee000000006001234501341138XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX9240934d01346e8d434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343%ST=0.80877%RT=0.858122) TECN(P=6000000000200640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX001692719c25dc1e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.05582) T4(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00169274a879a9610000000050040000001c0000%ST=1.00662%RT=1.05586) T5(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00019275000000008face9bf50140000001c0000%ST=1.05579%RT=1.35385) T6(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0001927650d5e2c90000000050040000001c0000%ST=1.10505%RT=1.35388) T7(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00019277000000008face9c150140000001c0000%ST=1.15422%RT=1.35389) EXTRA(FL=12345)
Each line other than EXTRA is the response to a probe. P= is the IP packet contents.
Some sensitive bytes such as the source and destination addresses are blocked out with XX
.
ST= is the send time and RT= is the receive time.
EXTRA contains one field, FL, which is the sent flow label. Not every OS supports changing the OS label, so some fingerprints will have FL=12345 and some will have FL=00000.
Files
nmap.groups is the database file.
Each group
is a class for classification purposes. Each group
may contain multiple print
s.
group Linux 2.6.38 - 3.2 nmapclass Linux | Linux | 2.6.X | general purpose cpe cpe:/o:linux:linux_kernel:2.6 nmapclass Linux | Linux | 3.X | general purpose cpe cpe:/o:linux:linux_kernel:3 print # Linux scanme 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux, Ubuntu 10.04, from david SCAN(V=5.61TEST4%OT=22%CT=1%CU=38935%DS=5%DC=I) S1(P=6000{4}2806fbXX{32}0016bfd19de75ded63fd0ed7a01237c841010000020405a00402080a56a24149ff{4}01030305%ST=0.075288%RT=0.088383) ... print # Linux scanme 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux, Ubuntu 10.04, from web SCAN(V=5.61TEST4%OT=22%CT=1%CU=33901%DS=1%DC=D) S1(P=6000{4}280640XX{32}001687c316611e7ecaeccd92a01237c884aa0000020405a00402080a56a2e029ff{4}01030305%ST=0.008097%RT=0.008497) ... group HP ProCurve 2520G switch nmapclass HP | embedded || switch cpe cpe:/h:hp:procurve_switch_2520g print S1(P=6000{4}2c063a2a01034802c30000c29134fffe814980200104701f08102e00{7}020017bded6c8be3e05f2ecb24b012ffff1ee10000020404c401030301{3}04020101080ac53856baff{4}%ST=0.1000{7}1%RT=0.1000{7}1) ...
nmap.set describes what features are used for training. It has a simple $
variable syntax that enables setting multiple features with similar names. For example,
$IPV6 = [ S1 S2 S3 ] $IPV6 * [ PLEN TC ]
stands for
S1.PLEN S1.TC S2.PLEN S2.TC S3.PLEN S3.TC
The meaning of each feature name is described in vectorize.py.
The program train.py reads nmap.groups, trains a classifier, and outputs nmap.model.
Finally, c_struct.py reads nmap.model and converts it to C++ source code, which should be copied to FPModel.cc.
Integration procedure
Open a submission from the mbox. Copy and paste the wrapped OS: lines to a file, t.fp.
Do a first trial prediction:
$ ./predict.py -m nmap.model <(./nmap26fp.py t.fp) == /dev/fd/63 == nmapclasses: predictions 62. 93.46% 23.43 Linux 3.7 - 3.9 60. 23.62% 24.96 Linux 3.2 61. 2.96% 457.42 Linux 3.2 - 3.8 38. 2.05% 105.35 Apple Mac OS X 10.6.8 - 10.7.3 (Snow Leopard - Lion) (Darwin 10.8.0 - 11.3.0) or iOS 4.3.3 ...
The nmap26fp.py program converts a fingerprint from Nmap's "fp" format to a different format called "6fp". The predict program requires a 6fp input.
The first column is the class number, which is the zero-indexed ordinal of the corresponding group
in nmap.groups. The second column is the matching score. The class with the highest score is declared the OS match. The third column is the novelty, which is a measure of how far the observed feature vector is from the center of the other vectors in the class.
It looks like we want to add this observed print to class 62. We can see how it differs from some elements of the class. Find the line number where class 62 begins (remember it is zero-indexed):
$ grep -n ^group nmap.groups | head -n 63 | tail -n 1 4243:group Linux 3.7 - 3.9
Copy each fingerprint into its own file, for example 62.1.fp, 62.2.fp, etc. You have to add OS:
to the beginning of each line after pasting into a new file, because the vectorize.py program uses OS:
as a format hint. You can look at the feature vector:
$ ./vectorize.py -s nmap.set t.fp 40 S1.PLEN 0 S1.TC 40 S2.PLEN 0 S2.TC 40 S3.PLEN 0 S3.TC
But what you really want is a diff with a reference print:
$ ./vecdiff 62.1.fp t.fp 0 S6.TC - UNKNOWN IE1.PLEN - UNKNOWN IE1.TC - UNKNOWN IE2.PLEN - UNKNOWN IE2.TC - UNKNOWN NS.PLEN - UNKNOWN NS.TC + 128 IE1.PLEN + 0 IE1.TC + 88 IE2.PLEN + 0 IE2.TC + 24 NS.PLEN + 0 NS.TC 356 U1.PLEN ... 0 T7.TC -24340034693.2 TCP_ISR +26194873735.1 TCP_ISR 43690 S1.TCP_WINDOW ... 1 S1.TCP_SACKOK - 5 S1.TCP_WSCALE + 7 S1.TCP_WSCALE 43690 S2.TCP_WINDOW
The reference print doesn't have responses to the IE1, IE2, and NS probes, but that is probably a firewall/LAN issue rather than a characteristic of the OS. TCP_ISR is essentially a match. The only other differences are in TCP_WSCALE features, which happen not to be meaningful for Linux.
Repeat the process with 62.2.fp, 62.3.fp, etc. until you are convinced that you should add add the new observed print to the group, or else that it should go into a new group. A large number of differences does not disqualify an observed print from being added to a group: you want diversity within classes.
Format the observed print for addition to the database:
$ ./unwrap.py -r -s t.fp SCAN(V=6.46%OT=22%CT=1%CU=37709%DS=0%DC=L) S1(P=6000{4}280640XX{32}0016926bfb9a5d888face9b5a012aaaa003000000204ffc40402080a5cf62431ff{4}01030307%ST=0.06965%RT=0.26963) S2(P=6000{4}280640XX{32}0016926c73e9c2938face9b6a012aaaa003000000204ffc40402080a5cf6244aff{4}01030307%ST=0.169595%RT=0.269659) S3(P=6000{4}280640XX{32}0016926d1d007f858face9b7a012aaaa003000000204ffc40101080a5cf62463ff{4}01030307%ST=0.269585%RT=0.469605) S4(P=6000{4}280640XX{32}0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cff{4}01030307%ST=0.369603%RT=0.469627) S5(P=6000{4}280640XX{32}0016926f0271b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ff{4}01030307%ST=0.469583%RT=0.661085) S6(P=6000{4}240640XX{32}00169270082a9c9b8face9ba9012aaaa002c00000204ffc40402080a5cf624aeff{4}%ST=0.569583%RT=0.661118) IE1(P=6000{4}803a40XX{32}8109d26cabcd00{122}%ST=0.611868%RT=0.661144) IE2(P=6000{4}583a40XX{32}0401d3d300{3}386001234500280026XX{32}3c00010400{4}2b00010400{12}3a00010400{4}8000d3ecabcd0001%ST=0.661051%RT=0.858067) NS(P=6000{4}183affXX{32}8800b7a9c000{3}XX{16}%ST=0.759501%RT=0.858103) U1(P=6000{3}01643a40XX{32}010468ee00{4}6001234501341138XX{32}9240934d01346e8d43{300}%ST=0.80877%RT=0.858122) TECN(P=6000{4}200640XX{32}001692719c25dc1e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.05582) T4(P=6000{4}140640XX{32}00169274a879a96100{4}500400{3}1c0000%ST=1.00662%RT=1.05586) T5(P=6000{4}140640XX{32}0001927500{4}8face9bf501400{3}1c0000%ST=1.05579%RT=1.35385) T6(P=6000{4}140640XX{32}0001927650d5e2c900{4}500400{3}1c0000%ST=1.10505%RT=1.35388) T7(P=6000{4}140640XX{32}0001927700{4}8face9c1501400{3}1c0000%ST=1.15422%RT=1.35389) EXTRA(FL=12345)
Copy and paste it into an existing group, or create a new group.
Now train the model with your new change:
$ ./train.py -c 100 -s nmap.set -g nmap.groups --scale > nmap.model Training. Accuracy 67.4698795181
The "Accuracy" number is not very meaningful so don't pay too much attention to it. (It tracks how many training samples got assigned to their original class, and is penalized when a Linux training example falls into a Linux class other than the one it started it; also I suspect that the cross-validation may not work well when you have few members per class as we do.)
Run the prediction again:
$ ./predict.py -m nmap.model <(./nmap26fp.py t.fp) == /dev/fd/63 == nmapclasses: predictions 62. 99.05% 5.70 Linux 3.7 - 3.9 60. 1.32% 24.96 Linux 3.2 61. 1.25% 457.42 Linux 3.2 - 3.8 52. 0.64% 81.64 Linux 2.6.16 - 3.2 ...
You are looking for a high score (above 90%) and a low novelty (under 15 is okay).
After handling all the submissions, generate the FPModel.cc source file and copy it into the nmap source.
$ ./c_struct.py -m nmap.model > FPModel.cc