Nmap/IPv6 OS Integration

From SecWiki
Jump to: navigation, search

IPv6 has its own database, classification engine, and integration tools. See the book sections on IPv6 matching and IPv6 probes and features.

The integration tools are in https://svn.nmap.org/nmap-exp/luis/ipv6tests. See the README for some commands.

First build liblinear and the Python binding.

cd liblinear-1.8
make
cd python
make
cd ../..

Submissions come in an mbox with wrapped fingerprints that look like this. There's a simple run-length encoding where bb{N} stands for the byte bb repeated N times.

 $ cat t.fp
 OS:SCAN(V=6.46%E=6%D=6/11%OT=22%CT=1%CU=37709%PV=N%DS=0%DC=L%G=Y%TM=5398B1
 OS:94%P=x86_64-unknown-linux-gnu)S1(P=6000{4}280640XX{32}0016926bfb9a5d888
 OS:face9b5a012aaaa003000000204ffc40402080a5cf62431ff{4}01030307%ST=0.06965
 OS:%RT=0.26963)S2(P=6000{4}280640XX{32}0016926c73e9c2938face9b6a012aaaa003
 OS:000000204ffc40402080a5cf6244aff{4}01030307%ST=0.169595%RT=0.269659)S3(P
 OS:=6000{4}280640XX{32}0016926d1d007f858face9b7a012aaaa003000000204ffc4010
 OS:1080a5cf62463ff{4}01030307%ST=0.269585%RT=0.469605)S4(P=6000{4}280640XX
 OS:{32}0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cff{
 OS:4}01030307%ST=0.369603%RT=0.469627)S5(P=6000{4}280640XX{32}0016926f0271
 OS:b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ff{4}01030307%ST=0.
 OS:469583%RT=0.661085)S6(P=6000{4}240640XX{32}00169270082a9c9b8face9ba9012
 OS:aaaa002c00000204ffc40402080a5cf624aeff{4}%ST=0.569583%RT=0.661118)IE1(P
 OS:=6000{4}803a40XX{32}8109d26cabcd00{122}%ST=0.611868%RT=0.661144)IE2(P=6
 OS:000{4}583a40XX{32}0401d3d300{3}386001234500280026XX{32}3c00010400{4}2b0
 OS:0010400{12}3a00010400{4}8000d3ecabcd0001%ST=0.661051%RT=0.858067)NS(P=6
 OS:000{4}183affXX{32}8800b7a9c000{3}XX{16}%ST=0.759501%RT=0.858103)U1(P=60
 OS:00{3}01643a40XX{32}010468ee00{4}6001234501341138XX{32}9240934d01346e8d4
 OS:3{300}%ST=0.80877%RT=0.858122)TECN(P=6000{4}200640XX{32}001692719c25dc1
 OS:e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.0558
 OS:2)T4(P=6000{4}140640XX{32}00169274a879a96100{4}500400{3}1c0000%ST=1.006
 OS:62%RT=1.05586)T5(P=6000{4}140640XX{32}0001927500{4}8face9bf501400{3}1c0
 OS:000%ST=1.05579%RT=1.35385)T6(P=6000{4}140640XX{32}0001927650d5e2c900{4}
 OS:500400{3}1c0000%ST=1.10505%RT=1.35388)T7(P=6000{4}140640XX{32}000192770
 OS:0{4}8face9c1501400{3}1c0000%ST=1.15422%RT=1.35389)EXTRA(FL=12345)

You can unwrap it:

 $ ./unwrap t.fp
 SCAN(V=6.46%E=6%D=6/11%OT=22%CT=1%CU=37709%PV=N%DS=0%DC=L%G=Y%TM=5398B194%P=x86_64-unknown-linux-gnu)
 S1(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926bfb9a5d888face9b5a012aaaa003000000204ffc40402080a5cf62431ffffffff01030307%ST=0.06965%RT=0.26963)
 S2(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926c73e9c2938face9b6a012aaaa003000000204ffc40402080a5cf6244affffffff01030307%ST=0.169595%RT=0.269659)
 S3(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926d1d007f858face9b7a012aaaa003000000204ffc40101080a5cf62463ffffffff01030307%ST=0.269585%RT=0.469605)
 S4(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cffffffff01030307%ST=0.369603%RT=0.469627)
 S5(P=6000000000280640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0016926f0271b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ffffffff01030307%ST=0.469583%RT=0.661085)
 S6(P=6000000000240640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00169270082a9c9b8face9ba9012aaaa002c00000204ffc40402080a5cf624aeffffffff%ST=0.569583%RT=0.661118)
 IE1(P=6000000000803a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8109d26cabcd0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000%ST=0.611868%RT=0.661144)
 IE2(P=6000000000583a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0401d3d3000000386001234500280026XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX3c000104000000002b0001040000000000000000000000003a000104000000008000d3ecabcd0001%ST=0.661051%RT=0.858067)
 NS(P=6000000000183affXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX8800b7a9c0000000XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX%ST=0.759501%RT=0.858103)
 U1(P=6000000001643a40XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX010468ee000000006001234501341138XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX9240934d01346e8d434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343434343%ST=0.80877%RT=0.858122)
 TECN(P=6000000000200640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX001692719c25dc1e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.05582)
 T4(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00169274a879a9610000000050040000001c0000%ST=1.00662%RT=1.05586)
 T5(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00019275000000008face9bf50140000001c0000%ST=1.05579%RT=1.35385)
 T6(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX0001927650d5e2c90000000050040000001c0000%ST=1.10505%RT=1.35388)
 T7(P=6000000000140640XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX00019277000000008face9c150140000001c0000%ST=1.15422%RT=1.35389)
 EXTRA(FL=12345)

Each line other than EXTRA is the response to a probe. P= is the IP packet contents. Some sensitive bytes such as the source and destination addresses are blocked out with XX. ST= is the send time and RT= is the receive time.

EXTRA contains one field, FL, which is the sent flow label. Not every OS supports changing the OS label, so some fingerprints will have FL=12345 and some will have FL=00000.

Files

nmap.groups is the database file. Each group is a class for classification purposes. Each group may contain multiple prints.

 group Linux 2.6.38 - 3.2
 nmapclass Linux | Linux | 2.6.X | general purpose
 cpe cpe:/o:linux:linux_kernel:2.6
 nmapclass Linux | Linux | 3.X | general purpose
 cpe cpe:/o:linux:linux_kernel:3
 
 print
 # Linux scanme 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux, Ubuntu 10.04, from david
 SCAN(V=5.61TEST4%OT=22%CT=1%CU=38935%DS=5%DC=I)
 S1(P=6000{4}2806fbXX{32}0016bfd19de75ded63fd0ed7a01237c841010000020405a00402080a56a24149ff{4}01030305%ST=0.075288%RT=0.088383)
 ...
 
 print
 # Linux scanme 2.6.39.1-linode34 #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux, Ubuntu 10.04, from web
 SCAN(V=5.61TEST4%OT=22%CT=1%CU=33901%DS=1%DC=D)
 S1(P=6000{4}280640XX{32}001687c316611e7ecaeccd92a01237c884aa0000020405a00402080a56a2e029ff{4}01030305%ST=0.008097%RT=0.008497)
 ...
 
 group HP ProCurve 2520G switch
 nmapclass HP | embedded || switch
 cpe cpe:/h:hp:procurve_switch_2520g
 
 print
 S1(P=6000{4}2c063a2a01034802c30000c29134fffe814980200104701f08102e00{7}020017bded6c8be3e05f2ecb24b012ffff1ee10000020404c401030301{3}04020101080ac53856baff{4}%ST=0.1000{7}1%RT=0.1000{7}1)
 ...

nmap.set describes what features are used for training. It has a simple $ variable syntax that enables setting multiple features with similar names. For example,

 $IPV6 = [
     S1
     S2
     S3
 ]
 $IPV6 * [
     PLEN
     TC
 ]

stands for

 S1.PLEN
 S1.TC
 S2.PLEN
 S2.TC
 S3.PLEN
 S3.TC

The meaning of each feature name is described in vectorize.py.

The program train.py reads nmap.groups, trains a classifier, and outputs nmap.model.

Finally, c_struct.py reads nmap.model and converts it to C++ source code, which should be copied to FPModel.cc.

Integration procedure

Open a submission from the mbox. Copy and paste the wrapped OS: lines to a file, t.fp.

Do a first trial prediction:

 $ ./predict.py -m nmap.model <(./nmap26fp.py t.fp)
 == /dev/fd/63 ==
 nmapclasses:
 predictions
 62.  93.46%  23.43 Linux 3.7 - 3.9
 60.  23.62%  24.96 Linux 3.2
 61.   2.96% 457.42 Linux 3.2 - 3.8
 38.   2.05% 105.35 Apple Mac OS X 10.6.8 - 10.7.3 (Snow Leopard - Lion) (Darwin 10.8.0 - 11.3.0) or iOS 4.3.3
 ...

The nmap26fp.py program converts a fingerprint from Nmap's "fp" format to a different format called "6fp". The predict program requires a 6fp input.

The first column is the class number, which is the zero-indexed ordinal of the corresponding group in nmap.groups. The second column is the matching score. The class with the highest score is declared the OS match. The third column is the novelty, which is a measure of how far the observed feature vector is from the center of the other vectors in the class.

It looks like we want to add this observed print to class 62. We can see how it differs from some elements of the class. Find the line number where class 62 begins (remember it is zero-indexed):

 $ grep -n ^group nmap.groups | head -n 63 | tail -n 1
 4243:group Linux 3.7 - 3.9

Copy each fingerprint into its own file, for example 62.1.fp, 62.2.fp, etc. You have to add OS: to the beginning of each line after pasting into a new file, because the vectorize.py program uses OS: as a format hint. You can look at the feature vector:

 $ ./vectorize.py -s nmap.set t.fp
         40  S1.PLEN
          0  S1.TC
         40  S2.PLEN
          0  S2.TC
         40  S3.PLEN
          0  S3.TC

But what you really want is a diff with a reference print:

 $ ./vecdiff 62.1.fp t.fp
           0  S6.TC
 -   UNKNOWN  IE1.PLEN
 -   UNKNOWN  IE1.TC
 -   UNKNOWN  IE2.PLEN
 -   UNKNOWN  IE2.TC
 -   UNKNOWN  NS.PLEN
 -   UNKNOWN  NS.TC
 +       128  IE1.PLEN
 +         0  IE1.TC
 +        88  IE2.PLEN
 +         0  IE2.TC
 +        24  NS.PLEN
 +         0  NS.TC
         356  U1.PLEN
 ...
           0  T7.TC
 -24340034693.2  TCP_ISR
 +26194873735.1  TCP_ISR
       43690  S1.TCP_WINDOW
 ...
           1  S1.TCP_SACKOK
 -         5  S1.TCP_WSCALE
 +         7  S1.TCP_WSCALE
       43690  S2.TCP_WINDOW

The reference print doesn't have responses to the IE1, IE2, and NS probes, but that is probably a firewall/LAN issue rather than a characteristic of the OS. TCP_ISR is essentially a match. The only other differences are in TCP_WSCALE features, which happen not to be meaningful for Linux.

Repeat the process with 62.2.fp, 62.3.fp, etc. until you are convinced that you should add add the new observed print to the group, or else that it should go into a new group. A large number of differences does not disqualify an observed print from being added to a group: you want diversity within classes.

Format the observed print for addition to the database:

 $ ./unwrap.py -r -s t.fp
 SCAN(V=6.46%OT=22%CT=1%CU=37709%DS=0%DC=L)
 S1(P=6000{4}280640XX{32}0016926bfb9a5d888face9b5a012aaaa003000000204ffc40402080a5cf62431ff{4}01030307%ST=0.06965%RT=0.26963)
 S2(P=6000{4}280640XX{32}0016926c73e9c2938face9b6a012aaaa003000000204ffc40402080a5cf6244aff{4}01030307%ST=0.169595%RT=0.269659)
 S3(P=6000{4}280640XX{32}0016926d1d007f858face9b7a012aaaa003000000204ffc40101080a5cf62463ff{4}01030307%ST=0.269585%RT=0.469605)
 S4(P=6000{4}280640XX{32}0016926e0918ef9b8face9b8a012aaaa003000000204ffc40402080a5cf6247cff{4}01030307%ST=0.369603%RT=0.469627)
 S5(P=6000{4}280640XX{32}0016926f0271b9cd8face9b9a012aaaa003000000204ffc40402080a5cf62495ff{4}01030307%ST=0.469583%RT=0.661085)
 S6(P=6000{4}240640XX{32}00169270082a9c9b8face9ba9012aaaa002c00000204ffc40402080a5cf624aeff{4}%ST=0.569583%RT=0.661118)
 IE1(P=6000{4}803a40XX{32}8109d26cabcd00{122}%ST=0.611868%RT=0.661144)
 IE2(P=6000{4}583a40XX{32}0401d3d300{3}386001234500280026XX{32}3c00010400{4}2b00010400{12}3a00010400{4}8000d3ecabcd0001%ST=0.661051%RT=0.858067)
 NS(P=6000{4}183affXX{32}8800b7a9c000{3}XX{16}%ST=0.759501%RT=0.858103)
 U1(P=6000{3}01643a40XX{32}010468ee00{4}6001234501341138XX{32}9240934d01346e8d43{300}%ST=0.80877%RT=0.858122)
 TECN(P=6000{4}200640XX{32}001692719c25dc1e8face9bb8052aaaa002800000204ffc40101040201030307%ST=0.858013%RT=1.05582)
 T4(P=6000{4}140640XX{32}00169274a879a96100{4}500400{3}1c0000%ST=1.00662%RT=1.05586)
 T5(P=6000{4}140640XX{32}0001927500{4}8face9bf501400{3}1c0000%ST=1.05579%RT=1.35385)
 T6(P=6000{4}140640XX{32}0001927650d5e2c900{4}500400{3}1c0000%ST=1.10505%RT=1.35388)
 T7(P=6000{4}140640XX{32}0001927700{4}8face9c1501400{3}1c0000%ST=1.15422%RT=1.35389)
 EXTRA(FL=12345)

Copy and paste it into an existing group, or create a new group.

Now train the model with your new change:

 $ ./train.py -c 100 -s nmap.set -g nmap.groups --scale > nmap.model
 Training.
 Accuracy 67.4698795181

The "Accuracy" number is not very meaningful so don't pay too much attention to it. (It tracks how many training samples got assigned to their original class, and is penalized when a Linux training example falls into a Linux class other than the one it started it; also I suspect that the cross-validation may not work well when you have few members per class as we do.)

Run the prediction again:

 $ ./predict.py -m nmap.model <(./nmap26fp.py t.fp)
 == /dev/fd/63 ==
 nmapclasses: 
 predictions
 62.  99.05%   5.70 Linux 3.7 - 3.9
 60.   1.32%  24.96 Linux 3.2
 61.   1.25% 457.42 Linux 3.2 - 3.8
 52.   0.64%  81.64 Linux 2.6.16 - 3.2
 ...

You are looking for a high score (above 90%) and a low novelty (under 15 is okay).

After handling all the submissions, generate the FPModel.cc source file and copy it into the nmap source.

 $ ./c_struct.py -m nmap.model > FPModel.cc