Nmap/Structured Script Output

From SecWiki
Jump to: navigation, search

This feature has been completed and merged. The documentation is at https://nmap.org/book/nse-api.html#nse-structured-output

Overview

Nmap's XML output is intended to be the official machine-readable format for programs which consume Nmap output. Unfortunately, the output of NSE scripts is currently handled as a blob of text and stuffed into the output attribute of the script tag. This makes the output essentially inaccessible to a XML parser, and does not encourage consistency in formatting between scripts.

Because of these issues, there is a need for a method to structure script output in a way that facilitates parsing by scripts and programs. A few solutions have been suggested, and are described below. Some include patches, with varying degrees of testing and compatibility with the current codebase. Some outstanding questions are:

  • Should there be a single, internal representation of script output, from which XML and Normal output formats are derived?
  • Should script authors be able to specify different content for XML and Normal outputs?
  • Should the structured-output method chosen be backwards-compatible with existing scripts? To what degree?

Requirements

  1. Human-readable text output must continue to appear in //script@output[XPath syntax] attributes.
    • This is the same output shown in normal screen output, with newlines and everything. In other words, keep backwards compatibility for tools like Ndiff.
    • It is probably fine for human-readable output to be derived automatically from structured output; that is, for the human-readable output to be essentially a pretty-printed Lua table.
      • Script authors should have a choice of formatting functions to use in generating human-readable output (e.g. tabular, indented, comma-separated, etc.)
      • Script authors may be allowed to specify a Lua function which generates human-readable output from the Lua table
    • It is probably not sufficient to automatically derive structured output from textual human-readable output.
  2. Scripts can't just make up their own new XML elements--it has to be possible to validate XML documents against a predefined DTD as always.
  3. It must be possible to represent
    • Single string outputs
    • Lists
    • Dictionary name-value mappings
    • Some degree of nesting of above elements (Arbitrary nesting is not necessary, though desirable).
  4. It is sufficient if all parts of the output (list elements, dictionary keys and values) are strings. That is, it's not necessary to represent an integer explicitly, for example. It's fine if processing scripts need to know the expected data types of the fields that they process. IP addresses can be represented textually.

To be continued...

Use cases

End user

User wants to count how many times the http-title script had output containing "Linksys".

User wants to find all hosts that have ssh-hostkey output, and save a report to a text file where each line contains an IP address and an MD5 host key fingerprint.

User wants to make a copy of an XML output file, but remove output from any *-ls scripts.

User wants to find what hosts and ports have services with expired SSL certificates (the "Not valid after" key of ssl-cert).


User wants a mapping of which accounts are registered with the Messenger service on what machines (One of the Names from nbtstat).

Here is an example of the XML that could be generated for the output of the nbtstat script:

<script name="nbstat" output="...Flags: &lt;unique&gt;&lt;active&gt;&#x0A;">
    <dict>                                               
        <elem key="NetBIOS name">NAS</elem>
        <elem key="NetBIOS user"><unknown></elem>
        <elem key="NetBIOS MAC"><unknown></elem>
    </dict>
    <list name="Names">
        <dict>
            <elem key="name">NAS</elem>
            <elem key="number">20</elem>
            <elem key="flags">
                <list>                                                                       
                    <elem>unique</elem>
                    <elem>active</elem>                                                      
                </list>
            </elem>
        </dict>
        <dict>
            <elem key="name">JDOE</elem>
            <elem key="number">03</elem>
            <elem key="flags">
                <list>
                    <elem>unique</elem>
                    <elem>active</elem>   
                </list> 
            </elem>
        </dict>
    </list>
</script>

The Lua table that is converted into this XML could look something like this:

{
  { NetBIOS_name="NAS", NetBIOS_user="<unknown>", NetBIOS_MAC="<unknown>" },
  Names={
    { name="NAS", number="20", flags={ "unique", "active" } },
    { name="JDOE", number="03", flags={ "unique", "active" } },
  }
}

Xpath to make the selection: //script[@name="nbstat"]/list[@name="Names"]/dict/elem[@key="number"][text()="20"]/../elem[@key="name"]


User wants to find Windows machines with SMB Message Signing disabled (from smb-security-mode).

Script author

The duplicates script wants to output a list of lists--each one a set of IP addresses that appear to be the same host for some reason.

The smb-ls script wants to output a list of file and directory names. Each element in the list is a dictionary with keys like mtime, size, and filename. The script also wants to output a compact human-readable format that looks like the output of ls.

Output proposals

Here we show the evolution of the output proposals so far and make room for more proposals.

Status quo

This is what XML output looks like as of Nmap 6.01. Line breaks added for readability.

$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert
<script id="ssl-cert"
        output="Subject: commonName=secwiki.org...&#xa;
Issuer: organizationName=Equifax/countryName=US&#xa;
Public Key type: rsa&#xa;
Public Key bits: 1024&#xa;
Not valid before: 2010-11-15 19:22:12&#amp;xa;
Not valid after:  2012-11-17 05:18:45&#amp;xa;
MD5:   c729 827b 8941 9bdc 20b0 43b4 9d9d 1595&#xa;
SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"/>
<script id="http-title" output="SecWiki"/>

YAML for everything

YAML proposal of 2011-03-31 (thread continues). YAML patch.

$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert
<script id="ssl-cert"
        output="Subject: commonName=secwiki.org...&#xa;
Issuer: organizationName=Equifax/countryName=US&#xa;
Public Key type: rsa&#xa;
Public Key bits: 1024&#xa;
Not valid before: 2010-11-15 19:22:12&#amp;xa;
Not valid after:  2012-11-17 05:18:45&#amp;xa;
MD5: c729 827b 8941 9bdc 20b0 43b4 9d9d 1595&#xa;
SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"/>
<script id="http-title" output="SecWiki"/>

(Well, I guess nothing changes except for some spacing, because the normal format_output output is already close to YAML, at least when you use colons to label things.)

XML elements representing YAML structure

The usual method of representing YAML as XML is no good because it creates arbitrary new elements. Instead David proposed to represent the structure using a finite number of new XML elements. http://seclists.org/nmap-dev/2011/q2/149.

$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert
<script id="ssl-cert" output="...">
  <dict>
    <elem key="Subject">commonName=secwiki.org...</elem>
    <elem key="Issuer">organizationName=Equifax/countryName=US</elem>
    <elem key="Public Key type">rsa</elem>
    <elem key="Public Key bits">1024</elem>
    <elem key="Not valid before">2010-11-15 19:22:12</elem>
    <elem key="Not valid after">2012-11-17 05:18:45</elem>
    <elem key="MD5">c729 827b 8941 9bdc 20b0 43b4 9d9d 1595</elem>
    <elem key="SHA-1">157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063</elem>
  </dict>
</script>
<script id="http-title" output="SecWiki">
  <elem>SecWiki</elem>
</script>

Proposal alpha

http://seclists.org/nmap-dev/2012/q2/375 of 21 May 2012. Sample output changes. Relative to the previous proposal, it removes output attributes, and doesn't have a dict container around the dictionary.

$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert
<script id="ssl-cert">
  <elem key="Subject">commonName=secwiki.org...</elem>
  <elem key="Issuer">organizationName=Equifax/countryName=US</elem>
  <elem key="Public Key type">rsa</elem>
  <elem key="Public Key bits">1024</elem>
  <elem key="Not valid before">2010-11-15 19:22:12</elem>
  <elem key="Not valid after">2012-11-17 05:18:45</elem>
  <elem key="MD5">c729 827b 8941 9bdc 20b0 43b4 9d9d 1595</elem>
  <elem key="SHA-1">157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063</elem>
</script>
<script id="http-title">
  <elem>SecWiki</elem>
</script>

Proposal beta

Proposal of 2012-06-13 to support parallel string and table outputs. String output without table output causes no structured output to be emitted. Table output without string output synthesizes a pretty-printed indented text output. If both outputs are provided, they are independent. Structured XML output uses dict, list, and elem elements. Each child of a dict must have a key attribute; children of list must not have such an attribute.

$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert
<script id="ssl-cert"
        output="Subject: commonName=secwiki.org...&#xa;
Issuer: organizationName=Equifax/countryName=US&#xa;
Public Key type: rsa&#xa;
Public Key bits: 1024&#xa;
Not valid before: 2010-11-15 19:22:12&#amp;xa;
Not valid after:  2012-11-17 05:18:45&#amp;xa;
MD5:   c729 827b 8941 9bdc 20b0 43b4 9d9d 1595&#xa;
SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063">
  <dict>
    <dict key="subject">
      <elem key="commonName">secwiki.org</elem>
    </dict>
    <dict key="issuer">
      <elem key="organizationName">Equifax</elem>
      <elem key="countryName">US</elem>
    </dict>
    <dict key="pubkey">
      <elem key="type">rsa</elem>
      <elem key="bits">1024</elem>
    </dict>
    <dict key="validity">
      <elem key="notBefore">2010-11-15 19:22:12</elem>
      <elem key="notAfter">2012-11-17 05:18:45</elem>
    </dict>
    <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem>
    <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem>
  </dict>
</script>
<script id="http-title" output="SecWiki"/>

Proposal gamma

Slightly modified from Proposal Beta, the major difference is replacement of both dict and list elements with table elements, more closely reflecting the Lua table structure. Also, since every script that has XML output is returning a table, the outermost table element is not output:

 <script id="ssl-cert"
         output="Subject: commonName=secwiki.org...&#xa;
 Issuer: organizationName=Equifax/countryName=US&#xa;
 Public Key type: rsa&#xa;
 Public Key bits: 1024&#xa;
 Not valid before: 2010-11-15 19:22:12&#amp;xa;
 Not valid after:  2012-11-17 05:18:45&#amp;xa;
 MD5:   c729 827b 8941 9bdc 20b0 43b4 9d9d 1595&#xa;
 SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063">
     <table key="subject">
       <elem key="commonName">secwiki.org</elem>
     </table>
     <table key="issuer">
       <elem key="organizationName">Equifax</elem>
       <elem key="countryName">US</elem>
     </table>
     <table key="pubkey">
       <elem key="type">rsa</elem>
       <elem key="bits">1024</elem>
     </table>
     <table key="validity">
       <elem key="notBefore">2010-11-15 19:22:12</elem>
       <elem key="notAfter">2012-11-17 05:18:45</elem>
     </table>
     <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem>
     <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem>
 </script>
 <script id="http-title" output="SecWiki"></script>

If a script returns a string (or a number), then the string is set as the @output attribute of the script element, which has no child nodes. If a script returns a table, it will be "pretty-printed" into a textual format using a recursive algorithm. This table:

 { "This is an example",
   name="thing one",
   isCool=false,
   ["Now for some numbers"] = {42, 13, 7, count=3}
 }

Results in this output:

| script-name:
|   This is an example
|   name: thing one
|   isCool: false
|   Now for some numbers: 
|     42
|     13
|     7
|     count: 3
|_

This proposal is getting traction, having been implemented at https://github.com/bonsaiviking/Nmap-script-XML. Next steps:

  • Implement some standard transforms and make a metatable-setter function in stdnse to expose them to script writers.
  • Begin adapting existing scripts to use this API, identifying bugs and improvements.

Proposal gamma-1

Proposed as extensions to proposal gamma. http://seclists.org/nmap-dev/2012/q2/973

  • Support errors as distinct elements in XML, with standardized handling for normal output. Considerations:
    • Should error elements be at top level only, or allowed to be nested in tables?
    • How can a script indicate something is an error? ("error" table index perhaps)
  • Possibly support warnings as distinct elements, too.
Let's handle errors as something distinct. I don't think they need to distract from solving just plain output. We'll add a special <error> element, and it will be set by a function call, not by a further complicated script return value. The same goes for <vuln>, which will have its own distinguished section distinct from script output. David (talk) 14:57, 19 July 2012 (PDT)

Nmap-script-XML c9c97295 2012-07-15

$ <b>nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert</b>
        <script id="http-title" output="SecWiki"></script>
        <script id="ssl-cert" output="issuer:   organizationalUnitName:
Equifax Secure Certificate Authority  countryName: US  organizationName: Equifax
md5: c729827b89419bdc20b043b49d9d1595 pubkey:   type: rsa  bits: 1024 sha1:
157b440e3df429947a8213d418565da6f10f3063 subject:   organizationalUnitName:
Domain Control Validated - RapidSSL(R)  countryName: US  organizationName:
secwiki.org  commonName: secwiki.org  serialNumber: UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ
validity:   notBefore:   day: 15  sec: 12  min: 22  hour: 19  month: 11  year: 2010
notAfter:   day: 17  sec: 45  min: 18  hour: 5  month: 11  year: 2012">
          <table key="subject">
            <elem key="organizationalUnitName">Domain Control Validated - RapidSSL(R)</elem>
            <elem key="countryName">US</elem>
            <elem key="organizationName">secwiki.org</elem>
            <elem key="commonName">secwiki.org</elem>
            <elem key="serialNumber">UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ</elem>
          </table>
          <table key="issuer">
            <elem key="organizationalUnitName">Equifax Secure Certificate Authority</elem>
            <elem key="countryName">US</elem>
            <elem key="organizationName">Equifax</elem>
          </table>
          <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem>
          <table key="pubkey">
            <elem key="type">rsa</elem>
            <elem key="bits">1024</elem>
          </table>
          <table key="validity">
            <table key="notBefore">
              <elem key="day">15</elem>
              <elem key="sec">12</elem>
              <elem key="min">22</elem>
              <elem key="hour">19</elem>
              <elem key="month">11</elem>
              <elem key="year">2010</elem>
            </table>
            <table key="notAfter">
              <elem key="day">17</elem>
              <elem key="sec">45</elem>
              <elem key="min">18</elem>
              <elem key="hour">5</elem>
              <elem key="month">11</elem>
              <elem key="year">2012</elem>
            </table>
          </table>
          <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem>
        </script>


Merge candidate 1

$ <b>nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert</b>
<script id="http-title" output="SecWiki">
  <elem key="title">SecWiki</elem>
</script>
<script id="ssl-cert" output="Subject: commonName=secwiki.org/...">
  <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem>
  <elem key="pem">-----BEGIN CERTIFICATE-----
  MIIDazCCAtSgAwIBAgIDFTSzMA0GCSqGSIb3DQEBBQUAME4xCzAJBgNVBAYTAlVT
  MRAwDgYDVQQKEwdFcXVpZmF4MS0wKwYDVQQLEyRFcXVpZmF4IFNlY3VyZSBDZXJ0
  aWZpY2F0ZSBBdXRob3JpdHkwHhcNMTAxMTE1MTkyMjEyWhcNMTIxMTE3MDUxODQ1
  WjCB3TEpMCcGA1UEBRMgVUdpZ3pLLTdqNzlweEI3eFczTVpNZlBlWVdZL2dKbkox
  CzAJBgNVBAYTAlVTMRQwEgYDVQQKEwtzZWN3aWtpLm9yZzETMBEGA1UECxMKR1Q5
  NjkzNzIwMDExMC8GA1UECxMoU2VlIHd3dy5yYXBpZHNzbC5jb20vcmVzb3VyY2Vz
  L2NwcyAoYykxMDEvMC0GA1UECxMmRG9tYWluIENvbnRyb2wgVmFsaWRhdGVkIC0g
  UmFwaWRTU0woUikxFDASBgNVBAMTC3NlY3dpa2kub3JnMIGfMA0GCSqGSIb3DQEB
  AQUAA4GNADCBiQKBgQCtGfJ589J4p8HOqE1U8EfXaawS0Q5VURsvifPItd8hjt7O
  eQ4nVgYMFV5qRIG1ey9kPU32+6smtyCumXNq32GyDcJ3PHVKxWNSMOSsu7auy0VC
  KK7qM5tDE2Yjuwb16ZsGjMoPDgCLtljcx2CU1nykXBonTe54V0QzkhfkfHtt6QID
  AQABo4HGMIHDMB8GA1UdIwQYMBaAFEjmaPkr0rKV10fYIyAQTzOYkJ/UMA4GA1Ud
  DwEB/wQEAwIE8DAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwFgYDVR0R
  BA8wDYILc2Vjd2lraS5vcmcwOgYDVR0fBDMwMTAvoC2gK4YpaHR0cDovL2NybC5n
  ZW90cnVzdC5jb20vY3Jscy9zZWN1cmVjYS5jcmwwHQYDVR0OBBYEFDClKsFfPfqm
  nkl4AhfuBNlc2KivMA0GCSqGSIb3DQEBBQUAA4GBALRSt0QX4TsuOWQoBTX2ZQSE
  4jlxXev5IHXWSOuVdUnpdxrxMwpp1mQ7gl+KH/VNG2PeGwUdSXvuBgN8V+fCudCJ
  rgJJYpLtLqGfxXU+XaL34ICJH+fwG7shWb8k1W4Q0Z54JeTyjI56F/GpDWY9jtk0
  LffllcPwT+7t3GGbZOYS
  -----END CERTIFICATE-----
  </elem>
  <table key="issuer">
    <elem key="countryName">US</elem>
    <elem key="organizationalUnitName">Equifax Secure Certificate Authority</elem>
    <elem key="organizationName">Equifax</elem>
  </table>
  <table key="subject">
    <elem key="countryName">US</elem>
    <elem key="organizationName">secwiki.org</elem>
    <elem key="commonName">secwiki.org</elem>
    <elem key="organizationalUnitName">Domain Control Validated - RapidSSL(R)</elem>
    <elem key="serialNumber">UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ</elem>
  </table>
  <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem>
  <table key="validity">
    <elem key="notBefore">2010-11-15T19:22:12Z</elem>
    <elem key="notAfter">2012-11-17T05:18:45Z</elem>
  </table>
  <table key="pubkey">
    <elem key="bits">1024</elem>
    <elem key="type">rsa</elem>
  </table>
</script>

Links

XPath

Examples on this page use XPath selector syntax to refer to XML elements and attributes. You can grep an XML file using these selectors with the xmlstarlet tool.

  • //script selects all script elements.
  • //hostscript/script selects all script elements that are immediate children of a hostscript element.
  • //script[@id="http-title"] selects all script elements with the id="http-title" attribute set.

Here are examples of xmlstarlet use using a sample before.xml XML file.

# Print all <script> elements.
$ xmlstarlet sel -t -m '//script' -c . -n before.xml
<script id="http-methods" output="No Allow or Public header in OPTIONS response (status code 501)"/>
<script id="http-title" output="301 Moved Permanently..."/>
<script id="ssl-cert" output="Subject: commonName=XXXXXXXXXXXX/organizationName=..."/>
...
# Print id and output of all hostscript outputs.
$ xmlstarlet sel -t -m '//hostscript/script' -v @id -o ': ' -v @output -n before.xml
nbstat: NetBIOS name: NAS, NetBIOS user: <unknown>, NetBIOS MAC: <unknown>
smbv2-enabled: Server supports SMBv2 protocol
nbstat: NetBIOS name: XXXX, NetBIOS user: <unknown>, NetBIOS MAC: xx:xx:xx:xx:xx:xx (unknown)
smb-os-discovery:
  OS: Windows Vista (TM) Enterprise 6002 Service Pack 2 (Windows Vista (TM) Enterprise 6.0)
  Computer name: XXXX
  NetBIOS computer name: XXXX
  Workgroup: WORKGROUP
  System time: 2012-05-27 21:58:06 UTC-5
# Grep out all HTML titles.
$ xmlstarlet sel -t -m '//script[@id="http-title"]' -v @output -n before.xml
301 Moved Permanently
Did not follow redirect to https://router/
Site doesn't have a title (text/html).
   PX-EH
XenServer 5.6.0
XenServer 5.6.0
XenServer 5.6.0
XenServer 5.6.0
Our Wiki
Moved
Teh Internets!
# Print the IPv4 addresses of all hosts having script output.
$ xmlstarlet sel -t -m '//port/script/../../../address[@addrtype="ipv4"]' -v @addr -n before.xml
192.168.1.1
192.168.1.3
192.168.1.5
192.168.1.6
192.168.1.10
192.168.1.81
192.168.1.200