Nmap/Structured Script Output
This feature has been completed and merged. The documentation is at https://nmap.org/book/nse-api.html#nse-structured-output
Overview
Nmap's XML output is intended to be the official machine-readable format for programs which consume Nmap output. Unfortunately, the output of NSE scripts is currently handled as a blob of text and stuffed into the output
attribute of the script
tag. This makes the output essentially inaccessible to a XML parser, and does not encourage consistency in formatting between scripts.
Because of these issues, there is a need for a method to structure script output in a way that facilitates parsing by scripts and programs. A few solutions have been suggested, and are described below. Some include patches, with varying degrees of testing and compatibility with the current codebase. Some outstanding questions are:
- Should there be a single, internal representation of script output, from which XML and Normal output formats are derived?
- Should script authors be able to specify different content for XML and Normal outputs?
- Should the structured-output method chosen be backwards-compatible with existing scripts? To what degree?
Requirements
- Human-readable text output must continue to appear in
//script@output
[XPath syntax] attributes.- This is the same output shown in normal screen output, with newlines and everything. In other words, keep backwards compatibility for tools like Ndiff.
- It is probably fine for human-readable output to be derived automatically from structured output; that is, for the human-readable output to be essentially a pretty-printed Lua table.
- Script authors should have a choice of formatting functions to use in generating human-readable output (e.g. tabular, indented, comma-separated, etc.)
- Script authors may be allowed to specify a Lua function which generates human-readable output from the Lua table
- It is probably not sufficient to automatically derive structured output from textual human-readable output.
- Scripts can't just make up their own new XML elements--it has to be possible to validate XML documents against a predefined DTD as always.
- It must be possible to represent
- Single string outputs
- Lists
- Dictionary name-value mappings
- Some degree of nesting of above elements (Arbitrary nesting is not necessary, though desirable).
- It is sufficient if all parts of the output (list elements, dictionary keys and values) are strings. That is, it's not necessary to represent an integer explicitly, for example. It's fine if processing scripts need to know the expected data types of the fields that they process. IP addresses can be represented textually.
To be continued...
Use cases
End user
User wants to count how many times the http-title script had output containing "Linksys".
User wants to find all hosts that have ssh-hostkey output, and save a report to a text file where each line contains an IP address and an MD5 host key fingerprint.
User wants to make a copy of an XML output file, but remove output from any *-ls scripts.
User wants to find what hosts and ports have services with expired SSL certificates (the "Not valid after" key of ssl-cert).
User wants a mapping of which accounts are registered with the Messenger service on what machines (One of the Names from nbtstat).
Here is an example of the XML that could be generated for the output of the nbtstat script:
<script name="nbstat" output="...Flags: <unique><active>
">
<dict>
<elem key="NetBIOS name">NAS</elem>
<elem key="NetBIOS user"><unknown></elem>
<elem key="NetBIOS MAC"><unknown></elem>
</dict>
<list name="Names">
<dict>
<elem key="name">NAS</elem>
<elem key="number">20</elem>
<elem key="flags">
<list>
<elem>unique</elem>
<elem>active</elem>
</list>
</elem>
</dict>
<dict>
<elem key="name">JDOE</elem>
<elem key="number">03</elem>
<elem key="flags">
<list>
<elem>unique</elem>
<elem>active</elem>
</list>
</elem>
</dict>
</list>
</script>
The Lua table that is converted into this XML could look something like this:
{
{ NetBIOS_name="NAS", NetBIOS_user="<unknown>", NetBIOS_MAC="<unknown>" },
Names={
{ name="NAS", number="20", flags={ "unique", "active" } },
{ name="JDOE", number="03", flags={ "unique", "active" } },
}
}
Xpath to make the selection: //script[@name="nbstat"]/list[@name="Names"]/dict/elem[@key="number"][text()="20"]/../elem[@key="name"]
User wants to find Windows machines with SMB Message Signing disabled (from smb-security-mode).
Script author
The duplicates script wants to output a list of lists--each one a set of IP addresses that appear to be the same host for some reason.
The smb-ls script wants to output a list of file and directory names. Each element in the list is a dictionary with keys like mtime
, size
, and filename
. The script also wants to output a compact human-readable format that looks like the output of ls
.
Output proposals
Here we show the evolution of the output proposals so far and make room for more proposals.
Status quo
This is what XML output looks like as of Nmap 6.01. Line breaks added for readability.
$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert <script id="ssl-cert" output="Subject: commonName=secwiki.org...
 Issuer: organizationName=Equifax/countryName=US
 Public Key type: rsa
 Public Key bits: 1024
 Not valid before: 2010-11-15 19:22:12&#amp;xa; Not valid after: 2012-11-17 05:18:45&#amp;xa; MD5: c729 827b 8941 9bdc 20b0 43b4 9d9d 1595
 SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"/> <script id="http-title" output="SecWiki"/>
YAML for everything
YAML proposal of 2011-03-31 (thread continues). YAML patch.
$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert <script id="ssl-cert" output="Subject: commonName=secwiki.org...
 Issuer: organizationName=Equifax/countryName=US
 Public Key type: rsa
 Public Key bits: 1024
 Not valid before: 2010-11-15 19:22:12&#amp;xa; Not valid after: 2012-11-17 05:18:45&#amp;xa; MD5: c729 827b 8941 9bdc 20b0 43b4 9d9d 1595
 SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"/> <script id="http-title" output="SecWiki"/>
(Well, I guess nothing changes except for some spacing, because the normal format_output
output is already close to YAML, at least when you use colons to label things.)
XML elements representing YAML structure
The usual method of representing YAML as XML is no good because it creates arbitrary new elements. Instead David proposed to represent the structure using a finite number of new XML elements. http://seclists.org/nmap-dev/2011/q2/149.
$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert <script id="ssl-cert" output="..."> <dict> <elem key="Subject">commonName=secwiki.org...</elem> <elem key="Issuer">organizationName=Equifax/countryName=US</elem> <elem key="Public Key type">rsa</elem> <elem key="Public Key bits">1024</elem> <elem key="Not valid before">2010-11-15 19:22:12</elem> <elem key="Not valid after">2012-11-17 05:18:45</elem> <elem key="MD5">c729 827b 8941 9bdc 20b0 43b4 9d9d 1595</elem> <elem key="SHA-1">157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063</elem> </dict> </script> <script id="http-title" output="SecWiki"> <elem>SecWiki</elem> </script>
Proposal alpha
http://seclists.org/nmap-dev/2012/q2/375 of 21 May 2012. Sample output changes. Relative to the previous proposal, it removes output
attributes, and doesn't have a dict
container around the dictionary.
$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert <script id="ssl-cert"> <elem key="Subject">commonName=secwiki.org...</elem> <elem key="Issuer">organizationName=Equifax/countryName=US</elem> <elem key="Public Key type">rsa</elem> <elem key="Public Key bits">1024</elem> <elem key="Not valid before">2010-11-15 19:22:12</elem> <elem key="Not valid after">2012-11-17 05:18:45</elem> <elem key="MD5">c729 827b 8941 9bdc 20b0 43b4 9d9d 1595</elem> <elem key="SHA-1">157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063</elem> </script> <script id="http-title"> <elem>SecWiki</elem> </script>
Proposal beta
Proposal of 2012-06-13 to support parallel string and table outputs. String output without table output causes no structured output to be emitted. Table output without string output synthesizes a pretty-printed indented text output. If both outputs are provided, they are independent. Structured XML output uses dict
, list
, and elem
elements. Each child of a dict
must have a key
attribute; children of list
must not have such an attribute.
$ nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert <script id="ssl-cert" output="Subject: commonName=secwiki.org...
 Issuer: organizationName=Equifax/countryName=US
 Public Key type: rsa
 Public Key bits: 1024
 Not valid before: 2010-11-15 19:22:12&#amp;xa; Not valid after: 2012-11-17 05:18:45&#amp;xa; MD5: c729 827b 8941 9bdc 20b0 43b4 9d9d 1595
 SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"> <dict> <dict key="subject"> <elem key="commonName">secwiki.org</elem> </dict> <dict key="issuer"> <elem key="organizationName">Equifax</elem> <elem key="countryName">US</elem> </dict> <dict key="pubkey"> <elem key="type">rsa</elem> <elem key="bits">1024</elem> </dict> <dict key="validity"> <elem key="notBefore">2010-11-15 19:22:12</elem> <elem key="notAfter">2012-11-17 05:18:45</elem> </dict> <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem> <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem> </dict> </script> <script id="http-title" output="SecWiki"/>
Proposal gamma
Slightly modified from Proposal Beta, the major difference is replacement of both dict
and list
elements with table
elements, more closely reflecting the Lua table structure. Also, since every script that has XML output is returning a table, the outermost table
element is not output:
<script id="ssl-cert" output="Subject: commonName=secwiki.org...
 Issuer: organizationName=Equifax/countryName=US
 Public Key type: rsa
 Public Key bits: 1024
 Not valid before: 2010-11-15 19:22:12&#amp;xa; Not valid after: 2012-11-17 05:18:45&#amp;xa; MD5: c729 827b 8941 9bdc 20b0 43b4 9d9d 1595
 SHA-1: 157b 440e 3df4 2994 7a82 13d4 1856 5da6 f10f 3063"> <table key="subject"> <elem key="commonName">secwiki.org</elem> </table> <table key="issuer"> <elem key="organizationName">Equifax</elem> <elem key="countryName">US</elem> </table> <table key="pubkey"> <elem key="type">rsa</elem> <elem key="bits">1024</elem> </table> <table key="validity"> <elem key="notBefore">2010-11-15 19:22:12</elem> <elem key="notAfter">2012-11-17 05:18:45</elem> </table> <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem> <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem> </script> <script id="http-title" output="SecWiki"></script>
If a script returns a string (or a number), then the string is set as the @output attribute of the script
element, which has no child nodes. If a script returns a table, it will be "pretty-printed" into a textual format using a recursive algorithm. This table:
{ "This is an example", name="thing one", isCool=false, ["Now for some numbers"] = {42, 13, 7, count=3} }
Results in this output:
| script-name: | This is an example | name: thing one | isCool: false | Now for some numbers: | 42 | 13 | 7 | count: 3 |_
This proposal is getting traction, having been implemented at https://github.com/bonsaiviking/Nmap-script-XML. Next steps:
- Implement some standard transforms and make a metatable-setter function in stdnse to expose them to script writers.
- Begin adapting existing scripts to use this API, identifying bugs and improvements.
Proposal gamma-1
Proposed as extensions to proposal gamma. http://seclists.org/nmap-dev/2012/q2/973
- Support errors as distinct elements in XML, with standardized handling for normal output. Considerations:
- Should error elements be at top level only, or allowed to be nested in tables?
- How can a script indicate something is an error? ("error" table index perhaps)
- Possibly support warnings as distinct elements, too.
- Let's handle errors as something distinct. I don't think they need to distract from solving just plain output. We'll add a special <error> element, and it will be set by a function call, not by a further complicated script return value. The same goes for <vuln>, which will have its own distinguished section distinct from script output. David (talk) 14:57, 19 July 2012 (PDT)
Nmap-script-XML c9c97295 2012-07-15
$ <b>nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert</b> <script id="http-title" output="SecWiki"></script> <script id="ssl-cert" output="issuer: organizationalUnitName: Equifax Secure Certificate Authority countryName: US organizationName: Equifax md5: c729827b89419bdc20b043b49d9d1595 pubkey: type: rsa bits: 1024 sha1: 157b440e3df429947a8213d418565da6f10f3063 subject: organizationalUnitName: Domain Control Validated - RapidSSL(R) countryName: US organizationName: secwiki.org commonName: secwiki.org serialNumber: UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ validity: notBefore: day: 15 sec: 12 min: 22 hour: 19 month: 11 year: 2010 notAfter: day: 17 sec: 45 min: 18 hour: 5 month: 11 year: 2012"> <table key="subject"> <elem key="organizationalUnitName">Domain Control Validated - RapidSSL(R)</elem> <elem key="countryName">US</elem> <elem key="organizationName">secwiki.org</elem> <elem key="commonName">secwiki.org</elem> <elem key="serialNumber">UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ</elem> </table> <table key="issuer"> <elem key="organizationalUnitName">Equifax Secure Certificate Authority</elem> <elem key="countryName">US</elem> <elem key="organizationName">Equifax</elem> </table> <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem> <table key="pubkey"> <elem key="type">rsa</elem> <elem key="bits">1024</elem> </table> <table key="validity"> <table key="notBefore"> <elem key="day">15</elem> <elem key="sec">12</elem> <elem key="min">22</elem> <elem key="hour">19</elem> <elem key="month">11</elem> <elem key="year">2010</elem> </table> <table key="notAfter"> <elem key="day">17</elem> <elem key="sec">45</elem> <elem key="min">18</elem> <elem key="hour">5</elem> <elem key="month">11</elem> <elem key="year">2012</elem> </table> </table> <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem> </script>
Merge candidate 1
$ <b>nmap -oX - -n -Pn -p 443 secwiki.org --script=http-title,ssl-cert</b> <script id="http-title" output="SecWiki"> <elem key="title">SecWiki</elem> </script> <script id="ssl-cert" output="Subject: commonName=secwiki.org/..."> <elem key="sha1">157b440e3df429947a8213d418565da6f10f3063</elem> <elem key="pem">-----BEGIN CERTIFICATE----- MIIDazCCAtSgAwIBAgIDFTSzMA0GCSqGSIb3DQEBBQUAME4xCzAJBgNVBAYTAlVT MRAwDgYDVQQKEwdFcXVpZmF4MS0wKwYDVQQLEyRFcXVpZmF4IFNlY3VyZSBDZXJ0 aWZpY2F0ZSBBdXRob3JpdHkwHhcNMTAxMTE1MTkyMjEyWhcNMTIxMTE3MDUxODQ1 WjCB3TEpMCcGA1UEBRMgVUdpZ3pLLTdqNzlweEI3eFczTVpNZlBlWVdZL2dKbkox CzAJBgNVBAYTAlVTMRQwEgYDVQQKEwtzZWN3aWtpLm9yZzETMBEGA1UECxMKR1Q5 NjkzNzIwMDExMC8GA1UECxMoU2VlIHd3dy5yYXBpZHNzbC5jb20vcmVzb3VyY2Vz L2NwcyAoYykxMDEvMC0GA1UECxMmRG9tYWluIENvbnRyb2wgVmFsaWRhdGVkIC0g UmFwaWRTU0woUikxFDASBgNVBAMTC3NlY3dpa2kub3JnMIGfMA0GCSqGSIb3DQEB AQUAA4GNADCBiQKBgQCtGfJ589J4p8HOqE1U8EfXaawS0Q5VURsvifPItd8hjt7O eQ4nVgYMFV5qRIG1ey9kPU32+6smtyCumXNq32GyDcJ3PHVKxWNSMOSsu7auy0VC KK7qM5tDE2Yjuwb16ZsGjMoPDgCLtljcx2CU1nykXBonTe54V0QzkhfkfHtt6QID AQABo4HGMIHDMB8GA1UdIwQYMBaAFEjmaPkr0rKV10fYIyAQTzOYkJ/UMA4GA1Ud DwEB/wQEAwIE8DAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYBBQUHAwIwFgYDVR0R BA8wDYILc2Vjd2lraS5vcmcwOgYDVR0fBDMwMTAvoC2gK4YpaHR0cDovL2NybC5n ZW90cnVzdC5jb20vY3Jscy9zZWN1cmVjYS5jcmwwHQYDVR0OBBYEFDClKsFfPfqm nkl4AhfuBNlc2KivMA0GCSqGSIb3DQEBBQUAA4GBALRSt0QX4TsuOWQoBTX2ZQSE 4jlxXev5IHXWSOuVdUnpdxrxMwpp1mQ7gl+KH/VNG2PeGwUdSXvuBgN8V+fCudCJ rgJJYpLtLqGfxXU+XaL34ICJH+fwG7shWb8k1W4Q0Z54JeTyjI56F/GpDWY9jtk0 LffllcPwT+7t3GGbZOYS -----END CERTIFICATE----- </elem> <table key="issuer"> <elem key="countryName">US</elem> <elem key="organizationalUnitName">Equifax Secure Certificate Authority</elem> <elem key="organizationName">Equifax</elem> </table> <table key="subject"> <elem key="countryName">US</elem> <elem key="organizationName">secwiki.org</elem> <elem key="commonName">secwiki.org</elem> <elem key="organizationalUnitName">Domain Control Validated - RapidSSL(R)</elem> <elem key="serialNumber">UGigzK-7j79pxB7xW3MZMfPeYWY/gJnJ</elem> </table> <elem key="md5">c729827b89419bdc20b043b49d9d1595</elem> <table key="validity"> <elem key="notBefore">2010-11-15T19:22:12Z</elem> <elem key="notAfter">2012-11-17T05:18:45Z</elem> </table> <table key="pubkey"> <elem key="bits">1024</elem> <elem key="type">rsa</elem> </table> </script>
Links
- http://seclists.org/nmap-dev/2011/q1/1129 Proposal of 31 March 2011 to use YAML both in normal output and in XML output; i.e., use YAML for everything. The human-readable and machine-readable forms would be the same.
- http://seclists.org/nmap-dev/2011/q2/149 Idea of 7 April 2011 for serializing YAML to XML instead of embedded YAML text in XML output.
- http://seclists.org/nmap-dev/2012/q2/375 Patch of 21 May 2012. Uses
container
andelem
elements. Derives XML structure automatically from the structure given tostdnse.format_output
. - http://seclists.org/nmap-dev/2012/q2/747 Proposal of 13 June 2012. Parallel text and table output.
XPath
Examples on this page use XPath selector syntax to refer to XML elements and attributes. You can grep an XML file using these selectors with the xmlstarlet tool.
-
//script
selects allscript
elements. -
//hostscript/script
selects allscript
elements that are immediate children of ahostscript
element. -
//script[@id="http-title"]
selects allscript
elements with theid="http-title"
attribute set.
Here are examples of xmlstarlet use using a sample before.xml XML file.
# Print all <script> elements. $ xmlstarlet sel -t -m '//script' -c . -n before.xml <script id="http-methods" output="No Allow or Public header in OPTIONS response (status code 501)"/> <script id="http-title" output="301 Moved Permanently..."/> <script id="ssl-cert" output="Subject: commonName=XXXXXXXXXXXX/organizationName=..."/> ...
# Print id and output of all hostscript outputs. $ xmlstarlet sel -t -m '//hostscript/script' -v @id -o ': ' -v @output -n before.xml nbstat: NetBIOS name: NAS, NetBIOS user: <unknown>, NetBIOS MAC: <unknown> smbv2-enabled: Server supports SMBv2 protocol nbstat: NetBIOS name: XXXX, NetBIOS user: <unknown>, NetBIOS MAC: xx:xx:xx:xx:xx:xx (unknown) smb-os-discovery: OS: Windows Vista (TM) Enterprise 6002 Service Pack 2 (Windows Vista (TM) Enterprise 6.0) Computer name: XXXX NetBIOS computer name: XXXX Workgroup: WORKGROUP System time: 2012-05-27 21:58:06 UTC-5
# Grep out all HTML titles. $ xmlstarlet sel -t -m '//script[@id="http-title"]' -v @output -n before.xml 301 Moved Permanently Did not follow redirect to https://router/ Site doesn't have a title (text/html). PX-EH XenServer 5.6.0 XenServer 5.6.0 XenServer 5.6.0 XenServer 5.6.0 Our Wiki Moved Teh Internets!
# Print the IPv4 addresses of all hosts having script output. $ xmlstarlet sel -t -m '//port/script/../../../address[@addrtype="ipv4"]' -v @addr -n before.xml 192.168.1.1 192.168.1.3 192.168.1.5 192.168.1.6 192.168.1.10 192.168.1.81 192.168.1.200