Rapid7 Research

Project Sonar

Gaining insights into global exposure

An Introduction to Project Sonar

Project Sonar started in September of 2013 with the goal of improving security through the active analysis of public networks. While the first few months focused almost entirely on SSL, DNS, and HTTP enumeration, the discoveries and insights derived from these datasets, especially around the identification of systems unknown to IT teams, led to the expansion of Project Sonar to include the scanning of UDP services.

Today, Project Sonar conducts internet-wide surveys across more than 70 different services and protocols to gain insights into global exposure to common vulnerabilities. In turn, this informs Rapid7’s more focused studies such as the Quarterly Threat Reports and the National Exposure Index, as well as our product development and related research. The datasets are available to the public at opendata.rapid7.com in an effort to enable further security research.

How It Works

Sonar Diagram

For endpoint studies, Project Sonar gathers data in two stages: In the first stage, all public IPv4 addresses (about 3.6 billion of them, excluding those opted-out) are scanned in an attempt to determine which have the respective service port open. Endpoints identified as having this port and protocol open are then communicated with, with the hope of extracting useful intelligence. As part of these activities, Sonar discovers names that might represent DNS records. For example, Sonar will obtain names from HTML links discovered during HTTP studies, and will extract the Common Name and other names included as part of SSL certificates. Sonar then performs weekly DNS studies using nearly 3 billion names as input, asking for several different DNS record types with useful intelligence.

Using this data, Project Sonar helps security practitioners and researchers:

    • Explore endpoint exposure; one objective is to determine how many hosts on the public Internet have port 25/TCP open.
    • Understand asset exposure, such as how many IPv4 addresses owned by a given organization are exposed on the Internet. Another example inquiry includes how many DNS names under a domain owned by an organization were discovered and resolved on the public internet.
    • Understand risk exposure. Some questions that can be answered by Sonar data are, “how many hosts on the public internet are exposing a risky service like SMB?” and, “which operating systems are these endpoints running?”

While Project Sonar exists to ultimately improve our collective security posture, you may opt to whitelist or blacklist the subnets from which it scans by emailing research@rapid7.com with your CIDR blocks/list of IP addresses and affiliation.

For more detail on the inner workings of Project Sonar, visit opendata.rapid7.com.

Access the Datasets

Our team makes Project Sonar datasets available to the public, so that you can get started on your own security research.

View for Free

A Bird's Eye View of Project Sonar

In this video, Bob Rudis, Chief Data Scientist at Rapid7, digs deeper into how Project Sonar data is being put to use by the Labs team, including a recent (and pretty impressive) impact story.

Data Mining the Undiscovered Country

In this week’s Whiteboard Wednesday, Bob Rudis, chief data scientist at Rapid7, revisits the presentation he gave at Rapid7’s 2017 UNITED conference. He digs deeper into how the data from Project Sonar and Lorelei Cloud are being put to use by the Rapid7 Labs team, as well as the upcoming launch of a new study on headless browser HTTP scans.