Research: Project Sonar

Introduction

Project Sonar started in September of 2013 with the goal of improving security through the active analysis of public networks. While the first few months focused almost entirely on SSL, DNS, and HTTP enumeration, the discoveries and insights derived from these datasets, especially around the identification of systems unknown to IT teams, led to the expansion of Project Sonar to include the scanning of UDP services.

Impact

Today, Project Sonar conducts internet-wide surveys across more than 70 different services and protocols to gain insights into global exposure to common vulnerabilities. In turn, this informs Rapid7’s more focused studies such as the Quarterly Threat Reports and the National Exposure Index, as well as our product development and related research. The datasets are available to the public at opendata.rapid7.com in an effort to enable further security research.

How it Works

For endpoint studies, Project Sonar gathers data in two stages: In the first stage, all public IPv4 addresses (about 3.6 billion of them, excluding those opted-out) are scanned in an attempt to determine which have the respective service port open. Endpoints identified as having this port and protocol open are then communicated with, with the hope of extracting useful intelligence. As part of these activities, Sonar discovers names that might represent DNS records. For example, Sonar will obtain names from HTML links discovered during HTTP studies, and will extract the Common Name and other names included as part of SSL certificates. Sonar then performs weekly DNS studies using nearly 3 billion names as input, asking for several different DNS record types with useful intelligence.

Data

Using this data, Project Sonar helps security practitioners and researchers:

Explore endpoint exposure; one objective is to determine how many hosts on the public Internet have port 25/TCP open.
Understand asset exposure, such as how many IPv4 addresses owned by a given organization are exposed on the Internet. Another example inquiry includes how many DNS names under a domain owned by an organization were discovered and resolved on the public internet.
Understand risk exposure. Some questions that can be answered by Sonar data are, “how many hosts on the public internet are exposing a risky service like SMB?” and, “which operating systems are these endpoints running?”

A Bird's Eye View of Project Sonar

In this video, Bob Rudis, Chief Data Scientist at Rapid7, digs deeper into how Project Sonar data is being put to use by the Labs team, including a recent (and pretty impressive) impact story.

Browse the Datasets

To fuel innovation in the field, we provide Project Sonar datasets for commercial purchase and via strategic partnerships with select academic institutions.

View

Additional Research Projects

Project Lorelei

Furthering our understanding of the attacker mindset.

Learn more

Project Lorelei

Furthering our understanding of the attacker mindset.

Learn more

While Project Sonar exists to ultimately improve our collective security posture, you may opt to whitelist or blacklist the subnets from which it scans by emailing [email protected] with your CIDR blocks/list of IP addresses and affiliation.

For more detail on the inner workings of Project Sonar, visit opendata.rapid7.com.