In today’s Whiteboard Wednesday, Bob Rudis, Chief Data Scientist at Rapid7, will discuss a recent research report released by Rapid7, “The National Exposure Index”.
This Rapid7 report offers an extensive and technical exploration of data derived from Project Sonar, our security research project that gains insights into global exposure to common vulnerabilities through internet-wide surveys across different services and protocols.
Given the increased reliance we all have on the internet - for everything from ecommerce, to monitoring the power grid, to adjusting our thermostats - we wanted to leverage the reach of Project Sonar to understand overall internet threat exposure at both a general level and at a country/region level.
Watch this week’s Whiteboard Wednesday to learn more.
Welcome to this week's Whiteboard Wednesday. My name is Bob Rudis, and I'm Rapid7's chief data scientist. Today we're talking about the release of our latest data-driven research report, The National Exposure Index. Tod Beardsley, security research manager, Jon Hart, senior security researcher, and I set out to try to understand overall threat exposure on the Internet at both the general level and at a country and region level. Now, we've defined exposure as an internet node, offering services that either potentially expose sensitive data over cleartext channels, or are widely recognized to be unwise to make available on the Internet, and we chose 30 well-known services to count using Rapid7's Project Sonar.Show more Show less
After launching all our studies and gathering the initial results, we came face to face with our first bit of discovery, the Internet is broken. How is the Internet broken, apart from the mere existence of Facebook and 4chan? Well, to breathe some life into IPv4, countries, carriers, content delivery networks and companies are all using modern tools to intercept traditional TCP and UDP connections and bend them to their collective wills. This makes it difficult to get guaranteed accurate data from our lightweight CNAC handshakes, which is what we used for this inaugural survey.
Now, another discovery was that the Internet of 2016 looks a great deal like the Internet of 1996, except that much like San Francisco or Boston's T, it's busting at the seams. That is, there are orders of magnitude, more nodes to count. Now, the same insecure services combined with some new ones are being exposed en masse across the globe. The major difference is that there are virtually no barriers to standing up infrastructure, and the need or desire for rapid innovation means that there's no minimum standard of care when deploying these new services, and won't be any time soon.
Let's take a look at some key findings on the report. First, Telnet, good old Telnet. The granddaddy of services providing access to remote systems since 1973 is still going strong. We detected the presence of 15 million nodes responding to TCP port 23. That's 10% of all the nodes in the survey and 5% of all the nodes that respond to ICMP ping requests. Given that Telnet is one of the most insecure methods of communication, it's amazing to see it's still in use at this scale. An even more head scratching fact is that there appear to be over 14 million printers connected directly to the Internet. LPT1 for life, I guess. Now, to keep from being completely depressed, we only looked for two database servers, MySQL and Microsoft SQL Server, and unfortunately found the combined total of 11.2 million of them just waiting for your SQL queries.
Now, we could've just posted a bunch of services counts in a blog post, rather than go through the monumental effort of building a report. Over a dozen folks at Rapid7 were involved in the production of the National Exposure Index. The main part of the report is taking many of these services in pairs, since most of the insecure ports we studied have secure counterparts. Think port 80 versus port 43, so for secure web traffic and not. We compared the balance of these pairings by country and then ranked them from worst to best, i.e. the ones that were most insecure percentagewise from the ones that offer more secure versions. And we then took all these rankings and fed them into a fancy cross-entropy Monte Carlo rank aggregation algorithm. That is, we let the computer do the hard work so Tod wouldn't get a headache having to think. That algorithm eventually came up with the final National Exposure Index of the top 50 most exposed countries.
Want to know which country is number one? You'll have to head on over to community.rapid7.com and get your own copy of the report. We'll be providing some follow-up blog posts and have provided the data and code used in the study at github.com/rapid7/data so you can try this at home and augment or improve upon our work. For those of you without a full data science stack available, keep an eye out for some interactive visualizations. We're developing them and we'll put them on community.rapid7.com to help you explore the data. Thank you.