Update: David Maldow of Human Productivity Lab wrote a response to the NYT article that presented an industry perspective on our findings. Mythical Videoconferencing Hackers and why we stand behind our claims. Additionally, the archive of Tuesday's webcast on the same topic (with live demos) is now available. Thank you to everyone who provided feedback!
Today's issue of the New York Times contains an article describing the results of research I conducted over the last three months. In short, a large portion of video conferencing equipment is connected to the Internet without a firewall and is configured to automatically answer incoming video calls. This allows a remote intruder to monitor both audio and video information, often with little or no indication to the target. The interesting part of this research is who it affects; these units can cost anywhere from a few hundred dollars (used) to tens of thousands of dollars for high-end room systems. It is rare to find a high-end video conferencing system in an unimportant location. Examples identified by this research include corporate boardrooms, inmate-lawyer consultation areas, venture capital firms, and research facilities.
This research covered about 3% of the addressable Internet and focused on equipment that spoke the H.323 protocol. Of the 250,000 systems identified with this service, just under 5,000 were configured to automatically receive incoming calls. There are an estimated 150,000 systems on the Internet as a whole affected by this issue. This does not count the hundreds of thousands of video conferencing systems exposed on the internal networks of large corporations.
Even cheap video conferencing systems provide an incredible level of visual acuity and audio reception. In the Rapid7 lab, we were able to easily read a six-digit password from a sticky note over 20 feet away from the camera. In an otherwise quiet environment, it was possible to clearly hear conversations down the hallway from the video conferencing systems. In most cases, the remote user has the ability to drive the camera - controlling pan, tilt, and zoom - providing visibility into areas far away from where the system is actually installed. A separate test confirmed the ability to monitor a user's keyboard and accurately capture their password, simply by aiming the camera and using a high level zoom. Another test demonstrated the ability to read a user's email on their laptop screen. If the system is connected to a television set that has not been powered on, the only indicator that a call is active will be the movement of the camera itself or a small light on the base of the system. Many of the high-end models do not include a visual indicator of a call in progress on the camera at all.
Video conferencing vendors have taken steps to provide security features, however the leading vendor, Polycom, still ships most of their equipment with auto-answer configured by default. Polycom provides a hardening guide, but default settings typically become the most common configuration, due to the lack of time, patience, or oversight required to successfully secure these devices. Other vendors, such as Sony, Tandberg (Cisco), Lifesize (Logitech), and Codian appear to require the user to specifically enabled auto-answer mode. Devices from each of these vendors were found during the course of the research, but they made up a much smaller portion of the whole compared to Polycom. Polycom documentation specifically calls out the security risks in the auto-answer option, but one would have to read the documentation, notice this, and then specifically configure the device to avoid this issue.
One of the most worrying parts of the research is related to firewalls. Many simple firewalls fail to handle the H.323 protocol in a way that works with common video equipment. There are multiple ways of solving this, including the use of H.323 gatekeepers, and firewall-friendly options within certain devices, but the easiest solution is to provide the device with a public address and call it done. The risk here is that many of these offer to little no security through their administrative interfaces, whether this is telnet, web, or a vendor-specific service.
All Metasploit editions contain an exploit module for a vulnerability in the LifeSize conferencing systems, published last year, which can be used to directly compromise unpatched LifeSize equipment through an exposed web service. Comparing the affected version numbers with the results of the research indicate that the majority of the LifeSize equipment identified on the Internet would be vulnerable to this exploit. Just like with other "network devices", IT departments typically ignore maintenance on video conferencing units unless there is a change to functionality. The picture below shows some of the LifeSize versions found in the wild, many of which are vulnerable to the exploit module included in Metasploit.
The web interfaces on video conferencing devices can often be used to initiate outbound calls to other parties. In some cases, the remote party may have adequately secured their system, but added an allowance for a particular device or organization. The ability to initiate calls on these devices can bypass the security of a third-party system in this manner. One example that was identified during the research process involved a law firm that had an address book entry for the board room of a well-known investment banking organization. A used device purchased from eBay arrived with an address book containing dozens of private sites, many of which were configured to auto-answer incoming calls from the Internet at large.
All shipping Metasploit editions contain a scanner module for quickly identifying H.323-enabled systems that accept incoming calls. This module is included in the default discovery mode of Metasploit Pro (free trial) and can be used to quickly inventory a large network to identify affected systems. This process also works for the free Metasploit Community Edition. The process for using Metasploit Pro to discover exposed H.323 devices is outlined below.
- Login to the web interface on https://metasploit:3790/
- Create a new Project
- Choose the Scan option
- Expand "Advanced Options" and enter "1720" into the Custom TCP Ports parameter
- Uncheck UDP and SNMP discovery options to increase scanning speed
- Launch the Scan task
- Once complete, browse to Analysis -> Services
- Enter "h323" into the Search box on the upper right
If you have already used the Scan component with the default settings, after applying any update in the last month, you should already have results available. Any device that accepted an incoming call should be identified by a minimum of protocol version and Vendor ID. Most devices will return the Product ID and DisplayName (Caller ID).
To validate an identified service, connect with a H.323-capable client such as NetMeeting (Windows XP), Ekiga (cross-platform, but buggy), Mirial Softphone (commercial), or ClearSea In the Cloud (only able to reach internet-exposed devices). For internal systems, I still rely on NetMeeting in a XP virtual machine as the most reliable H.323 client, however, this lacks the Pan-Tilt-Zoom (PTZ) and keypad controls of a more advanced client like Mirial or ClearSea In the Cloud.
IP-based H.323 really just scratches the surface of video conferencing security issues. ISDN is still used to connect many of these devices and this is much more difficult to survey. A used ViewStation recently purchased from eBay contained an address book of two dozen sites - all listed via their ISDN dial-in and not an IP address. Many systems, including the Polycom demo sites, offer both IP and ISDN-based services. My gut feel is that for every exposed device found on the public internet, there are twenty more attached to an obscure ISDN number that the IT department may have forgotten about. This may be a job for WarVOX, but until more research into ISDN analog dialing is done, it is hard to tell whether this is a workable solution for detecting exposed ISDN-attached video conferencing systems (normal PSTN lines receive a busy signal or failed call from an ISDN VC endpoint).
Video conferencing systems are one of the most dangerous but least-known exposures to organizations conducting business of a sensitive nature. The popularity of video conferencing systems among the venture capital and finance industries leads to a small pool of incredibly high-value targets for any attacker intent on industrial espionage or obtaining an unfair business advantage. Although many vendors provide some security measures, these tend to be ignored in the real world, by both IT staff and security auditors. The additional awareness raised by the NYT article, along with the introduction of scanning tools inside of Metasploit, will hopefully drive vendors and end-users to take video conferencing security seriously.
If you are interested in viewing a live demo of this technique, please view the on-demand webcast "Board Room Spy Cams: How Attackers Take Over Your Video Conferencing Systems And How To Stop Them"