Last updated at Mon, 13 May 2019 13:02:00 GMT
This is part three in a three-part series on medical device risk management, particularly as it pertains to vulnerability assessment. In part one, we discuss the processes and procedures to implement inside of a clinical environment to position the security team for success. Part two gets in the weeds and examines how to directly perform assessments on medical devices. In part three, we put it all together with an example of how an organization would implement these ideas with a based-in-reality medical device.
In our previous two posts, we have built out some theory on how to approach safely bringing medical devices into the vulnerability management life cycle. We have done this with an eye toward patient safety, up to the point where we’re proactively preparing for mistakes to be made. What would this process look like in practice?
A new initiative
Welcome to Mooseville Medical! A major metropolitan hospital system, Mooseville Medical has a primary level-one trauma center, several connected satellite clinics, and administrative personnel spread out wherever they fit. The bulk of the organization is focused on the main hospital, and taken as a unit, the environment is littered with everyday workstations, a modest data center, and—of course—medical devices.
We in the information security team have recently been told, somewhat ungracefully, that we need to examine “the cyber stuff” with regards to the organization’s medical devices. The lone piece of guidance we’ve been given is to (all together now) avoid patient harm.
There are a few things we know already as we begin to tackle this new initiative. First, the medical devices are segmented. We verified this with the biomedical team when implementing our vulnerability management program. Biomedical actually maintains a handy inventory of devices they have deployed. Several of those don’t even have a connected component and therefore have a dramatically reduced risk profile. With that said, there are a couple of systems—including a newer infusion pump with a management server—that are very much connected.
As we are not experts in any of these devices, our colleagues in biomedical prove to be an invaluable resource. After we describe our initiative (and they’ve calmed down when we tell them we’re not going to touch their production network), they’re enthusiastic that we’re showing an interest in their world. We’re looking for something that’s going to be a good first test bed: high-impact, network-connected, and readily available to play with. It doesn’t take long to circle in on the infusion pumps. There’s a ton of them deployed, they’re connected, and they cycle in and out of maintenance constantly, allowing us to steal one away for some experimentation.
They’re also kind enough to give us some operational data on the pump. It runs a proprietary operating system, with a few extra gigabytes of storage beyond what the OS needs for updates and configuration. It has modest memory and processing power. The management server runs Windows 2008 and communicates with the pump via a web server on port 8080. From a connection perspective, the pump has a standard ethernet port and a serial port that the biomedical team never touches. There are buttons on it for manual configuration of dosages, and a simple physical user interface for examining the current configuration of said dosages.
The biomedical team is also letting us play with it in their testing environment. It’s a modest setup; there’s a non-prod management server connected to a few pumps that are out of production for maintenance. It’s all connected via a consumer router.
Learn by doing
We should probably grab the documentation for the device and give it a read, but nuts to that. Learn by doing! Connecting to the router, we ping the pump and get a return. Cool, good to note. Telnet to 8080 succeeds, verifying that the device is currently in something resembling normal operation, at least according to the biomedical team. Throwing caution to the wind, we fire up Nmap and let a SYN scan fly against all TCP ports. The findings are ... unexpected. Port 21 comes back alive, but everything else is dead—including 8080. Huh. Sure enough, Telnet to 8080 is no longer connecting.
The biomedical team is equally confused. After a power cycle, the device appears to be working normally again, and we take another swing. This time, we introduce a rate limit on Nmap. Other than being slower, though, the results are the same. Bouncing the pump again, we change Nmap from a SYN scan to a full TCP connect and suddenly, the world is illuminated. Ports 21, 22, 23, and 8080 are live. Service enumeration identifies no surprises on those ports—an FTP server, an SSH server, a Telnet server, and a web server. They’re all open source and all a tad on the old side, but versions are clear.
Here, we resist the urge to start really pounding the device. Old versions of four services, with two of them unencrypted remote access services, are bound to have at least some vulnerabilities we can take advantage of. However, the SYN scan issue is a concern, indicating that this device isn’t the most stable thing in the world when scanning. We put a pin in this information for now.
Unleash the vuln scanner
Nmap and Telnet are great first tools, but we want to see how this thing plays with our vulnerability scanner. We know our vuln scanner is multi-threaded, so it’s going to introduce an extra load element that Nmap didn’t have. Our usual default template checks to see first whether a device is alive via ICMP and some well-known TCP ports. Next, it does service discovery on well-known TCP and UDP ports, as well as performs some operating system detection. Finally, it launches vulnerability checks—10 at a time—matched against the discovered services and OS. Before launching, we make one critical change to the template: switching from a SYN scan to a full TCP connect. Look at us, learning.
The scan finishes and, at first blush, we get what we expect. The four services are discovered, the scanner basically has no idea on the OS beyond some generic unix fingerprint, and we get a handful of generic vulnerabilities. Included are some gross-but-probably-false positives on username/password combos on the Telnet service. We’re only human, though, and when the vuln scanner tells us that “root/root” is going to work, well, what do you think we’re going to do?
The attempt fails, but not for the reason we expect. The Telnet service is unresponsive, and further attempts on the other known ports shows that they’re all unresponsive. While muttering a few expletives, we power cycle the pump and head to the logs. It just looks like we lost connectivity in the middle of vulnerability checks. Iterating a few times, we see the same behavior, though in slightly different places in the logs each time.
There are only a handful of things it could be. An individual vulnerability check could potentially crash a poorly built service, but is unlikely to knock the whole system over—and the logs aren’t stopping on a specific check, anyway. The most likely issue is that we hit it too hard. We drop the template down to running a single vulnerability check at a time (instead of 10) and let it fly again. Bingo! The device survives, and of course we find a few more generic vulnerabilities this time since our checks were able to complete. (Incidentally, the default Telnet password didn’t work. We’re pleased, but disappointed. You understand.)
After a little more experimenting, we find that the trouble starts at around seven simultaneous threads for vuln checking. We drop our template down to five for safety purposes. Having satisfied our need for a safe and stable scan template, it’s time to figure how secure this thing is. The couple of vulnerabilities that the vulnerability scanner found were fairly generic. We did get a few hits on the open source FTP and web server, but nothing remotely exploitable.
Finding the real vulnerabilities
Luckily, there’s a highly sophisticated research tool that reveals that there are indeed some device-specific vulnerabilities with this particular pump that our vulnerability scanner is not aware of. They’re fun vulnerabilities, too, and include:
- A remote buffer overflow specific to the version we’re running. There’s no available exploit code, and we’re not exactly exploit writers, but the disclosure notes that full remote access is possible.
- A hardcoded password. And this one works. Sigh.
- No certificate validation, vulnerable to man-in-the-middle.
- Passwords are stored in the clear on the device. Another sigh.
There’s good news with the bad news, though. Updated software for the pumps exists to address these. In addition, we’ve verified with biomedical and the device manual that FTP, SSH, and Telnet are not required for regular operations. Already we have some recommendations for the organization that can be safely implemented to lock these devices down better.
In addition to those recommendations, we make a few changes to the way we scan for vulnerabilities generally. Although we’re assured it’s unlikely, it’s still possible that a pump could drop into our main network instead of the medical device network, and our testing indicates that our default scan templates are not going to play nicely. However, the performance hit we’d take by slowing down our scans would be highly impactful, so we decide to break our vulnerability scanning into two steps—something we probably should have done already, anyway.
For step one, we implement a discovery scan. No vulnerability checks, we go relatively slow, and we do full TCP connects. We also build a dynamic asset group that’s explicitly designed to identify these pumps. Any assets that have those four services running and a generic unix OS get thrown into the asset group automatically and tagged with “Medical Device?” for further investigation. For now, all other assets get tossed into “safe to scan” asset groups, appropriate to their location and scan schedule. Then, our regular vulnerability scans operate off of the new safe asset groups, where we do what we’ve been doing. Just for fun, we run the new discovery scan across the environment to validate our assumptions that these devices aren’t on the primary network.
It takes some figuring—we end up having to visit a specific floor of the hospital and take a look at a few patient rooms when they’re unoccupied—but we finally determine that the network ports are not properly labeled. There’s a primary network drop and a medical device network drop in these rooms, and the pumps are sometimes placed on the wrong drop. After interviewing some of the medical personnel, they note that, oh yeah, sometimes the pumps are “flaky” with the network software, but they all know how to operate them offline so they just roll with it. This ends up the worst of both worlds—the devices are exposed on the main network (and we’ve been scanning them and probably knocking them over!) and there’s no management or reporting of issues established from the medical personnel to the biomed team to help us discover this.
What did we learn and, just as importantly, recommend by going through this exercise?
- The pumps we evaluated need to be updated and have non-critical services disabled (SSH, Telnet, FTP).
- There are some people processes to fix. Biomedical is working in conjunction with the clinical staff to ensure odd behavior gets reported rather than ignored.
- Medical devices get placed in the wrong place, and “defending” against that by intentionally treading lightly on the main network is super important.
- Security is developing a notification process for known medical devices showing up in non-medical networks.
- Patches aren’t getting pushed regularly. Many of the vulns found through either the scanner or our independent research were resolved by a software update. Since those software updates didn’t introduce functional changes, though, and there weren’t issues being reported by the clinical users, biomedical wasn’t aware of a need to push updates. For known medical devices, partner with the biomedical team to review security updates and the implications of applying the updates.
- We need to get into the clinical environment more. Flaws in how pumps are deployed in some of the patient rooms directly led to risk to the patients, and there is likely much more to be learned by spending time assessing these rooms.
- This process needs to be repeated with the other medical devices in Mooseville Medical. Significant risk was identified with the first one we examined. It’s likely this was only the tip of the iceberg.
Biomedical is on board to work with us, and more than a little startled at the first round of findings. Management is pleased (if concerned) with our first set of findings, and we’ve got defenders high and low to back up our continued work with medical devices. The work continues.
While this is a fictional example (obviously) I’d like to note that it is based in reality. In building it, I worked off of the profiles for existing infusion pumps and environments that I’ve worked in before. The vulnerabilities are all real and have all been observed in this type of device. I’ve embellished a little throughout the story to make it more interesting, but the scenario itself is realistic.
This series is intended to provide both guidance and ammunition for organizations interested in taking a serious look at their medical devices. It is critical that we understand the risk these devices bring with their utility and the risk we introduce by having them as part of a connected network. Heavy-handed approaches of “scan everything” without a deliberate approach will (and should) be met with the resistance. Coordinated, collaborative efforts ensure a better partnership and better security practices by everyone. It is important to realize that information security’s goal of serving and protecting can do more harm than good if not handled with care. In our zeal to protect patients, we can unintentionally become our own worst enemy. Let us not be the risk to the business—patient care is the business. Information security must respond accordingly.