What is Data Leakage? 

Data leakage is the occurrence of an organization inadvertently exposing sensitive information – usually due to a mistake like overlooking a critical vulnerability – to the public internet or unsecure networks. This process increases the chances that data will be taken by malicious actors.

In a worst-case scenario, data “leaks” off of the originating secure network and into the hands of bad actors who will hold the sensitive data for ransom or leak it wider onto more visible platforms and websites.

Data Leaks vs. Data Breach: What's the Difference? 

The National Institute of Standards and Technology (NIST) defines a breach as: 

"The loss of control, compromise, unauthorized disclosure, unauthorized acquisition, or any similar occurrence where: a person other than an authorized user accesses or potentially accesses personally identifiable information; or an authorized user accesses personally identifiable information for anything other than the authorized purpose.”

Simply put, a data breach is when data is knowingly accessed in an unauthorized manner. A data leak is when an authorized user mistakenly exposes data to the internet or unauthorized networks, but it technically hasn’t been stolen – yet.

The difference between these two terms is small but important when taking actions to secure the data in question or when reporting on the incident later.

How Does Data Leakage Occur? 

Data leakage occurs as a result of a number of mistakes or oversights – or something that no one in an organization would ever have thought of. Let’s take a look at a few ways data leakage can occur:

  • Human error: Way back in 2012, we said that there were a staggering number of cases involving human error that were leading to unprecedented governmental challenges in securing critical infrastructures, intellectual property, economic data, employee records, and other sensitive information. 12 years later, this still holds true.
  • Legacy or outdated data: Keeping archived data can have benefits, but more often this type of outdated information is becoming a significant vulnerability/liability to businesses around the world. However this legacy data is secured, eventually there will be a crack in its armor and that data will be exposed. Whether or not malicious actors pick up on the fact that this information is there for the taking or not is another question besides the critical one: Is it absolutely necessary to keep this old data around?
  • Poor password hygiene: If IT and security organizations do not implement sophisticated identity and access management (IAM) solutions to consistently update – and generate – new passwords, then odds are it’s only a matter of time before something like a credential stuffing attack occurs and the bad actors are exfiltrating data.
  • Vulnerabilities: It happens every day, everywhere: a vulnerability goes overlooked or undiscovered in the software development lifecycle (SDLC) and attackers are taking advantage in the blink of an eye. Depending on the size of a business or DevOps organization, with limited resources it simply may not be possible to catch everything.

What are the Effects of Data Leaks? 

The effects of data leaks can be disastrous. But, like with anything in security, so much of the process is about timing. If analysts are able to catch the cause of data leaks early, the overall business may be lucky enough to entirely avoid any negative fallout. Or it may be able to minimize the damage. Or it may have to deal with business- or reputation-altering repercussions.

Damage to Reputation 

Waiting until something happens shouldn’t be the priority; it should be planning in case of the event. Damage to reputation is something that can and should be scoped prior to the occurrence of any significant future event. That way, a business and its IT and security organizations will have a playbook to follow in such a situation. This will help minimize lasting negative reputational impact.

Damage to Finances

Following on from possible large-scale reputational damage, there is a two-pronged effect when it comes to a business’ bottom line: potential ransomware payments to threat actors as well as customers taking their business elsewhere. Businesses could quickly find themselves bankrupt or extinct if they aren’t prepared for the consequences of unintended data leakage.

Damage to Operations

The amount of time it takes for an organization to return to normal operations will depend on the severity of the security event following a data leak and in-progress initiatives that may have to be fully halted in an “all hands on deck” type data security event. This can cause incredible disruption to a business and create an operational deficit from which it could be near-impossible to return.

Damage to Talent Acquisition

The current cybersecurity talent shortage and skills gap only seems to continue to exacerbate as more managed security service providers (MSSPs) are called upon to provide monitoring, detection, and response actions on behalf of clients. Hiring skilled in-house talent can already be a laborious enterprise. Following a breach that causes catastrophic reputational damage? Not likely.

Types of Data Leakage

While there are obviously certain data types that are of higher value to threat actors – personally identifiable information (PII), financial- and health-related, etc. – what are some of the main vectors by which data leakage occurs? We’ve covered some of the various functionalities, but let’s now group them by type.

Human Error

Whether it was initiated by an internal source or perhaps a supply chain partner, to be classified as human error in this sense the act/disclosure/exposure must be unintentional. The root cause of this data exposure or leak might have begun as a misconfiguration during the SDLC and turned into a gaping vulnerability through which high-value data was exposed.

The inciting incident could also be something much less technical. Leaving workstations unattended and accessible while working remotely and lost devices are two such examples of mishaps that occur every day and lead to unintended negative consequences.

Attacker-initiated 

For the purposes of this page, we are mainly discussing data leakage in a scenario whereby an internal actor – employee, visitor, contractor, vendor, etc. – would unknowingly leave data unprotected or exposed to potential theft or ransom.

However, if an exposure is leveraged by attackers to more easily steal potentially sensitive data, then this type of leak would be attacker initiated. The responsibility for the exposure, though, still lies with the person or people who were initially tasked with securing the data. But if a door is left open, we can all reasonably assume there aren’t many attackers who wouldn’t throw it wide open and steal sensitive data.

How to Prevent Data Leakage

It's entirely possible to effectively prevent sensitive enterprise-level data from being exposed and subsequently leaking onto the public internet or into the data stores of malicious actors.

Whether one of the following preventive options are used as a standalone solution or part of a larger product suite, each organization should keep their unique needs and goals in mind when researching which solution/product is best for their environment.

  • Institute a data loss prevention (DLP) solution: DLP solutions typically focus on the endpoint, network, and cloud. This functionality specifically addresses the issues we’ve discussed at length here, such as vulnerabilities as a result of misconfiguration and accidental exposure.
  • Leverage encryption: Data encryption protects data from unauthorized use or access by using a “key” to encrypt a message on one end of a transmission and using the same key to decrypt it on the other end. With this process, even if malicious actors are able to successfully exfiltrate data, there is a good chance it will be of no use to them if strong encryption protocols are used. Increasingly, machine learning and AI are being used to create more sophisticated encryption techniques.
  • Shift left: Ensuring security processes are part of the SDLC – and thereby a true DevSecOps workflow – can vastly cut down on the amount of vulnerabilities that go out the door at the end of the build cycle. By integrating security checks into Infrastructure as Code (IaC) templates and other parts of the coding process, DevSecOps organizations reduce the chances of a critical data leak.
  • Train employees and partners: Engaging employee workforces in security awareness training that covers topics like basic password and authentication best practices can go a long way toward warding off a leak if, for instance, a device is lost or passwords are reused over a long period of time.