What It Is All About
A core assumption underlying organizational security practices is that defenders are able to remediate known vulnerabilities in their systems in a timely fashion. Otherwise, attackers can just follow the breadcrumbs laid out by security advisories and exploit known weaknesses. This is indeed what happens in many large breaches. While progress has been made at the level of consumers, with automatic updates and default patching settings, this does not translate to enterprises. They face a painful dilemma: patch too soon and incur potential downtime and failures; patch too late and get compromised by attacks. As a result, organizations take a long time to patch even critical security vulnerabilities. The central objective of the project THESEUS is to empower organizations to patch much faster. It aims to achieve this by radically changing the risk governance of patching.
Table of Contents
An Unmitigated Disaster
Contrary to common belief, more than 99% of attacks do not use highly advanced exploitation techniques or so-called zero-days, but rather known vulnerabilities that have often already been fixed in security patches for a long time1. The Rathenau institute recently advised the Dutch government that new technologies for security, such as artificial intelligence and post-quantum cryptography, will not matter if a basic issue like security patching remains unsolved2. For this reason, the current practice of delayed patching is arguably the most important problem in security today.
The consequences of delayed patching can be disastrous. During the infamous wave of Wannacry and NotPetya ransomware in 2017, all major incidents (e.g., disruptions of Maersk, 16 UK hospitals shutting down, disruption of Renault factories) occurred several months after Microsoft had released a patch that would have prevented the outbreaks. In 2017, Equifax suffered a major data breach that leaked personal data of over 140 million customers. It became one of the biggest data breaches in history. The attackers exploited a known vulnerability in the Apache Struts servers that could and should have been patched in March 2017 when the patch was released and administrators were told to apply it through all affected systems, but the process failed, and the vulnerability remained open in multiple systems until the end of July 20173. Even high-profile security vulnerabilities such as Heartbleed remain unpatched in many organizations for months after disclosure4. More recently, months after the vulnerabilities of two enterprise solutions (Pulse Connect Secure VPN and Citrix ADC and Gateway) were made public and mitigations became available, Internet-wide scans found thousands of organizations that were still running vulnerable systems.
In essence, the lack of patching hygiene is a governance problem. As the impact of breaches often radiates from the individual organization towards supply chains, customers, and third parties, the slow response to known vulnerabilities is also a major issue for society as a whole and thus the government. The Dutch Safety Board announced that they will investigate the Citrix incident to improve the “governance of digital security”5. The Dutch minister of Justice & Security announced he was considering top-down interventions for critical industries: mandating organizations by law to implement the security advisories of the government.
A mandatory patching regime could, however, well be worse than the disease. In the last three years, 48,373 vulnerabilities were reported to the NVD database, of which 11,628 were considered to have high severity. To mandate patching of all of these vulnerabilities would potentially expose organizations to thousands of planned and unplanned interruptions, even though most of these vulnerabilities are never exploited in the wild6. That being said, the fact that the government is announcing such drastic measures underlines the fact that we are dealing with a governance challenge. The societal impact of long vulnerability windows is too severe to leave the problem to the trial and error of individual organizations.
The challenge is daunting as organizations lack both the means and the incentives to mitigate relevant vulnerabilities in a timely fashion. The CISO of a Dutch hospital recently told us that patch deployment in their organization takes three months, on average. After seeing how Wannacry affected other health institutions, she initiated a crisis management procedure in order to patch that specific vulnerability faster than normal. Even with this peak effort, it would take the organization one month to deploy the patch. The question is: why does it take a month to patch a single vulnerability, despite the organization’s full effort and best intentions?
Patch your Software or Stay Vulnerable? What is Riskier?
Technically speaking, deploying a security fix is a problem that should not exist. Empirical studies7 have shown that, in almost all cases, the lines of code that need to be fixed can be counted on the fingers of one hand. The ideal fix process would be that a developer would patch these lines, eliminate the unwanted behavior, recompile the code, and provide users with the fixed version. Users, be they individuals or organizations, could replace the old version with the fixed one with close to zero downtime and zero functional tests, as the two versions would essentially be functionally identical (except for the unwanted behavior).
Unfortunately, the other substantial empirical evidence shows that this is not happening. A study on mobile applications8 showed that seemingly trivial updates in the underlying libraries would have broken the application in almost 50% of the updates: toss a coin and, in case of ‘head’, your application breaks. It is no surprise, then, to find that companies may delay the deployment of the not-vulnerable version by months. We might blame these companies, and security experts typically do, but we claim that there is a more fundamental obstacle: a vendor should fix only what is needed and nothing else. This option is often not available for a large population of products used by companies. Software used by companies has a long life in the field (Figure 1 in the cited work7 show a mean lifetime of eight years and a long tail of 14 years). The product evolves, and the security fixes are in practice bundled by suppliers with `improvements’ to keep abreast of the competition or gain new customers. Redhat Enterprise Linux started 2012 and will reach its extended end of life midway this project. Siemens WinCC V4.0 (software for controlling industrial control systems) was supposed to have died in 2004 (after seven years in service according to Siemens’ plan) but the latest migration instructions from Siemens date from May 2020.
A Way Forward
In short, organizations face a fundamental risk tradeoff: balancing the risk of not patching versus the risk of patching. The latter incurs ongoing business continuity disruptions, the former incurs a potentially catastrophic compromise event9. Since patching is risky and the risk of not patching is unknown, though frequently nothing happens, the incentive is for organizations to patch slowly. It is a “devil you know versus the devil you don’t” situation. With the disruptions to IT management arising from the global pandemic of 2020, patching is at risk of being further neglected even while vulnerabilities continue to be discovered10.
The way to get out of this catch-22 is to radically change the risk governance of patching. That is the objective of our proposed THESEUS project. While major advances have been made with automatic updating in the consumer space, these have not translated—and more importantly, cannot be translated—to the enterprise space. Changing the risk of patching for enterprises means to develop breakthroughs at three interdependent levels:
- Systems: drastically reducing risk of patching via new techniques in automatic vulnerability and patch triaging, as well as automatic patch generation with live update for cases where critical patches pose unacceptable availability risks.
- Enterprises: more efficiently quantifying risk of patching by assessing and aggregating the results of the patch triaging as a way to estimate exploit likelihood in a coherent picture that accounts for different attacker models and functional impact.
- Governance: more effectively managing risks of patching by introducing incentive mechanisms via notifications and information sharing, sector-wide benchmarks of patching speed, and potentially legal instruments.
Our proposed project sets out to (1) bring advances from the lab to real-world settings by working with a consortium of partners that contribute people, data, and pilots to the project; and (2) replace the infeasible and counterproductive idea of mandatory patching with a larger and more sophisticated set of governance interventions across different levels: system, enterprise, and sector. Our target is that enterprises can have a secure environment within which people can go about their work, without obstruction or the need for the organization to carry intractable risks. Progress in this area not only helps to thwart major impacts, but also generates benefits, for example in terms of making it easier and more secure to network within supply chain systems. Furthermore, it generates benefits for security providers, also present among our partners, who can market solutions to enlarge their client base.
-
Moore, A. (2017) Focus on the Biggest Security Threats, Not the Most Publicized. At: https://www.gartner.com/smarterwithgartner/focus-on-the-biggest-security-threats-not-the-most-publicized. ↩︎
-
Van Booheemen, P., Munnichs, G., Kool, L., and Hamer, J. (2020) Cyberweerbaar met nieuwe technologie: Kans en noodzaak van digitale innovatie. At: https://www.rathenau.nl/nl/digitale-samenleving/cyberweerbaar-met-nieuwe-technologie. ↩︎
-
Fruhlinger, J. (2020). Equifax data breach FAQ: What happened, who was affected, what was the impact? CSOonline. At https://www.csoonline.com/article/3444488/equifax-data-breach-faq-what-happened-who-was-affected-what-was-the-impact.html. ↩︎
-
Durumeric, Z., Li, F., Kasten, J., Amann, J., Beekman, J., Payer, M., … & Halderman, J. A. (2014). The matter of heartbleed. In Proceedings of the 14th Internet Measurement Conference (IMC) (pp. 475-488). ↩︎
-
https://www.onderzoeksraad.nl/en/page/17171/beveiligingslek-citrix ↩︎
-
Allodi, L. & Massacci, F. (2014), Comparing Vulnerability Severity and Exploits Using Case-Control Studies. ACM Transactions on Systems Security. Vol.1. Also in industry: How CVSS is DOSsing your patching policy (and wasting your money). Black Hat USA 2013. ↩︎
-
Dashevskyi, S., Brucker, A. D., & Massacci, F. (2018). A screening test for disclosed vulnerabilities in foss components. IEEE Transactions on Software Engineering, 45(10), 945-966. ↩︎
-
Huang, J., Borges, N., Bugiel, S., & Backes, M. (2019) Up-To-Crash: Evaluating Third-Party Library Updatability on Android. In Proceedings of the 4th IEEE European Symposium on Security and Privacy (EuroS&P). ↩︎
-
Ioannidis, C., Pym, D., & Williams, J. (2012). Information security trade-offs and optimal patching policies. European Journal of Operational Research 216.2 (pp. 434-444). ↩︎
-
PricewaterhouseCoopers (PwC). “Managing the impact of COVID-19 on cyber security”. March 2020. At: https://www.pwc.co.uk/cyber-security/pdf/impact-of-covid-19-on-cyber-security.pdf ↩︎