Understanding the Meltdown and Spectre vulnerabilities
The new vulnerabilities are in microchip processors and affect just about every modern day computing device, including desktops, cloud providers, and cell phones. One of the vulnerabilities (Spectre) can in theory be used remotely and potentially allow an attacker to access sensitive information. Spectre, while lower risk, is going to be a challenge to fix and will be problematic for the foreseeable future. Meltdown can be fixed by software updates, and despite a low CVSS score, is considered critical by many and poses a real risk specifically to cloud environments. Should you rush to do something in order to fix it? The answer is not so clear.
Many cloud vendors have already implemented patches without you even knowing it. Some software fixes that have been quickly published by vendors impacted are rumored to have negative impacts. The impacts include CPU slowdowns potentially up to 30% and others are more severe with new Microsoft patches supposedly breaking systems and certain antivirus software even causing the dreaded BSOD (Blue-Screen Of Death). Reports and details have changed substantially over the past 36 hours since the initial media coverage. Intel has suggested they have “made significant progress” in rolling out security patches and firmware updates to protect against the issue. While more patching will be required for organizations to truly protect their organizations, the truth is that this event is still unfolding and not all of the information is available to make the proper risk management decisions for your organizations at this point.
The vulnerability details of Meltdown and Spectre
Google indicates that “every Intel processor which implements out-of-order execution is potentially affected, which is effectively every processor since 1995 (except Intel Itanium and Intel Atom before 2013).” They successfully tested Meltdown on Intel processor generations released as early as 2011 and note that it is unclear whether ARM and AMD processors are also affected by Meltdown. This attack is independent of the operating system, and it does not rely on any software vulnerabilities. On a press conference phone call on January 3rd, Intel emphasized that this is not a “procedural flaw or bug“, rather it is a side-channel attack, while some news articles describe this as a “fundamental chip/processor design“. As such, it is unknown if this can be patched by Intel; thus, software vendors are creating patches to workaround the issues.
We published an initial entry in our VulnDB service on January 3rd that covered what was rumored to be one vulnerability based on existing information. We were holding off until more details became available because the initial information wasn’t actionable (e.g. there wasn’t certainty if it was a processor or OS vuln, which vendors were impacted, etc.) but ultimately published without as much detail as we would have liked due to the increased media. Within a couple of hours of publishing that entry, Google broke the January 9 embargo (more on this point later) and published considerable details. Once we had more detail we ended up abstracting to create a total of three entries that have been updated several dozen times since. We continue to monitor the news, social media, and researcher offerings to ensure our coverage is accurate and timely.
The rush to patch
While Meltdown and Spectre were fully disclosed on January 3, 2018, many companies had been working behind the scenes to develop patches for their software and services. With the lead time before disclosure, the goal was to proactively patch services in a way that wouldn’t disrupt operations, and for software vendors to have patches ready for customers. One critical aspect of patch development is testing; every patch has to be tested for all supported cloud instances, operating systems, web browsers, and other software impacted. With the disclosure happening six days before planned, some of the patches seem incomplete and problematic. Despite some vendors having months to develop and test the patches, they didn’t work out so well for customers. In no particular order, a sampling of the problems these patches have caused:
- Symantec Endpoint customers may experience Blue-Screen Of Death (BSOD) after patching. Customers will have to wait a full day for a second patch. [Source]
- The Register reports that Azure cloud customers are similarly experiencing issues after Microsoft’s patch. [Source]
- Microsoft is warning customers that their patch may conflict with anti-virus software and require the anti-virus vendor to set a specific registry key. [Source] For those curious if you are impacted, Kevin Beaumont is maintaining a spreadsheet listing each anti-virus vendor’s disposition and/or response.
- There is a lot of speculation, along with some testing, showing that patches to mitigate the flaws may slow down computers by up to 30%. [Source] Intel disputes this saying that over time the effect will be negligible.
- Many Amazon AWS customers are reporting significant CPU performance drops after the patches. [Source]
While patches are pushed out, end-users can look at workarounds to help mitigate the issue. For example, users of the Chrome browser can follow simple steps to enable an experimental feature that should stop such attacks until the next version of Chrome is released, with additional mitigations built in. Users of Microsoft IE / Edge can install the patches released by Microsoft last night. Users of Firefox are not as fortunate, as they have to wait until the release of the next version which is said to have mitigations in place.
Attribution and collisions
When word of these vulnerabilities broke in the news, largely on January 2, 2018, we didn’t know it was even more than one vulnerability. Further, we didn’t know who had discovered the vulnerabilities which is interesting. Since there was already word that an embargo was in place, it was pretty clear that security researcher(s) had discovered the issue and were coordinating with vendor(s). This is a much better alternative to a major flaw being discovered while being exploited in the wild by criminals. Once the details were released on January 3rd, the case of discovery became even more interesting. The vulnerability dubbed ‘Meltdown’ was independently discovered and reported by Jann Horn of Google Project Zero, Werner Haas and Thomas Prescher from Cyberus Technology, and a team of four researchers from Graz University of Technology in Austria.
The vulnerabilities dubbed ‘Spectre’ were independently discovered by Jann Horn of Google Project Zero and a group of five researchers from academia and a commercial company. On the surface, three separate researchers, or groups of researchers, finding the same vulnerability may seem incredible. However, as we have written about in the past, vulnerability rediscovery or vulnerability collisions are actually a lot more common than many realize. To us, the interesting part about this collision is that the discovery came from three relatively different areas; a researcher with a large service company that was given leeway to find vulnerabilities, a commercial company that specializes in cryptography, and an academic research group. This shows that a wide variety of cybersecurity researchers can be interested in the same type of research and ultimately find the same issues.
Last, one aspect we haven’t seen reported, that we find particularly interesting, is that the group of researchers (largely from academia) did this work under several government grants. According to their footnotes, their work was “supported in part by NSF awards #1514261 and #1652259, financial assistance award 70NANB15H328 from the U.S. Department of Commerce, National Institute of Standards and Technology, the 2017-2018 Rothschild Postdoctoral Fellowship, and the Defense Advanced Research Project Agency (DARPA) under Contract #FA8650-16-C-7622.” In short, we can thank the U.S. government for funding the research that led one group to discover these vulnerabilities, while not helping to coordinate the disclosure.
Exploiting Meltdown and Spectre
One more thing to ponder on the back of exploitation and vulnerability rediscovery is that we can’t assume the three parties are the only ones to have discovered these vulnerabilities. There may be several other parties that discovered this, and potentially a lot sooner, but did not reveal that fact. Governments and so-called “nation-state actors” as well as criminal organizations that use computer crime as their business model have the expertise to find such vulnerabilities, and no reason to share the details once found. So if bad actors had discovered this vulnerability and been using it all along, would we know? Despite the boilerplate disclaimer we frequently see, ala “Microsoft has not received any information to indicate that these vulnerabilities have been used to attack customers at this time”, that doesn’t have much weight in many cases. As the Meltdown site says:
Does the “cloud” mitigate risk?
Since “the cloud” became a household term many years back, the debate over the value of outsourcing services and infrastructure to the cloud has been hotly debated. There are clearly merits to using cloud-based services for both businesses as well as some aspects of your personal life (e.g. email). In the case of Meltdown and Spectre? There are advantages and disadvantages! On the upside, the companies that manage the cloud infrastructure are not only responsible for patching the vulnerabilities for you, the big players were part of the coordinated disclosure. That means by the time you learned of the vulnerability, a significant portion of major cloud providers had already mitigated the vulnerability or were close to completing the rollout of the patches.
On the downside? Your entire cloud infrastructure was just as vulnerable as your on-premises computers. Even worse? One of the two named vulnerabilities, Meltdown, poses a much more serious risk to cloud environments where a single host can be shared among multiple companies. Meltdown, despite being “only local”, allows someone with access to a system to potentially disclose arbitrary portions of the system’s memory. So a user with legitimate access to an Amazon AWS instance could in theory use this to disclose memory that contains sensitive information from a different company that shares the same instance.
With a well-designed exploit, an individual could rent an AWS instance for a few hours and potentially steal a wide variety of sensitive information, including customer details, passwords, and more. As always, using cloud resources has tradeoffs, and this is one more reminder of what should be in the ‘risk’ column when deciding.
The disclosure of Meltdown and Spectre
At some point in the week leading up to this article, rumors of a “big Intel vuln” started circulating. It is difficult to pinpoint where the rumors started, but there were signs of it well before the news articles started coverage. On November 17, 2017, and again on November 30, posts to the linux-arm-kernel mail list discussed large proposed patches to the Linux Kernel referencing them as the ‘KAISER’ patches along with a link to the associated paper. On December 4, a post to the unofficial Linux Kernel Mailing List discussed similar patches, calling them “Kernel Page Table Isolation (was KAISER).
The referenced name and paper, “KASLR is Dead: Long Live KASLR” was published June 24, 2017 by Daniel Gruss et al. In hindsight, Gruss et al were one of the three parties to discover the vulnerabilities that would become known as Meltdown and Spectre. Despite all of that information being public, it wouldn’t be until January, 2018 before the full impact would be realized. Jump to December 14, when Amazon AWS customers were notified via email of an upcoming wave of reboots on January 5 to fix a security issue.
That and other hints of something big to come were made clear in a January 2, 2018 article by The Register. At the time, the article was comprehensive, well-written, and did a good job of explaining what the community collectively knew. Unfortunately, a day later that article became largely obsolete when Google, Intel, and Gruss et al broke the planned January 9, 2018 embargo on the vulnerability information. By this point, there were a dozen or more news articles and thousands of Tweets speculating about the “big Intel vulnerability”. Intel broke the embargo due to “inaccurate media reports” while Google broke the embargo due to “growing speculation”.
With the details released, we realized that there were two major vulnerabilities, Meltdown and Spectre, and that Spectre was itself two separate variants giving us three distinct vulnerabilities. In the months leading up to the disclosure, we subsequently learned that Google had started patching their infrastructure in early 2017, a “majority” of Microsoft’s Azure cloud offering had been patched, macOS Server had been patched quietly as of version 10.13.2 (released December 21, 2017), and Microsoft’s “Windows Insiders” received preliminary patches for these vulnerabilities in November. The rest of the Windows world had to wait until January 3, 2018 for the patches. Note that according to the Verge, the patches were to include only the “Intel vulns” for Windows 10. In reality, Microsoft decided to release their full “Patch Tuesday” set of patches, including the “Intel vulns” and 33 additional issues, six days ahead of schedule.
Meanwhile, more mainstream news outlets like the New York Times were covering the vulnerabilities, albeit, with some errors. By mid-day January 4, 2018, many major vendors had released advisories with varying states of mitigation including VMware, Citrix, Red Hat, Microsoft, and more, with some vendors still evaluating the impact. For those relying on CVE/NVD for vulnerability intelligence, they got a partial win? Unlike some prior named vulnerabilities, MITRE opened up the three CVE IDs for these issues (CVE-2017-5754, 2017-5753, 2017-5715) in a timely fashion (January 3rd), but didn’t even include the names of the vulnerabilities in each entry. The sparse descriptions also give no real insight into the nature and real severity of the vulnerabilities. Meanwhile, NVD still hasn’t opened them up on NVD let alone generate CVSS scores or CPE data. Unfortunately, MITRE did not offer any guidance to CVE assignments to the CNAs either, prompting one CNA to consult Flashpoint for advice on assigning for their products.
Multi-vendor disclosures, especially to this scale, are tricky to say the least. Coordinating patches and disclosure across dozens of vendors, while leaving hundreds more out, will always be messy. In this case, it isn’t clear when the coordination between the various researchers, Google, AMD, and ARM began, but it seems the embargo lasted for several months. Ultimately, the information was disclosed six days shy of the targeted disclosure date (January 9, 2018). Regardless of the fallout, this is yet another clear reminder of the benefits and pitfalls of large-scale embargos.
The slow burn of Meltdown and Spectre: Exploits, lawsuits, and perspective
In this update, we continue to examine higher-level aspects of the disclosures a week after the initial fallout.
Patch and performance
Perhaps a bigger news aspect than the vulnerabilities themselves, were reports saying that patches to mitigate these flaws may slow down processors by up to 30%. At the same time, Intel claimed that “performance impact of these updates is highly workload-dependent and, for the average computer user, should not be significant”. Now that administrators have had time to monitor systems after various vendor patches, we’re seeing a more accurate report of the impacts.
With many organizations using cloud services for a significant part of their public facing offerings, many have been especially concerned about the patches’ potential hit to performance, which could increase monthly costs dramatically. The graph above comes from Ian Chan who noticed the spike after the patches were presumably deployed to their AWS EC2 instances, and noticing that the spikes range from 5 to 20%. Amanpreet Singh shared a graph showing his Redis / Elasticache install on AWS Managed Services, with a similar look. Similarly, Ruben Berenguel noticed spikes but points out that in addition to the slow down, the costs for the services will go up approximately 10% for the month. This is one aspect of patching in the cloud that many organizations have overlooked. In his case, they are noticing new instances spawning because the current ones can’t handle the workload. Companies that use the cloud for public facing services, such as the situation at Epic Games, may face login issues and service instability after the patches.
For traditional on premise servers and home machines, the initial benchmarks and testing are also showing significant spikes as well. Peter Czanik mentioned increased compile times on his Fedora system, where syslog_ng went from a 4 minute compile time to a 21 minute compile time. He also mentions that he believes compiling Java is the most affected. The Guru of 3D blog did extensive testing and benchmarks on a Windows 10 machine. Their results show that while there are noticeable changes, some aspects of the system rumored to be impacted (e.g. memory performance) are not showing any real difference, while other aspects like File I/O and disk performance are taking a hit. They go on to test browser performance and the results indicate minimal changes even under heavy usage, while gaming / GPU performance shows negligible change before and after the patch.
There are mixed results for one-off users doing CryptoCurrency mining. One particular Reddit thread has users suggesting there could be significant performance hits while others do not see the same impact. Regardless, even a small performance hit on a mining system could mean serious financial implications and in some cases make some systems not profitable (e.g. electricity consumption costs exceed currency output). The thread also reminds users to think about threat modeling and consider if they even need the patches, while others say that configuration options could help offset the post-patch performance hits. Kirk Pepperdine has a great summary about PCID (Process-Context Identifiers) functionality and how that may help minimize patch performance issues.
Finally, as noted in our first blog, the Meltdown patch is still causing serious compatibility issues with some software. The latest is Microsoft’s Meltdown patch, which appears to break the PulseSecure VPN client. Microsoft’s patches seemingly play poorly with AMD-based PCs and has been confirmed as causing performance dips on older versions of Windows.
Exploitation, detection, and mitigation
Coming to grips with Meltdown and Spectre reminds us of a fundamental truth – focusing on how to respond to the vulnerabilities is critical. Understanding the risk from an exploitation angle is frequently first on the mind. The time between a vulnerability disclosure and the time for someone to publish a functional exploit, often referred to as Total Time to Exploit, is one metric we track in VulnDB. Unsurprisingly, the time for working proof-of-concepts / exploits to be released was fairly short, with exploits already being released for both Meltdown and Spectre. As always, when looking for public exploits, it is important that organizations evaluate the code before running it. For example, Jerry Gamblin tweeted about a Meltdown PoC on January 5th, receiving 349 Retweets and 626 likes. We have a feeling very few people actually read the code.
The detection front has been more interesting, given initial speculation and statements that Meltdown could not be detected and Spectre would depend on how the attack was carried out. Since the disclosure, we’ve seen several tools that claim they can assist in detecting exploitation such as a set of Snort rules developed by the Talos team, a blog about detecting Meltdown using Capsule8, and an online Spectre detection utility from Tencent. In addition to detecting an attack, utilities are being released to verify you have the mitigations in place such as the SpecuCheck tool for Windows and exploit sample tracking via Hybrid-Analysis.
Finally, Intel has released a comprehensive analysis of the vulnerabilities in a paper titled “Intel Analysis of Speculative Execution Side Channels” (PDF). Perhaps more interesting is a detailed Twitter thread by Joe Fitz, a former Intel engineer, on why fixing these vulnerabilities at the chip level is so difficult. This may give perspective on why, despite knowing about the issues for six or more months, an Intel patch wasn’t immediately ready prior to the disclosure.
One aspect of the Meltdown and Spectre disclosures that has been fairly unique is the lawsuits. It’s typical to see a multitude of consumer lawsuits filed in the days immediately following a data breach announcement, but it’s rare to see legal activity arising out of vulnerability disclosures. Within days of disclosure, Intel found themselves facing the beginning of at least three separate class-action lawsuits. Based on discussions with more legally-knowledgeable associates, Flashpoint offers some general commentary for consideration about the merits of the three lawsuits in California, Oregon, and Indiana. We are not lawyers and this is not legal advice, just speculative commentary.
The cases out of Indiana and Oregon are primarily based on claims of state level unfair or deceptive Practices. These state level unfair trade practices acts are often called the “Little FTC Acts.” In the Indiana case, the plaintiffs are also arguing a breach of implied warranty claim, which may offer an opportunity for the court to expand contract law to recognize that every contract includes an implicit promise of security. While we are not familiar with cases that include implied warranty of the security of a product, legal scholars have advocated this approach, as security becomes more of a concern to consumers. However, if Intel made no explicit promises of security to computer manufacturers and their forthcoming solution does not significantly impact the speed of processors on the market, Intel may find themselves victorious. However, if they are found to have made explicit promises of security to manufacturers and especially consumers, or their ultimate solution significantly degrades the performance of processors already deployed, a case against Intel may have merit.
Legally, the case out of California is perhaps the most interesting, the most aggressive, and on the surface, the most likely to win of the three cases. This case brings up unfair competition as well as Little FTC Acts claims with the most robust pleadings of the three. Further, the case points out a refusal to recall vulnerable processors and dug into the contracts behind it. If this case proceeds based on the available information, we may see a settlement with Intel sooner than later. Ultimately, Intel may opt to settle all of these cases and more depending on the number of lawsuits ultimately brought against them. Settling and offering coupons for upgraded processors may be more cost effective than court costs including the potential of losing cases and having to pay on the class-action suits, or if they also face any fines from regulators (e.g. the Federal Trade Commission (FTC)).
Over the next several years, these three cases and any subsequent cases will be of interest to hardware and software vendors as security becomes more of an expectation and as vendor’s claims of security are examined more thoroughly in the courts. As Michael Scott argues in his paper titled “Tort Liability for Vendors of Insecure Software: Has the Time Finally Come?” from August 2007, “tort law can provide an ideal mechanism for enforcing the reasonable expectations of software licensees and users, particularly in the area of software intended to secure computer systems and networks.”
Disclosure history addendum
We already gave a very brief synopsis around the disclosure of Meltdown and Spectre, pointing out some prior works that were warning signs of what was to come. Building on that thread, three more bits have come to light that make the history of these issues more interesting. In no particular order:
- Simha Sethumadhavan Tweeted that he gave a presentation to Intel around six years ago on the “time tools and techniques to detect and mitigate microarchitectural side channels (Side Channel Vulnerability Factor measurement and method, and the TimeWarp mitigation from ISCA12).”
- Joanna Rutkowska and Rafał Wojtczuk did research in 2010 along these lines, but didn’t publish since it was under NDA and they did not have a working attack.
- In 2014, Immunity, Inc. wrote a paper with a synopsis of “Intel CPU information leak” discusses similar issues.
- In August, 2017, Intel SGX for Linux was found to be vulnerable to cache-timing attacks that disclosed privileged information to a local attacker, just like the latest. That issue didn’t even receive a CVE assignment.
- Similar research was done years back on an Xbox 360, but not published until now.
Anders Fogh published a blog titled “Behind the scenes of a bug collision” that gives some of his history and speculation on how so many parties found the same issue. He also cites prior work going back to 2005 that likely helped set the stage for this. Andy Greenberg of Wired wrote an article covering the triple-discovery as well, with some additional details about how it unfolded. These examples show that the idea and legwork was there many years ago, yet the full vulnerabilities weren’t discovered and made public until a week ago. This is a good reminder of the state of vulnerability research and that there are hundreds of highly-skilled researchers out there, all capable of finding such issues.
Since the disclosure of Meltdown and Spectre, Intel has faced some serious criticism and even lawsuits as mentioned previously. On one hand, as noted above, fixing this vulnerability is not simple by any means. On the other hand, the prior research leading up to this disclosure can be argued as a huge warning sign to Intel and that perhaps they should have been more cognizant of the threat and impact. While many are focusing on this one disclosure and using it to make more sweeping statements about vulnerability collisions, others are having to point out why such statements are incorrect. With that, here are some thoughts and questions that are central to this disclosure.
Could Intel have predicted or prevented these vulnerabilities? Predict, mostly likely yes, but prevent is still to be determined. Side-channel attacks against cryptography are fairly common, and cryptography algorithms are supposed to be resilient against such attacks. Yet those don’t seem to make the news with the same furor as the recent issues, even when that algorithm may be used in widely deployed or sensitive systems. Is Intel diligent in their handling of security vulnerabilities? Yes. They maintain an extensive set of pages on their website with security advisories, work with researchers, and frequently answer third-party questions asking for clarity on their disclosures. With AMD and Arm finding themselves vulnerable to a degree, it begs the questions if they offer the same security resources and diligence. But so far, there seems to be very little finger-pointing in their direction. Granted, sometimes Intel could benefit by looking to improve their public response to such issues (note the ITWire article calls F00F a remote issue, but public information around time of its disclosure suggests otherwise).
Vulnerabilities in processors and related functionality happen more than you think. There were 33 vulnerabilities in Intel processors/software in 2017 alone. In fact, it could be easily argued that at least one of them was far more significant than Meltdown or Spectre, especially given all of the vendors impacted. In the middle of Meltdown and Spectre making the news, two days before technical details were published, Cfir Cohen of the Google Cloud Security Team disclosed a vulnerability in AMD’s PSP. AMD describes it as “a dedicated processor that features ARM TrustZone® technology, along with a software-based Trusted Execution Environment (TEE) designed to enable third-party trusted applications.” Cohen’s vulnerability allows for unauthenticated remote code execution due to a buffer overflow. Yet somehow, this was missed by most and didn’t enjoy the same mainstream press while likely being a bigger risk to AMD users.
Vulnerabilities are disclosed every day, to the tune of over 20,000 new disclosures in 2017 alone. Just because a vulnerability receives a name, a website, and/or a marketing campaign does not necessarily mean it is high risk or that it will impact your organization. As always, we strongly encourage organizations to cut through the noise and focus on the details relevant to them, and make a decision based on that alone.
In addition to the usual news articles covering these disclosures, there are many others that are pointing out interesting aspects:
- Alasdair Allan brings up a good point about who may be impacted by this vulnerability more than others. He notes that CryptoCurrency exchanges may be particularly targeted as the private keys to wallets would be of high value.
- Daniel Gruss et al, researchers behind the “KASLR is Dead: Long Live KASLR” paper, submitted their research to BlackHat Briefings, but were rejected. Sometimes research that may be fairly critical doesn’t seem appealing to conferences.
- Richard Grisenthwaite from Arm Limited published a whitepaper titled “Cache Speculation Side-channels” covering the susceptibility of Arm implementations.
- Robert O’Callahan points out the continuing problems with the disclosure of Meltdown and Spectre, some of which we have noted in vendor advisories and commentary about the vulnerabilities (primarily around Spectre since it is multiple issues). Alex Ionescu humorously summarizes this confusion.
- Artturi Lehtiö Tweeted that IBM mainframes are vulnerable to these issues. IBM is still investigating per their blog.
- Shawn Webb gives his perspective on these vulnerabilities, with a focus from a FreeBSD and Hardened BSD slant.
- Hector Martin has created a GitHub repo to collect documentation and resources about the vulnerabilities and invites users to share by contributing pull requests.