By now most people have seen the three Bloomberg articles detailing the alleged conspiracy to install back-doors on servers assembled by SuperMicro via a tiny microchip. There are plenty of great takes already (1, 2, 3, 4, 5, 6, 7). Supply chain attacks are not new, nor are those using hardware implants. But this is high profile, alleged to be government perpetrated, and affects a large number of companies worldwide.
Several questions need to be answered here:
1) Is it possible from a purely technical standpoint?
2) Did it actually happen?
3) What do we do about it?
These articles make several interesting claims:
* A tiny microchip (size of a grain of rice) capable of installing remote back-doors
* Malicious hardware being installed at the SuperMicro factory
* Enabled network connectivity to a command and control server
* Network card firmware updates were backdoored in 2015
* Network connectors were found that contain hardware implants
Technical Feasibility
There are simply not enough technical details in the article to forensically disassemble how such an attack occurred. So, let’s turn this around: How would I implement such an attack from a hardware/firmware/software perspective?
Baseboard Management Controller
These are server-grade motherboards. Most such systems have a dedicated Baseboard Management Controller (BMC) on board, which is a secondary, dedicated computer for server management. This controller gives the administrators the capability to remotely write to the hard disks to install operating systems, upgrade the system BIOS, turn the power on and off, monitor health metrics such as temperatures, and view the video screen remotely just as though they were sitting at the console. It is well connected within the host system, sitting
on the Peripheral Component Interconnect Express (PCIe) and Low Pin Count (LPC) buses and has direct connections to the network interface. This allows straightforward pivots from the BMC to the host system by either directly issuing DMA transactions over PCIe, or by first compromising another PCIe device such as a network card. In short, the BMC is the most privileged security domain in the system, and a compromise here would allow a compromise of any other part of the system, including the BIOS,
hypervisor, and operating system.
BMCs come in all shapes, sizes, and brands. SuperMicro uses a very common BMC solution from ASPEED, HP has their iLO, Dell has their iDRAC, Facebook has created OpenBMC, and still many companies use their own custom solutions with in-house designed servers. All of these solutions have had vulnerabilities in the past, and IPMI itself has long been known to be a security disaster in both design and implementations.
Figure 1: BMC architectures for both ASPEED and HP
A BMC will typically load its firmware from an external flash memory device. To save board space, small serial flash memories are a common choice. These are small devices, often in a SOIC-8 package. On a secure embedded system, the images that are loaded from external flash would be cryptographically validated before they are used. Regrettably, not all BMC solutions implement such a Secure Boot process. HP introduced this with iLO5 and earlier versions are vulnerable. Dell iDRAC6 appears to only use checksums for integrity protection. ASPEED does not implement Secure Boot either.
Attacking the BMC
Without Secure Boot stopping us, the easiest attack would be to simply modify the BMC firmware in flash. No hardware implant is needed. But (a) the main article clearly states there was a hardware device, (b) this attack can be detected in an audit; pulling the flash chip and comparing its contents to a known-good copy would be easy to do, and (c) it does not work against the most modern devices that are implementing Secure Boot, so it lacks generality. Bloomberg’s second article does indicate that this sort of attack might have been attempted, and it is possible that it was just an earlier iteration before the attackers evolved into more advanced hardware implants.
Similarly, using outdated firmware with known (or even intentionally crafted) vulnerabilities is a much simpler attack than a hardware implant, and provides a comfortable amount of plausible deniability. With most BMC implementations there are many vulnerabilities to choose from. But this does not get the attacker the remote access that they want. They would first need to compromise their way into the management network, or get lucky and find the end customer
has misconfigured their network to expose the BMC to the Internet. This is unlikely to be the case for the large security-mature organizations mentioned in the article.
So, let’s try an interposer. Here I am talking about something like NCC Group’s TPM Genie where we use an inexpensive microcontroller inserted in the data path between the flash memory and the BMC controller. The interposer device will allow most memory transactions to pass through unscathed, but can selectively replace certain data with malicious values. By editing the code and data in this manner, a suitable first-stage payload can be loaded into the BMC. This keeps any malicious data out of the original flash device, so an audit of the flash chip will not detect it. It is still mitigated by Secure Boot however.
Now a common implementation flaw in embedded systems occurs when data is read from flash memory multiple times: once to validate it and again to use it. This is often referred to as a Time-of-Check-Time-of-Use (TOCTOU) issue, a double-read, or more generically a race-condition. All the attacker needs to do is allow the original data through for the first read where it will be validated, and replace the data on the second read after the validation. An interposer device like the one described above is ideal for implementing such an attack, even in the face of Secure Boot. This is not just theoretical; we have implemented tools to perform this attack.
Figure 2: Interposer connections
Aggressive miniaturization is our next challenge. But this is not actually a challenge at all. Years of human ingenuity and efficiency have already solved this. Very capable microcontrollers come in packages as small as 1.5mm in a CSP package. Such a thing could be hidden nearly anywhere on a board. To be effective though, it needs to be connected in the path of the BMC’s flash memory.
Figure 3: Interposer connections
This may be a case where supply chain logistics works in the attacker’s favor. For hardware makers, it is cheaper to change the specified list of populated components on the PCB, than it is to create a whole new PCB for various options and product SKUs. So PCB designers often design in options for different footprints, second-source components, for larger memory sizes, etc. The
SuperMicro board shown in the original article certainly does have an unpopulated footprint in the vicinity of the BMC. As well, there are a host of resistors and test points, some of which may be useful at attachment points, or to reconfigure the signals to disconnect one memory and connect another.
Figure 4: Interposer connections
This is where the forensic analysis part comes in. If the attacker is modifying an existing design, then they will need to attach the implant using wires that are visible to a trained eye. If, however, they are modifying the board design before it is fabricated in an effort to compromise all boards, then this would largely go undetectable by anyone without access to the original design source information.
The next steps are all really straightforward payload development and require only modest amounts of elbow grease. With the BMC suitably exploited, it can use its capabilities to communicate over the network interfaces, write to devices and memory over the PCIe bus, reach out to C C servers, and download second-stage payloads. These payloads will do the heavier lifting of compromising the host system if the attacker so chooses.
Risks in the supply chain
Outsourcing anything, from an open source software library to purchasing a server from overseas, is a risk. You are offloading control of your system security to another entity, one that may not have the same motivations and appetite for risk as you. Your suppliers may in turn outsource things to yet others. It is turtles all the way down. And when you get nearer the bottom, it is astounding the types of creative shenanigans that can happen. Malware on factory test stations, refurbished components sold as new, cash bribes paid by organized crime to factory workers, etc. What is alleged in the story is completely within the realm of possibility, and in-line with my professional experience when investigating actual factory product security breaches.
But the big difference between what is alleged in the article and what we have seen before, is the attacker’s motivation. What is commonplace, are things like device theft, counterfeiting, and laundering of illegitimate parts. These activities are perpetrated by those trying to make a profit, frequently with links to organized crime. Government spying is the stuff of movies and conspiracy theories. And if the story is true, it stands to seriously undermine a significant portion of the Chinese manufacturing economy. Not to say that it doesn’t or can’t happen, just that it does not pass the muster of Occam’s razor.
Establishing network connections
The BMC is frequently configurable to support either a dedicated management network connection or to piggyback over the host systems network interface. This allows users flexibility in the deployment, but also makes the attackers job of reaching out to the Internet much less predictable. Malware is commonly designed to reach out to a command and control (C C) server that the attacker controls. This type of rogue traffic is probably the easiest way to detect an attack such as the one described in the article. Any organization with a large robust security program, and certainly all of the victims named in the article, would catch this immediately. Now it is entirely possible that this backdoor operates silently until activated. It could monitor network traffic passively waiting for a magic string to activate its payload. Researchers demonstrated such an attack in a hard drive controller in 2012. The ability to lay dormant makes detection much more difficult.
The follow-up article makes additional claims about hardware implants embedded directly within the Ethernet connector itself. At first glance, this is an ideal place to put an implant if you want to be on the network as you get network access for free. But again, monitoring the network traffic will catch it. The claim here is again not new, as a similar incident was reported on twitter 6 months ago. But since that time no further proof has been found and reported anywhere. The most likely scenario is that the passive signal conditioning components that are normally present in high speed Ethernet connectors (and documented as such) were mistaken as something more. This new Bloomberg article offers similar claims, again, with no technical details that can be impartially verified.
Figure 5: Image from Twitter report
Did it actually happen?
In short, we do not know. The articles do not give any actual technical details, publish no sources, and relies on anonymous information almost entirely. Furthermore Apple, Amazon, SuperMicro, and the Chinese government have all published scathing denials of the story. The FBI, DHS, CSE, and NCSC, all organizations tasked with identifying this sort of thing, have all publicly stated they are unaware of any such hack. The follow-up article does name a source, who has clarified, but without additional technical details. There is speculation that the authors have conflated this with other stories involving bad firmware updates and common IPMI bugs. They have a history of publishing debunked stories about government hacking, and have clear perverse incentives at play. But the fact remains, nobody who understands the technology is saying that it is not possible, and indeed, we are saying that it IS possible. I think we can defend ourselves while maintaining a healthy amount of skepticism about the exact details.
What next?
First of all, don’t panic. There are enough arguments against the allegations in the article to warrant a healthy skepticism. But let’s recognize that a well-funded attacker with physical access to the hardware will always win. So, our job as defenders, is to elevate the cost of an attack beyond the point at which an attacker is willing to invest. Defense is not free, and so it becomes a business decision of how high you want to set this threshold. This is part of the normal threat modeling and risk management that you are (hopefully) already doing. One thing is clear, lots of people are looking closely at their servers now, so if there is evidence to come forward, it will very soon.
Secondly, if you are building products, then you should continue building them to be more robust against all sorts of attacks, including supply chain attacks. Attackers improve tools and techniques over time, and so must you with your defenses. You should be regularly theorizing how such an attack would play out against your products, and redesigning them with each generation to make this more difficult. Understand your product threat models, and build processes and features that mitigate these threats. Things like Secure Boot, authenticated debug, storage encryption, and transport encryption are table-stakes for embedded device security in 2018. Releasing modern, network-connected devices for use in critical business infrastructure without performing significant security due-diligence is both irresponsible and bad business. Regarding supply chain attacks specifically, here are some things that can help:
- Random product audits: Automated optical and X-ray inspections, precision weight and RF measurements, chemical analysis, and other techniques can help determine if there are rogue implants. Make the attacker work hard to hide their work. This is a cat-and-mouse game of obfuscation, but it is as useful for product quality as it is for security.
- Diversification: Multi-sourcing components with multiple vendors and multiple factories makes you harder for an attacker to predict. Diversification in your shipping logistics can help mitigate interdiction attacks.
- Strong monitoring: Your factory networks, processes, and people are all subject to exploitation. Detection can go a long way, but if you don’t look, you won’t find.
Lastly, if you worry that you might be vulnerable to a breach of this type, then start with the basics. Like most malware, the one described in the article reaches out to the Internet for command and control. This is traffic you should already be looking for, so keep doing that. Never expose your BMC to the Internet, ever. If you want to run simple spot checks against your server hardware or firmware to see if anything has changed, then you can do this too, but know that you are unlikely to detect the type of advanced targeted attack alleged in the article. Put a robust security incident response process in place so that if something does happen, you have a plan and know what to do. If you believe that you are already affected and have identified a server that is “different”, behaving strangely, or generating unknown traffic, then hire a forensics team to investigate in detail. It is NOT recommended that such a detailed forensic hardware investigation be performed proactively, as without actual symptoms to investigate, the reverse-engineering efforts are likely to be inconclusive, and cost-prohibitive.
Outsourcing, at the end of the day, is a business decision. You can rewrite all of your software libraries from scratch in-house, but your product will not ship on time. You can design and build all your servers in-house, but your profit margins would decrease. So, you outsource. You have ways to mitigate these risks using technical tools, contractual tools, legal tools, and reputational
tools. Building secure devices in untrusted factories and supply-chain management in general, are hard problems, but not insurmountable when treated as a risk management problem.