The Microcontroller Unit (MCU) is the heart of an embedded device, where the main firmware executes its instructions to carry out the system’s functions. These come in many varieties. Relatively simple microcontrollers with limited-resource processors may bundle only a few IO peripherals, a small amount of memory, and be intended to run a small real-time operating system (RTOS) or bare metal firmware. More complex System-on-Chip (SoC) devices may contain a dozen or more CPUs (some dedicated to specific functions), a wide assortment of peripheral controllers, and enough horsepower to run Android or a full-blown desktop operating system. Still others may be more exotic, use digital signal processors (DSP), leverage reconfigurable logic with the CPU implemented as a soft-core, or use application-specific architectures dedicated to specific workloads like a neural processor or a GPU. To simplify the discussion, we will use the imprecise term “microcontroller” to refer to all of these. Additionally, microcontrollers are rarely sold in isolation, and most often include an entire Board Support Package (BSP) of development tools and firmware intended by their vendors to help get your product to market as fast as possible.
Understanding the security offered by the BSP and MCU before making a selection can make an enormous difference to the overall security posture of the final product and save expensive design iterations. Additionally, at the time of this writing, due to the lingering effects of the COVID-19 pandemic, trade disputes, and other factors, there is a semiconductor shortage with up to 24-month lead times for some parts. This will force some device manufacturers to make hard decisions and rethink which chipsets power their products and they may instead need to limit choices to those available. This is a compromise, but a compromise that needs to be well understood. Ideally, a proper alternative MCU can be found that provides all the security features required by the product. During supply shortages prices inevitably rise. The price increases are anticipated to incentivize component counterfeiting and recycling. Parts must therefore be purchased from a reputable vendor to avoid spending money on old recycled MCUs, which may be rebranded to appear as if new.
This article intends to illuminate important security criteria that must be evaluated when choosing the right component for an embedded systems project. It will help inform the questions engineers should ask chip vendors before deciding which is the best microcontroller for their new product.
Security Considerations
Some MCUs cost a fraction of a dollar and therefore can be used in very inexpensive (or even disposable) IoT products. These typically have minimal security features; at most a simple checksum of the firmware may be calculated at device boot to ensure firmware integrity. On the other end of the spectrum, MCUs such as the ones based on ARM Cortex-M33, are powerful microcontrollers with hardware cryptographic accelerators that are able to run embedded Linux and have support for ARM’s TrustZone technology.
The security requirements of a product must be adjusted based on the environment in which they operate. A device deployed in an environment with additional security controls, such as behind locked doors, may have vastly reduced requirements compared to a portable device that is more likely to be lost or stolen.
Additionally, the requirements must adjust based on the sensitivity of the data processed by the device. Two similarly powered microcontrollers might “do the job”, but the one supporting strong security will be a better option to protect sensitive assets such as private user data, a cryptocurrency wallet, or access to administrative functions affecting other users.
Desired Security Features
Last week, NCC Group published a blog showing the risks embedded devices face. We’d recommend the reader spend a few minutes reviewing it before diving into this section, as it explains the security concerns the OEM vendors and their customers are exposed to if device security is just an afterthought. Take your time, we’ll wait…
Okay… now armed with a good understanding of these risks, let us discuss what kind of security features and properties embedded devices should include to mitigate these risks. As the saying goes, “a chain is no stronger than its weakest link”. Attackers are opportunistic and will find the easiest way to compromise your device.
Not all of the following security features presented below are equal in value. Features such as flash encryption are designed to protect against physical attacks and therefore are very useful for applications such as access control where an attacker may have access to the badge used to enter a building. The same feature is still useful, but less valuable for a device such as a consumer-grade router that is often behind locked doors. More valuable are the features that mitigate against remote attacks or do not allow code modifications. Refer to your product threat model when evaluating the importance of each feature.
Firmware Verification
Here we refer to the microcontroller’s ability to validate software before it is executed. Devices supporting Secure Boot will perform a cryptographic validation of the firmware before executing it, thus ensuring that only code signed by the manufacturer (or other authorized party) is allowed to run. Each firmware component in turn validates the subsequent firmware component as it is loaded to ensure that the system only runs authenticated firmware. The foundation of trust must be rooted in hardware, which means the microcontroller must support a hardware Root of Trust.
Additionally, vendor firmware should validate firmware updates before writing them to flash. On devices with Secure Boot an invalid update may not be able to compromise the device, but it may brick it. On devices without Secure Boot, however, verifying the firmware updates is critical as it is the only way to ensure legitimate code is running on the device. This is the case for many microcontrollers running from internal flash or EEPROM. There have been many instances in which attackers were able to force the device to retrieve the firmware update from an attacker-controlled location. As such, intruders were able to write their own firmware that set the stage for a remote attack against the users’ local network.
Hardware Root of Trust
A microcontroller is said to have a Root of Trust (RoT) when it starts by running trusted firmware, such as the ROM code which is set at chip manufacturing time, immutable, and therefore implicitly trusted. In some cases, the use of internal flash within the microcontroller can mimic this behavior, provided suitable flash protections are in place. This flash-based bootloader is programmed by the OEM and then permanently write-protected.
One-time-programmable (OTP) fuses are generally used to permanently configure the Secure Boot functionality, by enabling it, and burning the public key (or certificate, or hash of it) that is used to validate the firmware. In this case the OTP fuses are part of the RoT.
In other words, a device has a Root of Trust if it starts by running code from an internal/immutable memory with an immutable configuration. It represents the foundation for the security layers built on top. If the authenticity of this early loaded software cannot be enforced, all the subsequent security layers can be modified or bypassed.
At the inexpensive end of the spectrum microcontrollers tend not to have this feature, and allow any firmware to run. In this case, a bug in the code that allows writes to flash could persistently compromise the device and give attackers full control over the firmware.
Rollback Protection
This feature protects against attacks in which the attacker may update or replace the firmware with an older version in order to exploit known vulnerabilities in the older version. Since the older vulnerable firmware was legitimately signed by the vendor, the microcontroller will accept and execute the firmware. This defense is primarily of benefit when the user, owner, or administrator is considered a potential attacker. This is the case for example with jailbreaking of game consoles, or when no authorization is needed to perform firmware updates (which is the case for many Android devices).
An effective anti-rollback mechanism typically uses a monotonically increasing counter that represents the lowest version of the firmware that is accepted. This counter must be stored within the microcontroller’s secure storage (commonly OTP fuses or internal flash memory) that itself cannot be rolled back. The rollback check must be performed when the device boots up, but ideally should also be verified before a firmware update. An older version should be detected and disallowed. Some chip vendors offer this feature out of the box and make it relatively straightforward to enable and use by providing reference code.
Secure Storage and Data Protection
This section is more relevant for microcontrollers that run from an external flash. Here at NCC Group we routinely remove the flash memory from PCBs to extract credentials such as WiFi passwords and modify the firmware. Unfortunately, many product vendors do not take into consideration this type of attack. Some microcontrollers, such as the ESP32, make it easy to encrypt all external flash by provisioning a random 256-bit AES key at device manufacturing time. Others might encrypt only portions where user data is stored.
Choose a microcontroller that encrypts the flash to increase the difficulty of extracting user secrets or credentials. The encryption key should be unique to each device so that breaking one device will not compromise all similar devices. Recovering a key is usually a costly and time consuming process and that may only be acceptable to an attacker if a successful key recovery compromises thousands of devices.
Additionally, encrypting the code may increase the time required to find vulnerabilities, as attackers first need to defeat the encryption mechanism before they can start analyzing the code. However, it should only be considered an obfuscation mechanism that increases the time until vulnerabilities are discovered. It should not be by any means a substitute for Secure Boot or static code analysis and professional audits of the source code.
Hardware Entropy
If the embedded application requires access to a source of strong entropy, it is ideal to select for your product a microcontroller that has a hardware random number generator (HWRNG). The source of entropy is based on jitter coming from various sources such as thermal or audio noise or, more commonly, a ring oscillator. Note that some chip vendors certify their HWRNG implementations, a useful feature if your product also needs to be certified.
Strong entropy must be unpredictable and uniformly distributed. In the past, due to the lack of a strong RNG, developers used the current date or time since boot as seed for pseudo-random number generators (PRNG) to generate secrets. While the output of the PRNG is uniformly distributed, it is not unpredictable. Since the PRNG implementation is not secret, there are limited seeds that an attacker can try in order to replicate the results of the PRNG.
If the application requires entropy but it is too late to change the microcontroller in the product, at a minimum a strong random seed can be injected at manufacturing time from an external source. This seed needs to be securely stored to protect its confidentiality. This seed can later be used with a PRNG algorithm to generate random data that can be used for generating keys. The seed should be supplemented periodically with other sources such as battery voltage variations or user-based events, otherwise on each boot the random sequence will be repeated.
Choosing a microcontroller that generates poor entropy due to lack of hardware support can turn out to be costly. There are many examples of devices that were compromised due to generation of predictable keys.
Strong Cryptographic Algorithms
Some inexpensive processors do not have any support for cryptographic operations, or perhaps only support obsolete algorithms (MD5, SHA-1, DES/3DES, etc). Others have good support in the BSP, provided by the chip vendor. The more evolved microcontrollers can run cryptographic operations in hardware, with better performance and stronger secret protections.
Several aspects need to be considered. If the application requires access to cryptographic functions, it is better to choose a component for which the vendor already provides an implementation, in software or hardware. Creating your own secret cryptographic algorithms or implementations is always a bad idea and chipset vendors often provide robust open-source cryptography libraries. This also allows the product team to focus on the functionality of the application rather than adding basic features they can get for free.
Cryptographic operations can be CPU-intensive therefore, depending on the frequency of use, choose a part that can handle the cryptographic load as well as the functionality of the application.
The firmware verification (that was discussed previously) normally involves hashing the firmware and verifying the provided signature using a public key with an asymmetric algorithm. If a weak crypto algorithm is used in this process, attackers may be able to bypass the verification and run compromised firmware.
Resistance to Fault Injection Attacks
Even if the device under attack implements Secure Boot, it may not be sufficient to guarantee the integrity of the device. In the past, researchers as well as hackers have performed many successful fault injection attacks. The most common types are voltage and clock glitching. While different in nature, they both have the same goal: get the chipset into a state where a few instructions fail to execute properly. By carefully timing the glitch, the attacker can selectively skip important instructions such as Secure Boot verifications or debug disablement. Realistically expect that most microcontrollers are not hardened against this type of attack. Only recently have vendors started to treat this problem seriously and implement countermeasures, as attacker techniques and tools have become increasingly available and low cost.
If the product has a requirement to defend against physical attacks, it is recommended to select a microcontroller that implements countermeasures, from a vendor with a reputable security track record. Inquire if there are, at a minimum, mitigations in software. Mitigations in both the ROM code and the bootloader are important.
Commitment to Firmware Patching
For a significant share of the inexpensive microcontrollers on the market it is safe to assume that they will never receive firmware updates. This is particularly dangerous for connected devices as, historically, many vulnerabilities have been discovered in communication libraries such as those supporting TLS.
Choose a microcontroller that has long-term support, from a company that has a proven record of maintaining the firmware with periodic patching. This includes vendor’s code and any third-party code that might be in use. A microcontroller running a long-term support version of a popular operating system will most likely receive the best support.
The only good firmware updates are the ones that are actually applied. After the device is launched, it is the responsibility of the product manufacturer to merge in available patches and build new versions of the firmware. This should be a periodic process in which the engineers review the latest changes or security bulletins. Making the installation process easy for users (or even automatic) is vital.
Security Track Record
Commitment to firmware patching is not the whole story. History tells us that some chip vendors have been better at writing secure firmware than others. Some have a long track record of providing quality code and public research reflects that. Others are at the other extreme and are plagued by a considerable number of CVEs.
Even when using an open-source project for the firmware, it is important to understand that the device security is rooted in the ROM code. If the ROM code has vulnerabilities, it can bypass all other verification mechanisms. It is important to understand if the chip vendor is committed to security by performing external security audits of the ROM code and employing strategies such as fuzzing and static analysis to increase the code quality.
Security has a cost, but it is a cost worth paying. Chip vendors providing periodic security bulletins for their products tend to be the ones most committed to security.
Memory Protection
The memory management unit (MMU) and the memory protection unit (MPU) are hardware features of microcontrollers that allow memory protection for specified memory areas. There exist microcontrollers that do not have any memory protections in place, but those are not going to be covered in this section and their use is discouraged. Security-wise it is a good idea to avoid microcontrollers without at least some form of memory protection as one vulnerability can translate in a full control of the device (attacker could write anywhere in memory, including memory mapped flash if supported). Of all microcontrollers providing memory protection, only a small percentage are using an MMU.
An MPU is used to assign access permissions to memory regions, which varies for each chipset. These regions can have subregions and could be configured to be as little as a few bytes long. Each of these regions could be restricted to only allow privileged access, set as read-only, or marked as non-executable.
While having an MPU gives a crude level of protection, a microcontroller with an MMU can do a lot more and is vital to support many software mitigations. The MMU uses a Page Table to store a mapping between physical and virtual addresses. Each entry in the table is known as a Page Table Entry. Like the MPU, these Page Tables contain permissions such as read, write, and execute and can be accessed only by the owner process (or a more privileged process like the kernel). Pages containing data can be marked writable but not executable, and pages containing executable code can be marked as executable but not writable to mitigate some buffer overflow attacks. Any violations will cause a page fault exception. Another benefit is process isolation by which an application trying to access a page reserved for another application will not succeed, again, causing a page fault. This might be particularly useful if there are less trusted third-party applications running on the device as well. Certain exploit mitigations such as Address Space Layout Randomization (ASLR) rely on the MMU. ASLR is a technique to randomize the locations of code and data structures so that attackers cannot rely on previously known memory locations. This helps mitigate some exploitation techniques such as return-to-libc and return-oriented-programming (ROP). The translation layer provided by the virtual to physical address mapping makes it straightforward to implement ASLR.
Embedded OS
Chip vendors have a vested interest in adding support for their products to popular real-time operating systems such as Linux, mbedOS, FreeRTOS, and Zephyr. This way they can benefit from the multitude of software tools in the open-source ecosystem, without the additional cost of doing so in-house. This is also a great advantage for the OEM vendors, as the hardware layer is abstracted by the OS. This makes it easier to port the application to other (more powerful) microcontrollers without having to worry about the hardware-specific implementation. Additionally, some microcontrollers may reach the end of life and can no longer be purchased. Also, it may be easier to build a real-time application on an OS such as Embedded Linux.
Note however that not all operating systems are equal, and they have various levels of maturity. With support from chip vendors, some implement task isolation and have a separation of privileges, to minimize the effect of a single vulnerability. More basic firmware without an OS would only require one vulnerability for full system compromise. The ones that implement an OTA (over-the-air) update mechanism make it easy to deploy product updates. Secure storage may also be supported to complicate physical attacks. Some operating systems are open-source allowing a full analysis of the software stack and easier maintenance.
Programming Language and In-house Expertise
Bare metal firmware: While C is the most popular language used to develop bare metal firmware, there are some vendors who offer runtimes for higher-level language alternatives. MicroPython is one and it can be used on some Nuvoton microcontrollers as well as the ESP8266/ESP32 from Espressif. Embedded Lua is also seen in some cases.
Operating system: If the OEM BSP includes an OS such as Linux, it is possible to use many programming languages to develop the application: C, C++, Python, Java, Go, Rust, etc. Ideally, choose a memory safe programming language, such as Rust/Java/Go. This way entire classes of vulnerabilities can be largely avoided, such as buffer overflows. Note that even with languages like Rust some parts interfacing with the hardware need to be unsafe, as code needs to behave more like C. As such, the benefits of using a safe programming language do not apply to unsafe sections of code. Unsafe areas need to be carefully implemented, isolated, and rigorously tested.
It may be beneficial to consider your talent pool and choose a microcontroller supporting development in a safe programming language already used in the company. Since the developers are already experienced with the programming language, they will produce better quality code, faster.
Secure Debug and Authenticated Access
Debug interfaces such as UART and JTAG are of great help during product development. These can simplify firmware development and help find intermittent bugs, which are otherwise extremely hard to eliminate. But this advantage becomes a risk if these interfaces are unauthenticated and there is no way to disable them in production. As an example, devices without Secure Boot that perform over-the-air update validation may be protected against remote and local attacks but are easy targets to physical attacks. One common issue is that some vendors may protect a JTAG re-enable feature using a secret password shared across all devices. If that password leaks or is extracted from one device, it can be used to attack products from other vendors that are using the same chipset.
Some microcontrollers disable debugging using one-time-programmable fuses. Once the development phase is done, production devices are configured to burn debug-disable fuses on first boot. On microcontrollers with internal flash storage, this is sometimes a setting that can be reset after a full wipe of the storage space is performed.
Support for Secure Manufacturing
While some OEM vendors still produce their devices in-house, it is increasingly more economical to outsource production to manufacturing partners to benefit from their economy of scale. In the Original Design Manufacturing (ODM) model, the manufacturing partner designs and produces the final product according to the OEM specifications. In the Contract Manufacturer (CM) model the manufacturing partner builds the device that was designed in detail by the OEM thus giving the OEM more control over the design and process (but at a higher cost than with an ODM model).
No matter if the product is built in-house or by a manufacturing partner it is important to be able to ensure the authenticity of the device after it leaves the factory floor. Without such safeguards, counterfeit devices may be allowed to authenticate with the cloud services that support the device.
Typically, some per-device secret is provisioned to or is generated on each device on the production line. That secret, normally a cryptographic key, is later used to identify the device to the backend infrastructure through a challenge/response mechanism. The secret needs to be protected such that the attacker cannot steal or impersonate a device. It is best if it is saved in an internal storage or an area not readable by users. Even better, if the microcontroller supports it, saved in immutable e-fuses.
Regardless of the manufacturing model in use, the OEM should carefully choose or specify a microcontroller that can store the identity secret securely.
Conclusions
Creating a new product is not an easy task to begin with. However, it can be significantly simplified by starting with the right MCU/BSP combination. Choosing the right platform, provided by a vendor with a great history of security and patching can reduce the risk of a compromise. Also, a platform that provides a comprehensive software development kit will reduce the development time allowing the OEM to focus on developing their application, rather than porting or (even worse) writing code for common functionalities such as TLS. Additionally, choosing a platform that allows good debug can help improve the performance, quality, and stability of the product. Selecting a chipset to power the device solely on cost of hardware is bad news. A compromise, an increase in development time, and a poorly performing device that requires fixes at the end of the development process all translate to costs which, with proper planning, could have been avoided. Spending some time in the beginning understanding the benefits each MCU/BSP combination provides, and selecting one that best matches the product requirements is a worthwhile investment.
Don’t Forget To Be Awesome!
I’d like to thank my colleagues Nick Galloway, Jeremy Boone and Rob Wood for their insightful suggestions and feedback provided while writing this blog.