To BMC or Not to BMC...

Notes from the Bare-Metal Trenches

Rony Michaely

6/29/20267 min read


To BMC or Not to BMC...
Notes from the Bare-Metal Trenches

I’ll start with a confession: I’ve been pushing Edge Firmware Orchestration designs for the last six years, and I’m unapologetically biased. In my vision of the Tech future, BMCs don’t go away, they grow up. They stop being unaudited backdoors and become first-class, cryptographically anchored control planes that we treat as seriously as we treat our CPU microcode or our HSMs.

My design bias has been consistent:

  • OpenBMC as an auditable, attested, updatable, hardened firmware stack

  • Paired with a secure FPGA Lattice MachXO3D/LFMNX or Intel MAX 10 acting as a Hardware Root of Trust (HRoT)

  • End-to-End Encrypted (E2EE) paths from silicon-level keys all the way to a tenant-bound, privately signed cloud control plane

The punchline?
No one really shipped true hardware-level E2EE from HRoT to Cloud tenants at scale. When I did the designs, the incremental BoM (bill of materials) cost was roughly $5 per redesigned chip. For that $5, you buy:

  • Measured boot of BMC + UEFI

  • Hardware-fused identities

  • Encrypted, attestable firmware orchestration

  • Instant platform remediation without trusting the host OS at all

  • BLE sensors for entity signing and proximity validation.

And yet, the industry mostly shrugged.

I do believe in a world of heterogeneous compute singularity, where servers and edge devices can be managed, repaired, and remediated using the same secure primitives, from a rack in a hyperscale DC to a ruggedized gateway bolted to a lamppost.

To get there, though, we have to confront the uncomfortable middle: today’s BMC reality.

Why BMCs Are Beautifully Dangerous:

Baseboard Management Controllers (BMCs) sit at the heart of the modern hardware supply chain and data center control plane. They are:

  • Always-on

  • Network-facing

  • Below-OS

  • With direct power and memory control

In other words: a perfect operations platform, and a perfect adversary beachhead.

A BMC is not “just a chip.” It’s:

  • A full SoC running Linux or a similar OS

  • With its own flash, RAM, network stack

  • Wired into power rails, reset lines, SPI flash, and often PCIe

That means:

If you compromise the BMC, you don’t just “hack the server”,
you
own the platform that the server depends on.

Over the last decade, threat intelligence around BMCs has quietly matured. What emerges is a fairly standard, but deeply worrying, multi-phase attack chain.

The Core BMC Exploitation Attack Chain:

The typical firmware-level attack chain for BMC infrastructure looks like this:

  1. Phase 1: Initial Access
    Exploitation of web APIs / Redfish / IPMI

  2. Phase 2: Privilege Escalation
    Memory corruption / command injection → root on BMC

  3. Phase 3: Host Lateral Pivot
    KVM abuse / DMA-based host memory modifications

  4. Phase 4: Bare-Metal Persistence
    SPI flash rewriting / disabling secure boot

  5. Phase 5: Defensive Evasion
    Operating below the OS / complete EDR blind spot

I’ll walk through each, then extend the story into detection and design choices.

Phase 1: Initial Access & Pre-Authentication

When attackers go “to BMC,” they start at the exposed edge:

  • Management web GUIs

  • Redfish REST APIs

  • IPMI (and its graveyard of legacy endpoints)

Historically, this was about:

  • Default credentials

  • Poor password reuse

  • Management networks accidentally routable from the internet

That still happens, but the more interesting moves now are architectural exploitation.

Timing Side-Channels and Pre-Auth Enumeration

One pattern I’ve seen and reproduced:

  • Firmware uses memcmp() or similar unsafe comparisons for credential checks.

  • Error paths and timing differences reveal:

    • Whether a username exists

    • How many bytes of a password are correct

An attacker doesn’t need a password dump, they need a stopwatch. With enough timing samples, you can:

  • Enumerate valid user accounts

  • Perform a guided brute force while staying under rate-limits and alert thresholds

The result:
A noisy login storm becomes a slow, almost invisible credential oracle.

At this point, the attacker has credentials or a foothold in a pre-auth bug, and they move on to the next stage.

Phase 2: Arbitrary Code Execution (Root on BMC)

Once you have BMC interface access, the goal is to escape the velvet ropes of web forms and land on a shell.

The problem (for defenders) is structural:

  • Many BMC stacks are heavily customized Linux distros

  • Historically shipping without:

    • ASLR

    • Stack canaries

    • NX/DEP enforcement

  • And a lot of internal services run as root

Which means:

  • A single stack-based buffer overflow

  • Or a simple command injection in a REST handler

  • Translates directly into root shell on the BMC

From there, the attacker can:

  • Drop persistent binaries or scripts

  • Reconfigure services and logging

  • Pivot quietly into the host system

This is the moment where your “management controller” stops being a safety net and becomes a shadow hypervisor.

Phase 3: Pivoting to the Host System

With root on the BMC, you’ve effectively acquired:

  • A sidecar computer with hard-wired privileges over the main host.

Two primary pivot mechanisms dominate:

3.1 Software-Defined Peripherals (KVM & Virtual Media)

BMCs implement:

  • Keyboard-Video-Mouse (KVM) redirection

  • Virtual USB / virtual CD-ROM / virtual disk mounting

These are features for admins.

For an attacker, they are a weaponized pre-boot console:

  • Inject keystrokes at BIOS/UEFI or GRUB

  • Modify kernel command line to spawn a root shell on TTY

  • Mount a malicious image as “virtual media” to install host malware

From the host’s point of view, it’s just “someone at the console.”
From a SOC’s point of view, it’s often completely invisible.

3.2 Direct Memory Access (DMA)

On more tightly integrated systems, BMCs or adjacent controllers can:

  • Drive DMA-capable interfaces into host memory (via PCIe or custom interconnects)

  • Read or write arbitrary regions of DRAM without host CPU involvement

That enables:

  • Credential and key theft directly from memory

  • In-place patching of kernel structures, EDR agents, or page tables

  • Covert channels that bypass OS logs and controls

At this point, the OS is a theme park built on untrusted concrete. You can re-skin the rides, but the foundation is already someone else’s.

Phase 4: Permanent Persistence – Breaking the Trust Chain

The smartest attackers don’t just “Get in”, they become part of the hardware’s life cycle.

With BMC-level access, they can often:

  • Read and write host SPI flash (where UEFI/BIOS lives)

  • Patch or replace the boot firmware

  • Modify UEFI variables to disable Secure Boot or other protections

A few patterns show up frequently:

  • Bootkits: Custom UEFI components injected into the boot path.

  • Rollback Attacks: Reverting to older, vulnerable firmware that is still signed but no longer safe.

  • NVRAM Tampering: Flipping flags to turn off protections or enable debugging backdoors.

Why this matters:

  • You can wipe disks.

  • You can reinstall OSes.

  • You can rotate keys and certificates.

If the firmware trust chain is owned, the attacker can re-own everything on every boot.

Phase 5: XDR Evasion & Strategic Impact

Traditional security lives in two privilege realms:

  • Userland (Ring 3) EDR agents, applications, many controls

  • Kernel (Ring 0) OS, kernel drivers, some advanced sensors

But firmware and management engines live in:

  • Ring -2 / Ring -3 equivalents — under and around the CPU

This is below:

  • The visibility envelope of EDR

  • The logging pipeline of the OS

  • The remediation mechanisms of standard IT workflows

So, a BMC implant can:

  • Operate even when the host is “off” (cold but still on standby power)

  • Bypass tenant boundaries in multi-tenant infrastructures

  • Exfiltrate data via out-of-band channels weeks or months after an “incident resolution”

This isn’t just a blind spot. It’s a different gravitational well, and we’ve been pretending our OS-centric telescopes are sufficient.

Defending the Bare-Metal Stratum

Now, to the constructive part. If we want to keep BMCs, they have to grow up into first-class secure platforms.

1. Network Micro-Segmentation (Non-Negotiable)

Some rules are absolute:

  • BMC/IPMI/Redfish interfaces must never be exposed to the internet.

  • They should live on dedicated, non-routed management VLANs.

  • Access should be via:

    • Hardened bastion/jump hosts

    • Strong MFA

    • Strict ACLs and just-in-time access

“Ping the BMC from my laptop” is not a convenience; it’s a structural flaw.

2. Cryptographic Supply-Chain Verification

Before firmware touches production:

  • Run it through an automated firmware analysis pipeline:

    • Verify signatures and certificate chains

    • Check for unpatched CVEs

    • Scan for insecure configurations and debug hooks

Combine this with:

  • Hardware/firmware SBOMs and HBOMs

  • Continuous comparison between what the vendor claims and what the platform actually runs

The goal is to make a changed BMC image harder to hide than a changed Linux binary.

3. Hardware-Enforced Roots of Trust

This is where my bias comes in strongly.

Modern platforms can and should:

  • Leverage hardware-validated boot:

    • Intel Boot Guard, AMD Hardware Validated Boot

    • CPU-enforced signature checks on early boot components

  • Embed a HRoT FPGA (MachXO3D/MAX 10, etc.) that:

    • Measures and controls the BMC boot path

    • Refuses to power up or release secrets unless:

      • BMC firmware matches a known-good measurement

      • UEFI has not been tampered with

This is how we stop treating BMCs as “a Linux box we hope is fine” and start treating them like programmable, attestable infrastructure primitives.

To BMC or Not to BMC

So, should we get rid of BMCs?

I don’t think that’s the right question.

“Are BMCs safe?” will always be answerable with:

It depends on the implementation, the vendor, the patch velocity, the config…

And that’s a lawyer’s answer, not an engineer’s answer.

The Better Question

The question I come back to is:

Do we want an always-on, out-of-band, hardware-adjacent control plane?
If yes, are we
willing to pay the cost to make it provably trustworthy?

Because the alternatives aren’t pretty:

  • No BMCs → you lose:

    • Remote remediation at scale

    • Lights-out management

    • Fine-grained hardware telemetry and orchestration

  • Ad-hoc replacements (USB management dongles, external controllers)

    • Just recreate similar risks in less standardized, less scrutinized forms.

My Answer

I’m firmly in the pro-BMC camp, under two conditions:

  1. Open Firmware (Attested)

    • OpenBMC or equivalent auditable stacks

    • Reproducible builds, dynamically attested measurements, hardened toolchains

  2. Hardware-Anchored, Tenant-Bound E2EE

    • HRoT-enforced measured boot

    • Cryptographically bound identity from device → BMC → cloud tenant

    • Control and telemetry encrypted from hardware roots to tenant-specific keys

Yes, it costs a few extra dollars per board.
Yes, vendors will try to shave that off for margin.

But those dollars buy:

  • Attestable management

  • Faster remediation

  • A platform we can defend below the OS, not just react from above.

Beyond Servers: Singular Heterogeneous Orchestration

The other angle I care about is coherence.

We’re hurtling into a world where:

  • DC servers, edge boxes, routers, gateways, smart NICs, accelerators

  • All have some variation of “BMC-like” management plane

If every one of these ships with:

  • A different, proprietary, opaque stack

  • Its own patching lifecycle

  • Its own side-channels and hard-coded root accounts

Then we haven’t just built a data center, we’ve built a zoo of semi-trusted micro-OSes.

I want the opposite:

  • A unified orchestration story where:

    • Servers and edge devices share the same secure management primitives

    • Attestation from HRoT → BMC → firmware → host → agents is composable

    • Repairability and remediation are first-class, not afterthoughts

The OpenBMC + HRoT pattern isn’t perfect, but it is a deliberately designed foundation for that world.

Every time we design a board without a strong HRoT for the BMC, every time we bolt on a proprietary blob without attestation, we are effectively writing a story that says:

“We trust this thing because we always have, and we hope no one clever cares enough to prove us wrong.”

The path forward is:

  • Unattested, minimally secured BMCs as a necessary evil
    vs.

  • Attested, hardware-anchored, E2EE BMCs as a deliberate, defended control plane

I’ve spent six years nudging designs toward the latter.
Sometimes the answer was a clean prototype.
Sometimes it was a quiet “too expensive” in a budget review meeting.But the cost delta is small, the risk delta is enormous, and the attackers are already living below the OS.If we want a future where heterogeneous compute is manageable, repairable, and defensible from hyperscale racks to lonely edge boxes, we don’t walk away from BMCs.We upgrade them into the same security story we demand from CPUs, GPUs, TPMs and HSMs.

And we pay the extra $5...

Contact

Questions? Reach out anytime.

Email

Phone

info@attackchain.com

+1-866-426-0911

© 2025. All rights reserved.