Meet Fractal, an OS made for microarchitecture reverse engineering
Probing how a CPU isolates user code from kernel code is messy work. Researchers patch kernels, write drivers, or boot stripped-down bare-metal programs, and any of those choices change variables they were trying to hold still. Fractal, a new operating system from MIT CSAIL, was built to take that mess out of the loop, and its authors used it to surface previously undocumented behavior in the Apple M1 branch predictor.

Joseph Ravichandran and Mengjia Yan of MIT CSAIL developed Fractal to remove sources of measurement noise that interfere with microarchitecture reverse engineering. The system has been released as open source.
The trouble with general-purpose operating systems
Microarchitecture experiments measure subtle timing differences caused by shared CPU structures such as branch predictors, caches, and translation lookaside buffers. When researchers want to study how hardware isolates user code from kernel code, they typically need to run test instructions on both sides of the privilege boundary. On Linux this usually means patching the kernel or writing a driver. On macOS, where kernel extensions are deprecated and the open-source XNU code is incomplete, researchers have resorted to binary-patching the kernel.
Moving code from user space into a kernel extension changes the address space layout, the branch history path, and the contents of the return stack buffer. Any of those changes can alter the result of an experiment, making it hard to determine whether a hardware defense blocked an attack or whether some unrelated variable shifted.
How Fractal is structured
Fractal is a roughly 31,000-line kernel written from scratch, with support for X86_64, AARCH64, and RISC-V 64-bit systems. It runs on QEMU, Intel and AMD PCs, Raspberry Pi boards, and several Apple Silicon Macs.
The kernel introduces a model the authors call multi-privilege concurrency. A single task can host threads that share code and memory yet execute at configurable privilege levels. This is supported by shadow memory maps that present the same physical pages to user and kernel threads with different permission bits, and by a stack-aliasing scheme that lets a thread change its privilege level without altering the virtual layout of its runtime stack.
Fractal also provides a cooperative scheduler so researchers can dictate the exact order in which threads run. Interrupts and system services do not preempt running experiments. A memory region called the gmap supplies a large virtual area backed by a single 2MB huge page, with sections that can be replaced to control specific physical addresses.
“Fractal is a strong architecture contribution because it turns an often ad hoc microarchitectural reverse-engineering workflow into reusable research infrastructure,” says University of Southern California assistant professor Mengyuan Li, who wasn’t involved in the paper. “By reducing software noise and giving researchers tighter control across privilege boundaries, it makes difficult hardware experiments much easier to interpret.”
Findings on the Apple M1
The authors evaluated Fractal on a 2020 M1 Mac Mini by reverse engineering the conditional and indirect branch predictors on both performance and efficiency cores.
For the indirect branch predictor, user-trained targets are speculatively fetched in kernel mode, yet speculative execution of those targets is blocked. The authors attribute this to a pipeline race in which fetch begins before the privilege check completes. The same pattern holds across different ASIDs, consistent with ARM’s CSV2 specification covering both exception level and ASID.
The conditional branch predictor shows no privilege isolation on either core type. User code reliably mistrained the CBP across privilege levels and ASIDs, including cases where the receiver was running in kernel mode. This contradicts earlier findings by Tuby and Morrison, who reported that cross-training failed on the efficiency core. The Fractal authors attribute that earlier result to macOS not respecting core pinning for kernel extension code.
The team also tested for Phantom speculation, in which non-branch instructions are speculatively decoded as branches. Phantom fetches occurred on M1 in user-to-user, user-to-kernel, and cross-ASID configurations. Phantom-driven speculative execution did not occur. This is the first published evidence of
Phantom behavior on Apple Silicon.

Download: Secure Foundations for AI Workloads on AWS