"Shall Destroy, Did Not": Recovering ML-DSA Private Keys from wolfSSL's Heap
wolfSSL's ML-DSA signing implementation does not destroy private key material after use, violating FIPS 204 Section 3.6.3. The unzeroed heap block is recoverable via same-process allocation, enabling end-to-end signature forgery.
Summary
wolfSSL’s ML-DSA signing implementation frees a ~50 KB heap block containing private signing material (s1, s2, t0 in NTT form) without clearing it, violating FIPS 204 Section 3.6.3. For a same-process attacker able to allocate and read a same-size heap block, this material is recoverable – demonstrated on glibc across three Linux distributions. Recovery of s1 is sufficient for full signing-key compromise and arbitrary signature forgery, verified end-to-end against the compiled libwolfssl binary.
wolfSSL confirmed the finding, patched it (#10100, #10113), and credited the reporter. It declined to assign a CVE.
Affected: wolfSSL v5.7.2 – v5.9.0-stable, native
WOLFSSL_WC_DILITHIUMbuilds (requires--enable-mldsaor--enable-dilithium; not included in--enable-all). Fix: v5.9.1 (April 8, 2026). Update to v5.9.1 or later.
Off-process recovery primitives (core-dump ingest, cross-process /proc/$pid/mem) are covered in a follow-up post.
How It Works
The attacker is code running in the same process as the ML-DSA signing operation – a plugin, callback handler, co-loaded library, scripting engine, or any component that shares the process address space. The attacker can call malloc and read the returned buffer. No memory-corruption vulnerability, core dump, /proc/pid/mem access, or privilege beyond normal same-process execution is required.
The missed ForceZero
dilithium_sign_with_seed_mu() in wolfcrypt/src/dilithium.c, line 8417:
1
2
XFREE(y, key->heap, DYNAMIC_TYPE_DILITHIUM); // no ForceZero
return ret;
wolfSSL already fixed this exact pattern in dilithium keygen (643427040), ed25519 signing (5f7bc0f3a), and ed448 signing (109e765b5). The ML-DSA signing path was missed.
The freed block is 50,176 bytes for ML-DSA-44. With WC_DILITHIUM_CACHE_PRIV_VECTORS off (the default) it contains the private-key polynomials at fixed offsets:
| Offset | Contents | Note |
|---|---|---|
| 21504 | s1 – static secret signing key | NTT-small domain |
| 25600 | s2 | NTT domain |
| 29696 | t0 | NTT domain |
(Offsets are for ML-DSA-44 with default build configuration. ML-DSA-65/87 use a different block size but the same bug.)
Exploit chain
- Application signs message M1. wolfSSL allocates the ~50 KB block, writes s1/s2/t0 in NTT form, frees it without zeroing.
- Any code in the same process calls
malloc(50176). The chunk is too large for glibc’s tcache (~1 KB max) and goes through the unsorted/large-bin path; for a single-threaded process with no intervening same-size allocations,malloc(50176)returns the same chunk with its payload intact. Read s1 from offset 21504. - Forge a signature on a different message M2 using s1 plus the public key.
s1 is the static signing key. One recovery = permanent key compromise for every message under that key.
Why the forged signature verifies (perturbation-bound detail)
The forgery uses hint reconstruction; s2 and t0 are not needed. For ML-DSA-44, tau*(2^(d-1) + eta) = 159,822 < 2*gamma2 = 190,464, which guarantees the reconstructed hints are always correct and the forged signature passes wc_dilithium_verify_msg().
PoC
The PoC (poc_heap_forgery_v2.c) includes wolfcrypt/src/dilithium.c directly for access to static NTT/expand functions. A companion verifier (verify_forged.c) links against the compiled libwolfssl binary – not an inlined copy – and independently confirms wc_dilithium_verify_msg() accepts the forged signature:
1
2
3
4
5
6
$ ./poc_heap_forgery_v2
...
FORGERY OK - wc_dilithium_verify_msg accepted forged sig on m2
$ ./verify_forged
VERIFIED - linked libwolfssl accepted the forged signature
Test results (wolfSSL v5.9.0-stable, -O2):
| Platform | Arch | libc | Result |
|---|---|---|---|
| Ubuntu 22.04.5 (Azure B2ls_v2) | x86_64 | glibc 2.35 | 10/10 |
| Amazon Linux 2023 | x86_64 | glibc 2.34 | 5/5 |
| Ubuntu 20.04 | x86_64 | glibc 2.31 | 5/5 |
Heap reclamation is allocator-dependent. macOS libmalloc did not return the same chunk on sequential free -> malloc (0/100); the freed block still contains s1 but recovery needs a different primitive (core dump or direct process-memory readback). musl and jemalloc were not tested.
Why It Matters
“implementations of ML-DSA shall ensure that any potentially sensitive intermediate data is destroyed as soon as it is no longer needed.”
– FIPS 204, Section 3.6.3
“shall” is normative under NIST conventions. The unzeroed heap block containing s1, s2, and t0 directly violates this requirement. wolfSSL’s active FIPS 140-3 certificate #4718 (wolfCrypt v5.2.1) does not include ML-DSA in its validated boundary, but wolfSSL has a pending FIPS 140-3 submission with ML-DSA in scope; the §3.6.3 violation is relevant to that submission.
wolfSSL confirmed and patched the finding but declined to assign a CVE, classifying it as a bug requiring a second vulnerability to exploit. The follow-up post addresses that objection.
Mitigations
Update to v5.9.1 or later. Process isolation, zero-on-free XMALLOC/XFREE hooks, and WC_DILITHIUM_CACHE_PRIV_VECTORS are partial workarounds only; none is a substitute for the patch.
Timeline
| Date | Event |
|---|---|
| 2026-03-28 | Reported to wolfSSL with forgery PoC |
| 2026-03-30 | Confirmed and patched: PRs #10100 and #10113 |
| 2026-04-02 | CVE declined; ticket closed |
| 2026-04-13 | Public disclosure; posted to oss-security 2026-04-14 |
| 2026-04-17 | Follow-up post: core-dump and cross-process /proc/mem recovery |
References
- FIPS 204 (ML-DSA) – Section 3.6.3
- wolfSSL PR #10100 – stack + seedMu fix
- wolfSSL PR #10113 – heap block + seed fix
- oss-security post (2026-04-14) – public advisory thread
- CWE-226 – Sensitive Information in Resource Not Removed Before Reuse
- CWE-244 – Improper Clearing of Heap Memory Before Release