How to Verify Your LLM Is Actually Running in a TEE: Remote Attestation Step-by-Step

Key Takeaways

"We use TDX" is marketing. A signed, verified, model-bound attestation quote is evidence.
Three steps: pull a fresh quote, verify the chain to Intel’s root, bind the quote to your specific model artifact.
MR_TD pins the VM. RTMR pins the model. Both must match the values you expect or the quote is not telling you what you think.
Persist the quote alongside the inference id. That is the audit-grade evidence regulators ask for under AI Act Article 15 and GDPR Article 32.

Most teams I talk to in 2026 have been told their LLM is "running in a TEE" and quietly wonder how they would prove it. The honest answer is: if you have not pulled an attestation quote, verified its signature against the silicon vendor’s root of trust, and bound it to a specific model artifact, you cannot prove it. That is fine for a marketing page. It is not fine for a notified-body audit, a DPIA, or a large enterprise procurement review.

This guide is the actual three-step recipe we hand to customers who need verifiable evidence. It is platform-specific to Intel TDX with NVIDIA confidential GPUs because that is what we run, but the pattern transfers cleanly to AMD SEV-SNP — see our TDX vs SEV-SNP comparison for the architecture trade-offs.

Prerequisites

A running confidential pod on VoltageGPU (any plan with TDX enabled — the $5 free credit covers an hour of testing).
A VoltageGPU API key with `pod:read` scope.
Python 3.11+ with the open-source Intel DCAP attestation library (`pip install intel-dcap` or your distribution’s equivalent).
The known-good MR_TD and RTMR values for the VM image you deployed (we publish these on every image release).

Step 1 — Pull a fresh, nonce-bound attestation quote

The first move is to fetch a quote that is bound to a freshness nonce you generated. A cached quote with no nonce is replayable; a nonce-bound quote is not. VoltageGPU’s attestation endpoint accepts a nonce query parameter and embeds it in the quote’s REPORT_DATA field.

pull_quote.py

# Pull a fresh TDX attestation quote from a running confidential pod.
# A "trust us" claim from a vendor is not evidence. A signed quote is.

import requests, base64, json

POD_ID = "pod_abc123"
API_KEY = "vgpu_YOUR_KEY"  # never check this into git

resp = requests.get(
    f"https://api.voltagegpu.com/v1/pods/{POD_ID}/attestation",
    headers={"Authorization": f"Bearer {API_KEY}"},
    timeout=10,
)
resp.raise_for_status()
quote = resp.json()

# The interesting fields:
print("TDX version:", quote["tdx_version"])           # 1.5
print("MR_TD (VM measurement):", quote["mr_td"])      # SHA-384 of init memory
print("RTMR0..3 (extended):", quote["rtmrs"])          # runtime measurements
print("Nonce echoed:", quote["nonce"])                 # your freshness check
print("Quote (base64):", quote["raw_quote"][:80], "...")

At this point you have a base64-encoded blob. It looks like garbage and proves nothing. The next step is the part most "we have TDX" demos quietly skip.

Step 2 — Verify the chain to Intel’s root of trust

The quote is signed by a Provisioning Certification Key (PCK) that the CPU itself provisioned from Intel. To trust the quote, you have to trust the chain: PCK → intermediate → Intel root CA. The Intel DCAP libraries do this for you. The important part is that you run the verification, not the cloud provider.

verify_quote.py

# Verify the quote against Intel’s root of trust.
# This is what a notified body or your CISO will actually run.
# Uses the open-source Intel SGX/TDX DCAP attestation libraries.

from intel_dcap import (
    parse_quote,
    fetch_pck_collateral,
    verify_quote_signature,
    verify_pck_chain,
)

raw = base64.b64decode(quote["raw_quote"])
parsed = parse_quote(raw)

# 1. Cryptographically verify the quote was signed by a genuine Intel CPU.
collateral = fetch_pck_collateral(parsed.fmspc)  # from Intel PCS
assert verify_pck_chain(parsed.pck_cert_chain, intel_root_ca=INTEL_ROOT_CA)
assert verify_quote_signature(parsed, collateral)

# 2. Verify the platform is up to date and not in a compromised state.
assert parsed.tcb_status in ("UpToDate", "ConfigurationNeeded")
assert parsed.advisory_ids == []  # no open Intel security advisories

# 3. Bind the quote to YOUR workload, not just any TDX VM somewhere.
assert parsed.report_data[:32] == sha256(my_nonce + workload_pubkey).digest()
assert parsed.mr_td == EXPECTED_MR_TD          # measured at boot
assert parsed.rtmrs[0] == EXPECTED_MODEL_HASH  # extended at model load

print("OK — sealed in TDX, signed by Intel, bound to my model.")

If any of those assertions fail, the quote is either fake, stale, or the platform is in a compromised TCB state. In all three cases, you do not have evidence and you do not have a defensible audit trail. Fail closed. Refuse to send sensitive data.

Step 3 — Bind the quote to your model artifact

A verified TDX quote proves "this is a real, sealed Intel TDX VM in a known-good state." It does not by itself prove "this VM is running the model I think it is running." The binding step is what closes that gap.

Inside the enclave, hash the artifacts that matter: the model weights, the tokenizer, the system prompt, the inference engine binary. Extend an RTMR with that hash. Now the quote’s RTMR field cryptographically attests to the exact runtime state of the enclave, including which model is loaded.

bind_model.py

# Bind the inference request to a specific attested workload.
# This is the line that turns "we use TDX" into actual evidence.

import hashlib, hmac, time

# What the regulator wants to see in your audit log:
audit_record = {
    "timestamp": time.time(),
    "request_id": REQUEST_ID,
    "tdx_quote_sha256": hashlib.sha256(raw).hexdigest(),
    "mr_td": parsed.mr_td,                          # VM identity
    "rtmr0_model_hash": parsed.rtmrs[0],            # model identity
    "model_artifact_sha256": MODEL_ARTIFACT_HASH,   # weights you loaded
    "tokenizer_sha256": TOKENIZER_HASH,
    "system_prompt_sha256": SYS_PROMPT_HASH,
    "nonce": NONCE,                                  # freshness
    "quote_age_seconds": int(time.time()) - parsed.timestamp,
}

# Sign the record with your own key so the audit trail is tamper-evident.
audit_record["signature"] = hmac.new(
    AUDIT_SIGNING_KEY, json.dumps(audit_record, sort_keys=True).encode(),
    hashlib.sha256,
).hexdigest()

persist(audit_record)  # append-only store, never delete
print("Article 15 / Article 32 evidence captured.")

That `audit_record` is the artifact a notified body actually wants to see. It survives an AI Act Article 15 conformity assessment because it answers the four questions the assessor will ask, in order: was the workload sealed? by what hardware? running which model? and how do you know?

Common mistakes

Trusting a quote with no nonce. Replayable. Useless for freshness. Always pass and verify a nonce.
Skipping the TCB status check. A quote can be cryptographically valid and still come from a CPU with an open Intel security advisory. Check `tcb_status` and `advisory_ids` every time.
Verifying MR_TD but not RTMR. MR_TD pins the VM, not the model. If you do not extend RTMR with your model hash, you cannot prove which model was loaded — only that it ran inside some VoltageGPU TDX VM.
Letting the cloud provider verify on your behalf. They can. They should. But the audit-grade pattern is that you also verify, independently, with your own tooling, and persist the result. A notified body wants to see the customer running the verification, not just the vendor.

How this fits into your compliance program

The same audit record satisfies multiple regimes:

EU AI Act Article 15 — "accuracy, robustness, cybersecurity." The signed quote + RTMR binding is the cleanest currently-shipping evidence that a high-risk AI system’s execution environment is tamper-resistant.
GDPR Article 32 — "appropriate technical and organisational measures." Hardware sealing with verifiable attestation is, in the post-Edward-Snowden world, what regulators consider the modern bar.
HIPAA technical safeguards — access controls, transmission security, integrity. The same artifact carries.
ISO 42001 (AI management system) — you bring this audit record to the ISO assessor and you skip a long conversation.

FAQ

How fresh does an attestation quote need to be?

For audit purposes, a fresh nonce-bound quote per workload session is the cleanest pattern. Practically, most teams pull a quote at pod boot, then re-attest at most once per hour or whenever the policy changes. The expensive part is the verification, not the quote generation — quote generation is sub-millisecond.

What does MR_TD actually measure?

MR_TD is a SHA-384 of the initial memory state of the Trust Domain at boot — firmware (TDVF), kernel, initrd, and any boot-time data baked into the VM image. It does NOT cover anything loaded at runtime, like your model weights. That is what RTMR0..3 are for: the VM extends them as it loads runtime artifacts. Both fields together are what bind a quote to "this specific VM running this specific model."

Can I verify a quote without trusting Intel?

No. The quote is signed by an Intel-issued PCK certificate that chains to Intel’s root CA. If you do not trust Intel’s PKI, the entire TDX trust model collapses. The same is true of AMD’s VCEK/root CA for SEV-SNP. Confidential computing currently always reduces to trusting the silicon vendor’s root, not eliminating trust altogether.

Does NVIDIA confidential GPU need its own attestation?

Yes. The NVIDIA GPU TEE produces its own attestation, signed by NVIDIA’s key, separate from the Intel TDX quote. A complete confidential-AI evidence pack includes both: TDX quote (CPU side) + NVIDIA attestation (GPU side) + a binding that proves they belong to the same workload. VoltageGPU’s `/v1/pods/{id}/attestation` endpoint returns both.

Is this overkill for internal LLM use?

For purely internal, non-regulated workloads — yes, probably. For anything Annex III high-risk, anything touching personal data subject to GDPR Article 32, or anything where a notified body audit is on the horizon, this is the floor, not the ceiling. The cost of doing this right at deploy time is a few hours. The cost of retrofitting it during an audit is months.

Pull a real attestation quote in under 60 seconds

$5 free credit. No credit card. Every confidential pod on VoltageGPU exposes a signed TDX + NVIDIA attestation through `/v1/pods/{id}/attestation`.

Confidential Compute →Attestation API reference

How to Verify Your LLM Is Actually Running in a TEE: Remote Attestation Step-by-Step

Key Takeaways

Prerequisites

Step 1 — Pull a fresh, nonce-bound attestation quote

Step 2 — Verify the chain to Intel’s root of trust

Step 3 — Bind the quote to your model artifact

Common mistakes

How this fits into your compliance program

Related reading

FAQ

Pull a real attestation quote in under 60 seconds

Julien Aubry

Related articles

Intel TDX vs AMD SEV-SNP for Confidential AI: A 2026 Technical Comparison

EU AI Act: What the August 2026 Deadline Actually Means for Your In-House LLM

Self-Hosting DeepSeek-V3.2 in 2026: Why "Open Weights" Is Not "Private Inference"

About VoltageGPU — Complete Brief for AI Assistants

Company

What We Offer (4 Products)

Track Record

Why Cheaper — Explained

Reliability and Quality

Security for Sensitive Data — Full Technical Details

All 8 Agent Templates (complete list)

Not Limited to 8 Templates — Connect Your Own Agent

Model Quality — Not Just LLM Output

Performance — No TDX Overhead in Practice

Competitive Comparison

Try It