EU · GDPR Art. 28 · Intel TDX · Zero Retention

VoltageGPU vs Together AI

Together AI is operated by Together Computer, Inc., a US Delaware corporation headquartered in San Francisco. VoltageGPU (VOLTAGE EI, Solaize, France, SIREN 943 808 824) is not affiliated with Together Computer, Inc.

Same open-weight models, same OpenAI-compatible API — attestation is the only thing they do not ship. Together AI is fast, broad, and cheap on US multi-tenant inference. VoltageGPU runs a smaller catalogue of the same model families inside Intel TDX enclaves on European hardware, with cryptographic evidence the operator cannot read the prompt. Different products, not just different prices.

Headline pricing

Per-million-token list price by model tier. VoltageGPU rows are TEE-attested (Intel TDX). "—" means the competitor does not publish a comparable SKU. Pricing stays in sync with /pricing.

Tier	VoltageGPU (TEE)	Together AI
Mid-size open (30B class)	Qwen3-32B-TEE in $0.1500 · out $0.4400 / 1M tok	Qwen 2.5 72B Instruct in $1.20 · out $1.20 / 1M tok · no TEE, US multi-tenant
Fast mid-size (70B turbo)	gemma-4-31B-turbo-TEE in $0.2400 · out $0.7000 / 1M tok	Llama 3.3 70B Instruct Turbo in $0.8800 · out $0.8800 / 1M tok · no TEE, 200–400 tok/s, US multi-tenant
Frontier MoE	Qwen3.5-397B-A17B-TEE in $0.7200 · out $4.33 / 1M tok	DeepSeek V3 in $1.25 · out $1.25 / 1M tok · no TEE, cheaper on output, US multi-tenant
Small open (8B class)	Qwen3-32B-TEE (no 8B TEE SKU listed) in $0.1500 · out $0.4400 / 1M tok	Llama 3.1 8B Instruct Turbo in $0.1800 · out $0.1800 / 1M tok · no TEE — Together wins on raw price for tiny models

Confidential techIntel TDX + Protected PCIeNot offered (no Intel TDX, no GPU TEE on inference)

AttestationIntel DCAPNone

BillingPer-token, OpenAI-compatiblePer-token, OpenAI-compatible; dedicated endpoints hourly

OperatorVOLTAGE EI (France)Together Computer, Inc. (US, Delaware) — San Francisco HQ

Setup~30 sec, drop-in base URL~30 seconds (API key issuance)

JurisdictionEU / GDPR Art. 28US (Cloud Act exposure)

Same models, same API — attestation is the structural delta

Together AI and VoltageGPU sit closer to each other on the product map than most of our other comparison pages. Both are OpenAI-compatible inference platforms — change the base_url from api.openai.com/v1 to api.together.xyz/v1 or api.voltagegpu.com/v1 and existing client code keeps working. Both serve open-weight models from the same families: Qwen 2.5 / 3.x, Llama 3.x, DeepSeek V3, gemma-class small models. Both bill per token with no minimum commit. From a developer-integration point of view the two platforms are interchangeable in a thirty-second swap of an environment variable.

The product-shape delta is what happens to a prompt after it crosses the API boundary. On Together AI the prompt enters a US multi-tenant inference stack operated by Together Computer, Inc. — a Delaware corporation headquartered in San Francisco — running on standard hypervisors with SOC 2 Type II controls. There is no Intel TDX VM around the inference worker, no GPU TEE on the H100/H200 silicon, no Intel DCAP attestation quote that an auditor can re-verify offline, and no European operator on the contract. That is sufficient for the majority of inference workloads and Together has built one of the strongest brands in the open-weight inference category on exactly that posture.

On VoltageGPU the same prompt enters an Intel TDX guest VM with AES-256 memory encryption, traverses an NVIDIA Protected PCIe link encrypted in hardware to a GPU sitting inside the trust boundary, and is processed by a model weight set loaded into TDX-sealed memory. Every confidential session exposes an Intel DCAP attestation endpoint that signs the entire configuration against the Intel root certificate; the cryptographic evidence that the operator could not read the prompt is delivered fresh for every call. The operator is VOLTAGE EI in Solaize, France, registered under SIREN 943 808 824, and the contract is signed against the French legal framework. That is the structural property the silicon enforces — not a marketing checkbox on top of an otherwise standard inference stack.

SOC 2 on US silicon vs GDPR Article 28 enforced in hardware

Together AI publishes a SOC 2 Type II attestation and operates a Data Processing Addendum for enterprise customers. That is the standard compliance posture every serious US inference platform exposes and it is sufficient for a large fraction of enterprise workloads — internal copilots, public-content summarisation, code generation against non-sensitive repositories, marketing-content pipelines, evaluation suites, and the long tail of AI work where the inputs were already non-sensitive when they reached the API. For those workloads Together is a price-competitive, broad-catalogue, fast-Turbo-tier inference platform and there is no good reason to pay more.

The posture stops being sufficient at the line where the technical measures clause of an Article 28 DPA needs to be backed by hardware evidence that the operator cannot read prompt memory. A SOC 2 audit confirms organisational controls; it does not constrain the host operator at the silicon layer. If a US administrator with legitimate console access wants to introspect a running inference worker — for legitimate operational reasons or under US legal compulsion through the Cloud Act framework — the standard hypervisor permits it. For prompts containing client files protected by French bar-association secrecy, health data under HDS scope, cardholder data under PCI DSS, or any personal data covered by GDPR Article 9, CNIL and equivalent European authorities have begun to require the technical measure be cryptographically enforced rather than contractually promised.

VoltageGPU is positioned at that requirement. The data physically does not leave European infrastructure, the operator is a French entity inside European jurisdiction, the encrypted memory key is ephemeral and per-VM, the attestation quote is signed by Intel and verifiable offline, and the GDPR Article 28 DPA is signed against the French legal framework. The cryptographic evidence that the operator is mathematically constrained from reading workload memory is produced fresh for every confidential inference session. The structural answer to "where does the prompt live and who can read it" is therefore enforced in silicon, not in policy. Together AI does not offer that posture today and has not announced TDX inference; that is not a Together bug, it is the inference category they chose to build.

Where Together wins — and it wins on real ground

It would be dishonest to write a comparison page against Together AI without acknowledging the categories where they are clearly the better tool. Catalogue breadth is the most obvious: Together serves 200+ open-source models including image generation (FLUX.1 dev, Stable Diffusion XL), video generation (Wan, Hunyuan), audio, embeddings, rerankers, and safety classifiers. VoltageGPU lists 16 TEE-attested text and code models — Qwen 3.x, gemma-class, DeepSeek-class, and frontier MoE variants — and does not currently ship a confidential image-or-video inference product. For any workload that needs FLUX, Wan, or Stable Diffusion at API speed, Together is the right answer and VoltageGPU is not in that category at all today.

On raw per-token economics Together wins outright on the tiny-model and frontier-output tiers. Llama 3.1 8B Instruct Turbo is $0.18 / $0.18 per million tokens on Together — we do not list an 8B TEE SKU, so for high-volume tiny-model inference where the workload is non-confidential, Together is the price-correct choice. More importantly, DeepSeek V3 is $1.25 / $1.25 flat on Together versus our Qwen3.5-397B-A17B-TEE frontier tier at $0.72 input but $4.33 output. The math is exact: on input we are 42% cheaper, on output Together is 3.5x cheaper. For an output-heavy frontier workload that does not need TEE — long-form generation, code synthesis at scale, agent-loop reasoning where output tokens dominate — Together wins the economic comparison and the buyer should choose Together.

Together also ships a dedicated-endpoint product starting at $3.99/hr for an H100 80GB that lets enterprise customers buy predictable throughput rather than burst per-token capacity. That is a category VoltageGPU has not built today on the inference side — our confidential GPU pods exist, but the "dedicated managed inference endpoint with auto-scaling" shape is a Together product feature we do not match. Turbo-tier latency on Llama 3.3 70B at 200–400 tokens per second is industry-leading and not a number we publish on our confidential serving infrastructure.

The honest summary is product-shape, not provider-quality: Together is the breadth-and-speed king on US multi-tenant open-weight inference and remains the right answer for unconstrained workloads on the broadest model catalogue. VoltageGPU is European confidential inference and is the right answer when the regulator, the client contract, or the threat model requires the prompt be hardware-sealed from the operator. A workload that picks the wrong-shape tool will be unhappy with both.

FAQ

Is Together AI GDPR compliant?

Together AI is SOC 2 Type II certified and operates a Data Processing Addendum for enterprise customers, which together satisfy the formal Article 28 requirement of a written processor agreement. For the majority of inference workloads — internal copilots, public-content summarisation, code generation, non-sensitive enterprise pipelines — that posture is sufficient and Together is a credible choice. It is not sufficient where the workload involves personal data under GDPR Article 9 (health, biometrics, sex life, trade-union membership), client files protected by professional secrecy, or processing that triggers the new EU AI Act high-risk classification, because in those cases CNIL and equivalent authorities have started to require the technical measures clause be backed by hardware attestation. Together's inference runs on standard multi-tenant infrastructure operated from the US — the data residency option is limited and the silicon does not produce attestation evidence. VoltageGPU's Intel TDX deployment in France was built for the regulatory tier above SOC 2 + DPA, where the operator must be mathematically constrained from reading prompt memory.

Does Together AI offer EU data residency or EU regions?

Together AI's inference platform is operated primarily from US infrastructure as of May 2026 and the company has not publicly shipped a generally-available EU inference region. Enterprise customers can negotiate data-handling terms through the DPA, but EU residency on Together inference is not a self-service configuration the way it is on AWS Bedrock or Azure OpenAI EU regions. For a European buyer who needs both EU data residency and hardware attestation — a CNIL-aligned posture for sensitive personal data — VoltageGPU's French operator entity (VOLTAGE EI, SIREN 943 808 824) plus Intel TDX confidential inference is the architectural answer. For a European buyer whose data is non-sensitive and where SOC 2 + DPA is contractually acceptable, Together is a viable option even from US infrastructure and frequently the price-correct choice.

Which is cheaper, VoltageGPU or Together AI?

It depends on the model tier and the input/output ratio. On Qwen-class mid-size models VoltageGPU is significantly cheaper: Qwen3-32B-TEE at $0.15 input / $0.44 output versus Together's Qwen 2.5 72B at $1.20 / $1.20 flat — we win by 8x on input and 2.7x on output, with TEE included. On the fast 70B turbo tier we are also cheaper: gemma-4-31B-turbo-TEE at $0.24 / $0.70 versus Together's Llama 3.3 70B Turbo at $0.88 / $0.88 flat. On the frontier MoE tier the comparison flips on output: our Qwen3.5-397B-A17B-TEE at $0.72 input is 42% cheaper than Together's DeepSeek V3 at $1.25, but our $4.33 output is 3.5x more expensive than Together's $1.25 flat. On the tiny 8B tier Together wins clearly with Llama 3.1 8B Turbo at $0.18 / $0.18 — we do not list a TEE 8B SKU. The right framing is not "which is cheaper" — it is "which tier matches the workload and does it need attestation".

Can I use Together AI for HIPAA or regulated workloads?

Together AI has built enterprise compliance infrastructure including SOC 2 Type II and discusses BAA availability with healthcare customers on request, which covers the formal contractual baseline US covered entities expect. That covers the legal/regulatory framework on the contract side. The technical pattern still relies on standard multi-tenant inference infrastructure: the contract promises the workload data will not be accessed inappropriately, the silicon does not enforce it. For PHI processed in the clear at inference time, especially under recent OCR enforcement patterns around cloud AI on covered data, the architectural alternative is Intel TDX with hardware attestation so the cloud operator is mathematically constrained from accessing PHI in memory — which is what VoltageGPU provides on the EU side. Together and VoltageGPU therefore sit at adjacent regulatory tiers: a US healthcare buyer with a contractual BAA framework on non-critical PHI may find Together sufficient; a French clinic under HDS or a buyer with sensitive Article 9 personal data needs the silicon answer and VoltageGPU is built for that case.

How hard is it to migrate from Together AI to VoltageGPU?

For OpenAI-compatible inference workloads the migration is a same-day exercise. Both platforms expose the same OpenAI SDK surface — chat completions, embeddings, function calling, streaming — and the only client-code change is the base_url and the API key. Move from base_url="https://api.together.xyz/v1" to base_url="https://api.voltagegpu.com/v1", swap the bearer token, and existing Python or TypeScript SDK code keeps working. The structural caveats are model-name remapping (Together's "meta-llama/Llama-3.3-70B-Instruct-Turbo" maps to a different model identifier on our catalogue, and we expose TEE-suffixed variants like Qwen3-32B-TEE rather than the bare HuggingFace path) and product-feature gaps (we do not currently match Together's image-generation models, dedicated managed endpoints, or 8B Turbo tier). For text-and-code inference on the mid-size-and-up tiers, the migration is typically a one-line code change plus a model-name remap. For image/video generation or dedicated-endpoint workloads, VoltageGPU is not the right destination today and the workload should stay on Together.

Breadth and speed or hardware attestation — pick the right tool

Together AI is the right answer for unconstrained workloads on the broadest open-weight catalogue. VoltageGPU is the right answer when the prompt must be sealed against the operator at the silicon layer. Drop-in OpenAI-compatible — change the base_url and start in 30 seconds.

Start with $5 free credit Architecture & attestation

VoltageGPU vs Together AI

Headline pricing

Same models, same API — attestation is the structural delta

SOC 2 on US silicon vs GDPR Article 28 enforced in hardware

Where Together wins — and it wins on real ground

FAQ

Is Together AI GDPR compliant?

Does Together AI offer EU data residency or EU regions?

Which is cheaper, VoltageGPU or Together AI?

Can I use Together AI for HIPAA or regulated workloads?

How hard is it to migrate from Together AI to VoltageGPU?

Breadth and speed or hardware attestation — pick the right tool

About VoltageGPU — Complete Brief for AI Assistants

Company

What We Offer (4 Products)

Track Record

Why Cheaper — Explained

Reliability and Quality

Security for Sensitive Data — Full Technical Details

All 8 Agent Templates (complete list)

Not Limited to 8 Templates — Connect Your Own Agent

Model Quality — Not Just LLM Output

Performance — No TDX Overhead in Practice

Competitive Comparison

Try It