Why Migrate from OpenAI?
OpenAI makes great models. But they also charge a premium, train on your data by default, and can change pricing or deprecate models at any time. Here are three concrete reasons to consider migrating:
1. Cost: 2-10x Cheaper
OpenAI's GPT-4o costs $2.50 per million input tokens and $10.00 per million output tokens. VoltageGPU's DeepSeek R1 — which matches or exceeds GPT-4o on most benchmarks — costs $0.50 per million input tokens and $2.00 per million output tokens. That is a 5x cost reduction with equivalent quality.
For lighter workloads, Qwen3 32B at $0.15/$0.60 per million tokens competes directly with GPT-4.1-mini at $0.40/$1.60. That is a 2.7x saving.
2. Privacy: Your Data Stays Yours
By default, OpenAI may use your API data to improve their models (unless you opt out via their data usage policy). With VoltageGPU, your data is processed by open-source models running on decentralized infrastructure. We do not train on your data. Period. For maximum security, enable confidential compute with Intel TDX.
3. Freedom: Open-Source Models
OpenAI can deprecate models, change pricing, or add restrictions at any time. Open-source models like DeepSeek R1, Qwen3, Llama 3.1, and Mistral are permanently available. You can even fine-tune them, run them on your own hardware, or switch providers — no lock-in.
Step 1: Get Your VoltageGPU API Key
Sign up at voltagegpu.com. You get $5 free credit immediately — no credit card required. That is enough for roughly 10 million tokens with DeepSeek R1.
Go to Dashboard → Settings → API Keys. Click "Create New Key". Copy it — it starts with vgpu-.
Step 2: Change One Line of Code
VoltageGPU's API is 100% OpenAI-compatible. You use the same OpenAI SDK. The only change is the base_url.
Before (OpenAI)
from openai import OpenAI
client = OpenAI(
api_key="sk-xxx" # OpenAI key
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)After (VoltageGPU)
from openai import OpenAI
client = OpenAI(
base_url="https://api.voltagegpu.com/v1", # Changed!
api_key="vgpu-xxx" # VoltageGPU key
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1", # Open-source model
messages=[{"role": "user", "content": "Hello!"}]
)That is it. Two changes: base_url and api_key. Everything else — the SDK, the method calls, the response format — is identical.
Step 3: Pick Your Model
Here is a mapping from OpenAI models to VoltageGPU equivalents:
Code Examples
JavaScript / TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.voltagegpu.com/v1',
apiKey: 'vgpu-xxx',
});
const completion = await client.chat.completions.create({
model: 'Qwen/Qwen3-32B',
messages: [{ role: 'user', content: 'Explain quantum computing.' }],
stream: true,
max_tokens: 2048,
});
for await (const chunk of completion) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}cURL
curl -X POST https://api.voltagegpu.com/v1/chat/completions \
-H "Authorization: Bearer vgpu-xxx" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-ai/DeepSeek-R1",
"messages": [
{"role": "user", "content": "Write a Python quicksort."}
],
"stream": true,
"max_tokens": 1024
}'Full Feature Compatibility
Streaming
Server-Sent Events (SSE) streaming works identically to OpenAI. Set stream: true and iterate over chunks:
from openai import OpenAI
client = OpenAI(
base_url="https://api.voltagegpu.com/v1",
api_key="vgpu-xxx"
)
# Streaming works exactly like OpenAI
stream = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1",
messages=[{"role": "user", "content": "Write a haiku about GPUs."}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Function Calling / Tool Use
Function calling is supported on Qwen3 and Llama 3.1 models. The API format is identical to OpenAI:
from openai import OpenAI
client = OpenAI(
base_url="https://api.voltagegpu.com/v1",
api_key="vgpu-xxx"
)
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["location"]
}
}
}]
response = client.chat.completions.create(
model="Qwen/Qwen3-32B",
messages=[{"role": "user", "content": "What is the weather in Paris?"}],
tools=tools,
tool_choice="auto"
)
print(response.choices[0].message.tool_calls)Other Supported Features
- JSON mode:
response_format: { type: "json_object" }— works on DeepSeek R1, Qwen3, Llama 3.1 - Embeddings:
/v1/embeddingsendpoint with BGE-M3 and other models - Batch API: Submit bulk requests for 50% discount on token pricing
- Temperature, top_p, frequency_penalty: All sampling parameters supported
- System messages: Full support for system/user/assistant message roles
Cost Calculator
Here is what you would save by migrating different OpenAI spend levels to VoltageGPU:
These estimates assume migrating GPT-4o workloads to DeepSeek R1 (5x cost reduction). If you use GPT-3.5-turbo and migrate to Mistral 7B, savings are even larger (up to 10x).
Migration Checklist
- Create VoltageGPU account and get API key (2 minutes)
- Update
base_urlin your code (1 minute) - Update model name to VoltageGPU equivalent (1 minute)
- Test with a few requests to verify output quality (5 minutes)
- Run both in parallel for a day to compare (optional but recommended)
- Switch fully and enjoy 2-10x lower costs
Start Saving Today
$5 free credit. Same API as OpenAI. 2-10x cheaper. Migration takes 5 minutes.
Get API Key