π₯ Qwen/Qwen3-32B
High-performance 32B parameter LLM. Excellent for reasoning, coding, and multilingual tasks.
33.54M runs in 7 days
Access 144+ state-of-the-art AI models via API. Serverless inference with 85% cost savings vs OpenAI. OpenAI-compatible API for seamless integration.
High-performance 32B parameter LLM. Excellent for reasoning, coding, and multilingual tasks.
33.54M runs in 7 days
Advanced reasoning model with Trusted Execution Environment for secure inference.
7.63M runs in 7 days
Efficient 24B instruction-tuned model. Great balance of speed and quality.
3.5M runs in 7 days
State-of-the-art image generation. Create stunning visuals from text prompts.
High quality, fast generation
from openai import OpenAI
client = OpenAI(
base_url="https://api.voltagegpu.com/v1",
api_key="your-voltagegpu-api-key"
)
response = client.chat.completions.create(
model="Qwen/Qwen3-32B",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)by Zai Org
GLM-5 is a 744B parameter open-source language model designed for complex reasoning, coding, and agentic tasks, achieving performance competitive with leading frontier models.
Start a conversation with zai-org/GLM-5-TEE
Type a message below to begin