Language ModelAlibaba CloudHotOpen SourceMultilingual

Qwen 2.5 72B API

Alibaba's flagship 72B model excelling at multilingual tasks, coding, and mathematics.

Parameters

72B

Context

131,072 tokens

Organization

Alibaba Cloud

Pricing

$0.4

per 1M input tokens


$0.4

per 1M output tokens

Try Qwen 2.5 72B for Free

Quick Start

Start using Qwen 2.5 72B in minutes. VoltageGPU provides an OpenAI-compatible API — just change the base_url.

Python (OpenAI SDK)
pip install openai
from openai import OpenAI

client = OpenAI(
    base_url="https://api.voltagegpu.com/v1",
    api_key="YOUR_VOLTAGE_API_KEY"
)

response = client.chat.completions.create(
    model="Qwen/Qwen2.5-72B-Instruct",
    messages=[
        {"role": "system", "content": "You are a multilingual assistant. Respond in the same language as the user."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=2048,
    temperature=0.7
)

print(response.choices[0].message.content)
cURL
Terminal
curl -X POST https://api.voltagegpu.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_VOLTAGE_API_KEY" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a multilingual assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "max_tokens": 2048,
    "temperature": 0.7
  }'

Pricing

ComponentPriceUnit
Input tokens$0.4per 1M tokens
Output tokens$0.4per 1M tokens

New accounts receive $5 free credit. No credit card required to start.


Capabilities & Benchmarks

Qwen 2.5 72B achieves excellent benchmark scores: MMLU (86.1%), HumanEval (86.6%), MATH (83.1%), and GSM8K (91.6%). It supports 29+ languages, structured output (JSON/XML), tool use, and function calling. The model excels at bilingual English-Chinese tasks and offers strong performance in code generation, mathematical reasoning, and long-context processing up to 131K tokens.


About Qwen 2.5 72B

Qwen 2.5 72B is Alibaba Cloud's flagship open-weight language model, delivering exceptional performance across English, Chinese, and 27+ additional languages. With 72 billion parameters and a 131K context window, it achieves top-tier results on coding, mathematics, and general knowledge benchmarks. Qwen 2.5 features improved instruction following, structured output generation, and long-context understanding compared to its predecessors. It was trained on 18 trillion tokens of high-quality multilingual data.


Use Cases

🌏

Multilingual Applications

Build applications serving users in 29+ languages with strong bilingual English-Chinese capabilities.

💻

Code Generation

Generate high-quality code with top-tier HumanEval scores across multiple programming languages.

🧮

Mathematical Reasoning

Solve complex math problems with step-by-step reasoning and high accuracy.

📋

Structured Data Extraction

Extract and generate structured JSON/XML output from unstructured text reliably.

📑

Long Document Analysis

Analyze documents up to 131K tokens for summarization, Q&A, and insight extraction.


API Reference

Endpoint

POSThttps://api.voltagegpu.com/v1/chat/completions

Headers

AuthorizationBearer YOUR_VOLTAGE_API_KEYRequired
Content-Typeapplication/jsonRequired

Model ID

Qwen/Qwen2.5-72B-Instruct

Use this value as the model parameter in your API requests.

Example Request

curl -X POST https://api.voltagegpu.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_VOLTAGE_API_KEY" \
  -d '{
    "model": "Qwen/Qwen2.5-72B-Instruct",
    "messages": [
      {"role": "system", "content": "You are a multilingual assistant."},
      {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    "max_tokens": 2048,
    "temperature": 0.7
  }'



Frequently Asked Questions

What languages does Qwen 2.5 72B support?

Qwen 2.5 72B supports 29+ languages including English, Chinese (Simplified & Traditional), Japanese, Korean, French, German, Spanish, Portuguese, Arabic, Russian, Thai, Vietnamese, Indonesian, and many more. It is particularly strong in English-Chinese bilingual tasks.

How does Qwen 2.5 72B compare to Llama 3 70B?

Qwen 2.5 72B generally matches or exceeds Llama 3.3 70B on most benchmarks. It scores higher on coding (HumanEval: 86.6% vs 88.4%) and math (MATH: 83.1% vs 77.0%). It also supports more languages and offers better Chinese language capabilities. At $0.40/M tokens, it offers competitive pricing.

Does Qwen 2.5 72B support structured output?

Yes, Qwen 2.5 72B excels at generating structured output in JSON, XML, and other formats. You can use the response_format parameter to request JSON mode through the VoltageGPU API.

What is the context window of Qwen 2.5 72B?

Qwen 2.5 72B supports a 131,072 token context window, allowing it to process very long documents, codebases, and conversation histories in a single request.


Start using Qwen 2.5 72B today

Get $5 free credit when you sign up. No credit card required. Deploy in under 30 seconds with our OpenAI-compatible API.