Page Inspect

Internal Links

External Links

Images

Headings

Page Content

Title:Qwen-3: Alibaba Cloud's Next-Gen Open Source LLM | Apache 2.0 | MoE & Dense

Description:Discover and experience the Qwen-3 model series, covering MoE and Dense models from 0.6B to 235B parameters, featuring advanced capabilities like Hybrid Thinking, Multimodal Processing, and 119 language support. Find download, deployment, and benchmark information.

HTML Size:138 KB

Markdown Size:12 KB

Fetched At:October 13, 2025

Page Structure

h1Qwen-3: Explore the Next-Generation Open Source Large Model

h2One-Click Website Integration

h2Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3

h2Download Tongyi Qianwen APP

h3iOS App Store

h3Google Play Store

h3Android Package (Official)

h2Core Features

h3Hybrid Thinking Mode

h3Flagship & Efficient Performance

h3Unified Multimodal Processing

h3Broad Multilingual Support

h3MCP Protocol & Agent Capabilities

h3Efficient MoE & Diverse Dense Models

h3Ultra-Long Context Processing

h3Advanced Training Techniques

h3Open Ecosystem & Compatibility

h2DeepSeek V3 in Media Coverage

h3Breakthrough Performance

h3Massive Scale Architecture

h3Efficient Development Cost

h2Qwen-3 in Action

h3Qwen-3: Leading the Way in Open Source AI

h2Qwen-3 Performance on Authoritative Benchmarks

h3General Ability & Language Understanding

h3Coding Ability

h3Mathematical Ability

h2Technical Specifications

h3Qwen-3 Architecture Details

h2Qwen-3 Research

h3Innovative Architecture

h3Training Methodology

h3Technical Blog & Report

h2About the Qwen Team

h3Development Background

h3Technical Strength

h2Qwen-3 Deployment Options

h3Efficient Inference Frameworks (vLLM & SGLang)

h3Convenient Local Deployment

h3Cloud API Services

h3Model Platforms & Quantization Formats

h2How to Use Qwen-3

h3Choose Your Method

h3Access Platform or Download Model

h3Start Interacting or Integrating

h2Frequently Asked Questions

h3What makes Qwen-3 unique?

h3How can I access or use Qwen-3?

h3What tasks does Qwen-3 excel at?

h3What is Hybrid Thinking Mode?

Markdown Content

Qwen-3: Alibaba Cloud's Next-Gen Open Source LLM | Apache 2.0 | MoE & Dense

Qwen-3

Prompt Library

Download Models

Blog

App Download

Try Online

English

中文简体

Qwen-3

# Qwen-3: Explore the Next-Generation Open Source Large Model

Experience the flagship Qwen-3 model series developed by Alibaba Cloud, featuring hybrid thinking, multimodal processing, and powerful multilingual capabilities. Open sourced under Apache 2.0.

235B MoE Parameters

119+ Language Support

Hybrid Thinking Mode

Model Download & Deployment Try Online Now

## One-Click Website Integration

Own a website? Instantly add our chat interface using simple iframe code - no registration required.

<iframe src="https://qwen-3.com/embed" width="100%" height="600px" frameborder="0"></iframe>

## Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3

Qwen-3

New Chat

Send

Qwen-3 with DeepSeek R1Qwen-3 with x.AI Grok 3

## Download Tongyi Qianwen APP

Experience Qwen on your mobile device

### iOS App Store

For iPhone and iPad

Download

### Google Play Store

For Android devices

(Play Store download link currently unavailable)

Download

### Android Package (Official)

Direct APK download not officially provided yet

(Direct APK download link currently unavailable)

Download

## Core Features

Explore the powerful functions and innovative features of Qwen-3

### Hybrid Thinking Mode

Automatically switches between deep thinking and quick response modes based on task complexity, balancing intelligence and efficiency, with flexible control.

- •Thinking Mode (Step-by-step reasoning)
- •Non-Thinking Mode (Quick response)
- •API/Prompt tag control
- •Optimized thinking budget

### Flagship & Efficient Performance

Flagship MoE model performance rivals top closed-source models, while small-size models also exhibit exceptional performance, surpassing previous-generation large models.

- •Leading in coding/math/general ability
- •Excellent performance of Qwen3-235B-A22B
- •Qwen3-4B matches Qwen2.5-72B
- •MoE models activate fewer parameters, high efficiency

### Unified Multimodal Processing

Utilizes unified multimodal encoding technology, deeply integrating the processing of text, images, audio, video, and other inputs within a single architecture.

- •Text understanding and generation
- •Image recognition and analysis
- •Audio processing and interaction
- •Video content understanding

### Broad Multilingual Support

Supports up to 119 languages and dialects, significantly optimizing cross-lingual task performance and language switching issues.

- •Coverage of 119 languages & dialects
- •Pre-trained on 36T tokens
- •Reduced language switching errors
- •Strong cross-lingual capabilities

### MCP Protocol & Agent Capabilities

Natively supports the MCP protocol, standardizing external tool calls for AI Agents. Recommended to build agents using the Qwen-Agent framework.

- •Standardized external Action calls
- •Improved Agent development compatibility
- •Easy to build browser assistants, etc.
- •Qwen-Agent framework recommended

### Efficient MoE & Diverse Dense Models

Offers flagship MoE models and a variety of Dense models from 0.6B to 32B, meeting diverse scenario requirements.

- •Qwen3-235B (MoE, 22B activated)
- •Qwen3-30B (MoE, 3B activated)
- •0.6B to 32B Dense models
- •Open sourced under Apache 2.0

### Ultra-Long Context Processing

Dense models support up to 128K token context, and MoE models also support long context, efficiently handling long documents and complex dialogues.

- •Up to 128K context (8B-32B)
- •32K context (0.6B-4B)
- •Optimized attention mechanisms
- •Reduced memory usage for long sequences

### Advanced Training Techniques

Undergoes three-stage pre-training based on nearly 36 trillion tokens of data, and employs four-stage post-training to develop hybrid thinking and general capabilities.

- •36T tokens pre-training data
- •Three-stage pre-training process
- •Four-stage post-training flow
- •Application of high-quality synthetic data

### Open Ecosystem & Compatibility

Open sourced under the Apache 2.0 license, seamlessly integrated with mainstream tools like HuggingFace, vLLM, Ollama, SGLang, etc.

- •Fully open source (Apache 2.0)
- •Supports frameworks like vLLM, SGLang
- •Supports local tools like Ollama, LMStudio
- •Available on HuggingFace/ModelScope/Kaggle

## DeepSeek V3 in Media Coverage

New Breakthroughs in Open-Source AI Development

### Breakthrough Performance

DeepSeek V3 outperforms both open-source and closed-source AI models in programming competitions, particularly excelling in Codeforces competitions and Aider Polyglot tests.

### Massive Scale Architecture

Possesses 671 billion parameters and trained on 14.8 trillion tokens, 1.6 times the scale of Meta's Llama 3.1 405B.

### Efficient Development Cost

Completed training in just two months using Nvidia H800 GPUs, with a development cost of only $5.5 million.

## Qwen-3 in Action

See how Qwen-3 elevates open-source AI capabilities

### Qwen-3: Leading the Way in Open Source AI

A deep dive into the capabilities of Qwen-3 and its performance against other leading AI models.

## Qwen-3 Performance on Authoritative Benchmarks

### General Ability & Language Understanding

MMLULeading

GPQALeading

Arena HardExcellent

### Coding Ability

LiveCodeBenchSOTA

HumanEvalLeading

OpenCompassLeading

### Mathematical Ability

GSM8KExcellent

AIMEExcellent

## Technical Specifications

Explore the advanced technology, architecture, and capabilities driving Qwen-3

ArchitectureTrainingCapabilitiesOptimization

### Qwen-3 Architecture Details

Advanced architecture integrating Mixture-of-Experts, diverse dense models, and innovative mechanisms

•Mixture-of-Experts (MoE) models: Qwen3-235B (22B activated), Qwen3-30B (3B activated)

•Diverse Dense models: 0.6B, 1.7B, 4B, 8B, 14B, 32B

•Architectural basis for Hybrid Thinking Mode

•Unified Multimodal Encoding technology

•Native MCP (Model-Action-Protocol) support

•Support for long context (up to 128K/32K tokens)

•Optimized Transformer variant design

•Efficient attention mechanisms and chunked prefilling techniques

## Qwen-3 Research

Pushing the boundaries of language model capabilities

### Innovative Architecture

Integrating hybrid thinking mode, unified multimodal encoding, and efficient MoE architecture.

### Training Methodology

Multi-stage pre-training and post-training based on nearly 36 trillion tokens, covering 119 languages.

### Technical Blog & Report

Read our blog post to understand the design philosophy and performance details of Qwen-3. A detailed technical report will be released soon.

Read Blog Post

## About the Qwen Team

The team behind the Qwen-3 models

### Development Background

The Qwen-3 model series is developed by the Alibaba Cloud Tongyi Qianwen team. This team is dedicated to the open-source research and application of large language models, continuously releasing the leading Qwen model series.

### Technical Strength

Leveraging Alibaba Cloud's powerful cloud computing infrastructure and extensive experience in large-scale AI model training, the Qwen team can efficiently develop and iterate advanced language models.

## Qwen-3 Deployment Options

### Efficient Inference Frameworks (vLLM & SGLang)

Recommended for high-performance deployment using vLLM (>=0.8.4) or SGLang (>=0.4.6.post1), supporting long context and Hybrid Thinking Mode.

- High throughput
- Low latency
- Supports Hybrid Thinking Mode
- Compatible with OpenAI API

### Convenient Local Deployment

Easily run Qwen-3 models locally using tools like Ollama, LMStudio, MLX, llama.cpp, KTransformers, etc.

- Quick start
- Cross-platform support (CPU/GPU)
- Active community
- Support for various quantization formats

### Cloud API Services

Directly call the Qwen-3 API via Alibaba Cloud Bailian, DashScope, or together.ai without self-deployment.

- Out-of-the-box
- Pay-as-you-go
- Global access
- Enterprise-level support

### Model Platforms & Quantization Formats

Model weights are available on Hugging Face, ModelScope, Kaggle. Supports quantization formats like GGUF, AWQ, AutoGPTQ to reduce resource requirements.

- Multi-platform access
- Apache 2.0 License
- Supports Int4/Int8 quantization
- Suitable for consumer hardware

## How to Use Qwen-3

Get started quickly with Qwen-3: Try online, call APIs, or deploy locally

Step 1

### Choose Your Method

Based on your needs, choose to try it online (Qwen Chat), call the API service, or download the model for local deployment.

Step 2

### Access Platform or Download Model

Visit the Qwen Chat website/app, consult API documentation and providers (like Alibaba Cloud Bailian), or go to Hugging Face/ModelScope/Kaggle to download the required model files.

Step 3

### Start Interacting or Integrating

Interact directly with Qwen Chat, integrate it into your application according to the API documentation, or use tools like Ollama, vLLM, SGLang to run and manage the model locally.

Try Qwen Chat Online

## Frequently Asked Questions

Learn more about Qwen-3

### What makes Qwen-3 unique?

Qwen-3 offers various model sizes from 0.6B to 235B (MoE), open-sourced under Apache 2.0. Key innovations include the Hybrid Thinking Mode (intelligently switching thought depth), unified multimodal processing capabilities, and broad support for 119 languages.

### How can I access or use Qwen-3?

You can download model weights from Hugging Face, ModelScope, or Kaggle for local deployment (tools like vLLM, SGLang, Ollama are recommended). You can also call API services via Alibaba Cloud Bailian, DashScope, together.ai, etc., or experience it directly on the Qwen Chat website/app.

### What tasks does Qwen-3 excel at?

Qwen-3 demonstrates leading performance in coding, mathematics, and general capability benchmarks, surpassing models like Llama3.1-405B. Its multilingual abilities, long context processing, and Agent functionalities (with MCP protocol) are also very strong.

### What is Hybrid Thinking Mode?

This is an innovative feature of Qwen-3. The model can automatically or manually switch between a 'thinking mode' for deep reasoning and a 'non-thinking mode' for quick responses, based on task complexity, to balance effectiveness and efficiency.

### How many languages does Qwen-3 support?

Qwen-3 supports up to 119 languages and dialects, significantly enhancing cross-lingual understanding and generation capabilities through large-scale multilingual pre-training data (nearly 36T tokens).

### What are the hardware requirements for running Qwen-3?

Requirements depend on the model size. Smaller models (e.g., 0.6B, 1.7B) can run on consumer hardware, especially with Int4/Int8 quantization (like GGUF). Larger models (e.g., 32B, 235B) require more powerful GPU support. It's recommended to check the specific model's documentation and quantization options.

### Is Qwen-3 available for commercial use?

Yes, all models in the Qwen-3 series are released under the Apache 2.0 license, allowing for both commercial and research use.

### What is the context window size of Qwen-3?

Depending on the model size, Qwen-3 dense models support context lengths of 32K (0.6B-4B) or 128K (8B-32B) tokens. MoE models also support long context (check the model card for specific sizes).

### Which deployment frameworks/tools does Qwen-3 support?

vLLM (>=0.8.4) and SGLang (>=0.4.6.post1) are recommended for efficient deployment. For local execution, you can use Ollama, LMStudio, llama.cpp, MLX-LM, KTransformers, etc. It is also compatible with the Hugging Face Transformers library.

## Get Started with Qwen-3

### Try the Qwen-3 API Service

Access Qwen-3 API functionalities through platforms like Alibaba Cloud Bailian, DashScope, together.ai, etc.

View API Docs

### Visit the GitHub Repository

Find Qwen-3 source code, documentation, examples, and community support in the official GitHub repository.

Visit GitHub

### Experience Qwen Chat

Directly experience the capabilities of the Qwen-3 model through the official Qwen Chat website or mobile app.

Visit Qwen Chat

About UsPrivacy PolicyContact Us

© 2024 Qwen-3. All rights reserved.

The Qwen-3 series models are developed by the Alibaba Cloud Tongyi Qianwen team and open sourced under the Apache 2.0 license. Please refer to the official documentation and GitHub repository for details.