Page Inspect
Internal Links
13
External Links
4
Images
3
Headings
59
Page Content
Title:Qwen-3: Alibaba Cloud's Next-Gen Open Source LLM | Apache 2.0 | MoE & Dense
Description:Discover and experience the Qwen-3 model series, covering MoE and Dense models from 0.6B to 235B parameters, featuring advanced capabilities like Hybrid Thinking, Multimodal Processing, and 119 language support. Find download, deployment, and benchmark information.
HTML Size:138 KB
Markdown Size:12 KB
Fetched At:October 13, 2025
Page Structure
h1Qwen-3: Explore the Next-Generation Open Source Large Model
h2One-Click Website Integration
h2Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3
h2Download Tongyi Qianwen APP
h3iOS App Store
h3Google Play Store
h3Android Package (Official)
h2Core Features
h3Hybrid Thinking Mode
h3Flagship & Efficient Performance
h3Unified Multimodal Processing
h3Broad Multilingual Support
h3MCP Protocol & Agent Capabilities
h3Efficient MoE & Diverse Dense Models
h3Ultra-Long Context Processing
h3Advanced Training Techniques
h3Open Ecosystem & Compatibility
h2DeepSeek V3 in Media Coverage
h3Breakthrough Performance
h3Massive Scale Architecture
h3Efficient Development Cost
h2Qwen-3 in Action
h3Qwen-3: Leading the Way in Open Source AI
h2Qwen-3 Performance on Authoritative Benchmarks
h3General Ability & Language Understanding
h3Coding Ability
h3Mathematical Ability
h2Technical Specifications
h3Qwen-3 Architecture Details
h2Qwen-3 Research
h3Innovative Architecture
h3Training Methodology
h3Technical Blog & Report
h2About the Qwen Team
h3Development Background
h3Technical Strength
h2Qwen-3 Deployment Options
h3Efficient Inference Frameworks (vLLM & SGLang)
h3Convenient Local Deployment
h3Cloud API Services
h3Model Platforms & Quantization Formats
h2How to Use Qwen-3
h3Choose Your Method
h3Access Platform or Download Model
h3Start Interacting or Integrating
h2Frequently Asked Questions
h3What makes Qwen-3 unique?
h3How can I access or use Qwen-3?
h3What tasks does Qwen-3 excel at?
h3What is Hybrid Thinking Mode?
Markdown Content
Qwen-3: Alibaba Cloud's Next-Gen Open Source LLM | Apache 2.0 | MoE & Dense Qwen-3 Prompt Library Download Models Blog App Download Try Online English 中文简体 Qwen-3 # Qwen-3: Explore the Next-Generation Open Source Large Model Experience the flagship Qwen-3 model series developed by Alibaba Cloud, featuring hybrid thinking, multimodal processing, and powerful multilingual capabilities. Open sourced under Apache 2.0. 235B MoE Parameters 119+ Language Support Hybrid Thinking Mode Model Download & Deployment Try Online Now ## One-Click Website Integration Own a website? Instantly add our chat interface using simple iframe code - no registration required. <iframe src="https://qwen-3.com/embed" width="100%" height="600px" frameborder="0"></iframe> ## Free Online Chat - No Registration Required | Fast & Stable | Powered by Qwen-3 Qwen-3 New Chat Send Qwen-3 with DeepSeek R1Qwen-3 with x.AI Grok 3 ## Download Tongyi Qianwen APP Experience Qwen on your mobile device ### iOS App Store For iPhone and iPad Download ### Google Play Store For Android devices (Play Store download link currently unavailable) Download ### Android Package (Official) Direct APK download not officially provided yet (Direct APK download link currently unavailable) Download ## Core Features Explore the powerful functions and innovative features of Qwen-3 ### Hybrid Thinking Mode Automatically switches between deep thinking and quick response modes based on task complexity, balancing intelligence and efficiency, with flexible control. - •Thinking Mode (Step-by-step reasoning) - •Non-Thinking Mode (Quick response) - •API/Prompt tag control - •Optimized thinking budget ### Flagship & Efficient Performance Flagship MoE model performance rivals top closed-source models, while small-size models also exhibit exceptional performance, surpassing previous-generation large models. - •Leading in coding/math/general ability - •Excellent performance of Qwen3-235B-A22B - •Qwen3-4B matches Qwen2.5-72B - •MoE models activate fewer parameters, high efficiency ### Unified Multimodal Processing Utilizes unified multimodal encoding technology, deeply integrating the processing of text, images, audio, video, and other inputs within a single architecture. - •Text understanding and generation - •Image recognition and analysis - •Audio processing and interaction - •Video content understanding ### Broad Multilingual Support Supports up to 119 languages and dialects, significantly optimizing cross-lingual task performance and language switching issues. - •Coverage of 119 languages & dialects - •Pre-trained on 36T tokens - •Reduced language switching errors - •Strong cross-lingual capabilities ### MCP Protocol & Agent Capabilities Natively supports the MCP protocol, standardizing external tool calls for AI Agents. Recommended to build agents using the Qwen-Agent framework. - •Standardized external Action calls - •Improved Agent development compatibility - •Easy to build browser assistants, etc. - •Qwen-Agent framework recommended ### Efficient MoE & Diverse Dense Models Offers flagship MoE models and a variety of Dense models from 0.6B to 32B, meeting diverse scenario requirements. - •Qwen3-235B (MoE, 22B activated) - •Qwen3-30B (MoE, 3B activated) - •0.6B to 32B Dense models - •Open sourced under Apache 2.0 ### Ultra-Long Context Processing Dense models support up to 128K token context, and MoE models also support long context, efficiently handling long documents and complex dialogues. - •Up to 128K context (8B-32B) - •32K context (0.6B-4B) - •Optimized attention mechanisms - •Reduced memory usage for long sequences ### Advanced Training Techniques Undergoes three-stage pre-training based on nearly 36 trillion tokens of data, and employs four-stage post-training to develop hybrid thinking and general capabilities. - •36T tokens pre-training data - •Three-stage pre-training process - •Four-stage post-training flow - •Application of high-quality synthetic data ### Open Ecosystem & Compatibility Open sourced under the Apache 2.0 license, seamlessly integrated with mainstream tools like HuggingFace, vLLM, Ollama, SGLang, etc. - •Fully open source (Apache 2.0) - •Supports frameworks like vLLM, SGLang - •Supports local tools like Ollama, LMStudio - •Available on HuggingFace/ModelScope/Kaggle ## DeepSeek V3 in Media Coverage New Breakthroughs in Open-Source AI Development ### Breakthrough Performance DeepSeek V3 outperforms both open-source and closed-source AI models in programming competitions, particularly excelling in Codeforces competitions and Aider Polyglot tests. ### Massive Scale Architecture Possesses 671 billion parameters and trained on 14.8 trillion tokens, 1.6 times the scale of Meta's Llama 3.1 405B. ### Efficient Development Cost Completed training in just two months using Nvidia H800 GPUs, with a development cost of only $5.5 million. ## Qwen-3 in Action See how Qwen-3 elevates open-source AI capabilities ### Qwen-3: Leading the Way in Open Source AI A deep dive into the capabilities of Qwen-3 and its performance against other leading AI models. ## Qwen-3 Performance on Authoritative Benchmarks ### General Ability & Language Understanding MMLULeading GPQALeading Arena HardExcellent ### Coding Ability LiveCodeBenchSOTA HumanEvalLeading OpenCompassLeading ### Mathematical Ability GSM8KExcellent AIMEExcellent ## Technical Specifications Explore the advanced technology, architecture, and capabilities driving Qwen-3 ArchitectureTrainingCapabilitiesOptimization ### Qwen-3 Architecture Details Advanced architecture integrating Mixture-of-Experts, diverse dense models, and innovative mechanisms •Mixture-of-Experts (MoE) models: Qwen3-235B (22B activated), Qwen3-30B (3B activated) •Diverse Dense models: 0.6B, 1.7B, 4B, 8B, 14B, 32B •Architectural basis for Hybrid Thinking Mode •Unified Multimodal Encoding technology •Native MCP (Model-Action-Protocol) support •Support for long context (up to 128K/32K tokens) •Optimized Transformer variant design •Efficient attention mechanisms and chunked prefilling techniques ## Qwen-3 Research Pushing the boundaries of language model capabilities ### Innovative Architecture Integrating hybrid thinking mode, unified multimodal encoding, and efficient MoE architecture. ### Training Methodology Multi-stage pre-training and post-training based on nearly 36 trillion tokens, covering 119 languages. ### Technical Blog & Report Read our blog post to understand the design philosophy and performance details of Qwen-3. A detailed technical report will be released soon. Read Blog Post ## About the Qwen Team The team behind the Qwen-3 models ### Development Background The Qwen-3 model series is developed by the Alibaba Cloud Tongyi Qianwen team. This team is dedicated to the open-source research and application of large language models, continuously releasing the leading Qwen model series. ### Technical Strength Leveraging Alibaba Cloud's powerful cloud computing infrastructure and extensive experience in large-scale AI model training, the Qwen team can efficiently develop and iterate advanced language models. ## Qwen-3 Deployment Options ### Efficient Inference Frameworks (vLLM & SGLang) Recommended for high-performance deployment using vLLM (>=0.8.4) or SGLang (>=0.4.6.post1), supporting long context and Hybrid Thinking Mode. - High throughput - Low latency - Supports Hybrid Thinking Mode - Compatible with OpenAI API ### Convenient Local Deployment Easily run Qwen-3 models locally using tools like Ollama, LMStudio, MLX, llama.cpp, KTransformers, etc. - Quick start - Cross-platform support (CPU/GPU) - Active community - Support for various quantization formats ### Cloud API Services Directly call the Qwen-3 API via Alibaba Cloud Bailian, DashScope, or together.ai without self-deployment. - Out-of-the-box - Pay-as-you-go - Global access - Enterprise-level support ### Model Platforms & Quantization Formats Model weights are available on Hugging Face, ModelScope, Kaggle. Supports quantization formats like GGUF, AWQ, AutoGPTQ to reduce resource requirements. - Multi-platform access - Apache 2.0 License - Supports Int4/Int8 quantization - Suitable for consumer hardware ## How to Use Qwen-3 Get started quickly with Qwen-3: Try online, call APIs, or deploy locally Step 1 ### Choose Your Method Based on your needs, choose to try it online (Qwen Chat), call the API service, or download the model for local deployment. Step 2 ### Access Platform or Download Model Visit the Qwen Chat website/app, consult API documentation and providers (like Alibaba Cloud Bailian), or go to Hugging Face/ModelScope/Kaggle to download the required model files. Step 3 ### Start Interacting or Integrating Interact directly with Qwen Chat, integrate it into your application according to the API documentation, or use tools like Ollama, vLLM, SGLang to run and manage the model locally. Try Qwen Chat Online ## Frequently Asked Questions Learn more about Qwen-3 ### What makes Qwen-3 unique? Qwen-3 offers various model sizes from 0.6B to 235B (MoE), open-sourced under Apache 2.0. Key innovations include the Hybrid Thinking Mode (intelligently switching thought depth), unified multimodal processing capabilities, and broad support for 119 languages. ### How can I access or use Qwen-3? You can download model weights from Hugging Face, ModelScope, or Kaggle for local deployment (tools like vLLM, SGLang, Ollama are recommended). You can also call API services via Alibaba Cloud Bailian, DashScope, together.ai, etc., or experience it directly on the Qwen Chat website/app. ### What tasks does Qwen-3 excel at? Qwen-3 demonstrates leading performance in coding, mathematics, and general capability benchmarks, surpassing models like Llama3.1-405B. Its multilingual abilities, long context processing, and Agent functionalities (with MCP protocol) are also very strong. ### What is Hybrid Thinking Mode? This is an innovative feature of Qwen-3. The model can automatically or manually switch between a 'thinking mode' for deep reasoning and a 'non-thinking mode' for quick responses, based on task complexity, to balance effectiveness and efficiency. ### How many languages does Qwen-3 support? Qwen-3 supports up to 119 languages and dialects, significantly enhancing cross-lingual understanding and generation capabilities through large-scale multilingual pre-training data (nearly 36T tokens). ### What are the hardware requirements for running Qwen-3? Requirements depend on the model size. Smaller models (e.g., 0.6B, 1.7B) can run on consumer hardware, especially with Int4/Int8 quantization (like GGUF). Larger models (e.g., 32B, 235B) require more powerful GPU support. It's recommended to check the specific model's documentation and quantization options. ### Is Qwen-3 available for commercial use? Yes, all models in the Qwen-3 series are released under the Apache 2.0 license, allowing for both commercial and research use. ### What is the context window size of Qwen-3? Depending on the model size, Qwen-3 dense models support context lengths of 32K (0.6B-4B) or 128K (8B-32B) tokens. MoE models also support long context (check the model card for specific sizes). ### Which deployment frameworks/tools does Qwen-3 support? vLLM (>=0.8.4) and SGLang (>=0.4.6.post1) are recommended for efficient deployment. For local execution, you can use Ollama, LMStudio, llama.cpp, MLX-LM, KTransformers, etc. It is also compatible with the Hugging Face Transformers library. ## Get Started with Qwen-3 ### Try the Qwen-3 API Service Access Qwen-3 API functionalities through platforms like Alibaba Cloud Bailian, DashScope, together.ai, etc. View API Docs ### Visit the GitHub Repository Find Qwen-3 source code, documentation, examples, and community support in the official GitHub repository. Visit GitHub ### Experience Qwen Chat Directly experience the capabilities of the Qwen-3 model through the official Qwen Chat website or mobile app. Visit Qwen Chat About UsPrivacy PolicyContact Us © 2024 Qwen-3. All rights reserved. The Qwen-3 series models are developed by the Alibaba Cloud Tongyi Qianwen team and open sourced under the Apache 2.0 license. Please refer to the official documentation and GitHub repository for details.