Skip to content

Qwen3 - Model Report

Overview

AttributeDetails
DeveloperAlibaba Cloud (Qwen Team)
Release DateMay 2025 (original), July 2025 (Thinking update)
Model TypeLarge Language Model (Dense & MoE variants)
LicenseApache 2.0 (Open Source, Commercial Use Allowed)

Architecture

Model Family

ModelTotal ParamsActive ParamsArchitectureContext Window
Qwen3-235B-A22B235B22BMoE (128 experts, 8 active)32K / 131K (YaRN)
Qwen3-30B-A3B30B3BMoE (128 experts, 8 active)32K / 131K (YaRN)
Qwen3-32B32B32BDense Transformer32K / 131K (YaRN)
Qwen3-14B14B14BDense Transformer32K / 131K (YaRN)
Qwen3-8B8B8BDense Transformer32K / 131K (YaRN)
Qwen3-4B4B4BDense Transformer32K / 131K (YaRN)
Qwen3-1.7B1.7B1.7BDense Transformer32K / 131K (YaRN)
Qwen3-0.6B0.6B0.6BDense Transformer32K / 131K (YaRN)

Technical Specifications

SpecificationValue
TokenizerByte-level BPE (BBPE)
Vocabulary Size151,669 tokens
Languages Supported119 languages and dialects
MoE Routing128 experts, 8 selected per token

Training

AspectDetails
Training Data~36 Trillion tokens
Languages119 languages (up from 29 in Qwen2.5)
Pre-training Scale2x more tokens than Qwen2.5, 3x more languages
Post-trainingRLHF with thinking/non-thinking mode integration

Key Features

  • Unified Thinking Modes: Thinking mode (complex reasoning) and non-thinking mode (fast responses) in one model
  • Thinking Budget: Allocate computational resources adaptively based on task complexity
  • Dynamic Mode Switching: Automatic mode selection based on query type
  • Efficient MoE: 128 experts with only 8 active per token for efficiency
  • Extended Context: 32K native, 131K with YaRN extension
  • Multilingual: 119 languages and dialects support

Benchmarks

Flagship Model (Qwen3-235B-A22B)

BenchmarkScore
MMLU-Pro80.6%
LiveCodeBench69.5%
CodeForces ELOTop performer
BFCLTop performer

Smaller Models

ModelMMLUNotes
Qwen3-30B-A3B83%Outperforms QwQ-32B (10x more active params)
Qwen3-32B65.5% (Pro)Matches Qwen2.5-72B performance
Qwen3-4B-Rivals Qwen2.5-72B-Instruct

Efficiency Comparison

Qwen3 ModelEquivalent Qwen2.5 Performance
Qwen3-1.7BQwen2.5-3B
Qwen3-4BQwen2.5-7B
Qwen3-8BQwen2.5-14B
Qwen3-14BQwen2.5-32B
Qwen3-32BQwen2.5-72B

Pricing

PlatformDetails
Self-HostedFree (Apache 2.0)
Alibaba CloudPay-per-token API
Third-party APIsVarious providers (Together AI, Fireworks, etc.)

Open Source Availability

PlatformStatus
Hugging FaceQwen/Qwen3-235B-A22B
Hugging Face (32B)Qwen/Qwen3-32B
Hugging Face (8B)Qwen/Qwen3-8B
GGUF QuantsAvailable via community (llama.cpp compatible)
MLX FormatAvailable for Apple Silicon optimization
Weights DownloadAvailable
Self-HostingFully Supported

Minimum Hardware for Self-Hosting

By Model Size

ModelGGUF Q4_K_M SizeMin MemoryRecommended
Qwen3-0.6B~0.4GB4GB8GB
Qwen3-1.7B~1GB8GB8GB
Qwen3-4B~2.5GB8GB16GB
Qwen3-8B~5GB16GB16GB+
Qwen3-14B~8.5GB16GB24GB
Qwen3-32B~19.8GB32GB36-64GB
Qwen3-30B-A3B~18GB24GB32GB
Qwen3-235B-A22B~140GB192GB+256GB+

Apple Hardware Options

ModelMinimum Apple ProductMemoryApprox. CostSpeed
Qwen3-0.6BMacBook Air M18GB~$999100+ tok/s
Qwen3-4BMacBook Air M2/M3/M416GB~$1,19950+ tok/s
Qwen3-8BMacBook Air M2/M3/M416GB~$1,19930-40 tok/s
Qwen3-14BMacBook Pro M3/M424GB~$2,49920-30 tok/s
Qwen3-32BMacBook Pro M3/M4 Max36GB~$3,49915-25 tok/s
Qwen3-30B-A3BMacBook Pro M4 Max32GB~$3,19940-68 tok/s
Qwen3-235B-A22BMac Studio M3 Ultra256GB+~$8,000+5-10 tok/s
Use CaseModelApple ProductCost
Budget/MobileQwen3-8BMacBook Air M4 16GB~$1,199
Best BalanceQwen3-32BMacBook Pro M4 Max 48GB~$3,999
MoE EfficiencyQwen3-30B-A3BMacBook Pro M4 Max 48GB~$3,999
Maximum PowerQwen3-235B-A22BMac Studio M3 Ultra 512GB~$12,000

Performance on Apple Silicon

ChipModelQuantizationSpeed
M4 MaxQwen3-30B-A3B4-bit MLX~68 tok/s
M4 MaxQwen3-30B-A3BQ4_K_M GGUF~40 tok/s
M4 MaxQwen3-32BQ4_K_M GGUF~25 tok/s
M2 MaxQwen3-30B-A3B4-bit MLX~68 tok/s
M3 36GBQwen3-32BQ4_K_M~15-20 tok/s

Software Requirements

ComponentOptions
Inference EngineOllama, llama.cpp, MLX-LM, vLLM, LMStudio
Pythontransformers >= 4.51.0
Recommended for MacMLX-LM (optimized for Apple Silicon)

Summary

Minimum Apple Product to Run Qwen3:

  • Qwen3-8B (great general use): MacBook Air M4 16GB (~$1,199)
  • Qwen3-32B (flagship dense): MacBook Pro M4 Max 48GB (~$3,999)
  • Qwen3-235B (flagship MoE): Mac Studio M3 Ultra 256GB+ (~$8,000+)

Qwen3 offers exceptional flexibility—from tiny 0.6B models running on any Mac to the 235B flagship requiring enterprise hardware. The MoE models (30B-A3B, 235B-A22B) provide excellent performance-per-memory efficiency.

Sources

Technical research and documentation