Skip to content

Kimi K2.5 - Model Report

Overview

AttributeDetails
DeveloperMoonshot AI (China)
Release DateJanuary 27, 2026
Model TypeNative Multimodal Agentic Model
LicenseModified MIT License

Architecture

SpecificationValue
Total Parameters1.04 Trillion
Active Parameters32 Billion
ArchitectureMixture of Experts (MoE)
Number of Experts384 (8 selected per token + 1 shared)
Vision Encoder400 Million parameters
Context Window256K tokens
ModalitiesText, Image, Video

Training

AspectDetails
Training Data~15 Trillion mixed visual and text tokens
Base ModelKimi-K2-Base (continual pretraining)
QuantizationNative INT4 (QAT), Group size 32
OptimizationHopper Architecture optimized

Key Features

  • Agent Swarm Orchestration: Can spawn and manage up to 100 agents per prompt
  • Multi-Agent Task Decomposition: Decomposes complex tasks into parallel sub-tasks
  • Thinking Mode: Includes reasoning traces with reasoning_content (temp=1.0)
  • Instant Mode: Direct responses without reasoning traces (temp=0.6)
  • Native Multimodal: Built-in text, image, and video understanding
  • Native INT4 Inference: ~2x generation speed improvement via QAT

Benchmarks

BenchmarkScore
Hallucinations100%
General Knowledge100%
Reasoning100%
Ethics100%
Mathematics96.8% (97th percentile)
Coding92.0% (76th percentile)

Pricing

PlatformDetails
Self-HostedFree (open source)
NVIDIA NIMAvailable via NIM catalog
Cloud APIsVarious providers

Open Source Availability

PlatformStatus
Hugging Facemoonshotai/Kimi-K2.5
GGUF Quantsunsloth/Kimi-K2-Instruct-GGUF
Weights DownloadAvailable
Self-HostingPossible (requires significant hardware)

Minimum Hardware for Self-Hosting

Memory Requirements

QuantizationModel SizeMin Memory (RAM+VRAM+Disk)
1.8-bit GGUF~247GB250GB
2-bit XL~300GB300GB+
Q8 (Full)~1.09TB8x H200 GPUs
FP8~1TBEnterprise GPU cluster

Apple Hardware Options

RequirementMinimumRecommended
ProductMac Studio (M3 Ultra)Mac Studio (M3 Ultra)
Unified Memory256GB512GB
Storage500GB+ NVMe1TB+ NVMe
Quantization1.8-bit GGUF2-bit or higher
Expected Speed1-2 tokens/sec5+ tokens/sec
Approx. Cost~$8,000~$12,000

Why Mac Studio M3 Ultra?

Apple ProductMax Unified MemorySufficient?
MacBook Pro M4 Max128GBNo
Mac Studio M4 Max128GBNo
Mac Studio M3 Ultra512GBYes
Mac Pro M2 Ultra192GBBorderline

Note: The Mac Studio with M3 Ultra (512GB unified memory) is currently the only consumer Apple product capable of running Kimi K2.5 locally. Lower memory configurations will experience severe performance degradation due to disk swapping.

Performance Expectations

SetupVRAMRAMSpeed
RTX 4090 + 256GB RAM24GB256GB1-2 tok/s
Mac Studio M3 Ultra 512GB512GB unified-3-5 tok/s
2x A100 80GB160GB512GB15-20 tok/s
8x H2001.1TB-~45 tok/s

Sources

Technical research and documentation