JetBrains released Mellum2, open-sourcing the weights under the Apache 2.0 license. The first version of Mellum was a completion-focused 4B dense model. Mellum2 is its successor: a general-purpose model specialized in software engineering. It covers code generation and editing, debugging, multi-step reasoning, tool use and function calling, agentic coding, and conversational programming assistance.

JetBrains team positions Mellum2 as a “focal model” — a fast, specialized component inside larger AI systems, not a standalone replacement for frontier models.

Architecture

Mellum2 uses a Mixture-of-Experts (MoE) architecture with 12B total parameters and 2.5B active parameters per token. In MoE models, only a subset of parameters runs on each token. Here, the model has 64 experts and activates 8 per token. This keeps per-token compute equivalent to a 2.5B dense model, while the total parameter count provides higher capacity for specialization.

Key architectural details:

The model handles natural language and code. It is not multimodal — there is no image or video input.

Pre-Training

Pre-training spans approximately 10.6 trillion tokens through a three-phase curriculum. The data mixture progressively shifts from diverse web content toward curated code and mathematical content across the three phases.

Training used the Muon optimizer under FP8 hybrid precision with a Warmup-Hold-Decay learning rate schedule with linear decay to zero.

After pre-training, the base model’s context window was extended to 128K tokens using a layer-selective YaRN method before post-training began.

The Model Family

JetBrains team released six checkpoints covering the full training pipeline:

CheckpointDescription
Mellum2-12B-A2.5B-Base-PretrainBase checkpoint before long-context extension
Mellum2-12B-A2.5B-BaseFinal base model after context extension
Mellum2-12B-A2.5B-Instruct-SFTSupervised fine-tuned instruction checkpoint
Mellum2-12B-A2.5B-Thinking-SFTSupervised thinking checkpoint
Mellum2-12B-A2.5B-InstructRL-tuned instruction model
Mellum2-12B-A2.5B-ThinkingRL-tuned thinking model

Post-training follows two stages: supervised fine-tuning (SFT), then reinforcement learning with verifiable rewards (RLVR) on math, executable coding, tool use, instruction following, reasoning, and knowledge tasks.

The Instruct variant answers directly, without an externalized chain of thought. Use it for low-latency tasks: direct answers, tool use, and instruction following.

The Thinking variant emits an explicit reasoning trace before its final answer. Use it for complex debugging, multi-step planning, or agentic flows where step-by-step reasoning matters.

Benchmark Results

All numbers below are self-reported by JetBrains. The comparison set is open-weight models in the 4B–14B range.

Coding:

BenchmarkMellum2 InstructQwen3.5 (4B)Qwen3.5 (9B)Ministral 3 (14B)OLMo-3 (7B)Seed-Coder (8B)
LiveCodeBench v637.251.063.742.428.228.1
EvalPlus78.469.471.874.167.373.8
MultiPL-E67.151.067.171.536.177.0

Tool Use:

BenchmarkMellum2 InstructQwen3.5 (4B)Qwen3.5 (9B)Ministral 3 (14B)OLMo-3 (7B)
BFCL v366.364.170.552.741.9
BFCL v444.252.060.638.819.8

Math:

BenchmarkMellum2 InstructQwen3.5 (4B)Qwen3.5 (9B)Ministral 3 (14B)OLMo-3 (7B)
AIME 2025+202641.738.358.333.340.0
GSM-Plus80.585.287.986.685.8

Knowledge and Conversational:

BenchmarkMellum2 InstructQwen3.5 (4B)Qwen3.5 (9B)Ministral 3 (14B)OLMo-3 (7B)
MMLU-Redux78.187.591.185.971.8
GPQA Diamond40.976.879.858.640.9
IFEval75.882.183.967.383.2
MixEval62.265.971.171.259.4

Benchmark notes:

https://blog.jetbrains.com/ai/2026/06/mellum2-goes-open-source-a-fast-model-for-ai-workflows/

Use Cases

JetBrains identifies four production scenarios where Mellum2’s latency and efficiency profile is relevant:

Strengths and Limitations

Strengths:

Limitations:

Marktechpost’s Visual Explainer

#mtp-mellum2-slider *{box-sizing:border-box;margin:0;padding:0}
#mtp-mellum2-slider{font-family:’Segoe UI’,system-ui,sans-serif;background:#f7f8fa;border:1px solid #e2e5ea;border-radius:14px;overflow:hidden;max-width:720px;width:100%;position:relative;user-select:none}
#mtp-mellum2-slider .mtp-slides{display:flex;transition:transform .38s cubic-bezier(.4,0,.2,1);will-change:transform}
#mtp-mellum2-slider .mtp-slide{min-width:100%;padding:32px 36px 80px;background:#fff;position:relative}
#mtp-mellum2-slider .mtp-slide:nth-child(odd){background:#fff}
#mtp-mellum2-slider .mtp-slide:nth-child(even){background:#fafbfc}

#mtp-mellum2-slider .mtp-badge{display:inline-block;font-size:10px;font-weight:700;letter-spacing:.08em;text-transform:uppercase;padding:4px 10px;border-radius:20px;margin-bottom:14px}
#mtp-mellum2-slider .mtp-badge.blue{background:#e8f0fe;color:#1a56c4}
#mtp-mellum2-slider .mtp-badge.green{background:#e6f4ea;color:#1a7a3c}
#mtp-mellum2-slider .mtp-badge.amber{background:#fef8e7;color:#92600a}
#mtp-mellum2-slider .mtp-badge.purple{background:#f0ebfe;color:#5b30c8}
#mtp-mellum2-slider .mtp-badge.teal{background:#e6f7f5;color:#1a7a6a}
#mtp-mellum2-slider .mtp-badge.red{background:#fde8e8;color:#b91c1c}
#mtp-mellum2-slider .mtp-badge.gray{background:#f1f3f5;color:#4a5568}

#mtp-mellum2-slider .mtp-slide h2{font-size:19px;font-weight:700;color:#111827;line-height:1.35;margin-bottom:10px}
#mtp-mellum2-slider .mtp-slide .mtp-sub{font-size:13px;color:#6b7280;line-height:1.6;margin-bottom:18px}
#mtp-mellum2-slider .mtp-divider{height:1px;background:#e5e7eb;margin:16px 0}

#mtp-mellum2-slider .mtp-kv{display:grid;grid-template-columns:1fr 1fr;gap:10px;margin-top:4px}
#mtp-mellum2-slider .mtp-kv-item{background:#f7f8fa;border:1px solid #e5e7eb;border-radius:8px;padding:10px 12px}
#mtp-mellum2-slider .mtp-kv-item .kv-label{font-size:10px;font-weight:600;letter-spacing:.05em;text-transform:uppercase;color:#9ca3af;margin-bottom:3px}
#mtp-mellum2-slider .mtp-kv-item .kv-val{font-size:14px;font-weight:700;color:#111827;font-family:’Courier New’,monospace}

#mtp-mellum2-slider .mtp-list{list-style:none;display:flex;flex-direction:column;gap:8px}
#mtp-mellum2-slider .mtp-list li{font-size:13px;color:#374151;line-height:1.55;display:flex;gap:8px;align-items:flex-start}
#mtp-mellum2-slider .mtp-list li .dot{width:6px;height:6px;border-radius:50%;flex-shrink:0;margin-top:5px}
#mtp-mellum2-slider .mtp-list li .dot.blue{background:#3b82f6}
#mtp-mellum2-slider .mtp-list li .dot.green{background:#22c55e}
#mtp-mellum2-slider .mtp-list li .dot.amber{background:#f59e0b}
#mtp-mellum2-slider .mtp-list li .dot.red{background:#ef4444}

#mtp-mellum2-slider .mtp-table-wrap{overflow-x:auto;margin-top:4px}
#mtp-mellum2-slider .mtp-table{width:100%;border-collapse:collapse;font-size:12px}
#mtp-mellum2-slider .mtp-table th{background:#f1f3f5;color:#6b7280;font-weight:600;text-align:left;padding:7px 10px;border-bottom:1px solid #e5e7eb;white-space:nowrap}
#mtp-mellum2-slider .mtp-table td{padding:7px 10px;border-bottom:1px solid #f1f3f5;color:#374151;white-space:nowrap}
#mtp-mellum2-slider .mtp-table tr:last-child td{border-bottom:none}
#mtp-mellum2-slider .mtp-table .mtp-hi{font-weight:700;color:#1a56c4}
#mtp-mellum2-slider .mtp-table tr:hover td{background:#f7f8fa}

#mtp-mellum2-slider pre.mtp-code{background:#f1f3f5;border:1px solid #e2e5ea;border-radius:8px;padding:14px 16px;font-size:12px;font-family:’Courier New’,monospace;color:#1f2937;overflow-x:auto;line-height:1.6;margin-top:4px;white-space:pre}

#mtp-mellum2-slider .mtp-two-col{display:grid;grid-template-columns:1fr 1fr;gap:14px;margin-top:4px}
#mtp-mellum2-slider .mtp-col-box{background:#f7f8fa;border:1px solid #e5e7eb;border-radius:10px;padding:14px 16px}
#mtp-mellum2-slider .mtp-col-box h4{font-size:12px;font-weight:700;text-transform:uppercase;letter-spacing:.06em;margin-bottom:8px}
#mtp-mellum2-slider .mtp-col-box h4.green{color:#15803d}
#mtp-mellum2-slider .mtp-col-box h4.red{color:#b91c1c}

#mtp-mellum2-slider .mtp-family-row{display:flex;flex-direction:column;gap:6px;margin-top:4px}
#mtp-mellum2-slider .mtp-frow{display:flex;align-items:center;gap:10px;background:#f7f8fa;border:1px solid #e5e7eb;border-radius:8px;padding:8px 12px}
#mtp-mellum2-slider .mtp-frow .ftag{font-size:10px;font-weight:700;letter-spacing:.05em;padding:2px 8px;border-radius:10px;flex-shrink:0}
#mtp-mellum2-slider .mtp-frow .ftag.base{background:#e8f0fe;color:#1a56c4}
#mtp-mellum2-slider .mtp-frow .ftag.sft{background:#fef8e7;color:#92600a}
#mtp-mellum2-slider .mtp-frow .ftag.rl{background:#e6f4ea;color:#1a7a3c}
#mtp-mellum2-slider .mtp-frow .fname{font-size:11px;font-family:’Courier New’,monospace;color:#374151;font-weight:600}
#mtp-mellum2-slider .mtp-frow .fdesc{font-size:11px;color:#9ca3af;margin-left:auto}

#mtp-mellum2-slider .mtp-nav{position:absolute;bottom:0;left:0;right:0;height:56px;background:#fff;border-top:1px solid #e5e7eb;display:flex;align-items:center;justify-content:space-between;padding:0 20px}
#mtp-mellum2-slider .mtp-nav button{background:#f1f3f5;border:1px solid #e2e5ea;border-radius:8px;padding:7px 18px;font-size:12px;font-weight:600;color:#374151;cursor:pointer;transition:background .15s}
#mtp-mellum2-slider .mtp-nav button:hover{background:#e5e7eb}
#mtp-mellum2-slider .mtp-nav button:disabled{opacity:.35;cursor:not-allowed}
#mtp-mellum2-slider .mtp-dots{display:flex;gap:6px;align-items:center}
#mtp-mellum2-slider .mtp-dots span{width:7px;height:7px;border-radius:50%;background:#d1d5db;transition:all .2s;cursor:pointer}
#mtp-mellum2-slider .mtp-dots span.active{background:#1a56c4;width:20px;border-radius:4px}

#mtp-mellum2-slider .mtp-footer{background:#f1f3f5;border-top:1px solid #e2e5ea;padding:10px 20px;display:flex;align-items:center;justify-content:space-between}
#mtp-mellum2-slider .mtp-footer .mtp-brand{font-size:11px;font-weight:700;color:#1a56c4;letter-spacing:.04em}
#mtp-mellum2-slider .mtp-footer .mtp-tagline{font-size:10px;color:#9ca3af}
@media(max-width:540px){
#mtp-mellum2-slider .mtp-slide{padding:22px 18px 72px}
#mtp-mellum2-slider .mtp-kv{grid-template-columns:1fr 1fr}
#mtp-mellum2-slider .mtp-two-col{grid-template-columns:1fr}
#mtp-mellum2-slider .mtp-slide h2{font-size:16px}
#mtp-mellum2-slider .mtp-frow .fdesc{display:none}
#mtp-mellum2-slider .mtp-footer{flex-direction:column;gap:4px;text-align:center}
}

Overview

JetBrains Open-Sources Mellum2

A 12B Mixture-of-Experts model released under Apache 2.0 on June 2, 2026. Trained from scratch on ~10.6 trillion tokens for software engineering tasks.

Total Params
12B
Active / Token
2.5B
License
Apache 2.0
Context
131,072 tok
Architecture
MoE
Pre-train Data
~10.6T tok

Architecture

How Mellum2 Is Built

MoE activates 8 of 64 experts per token — per-token compute stays equivalent to a 2.5B dense model. An MTP head enables speculative decoding without a separate draft model.

Layers
28
Hidden Size
2304
Experts (total / active)
64 / 8
GQA Heads (Q / KV)
32 / 4
SWA Window
1,024 (¾ layers)
Vocabulary
98,304
Precision
bfloat16
Modality
Text + Code

Pre-Training

Training Pipeline

Three-phase curriculum progressively shifts from diverse web data toward curated code and math. Context extended to 128K via layer-selective YaRN before post-training.

  • Data: ~10.6 trillion tokens across three curriculum phases
  • Optimizer: Muon under FP8 hybrid precision
  • LR Schedule: Warmup-Hold-Decay with linear decay to zero
  • Context Extension: Layer-selective YaRN to 128K tokens
  • Post-Training: SFT → RLVR on coding, math, tool use, reasoning, knowledge
  • Design Constraint: Inference efficiency on commodity GPUs validated by ablation

Model Family

Six Checkpoints Released

Full pipeline from base pretrain through RL-tuned variants. Use Instruct for direct low-latency answers. Use Thinking for explicit step-by-step reasoning traces.

BASEMellum2-12B-A2.5B-Base-PretrainBefore context extension
BASEMellum2-12B-A2.5B-BaseAfter YaRN extension
SFTMellum2-12B-A2.5B-Instruct-SFTSupervised instruction
SFTMellum2-12B-A2.5B-Thinking-SFTSupervised thinking
RLVRMellum2-12B-A2.5B-InstructRL-tuned, no CoT
RLVRMellum2-12B-A2.5B-ThinkingRL-tuned, explicit CoT

Benchmarks

Evaluation Results (Instruct Variant)

All numbers self-reported by JetBrains. Comparison set: open-weight models in the 4B–14B range.

BenchmarkMellum2Qwen3.5 9BMinistral 3 14BOLMo-3 7B
LiveCodeBench v637.263.742.428.2
EvalPlus78.471.874.167.3
MultiPL-E67.167.171.536.1
BFCL v366.370.552.741.9
AIME 2025+202641.758.333.340.0
IFEval75.883.967.383.2

Use Cases

Where Mellum2 Fits in Production

JetBrains positions Mellum2 as a “focal model” — handling high-frequency, latency-sensitive steps inside larger AI pipelines.

  • Routing & Orchestration — Analyze prompts and select the right model or tool per task
  • RAG Pipelines — Summarize retrieved context at low latency before response generation
  • Sub-Agents — Handle repetitive steps in agent pipelines (context gathering, validation, planning)
  • Private Deployment — Apache 2.0 permits full self-hosting with no external API calls required

Strengths & Limitations

What Works and What Doesn’t

Mellum2 is designed for efficiency in component roles, not frontier-level capability across all benchmarks.

✓ Strengths

  • 2.5B active params — compute of a dense 2.5B model
  • MTP head enables built-in speculative decoding
  • 131K token context window
  • Strong EvalPlus (78.4) and BFCL v3 (66.3)
  • Apache 2.0 — commercial use, fine-tuning, self-hosting
  • vLLM support with tool-calling

✗ Limitations

  • Text and code only — no multimodal input
  • LiveCodeBench v6 (37.2) below Qwen3.5 9B (63.7)
  • GPQA Diamond (40.9) below most comparisons
  • GSM-Plus (80.5) trails all models listed
  • Not a frontier replacement — component role only

Quick Start

Deploy with vLLM

Install vLLM and serve the Instruct variant. Enable tool-calling with the hermes parser for function-calling workflows.

pip install vllm

# Basic serve
vllm serve JetBrains/Mellum2-12B-A2.5B-Instruct \
  --max-model-len 131072

# With tool calling
vllm serve JetBrains/Mellum2-12B-A2.5B-Instruct \
  --max-model-len 131072 \
  --enable-auto-tool-choice \
  --tool-call-parser hermes

Model weights: huggingface.co/JetBrains/mellum-2 · Technical report: arXiv:2605.31268

(function(){
var cur=0,total=8;
var slides=document.getElementById(‘mtp-slides’);
var dots=document.getElementById(‘mtp-dots’);
var prev=document.getElementById(‘mtp-prev’);
var next=document.getElementById(‘mtp-next’);
for(var i=0;i40)mtpNav(dx<0?1:-1)},{passive:true}); })();

Getting Started

Serve Mellum2 with vLLM:

pip install vllm
vllm serve JetBrains/Mellum2-12B-A2.5B-Instruct --max-model-len 131072

With tool calling enabled:

vllm serve JetBrains/Mellum2-12B-A2.5B-Instruct \
  --max-model-len 131072 \
  --enable-auto-tool-choice \
  --tool-call-parser hermes

Using the Hugging Face Transformers library:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("JetBrains/Mellum2-12B-A2.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("JetBrains/Mellum2-12B-A2.5B-Instruct")

messages = [{"role": "user", "content": "Write a Python function to reverse a string."}]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))


Check out the Model Weights and Technical details. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

The post JetBrains Releases Mellum2: A 12B MoE Model for Fast, Specialized Tasks in Multi-Model AI Pipelines appeared first on MarkTechPost.



Source link