Minimax
MiniMax‑M1 Debuts with Cost‑Efficient, High-Performance RL Model
MiniMax-AI has released MiniMax‑M1, a large open-weight AI model tuned for long-context reasoning and software engineering tasks. Built using hybrid attention and a novel reinforcement learning algorithm, it was trained in just three weeks for under $535K. Its public release offers developers a new long-context contender at an unusually efficient cost.
Georg S. Kuklick
•
June 16, 2025
MiniMax‑M1 comes in two variants with “thinking budgets” of 40K and 80K tokens, optimizing for different task complexities. The model employs Lightning Attention and a hybrid attention mechanism, along with a new RL fine-tuning strategy called CISPO. The result is a model that shows strong comparative performance against top-tier open-weight peers like DeepSeek‑R1 and Qwen3‑235B. Training took place across 512 H800 GPUs, reaching completion in just under three weeks at a compute cost of $534,700. This puts MiniMax‑M1 among the most cost-efficient efforts in the 100B+ parameter class.
The model particularly excels in tasks requiring long-context comprehension and complex reasoning in code, positioning it as a useful tool for AI engineers and researchers building on transformer backbones. Its public release via GitHub marks a deliberate open-access stance, contrasting with more closed models from enterprise labs. For developers needing long-context handling and for teams exploring new RL fine-tuning strategies, MiniMax‑M1 offers a compelling open-source option with competitive performance and efficient scaling.