All LLMs
DeepSeek·Open Source
DeepSeek V3
Efficient MoE rivaling closed frontier models.
Params
671B MoE
Context
128K
License
MIT (code) / Model License
Released
2024
Overview
DeepSeek V3 is a 671B parameter MoE with only ~37B active per token, achieving frontier performance at a fraction of training cost.
Features
- Mixture of Experts
- FP8 training
- Strong reasoning
- Open weights
Advantages
- Excellent cost/perf
- Open weights
- Competitive with GPT-4
Limitations
- MoE serving complexity
- Hardware demands
Best for
- Research
- Cost-sensitive inference
- Reasoning tasks