All LLMs
DeepSeek·Open Source

DeepSeek V3

Efficient MoE rivaling closed frontier models.

Params
671B MoE
Context
128K
License
MIT (code) / Model License
Released
2024

Overview

DeepSeek V3 is a 671B parameter MoE with only ~37B active per token, achieving frontier performance at a fraction of training cost.

Features

  • Mixture of Experts
  • FP8 training
  • Strong reasoning
  • Open weights

Advantages

  • Excellent cost/perf
  • Open weights
  • Competitive with GPT-4

Limitations

  • MoE serving complexity
  • Hardware demands

Best for

  • Research
  • Cost-sensitive inference
  • Reasoning tasks