ML Hyperpolyglot / Open Source Models
a side-by-side reference sheet
Contributions welcome on GitHub.
| General | |||||
|---|---|---|---|---|---|
| GPT-OSS 120B | DeepSeek V3.2 | Kimi K2 | Qwen3-235B | GLM-4.7 | |
| Organization | OpenAI | DeepSeek | Moonshot AI | Alibaba | Zhipu AI |
| License | Apache 2.0 | DeepSeek License | Kimi License | Apache 2.0 | MIT |
| Parameters (Total) | 117B | 685B | 1T | 235B | 400B |
| Parameters (Active) | 5.1B | 32B | 22B | 32B | |
| Context Window | 128k | 128k | 256k | 128k-256k | 200k |
| Architecture | |||||
| GPT-OSS 120B | DeepSeek V3.2 | Kimi K2 | Qwen3-235B | GLM-4.7 | |
| Architecture Type | Sparse MoE | Sparse MoE (DSA) | Sparse MoE | Sparse MoE | Sparse MoE |
| Attention | GQA | MLA (DSA) | MLA | GQA | Interleaved |
| Tokenizer | BPE (o200k) | BPE | BPE (160k) | BPE | BPE |