DeepSeek-R1-Distill-Qwen-32B is a model obtained through knowledge distillation based on Qwen2.5-32B. The model was fine-tuned using 800,000 handpicked samples generated by DeepSeek-R1, demonstrating superior performance in multiple domains including mathematics, programming, and reasoning. Obtained the best results in multiple benchmark tests such as AIME 2024, MATH-500, GPQA Diamond, etc...
Provider: deepseek-ai
Context window: 128K
Pricing: $0.20 input / $0.20 output per 1M tokens
Capabilities: serverless, json_mode, function_calling, structured_output