
Awesome LLMs Reasoning Abilities Papers
Demystifying Long Chain-of-Thought Reasoning in LLMs This study systematically investigate the mechanics of long CoT reasoning, identifying the key factors that enable models to generate long CoT trajectories and providing practical guidance for optimizing training strategies to enhance long CoT reasoning in LLMs https://arxiv.org/pdf/2502.03373.pdf DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning This work introduces first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1, which incorporates multi-stage training and cold-start data before RL and achieves performance comparable to OpenAI-o1-1217 on reasoning tasks. ...