본문 바로가기

visual rft1

R1-Zero’s “Aha Moment” in Visual Reasoning on a 2B Non-SFT Model 논문 리뷰 R1-Zero’s “Aha Moment” in Visual Reasoning on a 2B Non-SFT Model날짜: 2025년 3월 20일https://arxiv.org/pdf/2503.05132아직 연구 중이라고 함Awesome MLLM Reasoning 찾음https://github.com/HJYao00/Awesome-Reasoning-MLLM?tab=readme-ov-file거기서 찾은 논문이 본 논문open_r1 이라고 huggingface에서 deepseek r1 재현하기 위해 판 레퍼가 있음 https://github.com/huggingface/open-r1 AbstractDeepSeek-R1간단한 규칙 기반 보상을 통한 강화 학습 → 복잡한 추론훈련 중에 자기 반성 및 응답 길이 증가.. 2025. 3. 27.

이전 1 다음

티스토리툴바