(WIP) A Little Bit of Reinforcement Learning from Human Feedback
2025-02-17
![]()
The work introduces Reinforcement Learning from Human Feedback (RLHF), a significant tool in deploying modern machine learning systems. It provides a gentle introduction to the core methods for those with some quantitative background, starting with the origins in various scientific fields, including economics and philosophy. The text elaborates on definitions, problem formulations, data collection, and common mathematics in RLHF literature, highlighting popular algorithms and future research areas.
Was this useful?