← Back to glossary
RLAIF (Reinforcement Learning from AI Feedback)
Variant of RLHF where feedback to train the model comes from another AI system instead of humans, scaling the alignment process.
Advanced rlhf feedback_ia alineacion
Full definition
Variant of RLHF where feedback to train the model comes from another AI system instead of humans, scaling the alignment process.
Example in a business context
Using GPT-4 to evaluate and score the responses of a smaller model during its training.