RLHF

3 practitioners working with RLHF:

AI sycophancy: why your assistant agrees too much and how to fix it understanding why AI models default to agreement and what to do about it

the approval problem ChatGPT tells 5,000 people to breathe. heretic hits 1,000 stars. someone in Ukraine builds AI that survives power cuts. seven signals about what happens when you own your AI — or don't.

the sycophancy tax ChatGPT says 'breathe.' 5,000 people are leaving. this is what happens when you can't configure your AI's personality — someone else does it for you.