- "Diffusion Model Alignment Using Direct Preference Optimization (DPO)", Wallace et al 2023 {Salesforce} https://arxiv.org/abs/2311.12908#salesforce 2 comments reinforcementlearning
Linking pages
Related searches:
Search whole site: site:arxiv.org
Search title: [2311.12908] Diffusion Model Alignment Using Direct Preference Optimization
See how to search.