Aligning LLMs with Direct Preference Optimization Published 2024-02-08 Download video MP4 360p Recommendations 55:54 Александр Голубев - Воркшоп по LLM + RLHF 49:07 [Webinar] LLMs for Evaluating LLMs 1:03:55 Towards Reliable Use of Large Language Models: Better Detection, Consistency, and Instruction-Tuning 29:38 The Future Of AI Agents With Dharmesh Shah | INBOUND 2024 28:18 Fine-tuning Large Language Models (LLMs) | w/ Example Code 21:15 Direct Preference Optimization (DPO) - How to fine-tune LLMs directly without reinforcement learning 45:32 A Survey of Techniques for Maximizing LLM Performance 17:52 Everything you need to know about Fine-tuning and Merging LLMs: Maxime Labonne 08:55 Direct Preference Optimization: Your Language Model is Secretly a Reward Model | DPO paper explained 30:28 Enabling Cost-Efficient LLM Serving with Ray Serve 1:19:39 Build conversational AI experiences powered by LLMs with Vertex AI Conversation and Dialogflow CX 1:44:31 Stanford CS229 I Machine Learning I Building Large Language Models (LLMs) 1:53:43 ICML 2024 Tutorial: Physics of Language Models 1:00:38 Reinforcement Learning from Human Feedback: From Zero to chatGPT 59:48 [1hr Talk] Intro to Large Language Models 2:45:10 Building LLMs from the Ground Up: A 3-hour Coding Workshop 20:18 Why Does Diffusion Work Better than Auto-Regression? Similar videos 09:10 Direct Preference Optimization: Forget RLHF (PPO) 36:25 Direct Preference Optimization (DPO): Your Language Model is Secretly a Reward Model Explained 42:49 Direct Preference Optimization (DPO) 08:00 Direct Preference Optimization (DPO): A low cost alternative to train LLM models 1:16:21 Stanford CS25: V4 I Aligning Open Language Models 07:44 Why reward models are still key to understanding LLM alignment More results