8-bit Optimizers via Block-wise Quantization Published 2021-10-07 Download video MP4 360p Recommendations 17:07 LoRA explained (and a bit about precision and quantization) 58:41 8-bit Methods for Efficient Deep Learning with Tim Dettmers 11:03 LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work? 20:18 Why Does Diffusion Work Better than Auto-Regression? 12:59 The Boundary of Computation 09:21 Boeing Starliner Software New Big Trouble: Can't Return Without Crew... 11:44 QLoRA paper explained (Efficient Finetuning of Quantized LLMs) 24:59 How to train simple AIs to balance a double pendulum 34:48 The Unreasonable Effectiveness of JPEG: A Signal Processing Approach 17:38 The moment we stopped understanding AI [AlexNet] 1:12:30 Jeff Dean (Google): Exciting Trends in Machine Learning 06:59 Understanding: AI Model Quantization, GGML vs GPTQ! 1:08:25 Владимир Арлазаров // Искусственный интеллект и История шахматной программы Каисса 2:06:38 This is why Deep Learning is really weird. 32:27 Efficient Streaming Language Models with Attention Sinks (Paper Explained) 07:41 So how does your computer ACTUALLY compute sine? Basics of trig and more… 15:14 How are memories stored in neural networks? | The Hopfield Network #SoME2 12:37 Chaos Theory: the language of (in)stability 1:01:53 Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models Similar videos 58:25 Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82 13:19 8 bit Quantization and PEFT (Parameter efficient fine-tuning ) & LoRA (Low-Rank Adaptation) Config 30:48 QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers 57:58 QLoRA: Efficient Finetuning of Quantized Large Language Models (Tim Dettmers) 00:44 QLoRA - Efficient Finetuning of Quantized LLMs 3:06:41 QLoRA: Quantization for Fine Tuning 42:35 Inside TensorFlow: TF Model Optimization Toolkit (Quantization and Pruning) More results