8-bit Optimizers via Block-wise Quantization

Published 2021-10-07

Download video MP4 360p

Recommendations

17:07

LoRA explained (and a bit about precision and quantization)
58:41

8-bit Methods for Efficient Deep Learning with Tim Dettmers
11:03

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?
20:18

Why Does Diffusion Work Better than Auto-Regression?
12:59

The Boundary of Computation
09:21

Boeing Starliner Software New Big Trouble: Can't Return Without Crew...
11:44

QLoRA paper explained (Efficient Finetuning of Quantized LLMs)
24:59

How to train simple AIs to balance a double pendulum
34:48

The Unreasonable Effectiveness of JPEG: A Signal Processing Approach
17:38

The moment we stopped understanding AI [AlexNet]
1:12:30

Jeff Dean (Google): Exciting Trends in Machine Learning
06:59

Understanding: AI Model Quantization, GGML vs GPTQ!
1:08:25

Владимир Арлазаров // Искусственный интеллект и История шахматной программы Каисса
2:06:38

This is why Deep Learning is really weird.
32:27

Efficient Streaming Language Models with Attention Sinks (Paper Explained)
07:41

So how does your computer ACTUALLY compute sine? Basics of trig and more…
15:14

How are memories stored in neural networks? | The Hopfield Network #SoME2
12:37

Chaos Theory: the language of (in)stability
1:01:53

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

Similar videos

58:25

Democratizing Foundation Models via k-bit Quantization - Tim Dettmers | Stanford MLSys #82
13:19

8 bit Quantization and PEFT (Parameter efficient fine-tuning ) & LoRA (Low-Rank Adaptation) Config
30:48

QLoRA: Efficient Finetuning of Quantized LLMs | Tim Dettmers
57:58

QLoRA: Efficient Finetuning of Quantized Large Language Models (Tim Dettmers)
00:44

QLoRA - Efficient Finetuning of Quantized LLMs
3:06:41

QLoRA: Quantization for Fine Tuning
42:35

Inside TensorFlow: TF Model Optimization Toolkit (Quantization and Pruning)
More results