mirror of
https://github.com/p-e-w/heretic.git
synced 2026-06-02 05:03:33 +02:00
docs: update README
This commit is contained in:
@@ -116,8 +116,9 @@ a configuration file.
|
|||||||
|
|
||||||
At the start of a program run, Heretic benchmarks the system to determine
|
At the start of a program run, Heretic benchmarks the system to determine
|
||||||
the optimal batch size to make the most of the available hardware.
|
the optimal batch size to make the most of the available hardware.
|
||||||
On an RTX 3090, with the default configuration, decensoring Llama-3.1-8B-Instruct
|
On an RTX 3090, with the default configuration, decensoring
|
||||||
takes about 45 minutes. Note that Heretic supports model quantization with
|
[Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
|
||||||
|
takes about 20-30 minutes. Note that Heretic supports model quantization with
|
||||||
bitsandbytes, which can drastically reduce the amount of VRAM required to process
|
bitsandbytes, which can drastically reduce the amount of VRAM required to process
|
||||||
models. Set the `quantization` option to `bnb_4bit` to enable quantization.
|
models. Set the `quantization` option to `bnb_4bit` to enable quantization.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user