162 Commits

Author SHA1 Message Date
red40maxxer 60bd531fde perf: optimize abliteration matrix op (#46)
* perf: optimize abliteration matrix op

* refactor: comments and var names correspond with arditi

* refactor: fix comments and improve var notation

* fix: accidental line change and improve comments

---------

Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com>
2025-12-02 08:13:43 +05:30
Spiky Moth 1f74ac2888 Guard against refusals in broken English (#45)
* Guard against refusals in broken English

* Normalize whitespace between words
2025-11-26 11:29:08 +05:30
_Vinayyyy_ 63fc0e7d5a feat: Add bfloat16 to default dtypes list (#44)
Co-authored-by: Vinay Umrethe <vinayumrethe99@gmail.com>
2025-11-25 12:22:52 +05:30
_Vinayyyy_ 1efc4ee9e1 Featuring Notebook (Colab/Kaggle) Compatibility (#42)
* feat: Add hybrid UI for notebook compatibility

* Restore notebook detection logic

* fix: Improve notebook detection with env vars

* chore: cleanup

* chore: cleanup

* correct ruff format

* refactor: Address code review feedback

- Move password handling to prompt_password
- Use only_directories=True for save path prompt
- Simplify prompt_text arguments

---------

Co-authored-by: Vinay Umrethe <vinayumrethe99@gmail.com>
2025-11-24 19:46:39 +05:30
Nikolai Kolodziej 452b35e7b7 Add trust_remote_code configuration option (#31)
* Add `trust_remote_code` configuration option and apply it when loading models and tokenizers

* Default `trust_remote_code` to `None` and set it to `True` if previously `None` so the user wouldn't be asked multiple times

* Consistently access `trust_remote_code` from `self.settings` instead of the global `settings` object.

* Introduce `trusted_models` dictionary to manage and confirm `trust_remote_code` settings during model loading

* Assign `trust_remote_code` to `evaluate_model` in `trusted_models` instead of `model`
2025-11-24 06:27:44 +05:30
Spiky Moth b79b8b1475 Improve support for loading local datasets (#33)
* Handle loading local datasets

* Reorder branches to avoid chain of negatives
2025-11-23 11:15:34 +05:30
Philipp Emanuel Weidmann 83cbf0612a Add option to print refusal geometry 2025-11-22 13:18:54 +05:30
Philipp Emanuel Weidmann c35f3031f8 Allow stopping the optimization process early with Ctrl+C 2025-11-21 10:11:00 +05:30
Nikolai Kolodziej 2e1bb4b655 Use PYTORCH_ALLOC_CONF instead of deprecated PYTORCH_CUDA_ALLOC_CONF (#32)
* Use `PYTORCH_ALLOC_CONF` instead of deprecated `PYTORCH_CUDA_ALLOC_CONF`

* style: reformat environment variable check
2025-11-21 07:27:28 +05:30
Anthony Eufemio af02bc6ece Fix support for MXFP4 quantized models with Triton tensors (#28)
When loading models with MXFP4 quantization (e.g., openai/gpt-oss-20b),
the transformers library uses Triton tensors to wrap the quantized weights.
These Triton tensors have a .data attribute containing the underlying
PyTorch tensor, but torch.is_tensor() returns False for them.

This caused a KeyError: 'mlp.down_proj' when trying to load such models,
as the try_add() function would fail the assertion check before adding
the down projection matrices.

The fix extracts the underlying PyTorch tensor via the .data attribute
when encountering Triton tensors, allowing heretic to work with MXFP4
quantized models while maintaining full compatibility with standard models.

Tested with openai/gpt-oss-20b on PyTorch 2.9.1+cu130, transformers 4.57.1,
triton 3.5.1, and kernels 0.11.0.
2025-11-20 13:43:06 +05:30
Philipp Emanuel Weidmann 22a4a5b5b5 Add citation information to README 2025-11-19 12:14:17 +05:30
Philipp Emanuel Weidmann 694edf18d3 Follow up after recent PRs 2025-11-19 11:19:47 +05:30
Philipp Emanuel Weidmann c9c022a143 Fix linting issues 2025-11-19 10:16:58 +05:30
Philipp Emanuel Weidmann 9905d9517f Fix formatting issues 2025-11-19 10:04:43 +05:30
Philipp Emanuel Weidmann f06e939791 Add Ruff as a dev dependency 2025-11-19 09:59:18 +05:30
Philipp Emanuel Weidmann f3b9826ca4 Add CI workflow 2025-11-19 09:45:54 +05:30
Richard Young, PhD 13bb7b24d6 Fix KeyError when HuggingFace user profile fields are missing (#20)
Handle optional fullname and email fields in user profile gracefully
using .get() method with fallback values to prevent KeyError when
uploading models to HuggingFace.

This fixes an issue where users without a public email or fullname
set in their HuggingFace profile would encounter an error during
the upload process.

Co-authored-by: ricyoung <riyoung@gmail.com>
2025-11-19 05:32:50 +05:30
Nikolai Kolodziej c8b6663b93 Fix multi-GPU support and memory management (#17)
* Ensure projector is on the same device as the matrix for multi-GPU support

* Optimize memory management for loaded model weights

* Refactor memory management by removing unnecessary gc.collect() calls

* Optimize memory usage (#1)

* Improve memory management by explicitly deleting model layers and optimizing projector usage

* Optimize memory management by explicitly deleting the model and forcing garbage collection

* Add back deleted `empty_cache` call

* Fix broken file

* Remove unnecessary deletions

* Remove unnecessary empty_cache() calls

* Remove unused import of gc

* Duplicate `gc.collect` call in `empty_cache()`

* Move additional `gc.collect` call in front of `torch.x.empty_cache`
2025-11-19 05:09:12 +05:30
Ooze 61fdf72b42 Add support for Granite MoE Hybrid in model.py by including down projections for shared MLP and MoE experts (#14) 2025-11-18 08:32:58 +05:30
red40maxxer 7bad84b4f1 perf: clear residuals after computing direction (#15)
Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com>
2025-11-17 22:18:22 +05:30
Matt Barnson 09730bad70 MPS support (#5)
* MPS support

* oops, added issue tracker.

* Delete .beads/issues.jsonl
2025-11-17 18:42:01 +05:30
Philipp Emanuel Weidmann b3545e4b1e Fix retrieving package version v1.0.1 2025-11-16 17:35:13 +05:30
Philipp Emanuel Weidmann 3f346b6150 Change package name 2025-11-16 17:01:50 +05:30
Philipp Emanuel Weidmann 1a59d226c1 Fix spacing after images in README 2025-11-16 16:06:08 +05:30
Philipp Emanuel Weidmann 12ecf50033 Add README 2025-11-16 15:19:27 +05:30
Philipp Emanuel Weidmann ea699dce46 Improve appearance of selection menus 2025-11-16 11:32:58 +05:30
Philipp Emanuel Weidmann 8a1aceff11 Switch to multi-objective optimization 2025-11-14 18:04:23 +05:30
Philipp Emanuel Weidmann 0bae27f359 Fix some of the problems with Falcon-E-3B 2025-11-13 11:39:08 +05:30
Philipp Emanuel Weidmann e24080db64 Add metadata to pyproject.toml 2025-11-02 10:06:15 +05:30
Philipp Emanuel Weidmann fae39ffb89 Move default configuration to Python 2025-11-02 09:29:55 +05:30
Philipp Emanuel Weidmann 850c21b534 Make multivariate TPE work properly 2025-11-01 16:57:12 +05:30
Philipp Emanuel Weidmann a24e6eba96 Improve optimization 2025-10-31 16:04:28 +05:30
Philipp Emanuel Weidmann a9655c8d31 Perform calculations involving residual vectors in float32
Credit to Jim Lai for pointing out potential numerical problems in https://huggingface.co/blog/grimjim/projected-abliteration
2025-10-31 13:47:24 +05:30
Philipp Emanuel Weidmann 1496e0a04c Dynamically choose between global and per-layer refusal directions 2025-10-31 13:04:45 +05:30
Philipp Emanuel Weidmann c638d3d012 Adjust score parameters 2025-10-25 13:15:31 +05:30
Philipp Emanuel Weidmann 47e855d5d8 Guard against missing model card data 2025-10-25 13:12:18 +05:30
Philipp Emanuel Weidmann e2419de016 Add "abliterated" to model tags 2025-10-25 09:59:44 +05:30
Philipp Emanuel Weidmann ad8b04d371 Bump version to 1.0.0 2025-10-25 09:52:43 +05:30
Philipp Emanuel Weidmann 37c5ea06d1 Print elapsed and remaining time 2025-10-25 09:47:54 +05:30
Philipp Emanuel Weidmann cf57a0cfbe Add functionality to evaluate any model relative to the main model 2025-10-24 13:38:03 +05:30
Philipp Emanuel Weidmann e6aba71186 Improve refusal detection 2025-10-24 11:27:28 +05:30
Philipp Emanuel Weidmann f8f3f9a012 Fix chat responses being cut off 2025-10-22 12:30:28 +05:30
Philipp Emanuel Weidmann 6359aa44bb Separate abliteration parameters for different layer components 2025-10-22 12:05:28 +05:30
Philipp Emanuel Weidmann ed65d6902b Support gpt-oss MoE 2025-10-15 17:51:39 +05:30
Philipp Emanuel Weidmann 7ed0cb1ffb Support Phi-3.5-MoE 2025-10-14 11:23:53 +05:30
Philipp Emanuel Weidmann 8b827ee386 Support multimodal models 2025-10-14 10:32:34 +05:30
Philipp Emanuel Weidmann dd7abd3296 Add hf_transfer to dependencies
Required for repositories that don't use Xet
2025-10-14 07:56:43 +05:30
Philipp Emanuel Weidmann 3d5e645c13 Handle Ctrl+C gracefully 2025-10-12 12:53:40 +05:30
Philipp Emanuel Weidmann 74b55977f0 Pretty-print configuration errors 2025-10-12 10:39:59 +05:30
Philipp Emanuel Weidmann b4a0c0d3f2 Add program version to generated README intro 2025-10-11 17:31:11 +05:30