heretic

mirror of https://github.com/p-e-w/heretic.git synced 2026-06-02 05:03:33 +02:00

Author	SHA1	Message	Date
dependabot[bot]	5f6e1e4d52	build(deps): bump requests from 2.32.5 to 2.33.0 (#272 ) Bumps [requests](https://github.com/psf/requests) from 2.32.5 to 2.33.0. - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](https://github.com/psf/requests/compare/v2.32.5...v2.33.0) --- updated-dependencies: - dependency-name: requests dependency-version: 2.33.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:25:25 +05:30
dependabot[bot]	7ebd92dfa7	build(deps): bump pygments from 2.19.2 to 2.20.0 (#271 ) Bumps [pygments](https://github.com/pygments/pygments) from 2.19.2 to 2.20.0. - [Release notes](https://github.com/pygments/pygments/releases) - [Changelog](https://github.com/pygments/pygments/blob/master/CHANGES) - [Commits](https://github.com/pygments/pygments/compare/2.19.2...2.20.0) --- updated-dependencies: - dependency-name: pygments dependency-version: 2.20.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:24:56 +05:30
dependabot[bot]	655d66ef24	build(deps): bump nltk from 3.9.3 to 3.9.4 (#270 ) Bumps [nltk](https://github.com/nltk/nltk) from 3.9.3 to 3.9.4. - [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog) - [Commits](https://github.com/nltk/nltk/compare/3.9.3...3.9.4) --- updated-dependencies: - dependency-name: nltk dependency-version: 3.9.4 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:24:29 +05:30
dependabot[bot]	0f99c882ec	build(deps): bump filelock from 3.20.0 to 3.20.3 (#269 ) Bumps [filelock](https://github.com/tox-dev/py-filelock) from 3.20.0 to 3.20.3. - [Release notes](https://github.com/tox-dev/py-filelock/releases) - [Changelog](https://github.com/tox-dev/filelock/blob/main/docs/changelog.rst) - [Commits](https://github.com/tox-dev/py-filelock/compare/3.20.0...3.20.3) --- updated-dependencies: - dependency-name: filelock dependency-version: 3.20.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:23:59 +05:30
dependabot[bot]	92f851b693	build(deps): bump pillow from 12.0.0 to 12.1.1 (#268 ) Bumps [pillow](https://github.com/python-pillow/Pillow) from 12.0.0 to 12.1.1. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/12.0.0...12.1.1) --- updated-dependencies: - dependency-name: pillow dependency-version: 12.1.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:23:32 +05:30
dependabot[bot]	81e0c84ec6	build(deps): bump aiohttp from 3.13.2 to 3.13.4 (#267 ) --- updated-dependencies: - dependency-name: aiohttp dependency-version: 3.13.4 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-04 08:10:51 +05:30
Philipp Emanuel Weidmann	887d43a8d9	fix: set batch size on HFLM object	2026-04-01 14:27:43 +05:30
Philipp Emanuel Weidmann	96c7a7d98a	fix: replace tqdm progress bars with Rich progress bars	2026-03-28 18:30:15 +05:30
Philipp Emanuel Weidmann	1126332281	feat: add integrated benchmarking system	2026-03-24 18:25:12 +05:30
Philipp Emanuel Weidmann	19cdf7e244	fix: address ty complaint	2026-03-15 09:58:00 +05:30
Philipp Emanuel Weidmann	94775d4148	chore: update dependencies	2026-03-15 09:31:32 +05:30
cpagac	515a7b9eb5	fix: prevent div-by-zero in evaluator when base_refusals is 0 (#225 ) * fix: prevent div-by-zero in evaluator when base_refusals is 0 When a model refuses all prompts from the start, base_refusals is 0. Return refusals directly in that case so ablations that introduce new refusals are still penalized correctly. * fix: cast refusals to float for type consistency" before hitting commit changes Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-13 11:21:23 +05:30
erm14254	e26da5e0e6	fix: display all abliterable components across layers (#215 ) * fix: display all abliterable components across layers The current code only displays abliterable components from layer 0, which is misleading for hybrid architectures like Qwen3.5 that use different attention types across layers (e.g., `linear_attn.out_proj` in some layers, `self_attn.o_proj` in others). This fix iterates through all layers to collect and display the complete set of abliterable components with accurate module counts. Before (Qwen3.5-27B): * attn.out_proj: 1 modules per layer * mlp.down_proj: 1 modules per layer After (Qwen3.5-27B): * attn.out_proj: 48 modules total * attn.o_proj: 16 modules total * mlp.down_proj: 64 modules total * Fix formatting --------- Co-authored-by: Lawfer12 <ac728@ymail.com>	2026-03-11 14:10:37 +05:30
Philipp Emanuel Weidmann	ec0367226d	style: fix formatting and naming	2026-03-06 13:18:08 +05:30
Matthias Stegner	5e3c04c802	feat: add Qwen3.5 MoE hybrid layer support (#187 ) * feat: add Qwen3.5 MoE hybrid layer support Qwen3.5 MoE uses GatedDeltaNet (linear attention) on some layers instead of standard self-attention, causing abliteration to fail because self_attn.o_proj doesn't exist on those layers. Changes: - Wrap self_attn.o_proj in suppress(Exception) and add linear_attn.out_proj as alternative attention out-projection for GatedDeltaNet layers - Scan all layers in get_abliterable_components() instead of only layer 0, since hybrid models have different components on different layers - Derive LoRA target_modules from actual named_modules() instead of splitting component keys, which fails when module names differ across layers (e.g. "o_proj" vs "out_proj") Tested with Qwen3.5-397B-A17B (7/100 refusals, KL 0.2676). Relates to #43 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Philipp Emanuel Weidmann <pew@worldwidemann.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-03-06 13:03:57 +05:30
Spiky Moth	303ba9d978	fix: recheck prefix after inserting predefined (#194 )	2026-02-27 08:07:33 +05:30
Philipp Emanuel Weidmann	cb4ef3fdfc	docs: add Trendshift badge to README	2026-02-20 13:00:19 +05:30
cpagac	4c80c4beb9	fix: report VRAM usage across all GPUs instead of only the default device (#169 ) memory_allocated() and memory_reserved() without a device argument only report GPU 0. Sum across all devices for correct multi-GPU totals and add total VRAM reporting.	2026-02-17 12:53:41 +05:30
Spiky Moth	3a115e280c	fix: produce card for local models with existing readme (#157 )	2026-02-15 19:10:10 +05:30
Philipp Emanuel Weidmann	27097bfe8e	build: bump version to 1.2.0 v1.2.0	2026-02-14 18:11:42 +05:30
Philipp Emanuel Weidmann	025ab3a881	fix: disable LoRA export for now Workaround for #152	2026-02-14 16:56:12 +05:30
Philipp Emanuel Weidmann	1179013999	docs: update README	2026-02-14 16:32:08 +05:30
Philipp Emanuel Weidmann	fe7bc1bae3	docs: update README	2026-02-14 10:47:28 +05:30
Philipp Emanuel Weidmann	e70a1a85e8	fix: don't load checkpoint when evaluating a second model Fixes #144	2026-02-14 10:02:17 +05:30
Philipp Emanuel Weidmann	e7f8be98b7	fix: only export tokenizer when exporting full model Fixes #143	2026-02-14 09:18:22 +05:30
Philipp Emanuel Weidmann	6017bcd347	fix: use compatible release specifiers for non-dev dependencies Fixes #145 Credit to MuX on Discord for recognizing that this is an issue with Transformers 5	2026-02-13 12:27:57 +05:30
Philipp Emanuel Weidmann	dd0b3a2f69	docs: update README	2026-02-11 11:09:17 +05:30
Philipp Emanuel Weidmann	b873598b77	docs: improve settings documentation	2026-02-11 10:19:05 +05:30
Philipp Emanuel Weidmann	10ceb3098e	chore: update copyright notice	2026-02-11 09:46:36 +05:30
Salman Chishti	745b582414	ci: upgrade GitHub Actions to latest versions (#137 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2026-02-08 16:49:04 +05:30
Salman Chishti	d0e9462fb8	ci: upgrade GitHub Actions for Node 24 compatibility (#136 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>	2026-02-08 16:48:12 +05:30
Philipp Emanuel Weidmann	f68a887a7b	fix: improve code quality, improve UX, fix small bugs	2026-02-08 13:32:00 +05:30
Philipp Emanuel Weidmann	2690655a83	feat: print memory usage during run	2026-02-02 21:18:01 +05:30
Spiky Moth	3525b1ac22	Implement Magnitude-Preserving Orthogonal Ablation (#52 ) * feat: add support for winsorizing the residuals Adds setting winsorization_quantile, expressed as the quantile to clamp to. - If set to a value below 1, the residuals obtained from evaluating the first token of the good and bad prompts are winsorized - that is, values outside the given quantile are clamped. Note that winsorization_quantile = 0.95 corresponds to a 90% winsorization. * feat: implement magnitude-preserving orthogonal ablation Adds boolean setting orthogonalize_direction: - When enabled, only the component of the refusal directions that is orthogonal to the harmless direction is subtracted during abliteration. Adds enum-valued setting row_normalization: - 'none': No normalization. - 'pre': Row-normalize the weight matrix before computing the LoRA adapter. - 'full': Like 'pre', but re-normalizes to preserve original row magnitudes. * prefer 'good' and 'bad' over 'harmless' and 'harmful' * clarify how winsorization is applied * store and reuse full peft_config * remove unneeded cast * make LoRA rank configurable for full normalization * explain why the singular values are split across the components	2026-02-02 17:05:19 +05:30
anrp	42f5a9b553	fix: Use file instead of symlink lock (for windows) (#116 )	2026-01-25 19:34:01 +05:30
anrp	451db0b76e	fix: specify study name (#119 ) If we don't, optuna will generate a UUID for a name, which will never be found when loading as it is a "different" study. https://optuna.readthedocs.io/en/stable/reference/generated/optuna.study.create_study.html#optuna.study.create_study	2026-01-25 18:48:23 +05:30
anrp	ebc22c299e	feat: Allow study progress to be saved & resumed (#106 ) * feat: Store active study in log/study.jsonl and allow resuming * Simplify resume logic with load_if_exists=True * Significantly improve flexibility of study save/load * Put constructor arguments at the highest precedence * Review comments --------- Co-authored-by: Spiky Moth <spikymoth@pm.me>	2026-01-23 19:49:37 +05:30
anrp	d5c834c51d	fix: Allow abliterating VL models (#108 ) Per https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classes, it indicates that "There is one class of AutoModel for each task." Use the presence of "vision_config" in the config.json to determine which.	2026-01-23 19:34:31 +05:30
anrp	c86f49035e	feat: Refactor save machinery and always allow user to save LoRA (#110 )	2026-01-20 18:53:47 +05:30
anrp	85a6ec5ecb	fix: Include kernels (allows MXFP4 to be loaded in MXFP4 instead of upcasting) (#107 ) Co-authored-by: Andrew Patrikalakis <anrp@tri.global>	2026-01-16 17:30:24 +05:30
Philipp Emanuel Weidmann	632b1da622	feat: add config file for slop reduction	2026-01-11 18:51:26 +05:30
Philipp Emanuel Weidmann	1cfd09d7f3	ci: add style guide for Gemini	2026-01-09 14:58:56 +05:30
Philipp Emanuel Weidmann	09be09e12e	fix: restore classification of empty responses as refusals Fixes #93	2026-01-02 16:50:02 +05:30
Philipp Emanuel Weidmann	039f6222d2	feat: allow overriding the system prompt per dataset	2025-12-31 14:26:44 +05:30
Philipp Emanuel Weidmann	c4b2ea0c42	feat: allow injecting prefixes and suffixes into prompts	2025-12-31 12:00:44 +05:30
Philipp Emanuel Weidmann	02a5237a02	feat: add option to print prompt/response pairs	2025-12-27 14:48:29 +05:30
Philipp Emanuel Weidmann	cf8cf6f349	fix: address remaining ty complaint	2025-12-22 11:12:45 +05:30
Philipp Emanuel Weidmann	2141e110fb	ci: treat ty warnings as errors	2025-12-22 10:57:36 +05:30
Philipp Emanuel Weidmann	39101137ef	ci: add type checking	2025-12-22 10:48:42 +05:30
Philipp Emanuel Weidmann	064bed9a9f	fix: resolve issues raised by ty A single issue has been deliberately left unfixed to verify that the CI check works	2025-12-22 10:24:55 +05:30

1 2 3

131 Commits