Skip to content

Add LoRA fork weight loading (pre-transformers-v5 base)#654

Open
arcticfly wants to merge 1 commit intomainfrom
fix/fork-on-pre-v5
Open

Add LoRA fork weight loading (pre-transformers-v5 base)#654
arcticfly wants to merge 1 commit intomainfrom
fix/fork-on-pre-v5

Conversation

@arcticfly
Copy link
Copy Markdown
Collaborator

Summary

Adds the pieces needed for `backend._experimental_fork_checkpoint` to actually
load the forked LoRA weights into the trainer (rather than just copying the
checkpoint directory and letting `from_pretrained` initialize a fresh LoRA).

  • `UnslothState.load_lora_adapter(path)` — reads `adapter_model.safetensors` and applies it to the live peft model via `set_peft_model_state_dict`.
  • `UnslothService._forked_checkpoint_dir` — records the forked path so the first `_train_dedicated` / `_train_shared` call applies it.
  • `LocalBackend._experimental_fork_checkpoint` — invalidates the `_state` cache after `shutil.copytree` and records `_forked_checkpoint_dir` on the service.

Why the unusual base

This branch is based on commit `621e82b2` (last commit before the transformers-v5 upgrade in #629), not current main. On H200 + `load_in_4bit=True`, transformers v5 + Unsloth 2026.3.3 crash with `Half and BFloat16` in Unsloth's fused LoRA kernels on the first forward pass, before any rollouts. The v4 base avoids that.

Not expected to merge as-is — posting as a reference for the fork-weight-loading mechanics. Maintainers would likely want to:

  1. Resolve the v5 dtype mismatch upstream (possibly via Unsloth), then
  2. Cherry-pick the three pieces above onto main.

Test plan

  • End-to-end 20-step training on a forked `kl-000-1` checkpoint: checkpoint reloaded correctly across every step, `val/reward` started at ~0.86 (source-checkpoint quality, not raw-base-model quality).
  • End-to-end training without forking: unchanged behavior.
  • Maintainer review of whether this approach is the right shape for a forward-port.

🤖 Generated with Claude Code

Adds three pieces needed for LocalBackend._experimental_fork_checkpoint
to actually load the forked LoRA weights into the trainer:

1. UnslothState.load_lora_adapter — loads adapter_model.safetensors
   into the live peft model via set_peft_model_state_dict, replacing
   the freshly-initialized LoRA layers from from_pretrained.

2. UnslothService._forked_checkpoint_dir — stores the forked path so
   the first _train_dedicated / _train_shared call can apply it.

3. backend._experimental_fork_checkpoint — invalidates the _state cache
   after copytree, then records _forked_checkpoint_dir on the service.

Built on 621e82b (pre-transformers-v5) because v5 introduces a bf16/fp16
mismatch in Unsloth's fused LoRA kernels that crashes every forward pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant