Commit Graph

6 Commits

Author SHA1 Message Date
Tobi Lütke
d6f3688d91
Remove grpo command from default train entrypoint 2026-02-22 15:29:09 -05:00
Tobi Lütke
189916d6fb
Move GRPO training out of default finetune pipeline 2026-02-22 15:26:23 -05:00
Tobi Lutke
1d7d167b29
finetune: strict Pydantic schema, one canonical data format
Replace ad-hoc JSON parsing with a strict Pydantic model
(TrainingExample with typed OutputPair). All data loading goes
through load_examples() which fails loudly on invalid data.

- Convert v3_structured.jsonl from "searches" to "output" format
- Rewrite all consumer scripts (prepare, validate, score, analyze)
  to load through the Pydantic schema
- Prepared train/val files are ephemeral build artifacts
- Restore LFM2 and GEPA experiments under experiments/
- Add pydantic>=2.0 to dependencies
2026-02-22 13:39:00 -04:00
Tobi Lutke
739038e1a7
docs: add explicit HuggingFace repo destinations
- List all HuggingFace repos in CLAUDE.md (model, gguf, sft, grpo, train)
- Update jobs scripts to use tobil/qmd-query-expansion-train (no -v2)
- Clarify rules: no versioned repos, update in place

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 12:26:02 -05:00
Tobi Lutke
38073799c0
chore: clean up finetune folder and fix training workflow
- Remove versioned files (sft_v4.yaml, prepare_v4_dataset.py, train_v2/)
- Update configs to use local data/train/ directory
- Add glob pattern support to prepare_data.py and train.py
- Update .gitignore to properly ignore outputs/ and data/train*/
- Document data preparation step in CLAUDE.md

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 12:21:09 -05:00
Tobi Lutke
533f0eed37
docs: add finetune CLAUDE.md and update training workflow
- Add finetune/CLAUDE.md documenting the training pipeline
- Update configs to output to local outputs/ directory (gitignored)
- Document that all data/*.jsonl files are training data
- Document local CUDA training vs HuggingFace Jobs cloud training
- Enforce eval requirement before any model upload
- Single model repo (no -v1, -v2, -v4 versioning)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-31 12:15:56 -05:00