qmd/finetune/dataset
Tobi Lutke 32706a720f
Refactor finetune folder: train/rl scripts with YAML configs
Major changes:
- train.py: Generic SFT training script using YAML config
- rl.py: Generic GRPO training script using YAML config
- configs/: YAML configs per training run (sft_v4.yaml, grpo_v4.yaml)
- dataset/: Data preparation scripts moved here
- tui.py: Interactive model testing interface

Training results:
- SFT v4: 98.8% avg score (all Excellent)
- GRPO v4: 0% (failed - model drifted to verbose explanations)

Removed per-model scripts (train_0.6B.py, train_1.7B.py, etc)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 20:26:46 -05:00
..
clean_data.py Refactor finetune folder: train/rl scripts with YAML configs 2026-01-24 20:26:46 -05:00
generate_data_offline.py Refactor finetune folder: train/rl scripts with YAML configs 2026-01-24 20:26:46 -05:00
generate_data.py Refactor finetune folder: train/rl scripts with YAML configs 2026-01-24 20:26:46 -05:00
prepare_data.py Refactor finetune folder: train/rl scripts with YAML configs 2026-01-24 20:26:46 -05:00