qmd/finetune/Justfile
Shreyas Karnik 2df95ac9ba
feat: add ONNX conversion script for Transformers.js deployment
Add convert_onnx.py that mirrors convert_gguf.py's structure:
- Loads base Qwen3 model, merges SFT + GRPO adapters
- Exports to ONNX via Optimum (text-generation-with-past task)
- Supports Q4 (MatMulNBits), Q8, FP16, and FP32 output
- Uploads to separate HF repo (e.g. tobil/qmd-query-expansion-1.7B-ONNX)
- Writes Transformers.js compatibility config
- Includes model card with usage example

Usage:
    uv run convert_onnx.py --size 1.7B
    uv run convert_onnx.py --size 1.7B --quantize q4 --no-upload

Also adds `just convert-onnx` and `just convert-gguf` tasks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 11:50:03 -07:00

41 lines
1.1 KiB
Makefile

set shell := ["bash", "-uc"]
validate:
uv run dataset/validate_schema.py
uv run dataset/score_data.py
for f in data/*.jsonl; do \
uv run dataset/analyze_data.py --input "$f" --show-examples 0; \
done
score:
uv run dataset/score_data.py
schema:
uv run dataset/validate_schema.py
analyze:
for f in data/*.jsonl; do \
uv run dataset/analyze_data.py --input "$f" --show-examples 0; \
done
prepare:
QMD_BASE_MODEL=Qwen/Qwen3-1.7B uv run dataset/prepare_data.py --seed 42
convert-onnx size="1.7B":
uv run convert_onnx.py --size {{size}}
convert-gguf size="1.7B":
uv run convert_gguf.py --size {{size}}
train-local:
just prepare
HF_TOKEN=${HF_TOKEN} uv run torchrun --standalone --nproc_per_node auto \
train.py sft --config configs/sft_local.yaml |& tee /tmp/qmd-sft-train.log
# Experimental GRPO training is in finetune/experiments/grpo and not part of
# the default pipeline.
#
# grpo-local:
# HF_TOKEN=${HF_TOKEN} uv run train.py grpo --config experiments/grpo/grpo.yaml |& tee /tmp/qmd-grpo-train.log