Add convert_onnx.py that mirrors convert_gguf.py's structure:
- Loads base Qwen3 model, merges SFT + GRPO adapters
- Exports to ONNX via Optimum (text-generation-with-past task)
- Supports Q4 (MatMulNBits), Q8, FP16, and FP32 output
- Uploads to separate HF repo (e.g. tobil/qmd-query-expansion-1.7B-ONNX)
- Writes Transformers.js compatibility config
- Includes model card with usage example
Usage:
uv run convert_onnx.py --size 1.7B
uv run convert_onnx.py --size 1.7B --quantize q4 --no-upload
Also adds `just convert-onnx` and `just convert-gguf` tasks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
41 lines
1.1 KiB
Makefile
41 lines
1.1 KiB
Makefile
set shell := ["bash", "-uc"]
|
|
|
|
validate:
|
|
uv run dataset/validate_schema.py
|
|
uv run dataset/score_data.py
|
|
for f in data/*.jsonl; do \
|
|
uv run dataset/analyze_data.py --input "$f" --show-examples 0; \
|
|
done
|
|
|
|
score:
|
|
uv run dataset/score_data.py
|
|
|
|
schema:
|
|
uv run dataset/validate_schema.py
|
|
|
|
analyze:
|
|
for f in data/*.jsonl; do \
|
|
uv run dataset/analyze_data.py --input "$f" --show-examples 0; \
|
|
done
|
|
|
|
prepare:
|
|
QMD_BASE_MODEL=Qwen/Qwen3-1.7B uv run dataset/prepare_data.py --seed 42
|
|
|
|
convert-onnx size="1.7B":
|
|
uv run convert_onnx.py --size {{size}}
|
|
|
|
convert-gguf size="1.7B":
|
|
uv run convert_gguf.py --size {{size}}
|
|
|
|
train-local:
|
|
just prepare
|
|
HF_TOKEN=${HF_TOKEN} uv run torchrun --standalone --nproc_per_node auto \
|
|
train.py sft --config configs/sft_local.yaml |& tee /tmp/qmd-sft-train.log
|
|
|
|
# Experimental GRPO training is in finetune/experiments/grpo and not part of
|
|
# the default pipeline.
|
|
#
|
|
# grpo-local:
|
|
# HF_TOKEN=${HF_TOKEN} uv run train.py grpo --config experiments/grpo/grpo.yaml |& tee /tmp/qmd-grpo-train.log
|
|
|