Files
sunnypilot/release/ci/model_generator.py
James Vecellio-Grant 6d41ce2032 ci: Refactor model building workflows (#1096)
* Tinygrad bump from sync-20250627

* bump tinygrad_repo

* Reformat metadata generator to match driving_models.json

* bump tinygrad

* Revert "bump tinygrad"

This reverts commit f479dfd502.

* revert me after SP model compiled

* Model recompiled successfully, initiate "revert me after SP model compiled"

This reverts commit 95706eb688.

* The "FillMe" placeholder caused an extra 10 seconds of work

* bump to 22Jul2025

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Allow more dynamic short names

This should hopefully be future-proof for now.. It's robust enough to return the correct word-digit format (see example on how it generates from given display name below):

'Last Horizon V2 (November 22, 2024)' -> LHV2
'Alabama (November 25, 2024)' -> ALABAMA
'PlayStation (December 03, 2024)' -> PLAYSTAT
'Postal Service (December 09, 2024)' -> PS
'Null Pointer (December 13, 2024)' -> NP
'North America (December 16, 2024)' -> NA
'National Public Radio (December 18, 2024)' -> NPR
'Filet o Fish (March 7, 2025)' -> FOF
'Tomb Raider 2 (April 18, 2025)' -> TR2
'Tomb Raider 3 (April 22, 2025)' -> TR3
'Tomb Raider 4 (April 25, 2025)' -> TR4
'Tomb Raider 5 (April 25, 2025)' -> TR5
'Tomb Raider 6 (April 30, 2025)' -> TR6
'Tomb Raider 7 (May 07, 2025)' -> TR7
'Down to Ride (Revision: May 10, 2025)' -> DTR
'SP Vikander Model (May 16, 2025)' -> SPVM
'VFF Driving (May 15, 2025)' -> VFFD
'Secret Good Openpilot (May 16, 2025)' -> SGO
'Vegetarian Filet o Fish (May 29, 2025)' -> VFOF
'Down To Ride (Revision: May 30, 2025)' -> DTR
'Vegetarian Filet o Fish v2 (June 05, 2025)' -> VFOFV2
'Kerrygold Driving (June 08, 2025)' -> KD
'Tomb Raider 10 (June 16, 2025)' -> TR10
'Organic Kerrygold (June 17, 2025)' -> OK
'Liquid Crystal Driving (June 21, 2025)' -> LCD
'Vegetarian Filet o Fish v3 (June 21, 2025)' -> VFOFV3
'Vibe Model [Custom Model]' -> VMCM
'Tomb Raider 13 (June 27, 2025)' -> TR13
'Aggressive TR (June 28, 2025)' -> ATR
'Tomb Raider 14 (June 30, 2025)' -> TR14
'Cookiemonster Tomb Raider (July 02, 2025)' -> CTR
'Down to Ride (Revision: July 07, 2025)' -> DTR
'Simple Plan Driving (July 07, 2025)' -> SPD
'Down to Ride (Revision: July 08, 2025)' -> DTR
'Tomb Raider 15 (July 09, 2025)' -> TR15
'Tomb Raider 15 rev-2 (July 11, 2025)' -> TR15R2
'Le Tomb Raider 14 (July 14, 2025)' -> LTR14
'Le Tomb Raider 14h (July 17, 2025)' -> LTR14H
'Tomb Raider 16 (July 18, 2025)' -> TR16
'Tomb Raider 16v2 (July 21, 2025)' -> TR16V2

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* No need to sleep 3 seconds, just send it

* try dynamic

* cleanup

* Update build-single-tinygrad-model.yaml

* bc devtekve said. also, this is repetitive af

* Revert "bc devtekve said. also, this is repetitive af"

This reverts commit 3a0c1562de.

* maybe we could use a script instead that both build all

That both build all and sunnypilot-build-model reference

* refactor: consolidate model building steps into a single workflow

* tweak

* tweakx2

* tweakx3

* tweakx4

* dunno dunno...

* output dir

* lots of changes

* Revert "lots of changes"

This reverts commit 4aadb0ee29.

* fail if all fail

* no inputs needed

* make it easier for us

* note failure and exit 0

* Update build-all-tinygrad-models.yaml

* not needed unless we really want it

* Update build-single-tinygrad-model.yaml

* Merge branch 'sync-20250627-tinygrad' of github.com:sunnypilot/sunnypilot into sync-20250627-tinygrad

* retry for failed ?

* always run this step because sometimes one build fails

which causes the matrix to fail, but most builds still have uploaded artifacts.

* strip

* no escape

* Update build-all-tinygrad-models.yaml

* Test case from terminal run

(openpilot) james@Mac sunnypilot % jq -c '[.bundles[] | select(.runner=="tinygrad") | {ref, display_name: (.display_name | gsub(" \\([^)]*\\)"; "")), is_20hz}]' \
  /Users/james/Documents/GitHub/sunnypilot-docs/docs/driving_models_v6.json > matrix.json

mkdir -p output
touch "output/model-Tomb Raider 16v2 (July 21, 2025)-544"
touch "output/model-Space Lab Model (July 24, 2025)-547"
touch "output/model-Space Lab Model v1 (July 24, 2025)-548"

built=(); while IFS= read -r line; do built+=("$line"); done < <(
  ls output | sed -E 's/^model-//' | sed -E 's/-[0-9]+$//' | sed -E 's/ \([^)]*\)//' | awk '{gsub(/^ +| +$/, ""); print}'
)

jq -c --argjson built "$(printf '%s\n' "${built[@]}" | jq -R . | jq -s .)" \
  'map(select(.display_name as $n | ($built | index($n | gsub("^ +| +$"; "")) | not)))' \
  matrix.json > retry_matrix.json

cat retry_matrix.json

[]
(openpilot) james@Mac sunnypilot %

* always

* great success

* add suffix to retry artifact so it doesn't conflict

* retry to get_model too

* and there haha

* unnecessary hyphen

* compare built to missing. include retries

* adjust copy of artifacts.

* Update build-all-tinygrad-models.yaml

* Update model selector versioning and add documentation

* Update retry condition for failed models in build-all-tinygrad-models.yaml

* Update retry condition for failed models in build-all-tinygrad-models.yaml

* Update build-single-tinygrad-model.yaml

* false

* default none because why not

* red diff? i think?

* meh ... not needed i guess

* error error error

* Nayan is watching... always watching mike wazowski

* string all the way

* lots of retries just in case because im scared

* more robust

* ONLY ONE!!!!!!

* delete.... a lot

* fix artifacts

* fix artifacts

* make sure each is unique :)

* skip files like artifact duhhhh

* artifact name dir

* concurrency

* copy here

* Update build-single-tinygrad-model.yaml

* Update build-single-tinygrad-model.yaml

* bump

* bump tinygrad

* max parallel? if not, i have the other remedy ready in build-all

* revert me!

* I resynced tinygrad woo hoo

* setup shouldnt fail

* pull

* big ole diff

* condition

* Update build-all-tinygrad-models.yaml

* not always() never always() never!!!

* not failure instead of great success

* Update build-all-tinygrad-models.yaml

* yay that worked. lets invoke build-single one last time

* these arent used and are just taking up 250MB space

* really frog?

* bump back to 3

* self-hosted, tici

* rename to trigger tests

* 2 and done

---------

Co-authored-by: DevTekVE <devtekve@gmail.com>
2025-07-31 09:15:06 -07:00

147 lines
4.9 KiB
Python
Executable File

import os
import sys
import hashlib
import json
import re
from pathlib import Path
from datetime import datetime, UTC
def create_short_name(full_name):
# Remove parentheses and extract alphanumeric words
clean_name = re.sub(r'\([^)]*\)', '', full_name)
words = [re.sub(r'[^a-zA-Z0-9]', '', word) for word in clean_name.split() if re.sub(r'[^a-zA-Z0-9]', '', word)]
if len(words) == 1:
return words[0][:8].upper()
# Handle special case: Name + Version (e.g., "Word A1" -> "WordA1")
if len(words) == 2 and re.match(r'^[A-Za-z]\d+$', words[1]):
return (words[0] + words[1])[:8].upper()
result = ""
for word in words:
# Version or number patterns
if (re.match(r'^\d+[a-zA-Z]+$', word) or
re.match(r'^\d+[vVbB]\d+$', word) or
re.match(r'^[vVbB]\d+$', word) or
re.match(r'^\d{4}$', word)):
result += word.upper()
# All uppercase abbreviations (2-3 letters)
elif re.match(r'^[A-Z]{2,3}$', word):
result += word
# Letters+digits (for example tr15 rev2)
elif re.match(r'^[a-zA-Z]+[0-9]+$', word):
result += word[0].upper() + ''.join(re.findall(r'\d+', word))
elif word.isalpha():
result += word[0].upper()
elif word.isdigit():
result += word
else:
result += word[0].upper()
return result[:8]
def generate_metadata(model_path: Path, output_dir: Path, short_name: str):
model_path = model_path
output_path = output_dir
base = model_path.stem
# Define output files for tinygrad and metadata
tinygrad_file = output_path / f"{base}_tinygrad.pkl"
metadata_file = output_path / f"{base}_metadata.pkl"
if not tinygrad_file.exists() or not metadata_file.exists():
print(f"Error: Missing files for model {base} ({tinygrad_file} or {metadata_file})", file=sys.stderr)
return
# Calculate the sha256 hashes
with open(tinygrad_file, 'rb') as f:
tinygrad_hash = hashlib.sha256(f.read()).hexdigest()
with open(metadata_file, 'rb') as f:
metadata_hash = hashlib.sha256(f.read()).hexdigest()
# Rename the files if a custom file name is provided
if short_name:
tinygrad_file = tinygrad_file.rename(output_path / f"{base}_{short_name.lower()}_tinygrad.pkl")
metadata_file = metadata_file.rename(output_path / f"{base}_{short_name.lower()}_metadata.pkl")
# Build the metadata structure
model_metadata = {
"type": base.split("_")[-1] if "dmonitoring" not in base else "dmonitoring",
"artifact": {
"file_name": tinygrad_file.name,
"download_uri": {
"url": "https://gitlab.com/sunnypilot/public/docs.sunnypilot.ai/-/raw/main/",
"sha256": tinygrad_hash
}
},
"metadata": {
"file_name": metadata_file.name,
"download_uri": {
"url": "https://gitlab.com/sunnypilot/public/docs.sunnypilot.ai/-/raw/main/",
"sha256": metadata_hash
}
}
}
# Return model metadata
return model_metadata
def create_metadata_json(models: list, output_dir: Path, custom_name=None, short_name=None, is_20hz=False, upstream_branch="unknown"):
metadata_json = {
"short_name": short_name,
"display_name": custom_name or upstream_branch,
"is_20hz": is_20hz,
"ref": upstream_branch,
"environment": "development",
"runner": "tinygrad",
"index": -1,
"minimum_selector_version": "-1",
"generation": "-1",
"build_time": datetime.now(UTC).strftime("%Y-%m-%dT%H:%M:%SZ"),
"overrides": {},
"models": models,
}
# Write metadata to output_dir
with open(output_dir / "metadata.json", "w") as f:
json.dump(metadata_json, f, indent=2)
print(f"Generated metadata.json with {len(models)} models.")
if __name__ == "__main__":
import argparse
import glob
parser = argparse.ArgumentParser(description="Generate metadata for model files")
parser.add_argument("--model-dir", default="./models", help="Directory containing ONNX model files")
parser.add_argument("--output-dir", default="./output", help="Output directory for metadata")
parser.add_argument("--custom-name", help="Custom display name for the model")
parser.add_argument("--is-20hz", action="store_true", help="Whether this is a 20Hz model")
parser.add_argument("--upstream-branch", default="unknown", help="Upstream branch name")
args = parser.parse_args()
# Find all ONNX files in the given directory
model_paths = glob.glob(os.path.join(args.model_dir, "*.onnx"))
if not model_paths:
print(f"No ONNX files found in {args.model_dir}", file=sys.stderr)
sys.exit(1)
_output_dir = Path(args.output_dir)
_output_dir.mkdir(exist_ok=True, parents=True)
_models = []
for _model_path in model_paths:
_model_metadata = generate_metadata(Path(_model_path), _output_dir, create_short_name(args.custom_name))
if _model_metadata:
_models.append(_model_metadata)
if _models:
create_metadata_json(_models, _output_dir, args.custom_name, create_short_name(args.custom_name), args.is_20hz, args.upstream_branch)
else:
print("No models processed.", file=sys.stderr)