Files
sunnypilot/.github/workflows/build-all-tinygrad-models.yaml
James Vecellio-Grant 6d41ce2032 ci: Refactor model building workflows (#1096)
* Tinygrad bump from sync-20250627

* bump tinygrad_repo

* Reformat metadata generator to match driving_models.json

* bump tinygrad

* Revert "bump tinygrad"

This reverts commit f479dfd502.

* revert me after SP model compiled

* Model recompiled successfully, initiate "revert me after SP model compiled"

This reverts commit 95706eb688.

* The "FillMe" placeholder caused an extra 10 seconds of work

* bump to 22Jul2025

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* Allow more dynamic short names

This should hopefully be future-proof for now.. It's robust enough to return the correct word-digit format (see example on how it generates from given display name below):

'Last Horizon V2 (November 22, 2024)' -> LHV2
'Alabama (November 25, 2024)' -> ALABAMA
'PlayStation (December 03, 2024)' -> PLAYSTAT
'Postal Service (December 09, 2024)' -> PS
'Null Pointer (December 13, 2024)' -> NP
'North America (December 16, 2024)' -> NA
'National Public Radio (December 18, 2024)' -> NPR
'Filet o Fish (March 7, 2025)' -> FOF
'Tomb Raider 2 (April 18, 2025)' -> TR2
'Tomb Raider 3 (April 22, 2025)' -> TR3
'Tomb Raider 4 (April 25, 2025)' -> TR4
'Tomb Raider 5 (April 25, 2025)' -> TR5
'Tomb Raider 6 (April 30, 2025)' -> TR6
'Tomb Raider 7 (May 07, 2025)' -> TR7
'Down to Ride (Revision: May 10, 2025)' -> DTR
'SP Vikander Model (May 16, 2025)' -> SPVM
'VFF Driving (May 15, 2025)' -> VFFD
'Secret Good Openpilot (May 16, 2025)' -> SGO
'Vegetarian Filet o Fish (May 29, 2025)' -> VFOF
'Down To Ride (Revision: May 30, 2025)' -> DTR
'Vegetarian Filet o Fish v2 (June 05, 2025)' -> VFOFV2
'Kerrygold Driving (June 08, 2025)' -> KD
'Tomb Raider 10 (June 16, 2025)' -> TR10
'Organic Kerrygold (June 17, 2025)' -> OK
'Liquid Crystal Driving (June 21, 2025)' -> LCD
'Vegetarian Filet o Fish v3 (June 21, 2025)' -> VFOFV3
'Vibe Model [Custom Model]' -> VMCM
'Tomb Raider 13 (June 27, 2025)' -> TR13
'Aggressive TR (June 28, 2025)' -> ATR
'Tomb Raider 14 (June 30, 2025)' -> TR14
'Cookiemonster Tomb Raider (July 02, 2025)' -> CTR
'Down to Ride (Revision: July 07, 2025)' -> DTR
'Simple Plan Driving (July 07, 2025)' -> SPD
'Down to Ride (Revision: July 08, 2025)' -> DTR
'Tomb Raider 15 (July 09, 2025)' -> TR15
'Tomb Raider 15 rev-2 (July 11, 2025)' -> TR15R2
'Le Tomb Raider 14 (July 14, 2025)' -> LTR14
'Le Tomb Raider 14h (July 17, 2025)' -> LTR14H
'Tomb Raider 16 (July 18, 2025)' -> TR16
'Tomb Raider 16v2 (July 21, 2025)' -> TR16V2

* Update build-all-tinygrad-models.yaml

* Update build-all-tinygrad-models.yaml

* No need to sleep 3 seconds, just send it

* try dynamic

* cleanup

* Update build-single-tinygrad-model.yaml

* bc devtekve said. also, this is repetitive af

* Revert "bc devtekve said. also, this is repetitive af"

This reverts commit 3a0c1562de.

* maybe we could use a script instead that both build all

That both build all and sunnypilot-build-model reference

* refactor: consolidate model building steps into a single workflow

* tweak

* tweakx2

* tweakx3

* tweakx4

* dunno dunno...

* output dir

* lots of changes

* Revert "lots of changes"

This reverts commit 4aadb0ee29.

* fail if all fail

* no inputs needed

* make it easier for us

* note failure and exit 0

* Update build-all-tinygrad-models.yaml

* not needed unless we really want it

* Update build-single-tinygrad-model.yaml

* Merge branch 'sync-20250627-tinygrad' of github.com:sunnypilot/sunnypilot into sync-20250627-tinygrad

* retry for failed ?

* always run this step because sometimes one build fails

which causes the matrix to fail, but most builds still have uploaded artifacts.

* strip

* no escape

* Update build-all-tinygrad-models.yaml

* Test case from terminal run

(openpilot) james@Mac sunnypilot % jq -c '[.bundles[] | select(.runner=="tinygrad") | {ref, display_name: (.display_name | gsub(" \\([^)]*\\)"; "")), is_20hz}]' \
  /Users/james/Documents/GitHub/sunnypilot-docs/docs/driving_models_v6.json > matrix.json

mkdir -p output
touch "output/model-Tomb Raider 16v2 (July 21, 2025)-544"
touch "output/model-Space Lab Model (July 24, 2025)-547"
touch "output/model-Space Lab Model v1 (July 24, 2025)-548"

built=(); while IFS= read -r line; do built+=("$line"); done < <(
  ls output | sed -E 's/^model-//' | sed -E 's/-[0-9]+$//' | sed -E 's/ \([^)]*\)//' | awk '{gsub(/^ +| +$/, ""); print}'
)

jq -c --argjson built "$(printf '%s\n' "${built[@]}" | jq -R . | jq -s .)" \
  'map(select(.display_name as $n | ($built | index($n | gsub("^ +| +$"; "")) | not)))' \
  matrix.json > retry_matrix.json

cat retry_matrix.json

[]
(openpilot) james@Mac sunnypilot %

* always

* great success

* add suffix to retry artifact so it doesn't conflict

* retry to get_model too

* and there haha

* unnecessary hyphen

* compare built to missing. include retries

* adjust copy of artifacts.

* Update build-all-tinygrad-models.yaml

* Update model selector versioning and add documentation

* Update retry condition for failed models in build-all-tinygrad-models.yaml

* Update retry condition for failed models in build-all-tinygrad-models.yaml

* Update build-single-tinygrad-model.yaml

* false

* default none because why not

* red diff? i think?

* meh ... not needed i guess

* error error error

* Nayan is watching... always watching mike wazowski

* string all the way

* lots of retries just in case because im scared

* more robust

* ONLY ONE!!!!!!

* delete.... a lot

* fix artifacts

* fix artifacts

* make sure each is unique :)

* skip files like artifact duhhhh

* artifact name dir

* concurrency

* copy here

* Update build-single-tinygrad-model.yaml

* Update build-single-tinygrad-model.yaml

* bump

* bump tinygrad

* max parallel? if not, i have the other remedy ready in build-all

* revert me!

* I resynced tinygrad woo hoo

* setup shouldnt fail

* pull

* big ole diff

* condition

* Update build-all-tinygrad-models.yaml

* not always() never always() never!!!

* not failure instead of great success

* Update build-all-tinygrad-models.yaml

* yay that worked. lets invoke build-single one last time

* these arent used and are just taking up 250MB space

* really frog?

* bump back to 3

* self-hosted, tici

* rename to trigger tests

* 2 and done

---------

Co-authored-by: DevTekVE <devtekve@gmail.com>
2025-07-31 09:15:06 -07:00

286 lines
11 KiB
YAML

name: Build and push all tinygrad models
on:
workflow_dispatch:
inputs:
set_min_version:
description: 'Minimum selector version required for the models (see helpers.py or readme.md)'
required: true
type: string
jobs:
setup:
runs-on: ubuntu-latest
outputs:
json_version: ${{ steps.get-json.outputs.json_version }}
recompiled_dir: ${{ steps.create-recompiled-dir.outputs.recompiled_dir }}
json_file: ${{ steps.get-json.outputs.json_file }}
model_matrix: ${{ steps.set-matrix.outputs.model_matrix }}
steps:
- name: Checkout docs repo (sunnypilot-docs, gh-pages)
uses: actions/checkout@v4
with:
repository: sunnypilot/sunnypilot-docs
ref: gh-pages
path: docs
ssh-key: ${{ secrets.CI_SUNNYPILOT_DOCS_PRIVATE_KEY }}
- name: Get next JSON version to use (from GitHub docs repo)
id: get-json
run: |
cd docs/docs
latest=$(ls driving_models_v*.json | sed -E 's/.*_v([0-9]+)\.json/\1/' | sort -n | tail -1)
next=$((latest+1))
json_file="driving_models_v${next}.json"
cp "driving_models_v${latest}.json" "$json_file"
echo "json_file=docs/docs/$json_file" >> $GITHUB_OUTPUT
echo "json_version=$((next+0))" >> $GITHUB_OUTPUT
echo "SRC_JSON_FILE=docs/docs/driving_models_v${latest}.json" >> $GITHUB_ENV
- name: Extract tinygrad models
id: set-matrix
working-directory: docs/docs
run: |
jq -c '[.bundles[] | select(.runner=="tinygrad") | {ref, display_name: (.display_name | gsub(" \\([^)]*\\)"; "")), is_20hz}]' "$(basename "${SRC_JSON_FILE}")" > matrix.json
echo "model_matrix=$(cat matrix.json)" >> $GITHUB_OUTPUT
- name: Set up SSH
uses: webfactory/ssh-agent@v0.9.0
with:
ssh-private-key: ${{ secrets.GITLAB_SSH_PRIVATE_KEY }}
- run: |
mkdir -p ~/.ssh
ssh-keyscan -H gitlab.com >> ~/.ssh/known_hosts
- name: Clone GitLab docs repo and create new recompiled dir
id: create-recompiled-dir
env:
GIT_SSH_COMMAND: 'ssh -o UserKnownHostsFile=~/.ssh/known_hosts'
run: |
git clone --depth 1 --filter=tree:0 --sparse git@gitlab.com:sunnypilot/public/docs.sunnypilot.ai2.git gitlab_docs
cd gitlab_docs
git checkout main
git sparse-checkout set --no-cone models/
cd models
latest_dir=$(ls -d recompiled* 2>/dev/null | sed -E 's/recompiled([0-9]+)/\1/' | sort -n | tail -1)
if [[ -z "$latest_dir" ]]; then
next_dir=1
else
next_dir=$((latest_dir+1))
fi
recompiled_dir="${next_dir}"
mkdir -p "recompiled${recompiled_dir}"
touch "recompiled${recompiled_dir}/.gitkeep"
cd ../..
echo "recompiled_dir=$recompiled_dir" >> $GITHUB_OUTPUT
- name: Push empty recompiled dir to GitLab
run: |
cd gitlab_docs
git add models/recompiled${{ steps.create-recompiled-dir.outputs.recompiled_dir }}
git config --global user.name "GitHub Action"
git config --global user.email "action@github.com"
git commit -m "Add recompiled${{ steps.create-recompiled-dir.outputs.recompiled_dir }} for build-all" || echo "No changes to commit"
git push origin main
- name: Push new JSON to GitHub docs repo
run: |
cd docs
git pull origin gh-pages
git add docs/"$(basename ${{ steps.get-json.outputs.json_file }})"
git config --global user.name "GitHub Action"
git config --global user.email "action@github.com"
git commit -m "Add new ${{ steps.get-json.outputs.json_file }} for build-all" || echo "No changes to commit"
git push origin gh-pages
get_and_build:
needs: [setup]
strategy:
matrix:
model: ${{ fromJson(needs.setup.outputs.model_matrix) }}
fail-fast: false
uses: ./.github/workflows/build-single-tinygrad-model.yaml
with:
upstream_branch: ${{ matrix.model.ref }}
custom_name: ${{ matrix.model.display_name }}
recompiled_dir: ${{ needs.setup.outputs.recompiled_dir }}
json_version: ${{ needs.setup.outputs.json_version }}
secrets: inherit
retry_failed_models:
needs: [setup, get_and_build]
runs-on: ubuntu-latest
if: ${{ needs.setup.result != 'failure' && (failure() && !cancelled()) }}
outputs:
retry_matrix: ${{ steps.set-retry-matrix.outputs.retry_matrix }}
steps:
- uses: actions/download-artifact@v4
with:
pattern: model-*
path: output
- id: set-retry-matrix
run: |
echo '${{ needs.setup.outputs.model_matrix }}' > matrix.json
built=(); while IFS= read -r line; do built+=("$line"); done < <(
ls output | sed -E 's/^model-//' | sed -E 's/-[0-9]+$//' | sed -E 's/ \([^)]*\)//' | awk '{gsub(/^ +| +$/, ""); print}'
)
jq -c --argjson built "$(printf '%s\n' "${built[@]}" | jq -R . | jq -s .)" \
'map(select(.display_name as $n | ($built | index($n | gsub("^ +| +$"; "")) | not)))' matrix.json > retry_matrix.json
echo "retry_matrix=$(cat retry_matrix.json)" >> $GITHUB_OUTPUT
retry_get_and_build:
needs: [setup, get_and_build, retry_failed_models]
if: ${{ needs.get_and_build.result == 'failure' || (needs.retry_failed_models.outputs.retry_matrix != '[]' && needs.retry_failed_models.outputs.retry_matrix != '') }}
strategy:
matrix:
model: ${{ fromJson(needs.retry_failed_models.outputs.retry_matrix) }}
fail-fast: false
uses: ./.github/workflows/build-single-tinygrad-model.yaml
with:
upstream_branch: ${{ matrix.model.ref }}
custom_name: ${{ matrix.model.display_name }}
recompiled_dir: ${{ needs.setup.outputs.recompiled_dir }}
json_version: ${{ needs.setup.outputs.json_version }}
artifact_suffix: -retry
secrets: inherit
publish_models:
name: Publish models sequentially
needs: [setup, get_and_build, retry_failed_models, retry_get_and_build]
if: ${{ !cancelled() && (needs.get_and_build.result != 'failure' || needs.retry_get_and_build.result == 'success' || (needs.retry_failed_models.outputs.retry_matrix != '[]' && needs.retry_failed_models.outputs.retry_matrix != '')) }}
runs-on: ubuntu-latest
strategy:
max-parallel: 1
matrix:
model: ${{ fromJson(needs.setup.outputs.model_matrix) }}
env:
RECOMPILED_DIR: recompiled${{ needs.setup.outputs.recompiled_dir }}
JSON_FILE: ${{ needs.setup.outputs.json_file }}
ARTIFACT_NAME_INPUT: ${{ matrix.model.display_name }}
steps:
- name: Set up SSH
uses: webfactory/ssh-agent@v0.9.0
with:
ssh-private-key: ${{ secrets.GITLAB_SSH_PRIVATE_KEY }}
- name: Add GitLab.com SSH key to known_hosts
run: |
mkdir -p ~/.ssh
ssh-keyscan -H gitlab.com >> ~/.ssh/known_hosts
- name: Clone GitLab docs repo
env:
GIT_SSH_COMMAND: 'ssh -o UserKnownHostsFile=~/.ssh/known_hosts'
run: |
echo "Cloning GitLab"
git clone --depth 1 --filter=tree:0 --sparse git@gitlab.com:sunnypilot/public/docs.sunnypilot.ai2.git gitlab_docs
cd gitlab_docs
echo "checkout models/${RECOMPILED_DIR}"
git sparse-checkout set --no-cone models/${RECOMPILED_DIR}
git checkout main
cd ..
- name: Checkout docs repo
uses: actions/checkout@v4
with:
repository: sunnypilot/sunnypilot-docs
ref: gh-pages
path: docs
ssh-key: ${{ secrets.CI_SUNNYPILOT_DOCS_PRIVATE_KEY }}
- name: Validate recompiled dir and JSON version
run: |
if [ ! -d "gitlab_docs/models/$RECOMPILED_DIR" ]; then
echo "Recompiled dir $RECOMPILED_DIR does not exist in GitLab repo"
exit 1
fi
if [ ! -f "$JSON_FILE" ]; then
echo "JSON file $JSON_FILE does not exist!"
exit 1
fi
- name: Download artifact name file
uses: actions/download-artifact@v4
with:
name: artifact-name-${{ env.ARTIFACT_NAME_INPUT }}
path: artifact_name
- name: Read artifact name
id: read-artifact-name
run: |
ARTIFACT_NAME=$(cat artifact_name/artifact_name.txt)
echo "artifact_name=$ARTIFACT_NAME" >> $GITHUB_OUTPUT
- name: Download model artifact
uses: actions/download-artifact@v4
with:
name: ${{ steps.read-artifact-name.outputs.artifact_name }}
path: output
- name: Remove onnx files bc not needed for recompiled dir since they already exist from single build
run: |
find output -type f -name '*.onnx' -delete
find output -type f -name 'big_*.pkl' -delete
find output -type f -name 'dmonitoring_model_tinygrad.pkl' -delete
- name: Copy model artifacts to gitlab
env:
ARTIFACT_NAME: ${{ steps.read-artifact-name.outputs.artifact_name }}
run: |
ARTIFACT_DIR="gitlab_docs/models/${RECOMPILED_DIR}/${ARTIFACT_NAME}"
mkdir -p "$ARTIFACT_DIR"
for path in output/*; do
if [ "$(basename "$path")" = "artifact_name.txt" ]; then
continue
fi
name="$(basename "$path")"
if [ -d "$path" ]; then
mkdir -p "$ARTIFACT_DIR/$name"
cp -r "$path"/* "$ARTIFACT_DIR/$name/"
echo "Copied dir $name -> $ARTIFACT_DIR/$name"
else
cp "$path" "$ARTIFACT_DIR/"
echo "Copied file $name -> $ARTIFACT_DIR/"
fi
done
- name: Push recompiled dir to GitLab
env:
GITLAB_SSH_PRIVATE_KEY: ${{ secrets.GITLAB_SSH_PRIVATE_KEY }}
run: |
cd gitlab_docs
git checkout main
git pull origin main
for d in models/"$RECOMPILED_DIR"/*/; do
git sparse-checkout add "$d"
done
git add models/"$RECOMPILED_DIR"
git config --global user.name "GitHub Action"
git config --global user.email "action@github.com"
git commit -m "Update $RECOMPILED_DIR with model from build-all-tinygrad-models" || echo "No changes to commit"
git push origin main
- run: |
cd docs
git pull origin gh-pages
- name: update json
run: |
ARGS=""
[ -n "${{ inputs.set_min_version }}" ] && ARGS="$ARGS --set-min-version \"${{ inputs.set_min_version }}\""
ARGS="$ARGS --sort-by-date"
eval python3 docs/json_parser.py \
--json-path "$JSON_FILE" \
--recompiled-dir "gitlab_docs/models/$RECOMPILED_DIR" \
$ARGS
- name: Push updated json to GitHub
run: |
cd docs
git config --global user.name "GitHub Action"
git config --global user.email "action@github.com"
git checkout gh-pages
git add docs/"$(basename $JSON_FILE)"
git commit -m "Update $(basename $JSON_FILE) after recompiling model" || echo "No changes to commit"
git push origin gh-pages