Commit Graph

6060 Commits

Author SHA1 Message Date
samm393 19c11792fd
Flux.1 (#6334)
* initial commit

* whitespace

* get rid of torch import

* indentation

* less hardcoding

* add flux.1-dev

* jit

* no double

* t5 tidy up

* validation image

* reuse sdxl autoencoder

* typing changes

* empty lines

* remove unneeded comments

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-09-24 10:08:04 +08:00
chenyu 31b9c74c77
tiny import cleanup and fix typo (#6692) 2024-09-23 21:48:23 -04:00
qazal 02c0c09fb9
VIZ syntax highlighting and new colors (#6686)
* VIZ syntax highlighting

* more work
2024-09-24 09:41:07 +08:00
ignaciosica 0ffbd75af8
Refactor TC [run_process_replay] (#6456)
* unify _apply_tc_opt

* refactor tc pt2

* hotfix: remove blank line

* refactor upcast_axes

* simplify check before using tensor_cores

* rename upcast_axes

* fix amx and remove counting hack

* AMX cleanup

* hotfix: bug

* skip hand-coded TC opts if AMX to also skip if emulating

* hotfix: AMX bug

* hotfix: AMX tests

* minor format change

* hotfix: minor var name change

* hotfix: minor refactor

* hotfix: hand-coded tc bug

* hotfix: simple change

* fix comment

* hotfix: refactor attempt to local N

* hotfix: AMD TC spacing

* refactor tensor core options in kernel.py to include opt order

* hotfix: add comments to TensorCore dataclass

* hotfix: improve comment on TC dataclas

* hotfix: refactor opt_seq loop

* hotfix: add comments in hand-coded TC opts

* hotfix: upcast_axes comment

* hotfix: remove unroll from opt_seq

* hotfix: bug + remove unroll from opt_seq

* hotfix: rename opt_seq into opts_seq

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-09-24 09:05:29 +08:00
George Hotz b9e6d42a1f
Revert "gated native math in OpenCL (#6683)" (#6691)
This reverts commit 2fe3eeed17.
2024-09-24 08:48:10 +08:00
Harald Schäfer 382938ab41
Add command to show default backend in README (#6688)
* Update README.md

* Update README.md

* Update README.md
2024-09-24 08:42:18 +08:00
George Hotz 46fab1f185 hotfix: curved edges in viz 2024-09-23 19:45:35 +08:00
qazal ee050d31d7
viz more touchups (#6685)
* dont print if we're running VIZ

* 242424
2024-09-23 19:44:28 +08:00
George Hotz 2fe3eeed17
gated native math in OpenCL (#6683)
* gated native math

* Update cstyle.py
2024-09-23 19:22:13 +08:00
George Hotz 84072166db
move mul consts like add consts (#6684) 2024-09-23 19:21:53 +08:00
George Hotz de259e3f09 hotfix: add compile3 to comma CI 2024-09-23 18:25:49 +08:00
George Hotz 7c38121280
load penalty (#6681)
* bias/bn loads after loops

* load penalty in fix_priority

* more generic test
2024-09-23 18:12:12 +08:00
George Hotz 431ffc4254 hotfix: delete float16 failing 2024-09-23 17:42:57 +08:00
qazal aad7c9c883
viz adjustable metadata (#6679)
* move from grid to flexbox

* viz adjustable metadata

* w-size
2024-09-23 17:31:51 +08:00
George Hotz 2f2f933e50
fix buffer shape regression from onnx (#6678) 2024-09-23 16:58:42 +08:00
qazal b438e3cc19
viz bugfix click in middle of UOps (#6676) 2024-09-23 16:44:19 +08:00
chenyu f55459c98e
failed validhack test for a 0.9.7 conv (#6677) 2024-09-23 04:43:47 -04:00
nimlgen 94cbb1cd32
qcom image copyout (#6667)
* qcom copyout

* copyin

* linter

* fix

* linter

* myoy
2024-09-23 16:11:43 +08:00
George Hotz 417a19a292
uop priority inversion (#6670)
* make checks simpler [run_process_replay]

* reorder uops

* fix inversion [run_process_replay]

* no need to move SPECIALs

* Update uopgraph.py
2024-09-23 15:53:53 +08:00
qazal 49bf92afa2
schedule UOps.ASSIGN (#6661) 2024-09-23 15:44:12 +08:00
George Hotz 9f1f445a5f
reorder uops (#6672) 2024-09-23 15:21:59 +08:00
qazal e2d6e10ddf
hotfix: reset benchmarks cache for process replay (#6671) 2024-09-23 15:13:02 +08:00
chenyu 0362dbbbe8
relax idx simplification given valid (#6669)
apply to kernels in op 0.9.7.
if a valid has a complicated expr, we cannot drop valid but it's possible to simplify idx given valid
2024-09-23 03:04:57 -04:00
qazal 7ca9ffa494
misc UOp st cleanups (#6668) 2024-09-23 14:16:42 +08:00
chenyu 26ebb7cab4
don't use div_folding in lt_folding (#6666)
* don't use div_folding in lt_folding

valids 35 -> 13

* fails the same as before
2024-09-23 01:50:18 -04:00
qazal e9248b9e27
viz highlight new nodes (#6665)
* p2

* ret adds and dels

* maybe that way

* add additions

* simpler test_viz
2024-09-23 13:46:18 +08:00
chenyu da5b741656
removed valid in openpilot conv (#6619)
35 valids left
2024-09-23 00:30:18 -04:00
George Hotz 52c2c4df9c
fix match of sz 0 + dedup kernel ast [run_process_replay] (#6663)
* fix match of sz 0 [run_process_replay]

* empty graph rewrite to dedup st
2024-09-23 11:56:53 +08:00
chenyu 2d4d594994
validhack is_irreducible helper (#6664)
[run_process_replay]
2024-09-22 23:42:47 -04:00
chenyu 1923932339
canonicalize simplex lt (#6658)
(X := a0*x0 + a1*x1 + ...) > 0 is equivalent to x0 + x1 + ... > 0 if xi >= 0 and ai > 0 for ints
2024-09-22 23:04:47 -04:00
wozeparrot 46e360fdc0
check bfloat16 range with threefry (#6660) 2024-09-23 10:48:44 +08:00
qazal d24e4b1042
viz more kernel view work (#6659) 2024-09-23 10:48:35 +08:00
qazal 6be1bf09f1
hotfix: bring COMPARE_SCHEDULE=0 back (#6657) 2024-09-23 10:39:43 +08:00
George Hotz e945fa9c5c
put local on the PtrDtype [run_process_replay] (#6656)
* put local on the PtrDtype [run_process_replay]

* those are local too
2024-09-23 10:29:17 +08:00
chenyu 90c1ccc402
simpler drop valid check in simplify_valid_image_load (#6653)
* simpler drop valid check in simplify_valid_image_load

* update tests
2024-09-22 21:46:39 -04:00
qazal 99ed9fb75e
simpler verify_ast [run_process_replay] (#6654) 2024-09-23 09:40:09 +08:00
nimlgen 8a9195d86e
qcom texs refactor (#6613)
* qcom texs refactor

* fix

* linter

* qcombuf

* linter
2024-09-23 09:03:17 +08:00
qazal d1bae42d35
viz lowerer and graph_rewrite dedup try 2 (#6652) 2024-09-22 21:09:46 +08:00
qazal 6b65d8c461
more process replay tracing work [run_process_replay] (#6650) 2024-09-22 16:16:58 +08:00
George Hotz 4fc5a34fe7
lowerer is just a graph rewrite, not a class [run_process_replay] (#6648) 2024-09-22 14:15:33 +08:00
George Hotz 0eb710de84
move WMMA out of lowerer [run_process_replay] (#6647) 2024-09-22 14:05:51 +08:00
George Hotz 84703d5b77
replace the lowerer with a contextual PatternMatcher [run_process_replay] (#6646)
* replace the lowerer with a contextual PatternMatcher [run_process_replay]

* todo

* it's REDUCE by the time it's in lowerer
2024-09-22 13:22:26 +08:00
qazal 4751159139
second iteration on viz/serve.py (#6643)
* small detail in checkStatus

* better abstractions for the api

* update test_viz

* ui updates
2024-09-22 08:49:44 +08:00
qazal 5bafed2f88
process replay traceback (#6642) 2024-09-21 16:53:34 +08:00
chenyu 9456a625bc
const_like type fix (#6641)
`Tuple[ConstType, ...]` instead of `Tuple[ConstType]`
2024-09-21 03:44:08 -04:00
qazal 8edce82124
viz show server status (#6640) 2024-09-21 15:08:13 +08:00
qazal 982086f54c
UOps.VALID try 2 (#6623)
* make UOps.VALID compile

* fixable tests

* bufs dedup

* cleanup the CONST spec

* regenerate dataset with graph_rewrite

```py
def rewrite_const(const:UOp, st_src:UOp) -> UOp:
  st: ShapeTracker = st_src.arg
  return UOp(UOps.VALID, dtypes.bool, (st.to_uop(),)).where(UOp.const(const.dtype, const.arg), UOp.const(const.dtype, 0))
pm = PatternMatcher([(UPat(UOps.CONST, name="const", src=(UPat(UOps.SHAPETRACKER, name="st_src"),)), rewrite_const)])
```

* rm arg

* remove arg

* revert arg removal

This reverts commit 2c35c75c950075d38c9fb8572f14640fe8235f74.

* red test_pickle_define_var
2024-09-21 14:19:25 +08:00
qazal dd05e27622
remove UOp from DEFINE_VAR arg [run_process_replay] (#6639)
* remove UOp from DEFINE_VAR arg [run_process_replay]

* that assert is in `spec`

* more .args to remove
2024-09-21 14:07:56 +08:00
qazal d2351af019
fixup non-void SINKs in tests [run_process_replay] (#6624) 2024-09-21 13:29:18 +08:00
qazal 391d14438e
DEFINE_VAR prereqs for VALID [run_process_replay] (#6637) 2024-09-21 13:28:39 +08:00