tinygrad

History

CaltropHungerton 38fb1e14a2 Intel XMX Tensor Core Support (#5622 ) * fixed xmx demo * i think i'm invoking the DPAS but it's slow * compiler build arg to stop register spilling, indicated where to fix flop counter * don't mind this * do NOT mind me * do not mind me * do not view * i will add bf16 later * in process of figuring out tc fields * we figured out the fields!!! * added check for cl device vendor, added seperate IntelRenderer * remove tc thread_local_aliases * cleaning debris before draft pr * edits for linter * deduping and checking device extensions * i will find more line reductions in other places * before merge upstream * double grf size in compiler to fix register spilling (bandaid), device checking changes * tc python emulation * fixed emulation * tests for emulated intel tensor core * TC=0, 1 working on upstream, fixed perf * test * debris * check for specialized cl device when we canonicalize device * bf16 support, tc=3 test added * address tests * revert half2 loads on intel tc, cleanup * linter * fold_expanded revert * lint, whitespace fix * cuda bf16 (only one with bf16) is skipped in test tensor cores, so i will skip for intel bf16 too * make line shorter, no need for noqa E501 * removed device intel * fix python emulation --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>		2024-08-16 09:19:21 -07:00
..
benchmark.yml	load balance NV benchmark ci (#6107 )	2024-08-16 10:08:08 -04:00
docs.yml	add strict mkdocs check (#5497 )	2024-07-15 14:21:37 -07:00
python-publish.yml	update gh actions (#3033 )	2024-01-09 17:52:22 -08:00
szdiff.yml	update gh actions (#3033 )	2024-01-09 17:52:22 -08:00
test.yml	Intel XMX Tensor Core Support (#5622 )	2024-08-16 09:19:21 -07:00