tinygrad/extra
Francis Lam bbb0ad4800
wmma: widen TC usage in search by using PADTO on TC axes when possible (#4216)
* wmma: widen TC usage in search by using PADTO on TC axes when possible

* test: start tests for the new padding TC behavior

* search: upgrade padded TC search to TC_OPT >= 2

* test: add behavior and correctness test for padded TC

added optional argument to apply_tensor_core to set TC_OPT level

* linearizer: add tests for the PADTO behvaior and docs
2024-04-22 16:50:31 -04:00
..
accel move things, clean up extra (#2292) 2023-11-13 20:18:40 -08:00
assembly move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
backends JitItem -> ExecItem (#4146) 2024-04-11 08:24:57 -07:00
datasets scipy.signal.gaussian -> scipy.signal.windows.gaussian (#4205) 2024-04-17 19:15:37 -04:00
gemm wmma: widen TC usage in search by using PADTO on TC axes when possible (#4216) 2024-04-22 16:50:31 -04:00
hip_gpu_driver kfd free buffers (#4027) 2024-04-01 15:50:58 -07:00
hiprtc use comgr to compile (#3248) 2024-01-26 18:27:49 -08:00
junk coder.py can write and run code (#2439) 2023-11-25 12:27:54 -08:00
models monkey patching (#4214) 2024-04-18 19:20:52 -04:00
nv_gpu_driver nv driver (#4044) 2024-04-22 19:50:20 +04:00
optimization search: add a BEAM_COMPARE env to optionally not compare to hc/tc (#4107) 2024-04-08 18:54:01 -04:00
qcom_gpu_driver start Qualcomm GPU driver (#2804) 2023-12-16 23:10:50 -08:00
archprobe.py move dtypes to dtype.py (#2964) 2024-01-01 14:58:48 -08:00
augment.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
autopad.py split to schedule.py (#3949) 2024-03-26 21:02:46 -07:00
disk_read_speed.py fast hip read (#3014) 2024-01-05 10:33:13 -08:00
dump_cache.py wow how did i think that was okay (#2339) 2023-11-16 21:21:11 -08:00
export_model.py create engine folder and move code (#3948) 2024-03-26 20:38:03 -07:00
gradcheck.py Fix: Jacobian tests [WIP] (#1126) 2023-07-05 15:36:22 -07:00
hip_events.py move autogen to runtime/autogen (#3254) 2024-01-26 12:44:19 -08:00
introspection.py move GlobalCounter to helpers (#4002) 2024-03-30 00:30:30 -04:00
lr_scheduler.py add lars to nn (#3750) 2024-03-24 11:43:12 -04:00
multitensor.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
onnx.py update onnx to 1.16.0 (#4127) 2024-04-10 11:19:13 -04:00
onnx_ops.py Replicate llm.c in tinygrad (#4179) 2024-04-16 15:40:48 +04:00
ring_copy.py ring copy example (#3185) 2024-01-19 23:34:30 -05:00
thneed.py new style device (#2530) 2023-11-30 17:07:16 -08:00
to_movement_ops.py fixup to_movement_ops and add back to CI (#3881) 2024-03-22 18:14:49 -04:00
training.py create engine folder and move code (#3948) 2024-03-26 20:38:03 -07:00
transfer_speed.py hotfix: copy size is in bytes 2024-01-17 16:44:15 +00:00