tinygrad/extra
qazal 12996d3a7d
green linearizer asserts for ops (#2800)
* these asserts should pass

* fix that assert

* ALU dtypes

* acc dtype for group_for_reduce

* cast image ALUs to the base dtype

* remove all casts from linearizer

* fix argmax

* fix multinomial

* fix __getitem__

* Revert "fix __getitem__"

This reverts commit 62ad719bfa5a2e1fcbfa931360f54897f8977602.

* fix MemBuffer outputs being wrong when there is an arange + ALU with a different dtype

eg. fancy slicing (int, float), bert embeddings (int, long)

this should be fixed in lazy instead of having to break the kernel

* cleanup argmax fix

* fix matmul in ints

cast in the end

* fix llama

* skip wrong hardcoded asts in the worlds dataset

* fix llama p2

* cleanup missing parts of the diff

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2023-12-25 10:41:54 -05:00
..
accel move things, clean up extra (#2292) 2023-11-13 20:18:40 -08:00
assembly remove duplicated dtype in DEFINE_GLOBAL args (#2768) 2023-12-14 15:42:36 -05:00
datasets bump version to 0.8.0, clean CI, remove requests (#2545) 2023-12-01 10:42:50 -08:00
dist hip & cuda to gpuctypes (#2539) 2023-12-01 09:25:27 -08:00
gemm get_lazyops() -> lazyops (#2884) 2023-12-20 18:04:49 -08:00
junk coder.py can write and run code (#2439) 2023-11-25 12:27:54 -08:00
models hotfix fix coder. RMSNorm cannot have float16 input (#2932) 2023-12-25 02:28:11 -05:00
optimization green linearizer asserts for ops (#2800) 2023-12-25 10:41:54 -05:00
qcom_gpu_driver start Qualcomm GPU driver (#2804) 2023-12-16 23:10:50 -08:00
triton remove duplicated dtype in DEFINE_GLOBAL args (#2768) 2023-12-14 15:42:36 -05:00
archprobe.py no werror in archprobe 2023-05-03 19:34:17 +00:00
augment.py [ready] Replacing os with pathlib (#1708) 2023-08-30 10:41:08 -07:00
autopad.py autopad shapetracker for BEAM (#2375) 2023-11-22 21:05:25 -05:00
dump_cache.py wow how did i think that was okay (#2339) 2023-11-16 21:21:11 -08:00
export_model.py s/lazydata.realized/lazydata.base.realized/g (#2914) 2023-12-22 14:45:13 -05:00
gradcheck.py Fix: Jacobian tests [WIP] (#1126) 2023-07-05 15:36:22 -07:00
introspection.py new lazy, benchmark (#2878) 2023-12-20 14:33:21 -08:00
lr_scheduler.py ResNet training changes (update benchmark) (#2390) 2023-11-22 17:41:12 -08:00
multitensor.py multitensor start (#2676) 2023-12-07 17:07:05 -08:00
onnx.py onnx_ops formatting cleanup (#2904) 2023-12-21 20:06:06 -05:00
onnx_ops.py add CMPEQ (#2931) 2023-12-25 00:15:55 -05:00
thneed.py new style device (#2530) 2023-11-30 17:07:16 -08:00
to_movement_ops.py slightly better extra/to_movement_ops dedups (#2695) 2023-12-10 11:05:44 -05:00
training.py cleanups before interpreted jit (#2306) 2023-11-14 21:44:25 -08:00