mirror of https://github.com/commaai/tinygrad.git
aab9ee0fca
* Add support for one case of `UOps.CAST` for RDNA3 assembler * Adds support for casting from `bool` -> `float32`. Seems like a very common operation that is required in many places. * Fix bool register definition for vector operations * Use `vcc_lo` instead of `vcc` which seems to be required since it's configured to use wavefront_size=32 * Add vector support for some places that were scalar only in register definition and comparison ops * Fix some issues in what seems to be defunct `external_test_image.py` * Some tests still don't pass for other reasons, but it at least runs now and one broken test is now fixed * Refactor RDNA3 assembler register definition * Unify multi-registor code between dtypes and combine with single-register allocation since they're all untyped registers at the end of the day |
||
---|---|---|
.. | ||
external_copy_benchmark.py | ||
external_hlb_cifar.py | ||
external_metal_uaf.py | ||
external_multi_gpu.py | ||
external_osx_profiling.py | ||
external_test_gpu_ast.py | ||
external_test_image.py | ||
external_test_llvm.py | ||
external_test_onnx_backend.py | ||
external_test_opt.py | ||
external_test_optim.py | ||
external_test_yolo.py | ||
external_test_yolov8.py | ||
fuzz_shapetracker.py | ||
fuzz_symbolic.py | ||
graph_batchnorm.py |