* mcts search
* mcts cleanups
* mcts cleanup
* random shuffle children order
* mcts in handcode_opt
* src and remove_node
* debug 3 to print ast
* print the type
* mcts in extra
* test/external/fuzz_linearizer: fix for new AST changes
also add beautiful_mnist failures
* add CLANG and LLVM to test_failure_35 failed_platforms
* fix test_linearizer_failure names
* added model impl
* minor cleanups
* extracted weights loading into from_pretrained
* reorganized model for better weight loading
* removed lru cache for state dict loading
* start writing openelm
* progress...hit bug
* repeat_interleave support
* gqa
* add rotary embedding
* spp
* i think it runs correctly
* broken
* output is good now
* cleanups
* no io_uring on android
* exp uring
* fixes and old version
* nv
* cleaner
* cmp vs aio
* fix
* no lib
* fix nv
* linter
* disk_speed_test now runs default
* fixes
* uring -> io_uring
* linter happy
* get_temp_buf comment added
* tiny nits
* put wait back
* test runs everywhere
* remove consts
* remove mmap consts
* do not require iouring to run test, they are generic
replaced all dtype.np with _to_np_dtype defined in tensor.py.
after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
* Create UnaryOps.RECIP and BinaryOps.IDIV and changing uses of BinaryOps.DIV
* Delete unused import
* Add cstyle renderer
* Fix formatting text
* Fix test error due to bad implementation of renderer
* Add PTX support
* Add RECIP to LLVMIR
* Remove BinaryOps.DIV from symbolic test
* Change some test and fix C floor division
* Change references to DIV for the RECIP or IDIV
* Add mimic idiv for symbolic test
* Restore floor
* Mimic idiv
* cast to int
* Fix some test and renderer
* Remove DIV for render nodes
* Resolve issue with div
* Add TestRenderer
* Fix test
* fix error
* Fix PAD test
* Fix div implementation
* Remove DIV
* Add upcast to rshift, due to use of MUL and RECIP on DIV
* Fix linter
* Remove complete BinaryOps.DIV
* Fix lint
* Fix some test
* Revert mul modification
* Fix tests
* Fix CLANG for uops
* Revert IDIV function
* Minor fix
* modify pattern matching rule to support nan
* Fix UNSAFE_PADS_OPS to add UnaryOps.RECIP
* Remove const folding for IDIV and fix PTX
* Complete remove IDIV from extra
* Remove test_div from TestFloatUOps due to test on recip
* Fix linearizer
* fix
* Fix test_22
* Fix llvm
* Apply trunc function for llvmlit
* use floor instead of trunc
* Use correct type
* Generate new fuzz db
* Fix rshift, do not cast to float to support idiv
* Return upcast=false to rshift
* Add to unsafepad BinaryOps.IDIV
* Remove RECIP override for CUDA
* add atol / rtol for the test
* Remove cast to int on IDIV
* Regenerate sops
* delete sops.gz
* regenerate
* regenerate
* regenerate
* Reduce margins
* pass atol and rtol as parametersg for _test_metrics
* regenerated dataset
* Regenerate
* Remove duplicated
* Revert changes on extra
* Remove changes extra and NOQA for test
* Remove E501
* Remove and change line
* Remove E501
* Fix atan2
* Revert import and E501
* Remove E501
* Add hrcp to halp ops
* Remove 1 of hrcp
* Remove last DIV and add type check on uops for IDIV
* Fix new tests
* Fix tests and custom function
* Regenerate dataset
* Regenerate dataset
* Revert dataset
* Change generate dataset script
* Remove line
* Change IDIV, type checker validate if x,y and z are int
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>