the correct condition is that PADTO cannot be applied to reduce axis, not Reduce.MAX in ops.
even for Reduce.SUM it's possible that the reduce axis had a div before, and the padded 0 became inf then sum over it is incorrect.
* return bool
* add tests to the type spec
* fix multinomial
* fix tril
* fix round
* fix NegativeLogLikelihoodLoss
* rm debug
* webgpu
* more webgpu
* bitwise or for adding two bools
* onnx ops dont need to cast anymore
* Revert "bitwise or for adding two bools"
This reverts commit b413babffa4d93c5cc94a252cb7086b9a899a437.
* workaround for metal neg
* just the tests in the type spec
* test dtypes of return values of cumsum, argmax/min, multinomial
cumsum behaves like sum, and functions that return an index return in dtypes.default_int
* because webgpu is different
* ww/Fixed Tensor.randint() to accept shape tuples ()
* ww/Wrote a test to cover this typo
* ww/Updated Tensor random objects to optionally take (,) or *() to be more consistent
* ww/no lint no worries
* ww/Made peace with linter
* ww/Added new line can't reduce line size without reducing readablitity
* ww/reverted to using .mul
* space removal in formula and a single test to cover it
* space in torch einsum as well
* replacing spaces in a var formula to support truncating all the spaces
* better support for platform dependent flags
* osx test support
* removed unused import and made line length <150
* changed osx ci shm
* lstrip in case SharedMemory._name is passed
* lazy rewrite, try 2
* min fix tests
* pass contig test
* put broken pads back
* move that to realize
* no contig child fixes array packing
* so wrong
* now that's correct
* base children
* fix bind issues
* disable to_image_idx
* fix tests
* that failure shouldn't break other tests
* more fixes
* fix torch
* skip failing tests in CI
* 1e-7
* half is broken
* 1e-6 margin of error
* remove the all_int(shape) check in Tensor._loadop
we can support jittable symbolic shape random with custom rand now, and we can formalize it in the test after threefry is ready
* MOCKHIP false positive
* move everything to code_for_op to reason about it
* loop the loopable parts
* its not that unreadable
* these are loopable too
* nitpick
* tests p1 - replace these with the actual compiler running alu ops tests
* tests p2: compile test_dtype_alu in HIP!
+add to CI
* nobody liked test_renderer
* revert test_dtypes change
* isolated mockhip tests
* dont need the WHERE hack after #2782
+ruff
* bf16 is broken in HIP
job failed in: https://github.com/tinygrad/tinygrad/actions/runs/7232101987/job/19705951290?pr=2778#step:8:73
* picking this back up
* add compile tests for unary ops and binary ops
* MOD is only in ints
* CMPLT wont work after the dtypes pr is merged because it will always be bool
* test all combinations
* Update cstyle.py
* don't use vload
* no getenv
* set seed
---------
Co-authored-by: qazal <qazal.software@gmail.com>
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
* invert (broken)
* decent invert
* shapetracker invert works
* plus is meh, invert is good
* support invert mask
* a few more invert tests
* shapetracker math invert test