tinygrad

Commit Graph

Author	SHA1	Message	Date
George Hotz	c81ce9643d	move globalcounters to ops (#2960 ) * move globalcounters to ops * missed a few * sick of that failing	2024-01-01 14:21:02 -08:00
chenyu	8291986959	Variable.sum -> Node.sum, Variable.ands -> Node.ands (#2961 )	2024-01-01 16:21:28 -05:00
chenyu	3d720b5761	move expand_idx, iter_idxs and expand_node from symbolic to linearizer (#2959 )	2024-01-01 14:41:21 -05:00
George Hotz	56f44bd10e	move the compiler cache to be global (#2957 ) * move the compiler cache to be global * remove non robust test * remove dead code	2024-01-01 10:59:56 -08:00
George Hotz	063f465604	simpler webgpu (#2956 ) * simpler webgpu * skip that test	2024-01-01 10:28:59 -08:00
chenyu	50f2e31d26	cleanup float4 grouping in global_load and global_store (#2942 ) * cleanup float4 grouping in global_load and global_store * fix test decorator	2023-12-27 14:10:04 -05:00
chenyu	54629b56d2	minor cleanup in kernel and linearizer (#2937 ) * minor cleanup in kernel and linearizer less long line, spaces and colocate variables * no deadline in hypothesis test	2023-12-26 12:05:32 -05:00
chenyu	820f2e054e	fix PADTO optimization (#2935 ) the correct condition is that PADTO cannot be applied to reduce axis, not Reduce.MAX in ops. even for Reduce.SUM it's possible that the reduce axis had a div before, and the padded 0 became inf then sum over it is incorrect.	2023-12-25 22:52:49 -05:00
qazal	dca5e4fe74	tensor == tensor should be bool (#2916 ) * return bool * add tests to the type spec * fix multinomial * fix tril * fix round * fix NegativeLogLikelihoodLoss * rm debug * webgpu * more webgpu * bitwise or for adding two bools * onnx ops dont need to cast anymore * Revert "bitwise or for adding two bools" This reverts commit b413babffa4d93c5cc94a252cb7086b9a899a437. * workaround for metal neg * just the tests in the type spec	2023-12-25 12:38:47 -05:00
chenyu	8a8aed23d2	test dtypes of return values of cumsum, argmax/min, multinomial (#2933 ) * test dtypes of return values of cumsum, argmax/min, multinomial cumsum behaves like sum, and functions that return an index return in dtypes.default_int * because webgpu is different	2023-12-25 11:33:17 -05:00
chenyu	1fb815e77e	hotfix fix coder. RMSNorm cannot have float16 input (#2932 ) * hotfix fix coder. RMSNorm cannot have float16 input * update real world test due to new kernels * more type casts	2023-12-25 02:28:11 -05:00
Will	016aebcd84	Fixed Tensor.randint() not accepting tuple shapes (#2923 ) * ww/Fixed Tensor.randint() to accept shape tuples () * ww/Wrote a test to cover this typo * ww/Updated Tensor random objects to optionally take (,) or () to be more consistent ww/no lint no worries * ww/Made peace with linter * ww/Added new line can't reduce line size without reducing readablitity * ww/reverted to using .mul	2023-12-24 20:32:26 -05:00
Isalia20	8de1fc2539	Einsum space fix (#2927 ) * space removal in formula and a single test to cover it * space in torch einsum as well * replacing spaces in a var formula to support truncating all the spaces	2023-12-24 01:23:27 -05:00
chenyu	b55b55d56e	use at least int32 and uint32 for sum output (#2926 ) * use at least int32 and uint32 for sum output * use the correct type for acc * fix opencl * llvm mulacc	2023-12-24 01:14:54 -05:00
chenyu	089703a390	cleanup test_dtype_alu (#2919 ) wrapped long lines and lowered atol for METAL.sin to 2 since atol of two sins are bounded by 2	2023-12-22 17:29:31 -05:00
chenyu	50927defad	s/lazydata.realized/lazydata.base.realized/g (#2914 ) * s/lazydata.realized/lazydata.base.realized/g * not that	2023-12-22 14:45:13 -05:00
chenyu	2783e1b50d	bugfix Tensor.item when it's unbased (#2913 ) it's possible for numel 1 tensor lazydata to be unbased and should call lazydata.base.realized	2023-12-22 13:50:06 -05:00
Oleg Rybalko	c3133adb8c	Disk shm refactor (#2912 ) * better support for platform dependent flags * osx test support * removed unused import and made line length <150 * changed osx ci shm * lstrip in case SharedMemory._name is passed	2023-12-22 09:23:37 -08:00
chenyu	3855432265	don't use numpy to create Tensor(None) (#2909 ) * don't use numpy to create Tensor(None) empty suffices * parentheses	2023-12-22 01:07:44 -05:00
chenyu	50cfb1fb3a	update onnx model links (#2908 ) updated in https://github.com/onnx/models/pull/644	2023-12-22 00:19:41 -05:00
chenyu	1bbeb3fe2f	remove the different rtol / atol for openpilot CUDA in benchmark (#2907 ) not sure what the issue was but seems to be fixed on master	2023-12-21 22:23:39 -05:00
chenyu	a543d8bea8	fuzz default dtypes for some test_dtype tests (#2906 ) * fuzz default dtypes for some test_dtype tests * ocd * setUp and tearDown	2023-12-21 22:00:21 -05:00
George Hotz	5cac6338a4	apply the multitensor optimizations in lazy.py (#2901 ) * apply the multitensor optimizations in lazy.py * less lines * hack for webgpu * save a line	2023-12-21 13:55:49 -08:00
chenyu	5bf43c9634	reenable one onnx test failed due to dtype (#2902 )	2023-12-21 15:50:02 -05:00
George Hotz	193109a88c	hotfix: compare on ids	2023-12-20 23:47:50 -08:00
George Hotz	f6c7833f9f	fast compare for lazyop (#2893 )	2023-12-20 23:32:27 -08:00
George Hotz	41b2a25be6	Fix exponential behavior in lazyops (#2890 ) * add cache to ast_parse and lazyop builder * add caches	2023-12-20 22:06:50 -08:00
George Hotz	8c4a0f8e15	Fix int child count (#2882 ) * pad ops broke coder * that contiguous fixes it * Update lazy.py * recursive add * fix all * revert that * todo test	2023-12-20 21:06:27 -08:00
George Hotz	7da2325dc7	get_lazyops() -> lazyops (#2884 ) * get_lazyops() -> lazyops * don't compare empty mem	2023-12-20 18:04:49 -08:00
George Hotz	e1861ab65e	remove realize from optimizer (#2880 ) * remove realize from optimizer * one still needed * opt realize	2023-12-20 16:42:41 -08:00
George Hotz	1765849937	new lazy, benchmark (#2878 ) * lazy rewrite, try 2 * min fix tests * pass contig test * put broken pads back * move that to realize * no contig child fixes array packing * so wrong * now that's correct * base children * fix bind issues * disable to_image_idx * fix tests * that failure shouldn't break other tests * more fixes * fix torch * skip failing tests in CI * 1e-7 * half is broken * 1e-6 margin of error	2023-12-20 14:33:21 -08:00
Peter Cawley	dae8976889	Fix reshape merging with masks (#2877 )	2023-12-20 14:00:58 -08:00
George Hotz	8fe24038d8	Revert "mulacc fusion cleanup (#2871 )" (#2876 ) This reverts commit `863c5b26ed`.	2023-12-20 13:26:25 -08:00
qazal	863c5b26ed	mulacc fusion cleanup (#2871 ) * add mulacc fusion tests * cleanup the implementation * fix indent in the test utility * less verbose	2023-12-20 15:39:54 -05:00
chenyu	e13b4964d7	remove the all_int(shape) check in Tensor._loadop (#2874 ) * remove the all_int(shape) check in Tensor._loadop we can support jittable symbolic shape random with custom rand now, and we can formalize it in the test after threefry is ready * MOCKHIP false positive	2023-12-20 15:04:50 -05:00
qazal	5f07ef455e	update dtypes (#2872 )	2023-12-20 15:04:02 -05:00
George Hotz	ca59054463	fix shapetracker math (#2861 ) * proper test * all st math good now * fix real_strides bug	2023-12-19 22:17:34 -08:00
chenyu	5a739e8c20	update one skipped pad_reshape test that was fine (#2860 ) * update one skipped pad_reshape test that was fine had a typo * this one passed	2023-12-19 23:25:52 -05:00
chenyu	ad233d557f	disable reshape merging with masks (#2858 ) fuzzer found a bug, and it's not complete	2023-12-19 19:06:16 -05:00
Oleg Rybalko	42a038c83f	More readable torch_load ext check (#2853 ) * more readable extension check * enable tarfile test * detach tensor if requires grad in torch	2023-12-19 14:53:15 -05:00
chenyu	172a88e719	skip slow test_indexing on METAL (#2852 ) LLVM still runs and is a lot faster, would be curious to know why. also reworded some error messages and remove regex check	2023-12-19 12:00:54 -05:00
geohotstan	fec8e9060c	Add simple fancy indexing exceptions (#2706 ) * fancy indexing raise error * updated error message * improved error check * oops * fixed onnx * oops typo * merge * add full_flatten * try * merged and updated some tests * more cleaning * done * temp fix onnx * try * add todo in onnx_test * reword * gah	2023-12-19 11:23:51 -05:00
George Hotz	90fb09b55c	remove unused _device_extra_args	2023-12-18 22:14:58 -08:00
George Hotz	b2192b5400	minor improvements (#2845 )	2023-12-18 22:09:08 -08:00
George Hotz	d086325b1b	hotfix: failing tests	2023-12-18 21:12:42 -08:00
George Hotz	07df14aa0e	HIP cleanups (#2843 ) * move everything to code_for_op to reason about it * loop the loopable parts * its not that unreadable * these are loopable too * nitpick * tests p1 - replace these with the actual compiler running alu ops tests * tests p2: compile test_dtype_alu in HIP! +add to CI * nobody liked test_renderer * revert test_dtypes change * isolated mockhip tests * dont need the WHERE hack after #2782 +ruff * bf16 is broken in HIP job failed in: https://github.com/tinygrad/tinygrad/actions/runs/7232101987/job/19705951290?pr=2778#step:8:73 * picking this back up * add compile tests for unary ops and binary ops * MOD is only in ints * CMPLT wont work after the dtypes pr is merged because it will always be bool * test all combinations * Update cstyle.py * don't use vload * no getenv * set seed --------- Co-authored-by: qazal <qazal.software@gmail.com> Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2023-12-18 21:09:32 -08:00
George Hotz	b6d71b131e	hotfix: push broken tests	2023-12-18 21:08:42 -08:00
George Hotz	80f53245e8	shapetracker add and invert (#2828 ) * invert (broken) * decent invert * shapetracker invert works * plus is meh, invert is good * support invert mask * a few more invert tests * shapetracker math invert test	2023-12-18 16:03:27 -08:00
chenyu	73cadfbb3c	Remove pytest markers (#2831 ) * remove pytest marker * fix some, skip some * tweak * fix * skip slow * skip more	2023-12-18 18:53:28 -05:00
chenyu	264fe9c93f	clean up test_dtype.py (#2827 ) make is_dtype_supported a pure function and clean up long lines	2023-12-18 16:06:09 -05:00

1 2 3 4 5 ...

1208 Commits