tinygrad

Commit Graph

Author	SHA1	Message	Date
George Hotz	ded1b38b84	minor dtype cleanup [pr] (#7124 ) * minor dtype cleanup [pr] * use ptr() function	2024-10-17 17:41:23 +08:00
qazal	20d3c2d113	unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW (#6955 ) * add UOps.VIEW * update hardcoded asts * update sops.gz	2024-10-09 02:00:17 +08:00
wozeparrot	c100f3d406	default threefry (#6116 )	2024-09-25 17:45:13 +08:00
George Hotz	bdd0c06f29	add void type to uop (#6471 ) * unwrap_dtype maybe * uopgraph stuff that hardcoded None * test_ops passes * dtypes.py fixups * update test_linearizer and friends * more ast updates * test_beam and test_schedule too * add void type to uop [run_process_replay] * remove dumb casts * start making it green * more cast cleanups * more cls methods to fix * regenerate dataset * split UOp and NOp const * maybe that too * fix docs * update test_uop_symbolic * test_verify_ast * new sops with no diff * meh, type_ignore is alright * remove that assert --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-09-11 18:16:28 +08:00
qazal	442150a8df	more ast_const for hardcoding consts [run_process_replay] (#6418 )	2024-09-09 11:35:08 +08:00
gswangg	1dc6040877	migrate test_search.py to UOp AST (#6245 ) * add imports and update test_kernel_count with UOp AST * test_filter_global_buffer * remove LazyOp * remove extra.ops and ReduceOps --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-08-24 16:13:53 +03:00
qazal	d1d41130cd	use membufs in ImageDType checks [run_process_replay] (#6136 ) * use membufs in ImageDType checks * set by key [run_process_replay]	2024-08-17 16:17:46 +03:00
qazal	28c75bf2a6	merge uops with ops (#6111 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-08-16 18:17:57 -04:00
qazal	c23d44c779	AST is UOp (#6030 ) * most of the work from the uops2 branch * schedule * realize * kernel * lowerer * search * green * merge uops with ops * Revert "merge uops with ops" This reverts commit 1408a59f12c97e3466679884266b247cf9df46bc. * fix benchmark * remove extra dedup	2024-08-16 22:09:00 +03:00
qazal	4d38fec8c1	rename lazyops to parents [run_process_replay] (#6091 )	2024-08-15 17:27:32 +03:00
George Hotz	b399ccd6ef	BEAM bugfix, kernels dedup now (#5617 ) * BEAM bugfix, kernels dedup now * getenv is default	2024-07-20 19:43:50 -07:00
George Hotz	fa7e734b49	MetaOps.KERNEL (#5543 )	2024-07-17 19:41:23 -07:00
chenyu	28972418c4	s/get_linearizer/get_kernel [run_process_replay] (#5467 )	2024-07-13 20:32:22 -04:00
George Hotz	03c2dc8bd7	lowerer is kernel [run_process_replay] (#5437 )	2024-07-12 18:50:55 -07:00
George Hotz	870dc8c350	s/Linearizer/Lowerer [run_process_replay] (#5428 )	2024-07-12 15:54:07 -07:00
George Hotz	6707c778d0	scheduleitem is not Tuple [run_process_replay] (#5425 ) * scheduleitem is not Tuple [run_process_replay] * fix tests * fix op + fuzzers * fix mop test	2024-07-12 15:13:19 -07:00
George Hotz	f6ef283e6a	s/loadops/metaops [run_process_replay] (#5421 )	2024-07-12 13:26:50 -07:00
qazal	28bf8d86d8	test_linearizer with multi output ASTs (#5115 ) * ast is tuple * run test_phi_simplification * update reason * more tc * beam * a few more * use test_opt directly	2024-06-23 15:41:24 +03:00
George Hotz	9f875123b6	small changes from lowerer. [run_process_replay] [no_assert] (#5102 )	2024-06-22 11:09:35 -07:00
nimlgen	fd071ba27e	amd mockgpu correct timer resolution (#4942 ) * amd mockgpu correct timer resolution * test it	2024-06-13 10:07:34 +03:00
qazal	8b5bcf309a	process replay in all of CI (#4884 )	2024-06-10 14:49:29 -04:00
qazal	f64fa51a64	process replay for test/* (#4799 ) * add input to unit tests [run_process_replay] * add setup [run_process_replay] * run tests [run_process_replay] * add cuda and amd [run_process_replay] * run everything but BEAM=2 [run_process_replay] * skip export_model [run_process_replay] * fix amd CI * add concurrency back	2024-06-03 12:01:58 +03:00
chenyu	456aa0b656	update test_search kernel count (#4652 ) integration test that beaming 1 kernel increments kernel count by 1, and moved exiting test_kernel_count to TestTimeLinearizer	2024-05-19 13:54:52 -04:00
Léo	967e35f8b8	fix(beam): GlobalCounters kernel count increasing when clearing l2 (#4598 ) * fix(beam): GlobalCounters kernel count increasing when clearing l2 * fix: removed the NOSTATS var by adding do_update_stats to Tensor.realize() * test(search): regression test for _time_program, should not increment kernel_count * fix(test_search): unused var and now properly checking when l2 is cleared * fix(test_search): added assert message * fix(test_search): now testing public beam api for kcount * ruff fixes --------- Co-authored-by: Léo Paillé <leo.paille@enseirb-matmeca.fr>	2024-05-19 10:03:47 -07:00
nimlgen	daf57af3eb	move tc to renderers (#4631 ) * move tc to renderers * missed import * fix typo * fix * fix imports * remove from tests * fix 4607 * nv emulate timestamp * time is int * correct time	2024-05-18 00:36:29 +03:00
chenyu	c86adabe15	time with real global buffers in search (#4621 ) * filter fake buffers in search * test that * update test	2024-05-17 12:36:23 -04:00
nimlgen	eb9689336e	nv mockgpu (#4600 ) * mockgpu nv * works * comment that out * fix merge * setup gpuocelot * install packages * not run all of them * passes * fix ci * almost * should pass * linter * linter 2 * try this? * ugn, not supported * ci * remove ticket from description * better descs	2024-05-15 23:46:08 +03:00
George Hotz	ff64bcab69	move graph/search to engine (#4596 )	2024-05-14 23:12:59 -07:00
George Hotz	1e843d495e	cleaning up search with Program (#4500 ) * cleaning up search * fix tests * test fix * minor compiler cleanup	2024-05-09 19:01:53 -07:00
George Hotz	c9e84ed0da	refactor to Program class (#4476 ) * refactor to Program class * switch to Program * fix tests * smaller diff * self.p * more tests * fix metal test * tests * fix openpilot * move that to linearizer * p.launchdims	2024-05-09 17:29:07 -07:00
Francis Lam	5c5b40880f	search: fix edge cases on screening potential ops (#4394 ) * search: fix edge cases on screening potential ops won't change correctness, but will save a little python time by properly deduplicating potential actions * check for de-duplication instead of exact valid actions * refactor long line	2024-05-02 14:53:05 -04:00
George Hotz	acf4ba5c9f	method cache respects beam option (#4261 ) * method cache respects beam option * cleanup get_runner	2024-04-23 09:00:41 +04:00
George Hotz	9eef44521b	ScheduleItem uses Buffer (#3995 ) * schedule Buffer * update * update tests * master * works * remove LoadOps.WAIT * fix compile2 * bad test * rename and note	2024-03-29 20:50:27 -07:00
George Hotz	42b9d999ea	Buffer isn't always allocated (#3974 ) * buffer alloc * allocate * missing allocates * last one	2024-03-28 13:33:47 -07:00
George Hotz	68ca4d4276	split to schedule.py (#3949 ) * split to schedule.py * split	2024-03-26 21:02:46 -07:00
George Hotz	150ea2eb76	create engine folder and move code (#3948 ) * retry * older tf * that	2024-03-26 20:38:03 -07:00
qazal	337cd53444	multioutput ScheduleItem (#3699 ) * refactor realize.py * update docs * update test_sched * update runners and devices * update openpilot and unit tests * cleanup runner lowering * update more tests	2024-03-13 08:59:38 -07:00
chenyu	906cc3a69b	cleanup tests Device[Device.DEFAULT] is always Compiled (#3645 )	2024-03-07 11:15:42 -05:00
Jovan Sardinha	8978488565	add sanity tests for bufs_from_lin (#3586 )	2024-03-02 14:17:43 -08:00
George Hotz	2e60012bcf	move create schedule and delete old API (#3377 ) * move create schedule and delete old API * fix test multitensor	2024-02-12 18:10:45 +01:00
George Hotz	655c6f61d3	St real size (#3046 ) * track the size in the lazybuffer * shapetracker real size * lint	2024-01-08 14:44:53 -08:00
George Hotz	c003be7309	Revert "track size in shapetracker" (#3043 ) * Revert "track size in shapetracker (#3026)" This reverts commit `a8ba1ac08f`. * st.size	2024-01-08 13:13:39 -08:00
George Hotz	a8ba1ac08f	track size in shapetracker (#3026 ) * track size in shapetracker * shapetracker adapter * size is an int * create Buffer with st.size * only compare the views for the jit * fix webgpu	2024-01-05 20:15:53 -08:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
George Hotz	f17bc16f46	simple runtime args (#2211 ) * simple runtime args * fix some tests * fix abstractions and triton * fix search	2023-11-03 12:31:29 -07:00
George Hotz	7103b716c4	merge kernel and optimizer (#2200 ) * merge kernel and optimizer * linearize is reentrant * move global/local size * clean up linearizer copy * remove unneeded lin copies * stop linearizing twice * oops, that should be None	2023-11-01 15:20:01 -07:00
qazal	36d4001b4f	add test coverage for search (#2104 ) * add test coverage for search * only in compiled backends * dont use device.default in decorator * time_til is the other way around xd	2023-10-19 17:06:47 -07:00

48 Commits