tinygrad

Commit Graph

Author	SHA1	Message	Date
George Hotz	ff64bcab69	move graph/search to engine (#4596 )	2024-05-14 23:12:59 -07:00
George Hotz	347a3acb37	add renderer class (#4524 ) * add renderer class * tests pass * fix pylint * fix tensor cores	2024-05-10 21:40:02 -07:00
George Hotz	17faae091b	optimizer shouldn't be run without training (#4460 ) * optimizer shouldn't be run without training * set training in relevant tests * fix multitensor * that too	2024-05-06 15:34:12 -07:00
George Hotz	f4e49a7c1a	resnet 50 opt: correct loop + LARS (#4449 ) * correct loop + LARS * ops	2024-05-06 08:01:26 -07:00
George Hotz	fc995d4446	add backward to handcode_resnet50_opt	2024-05-06 06:42:26 -07:00
chenyu	aa093efa43	fix handcode_resnet50_opt flops count (#4184 )	2024-04-15 22:13:45 -04:00
chenyu	b47f6cebb2	LinearizerOptions -> CompilerOptions (#3978 )	2024-03-28 17:50:23 -04:00
George Hotz	68ca4d4276	split to schedule.py (#3949 ) * split to schedule.py * split	2024-03-26 21:02:46 -07:00
George Hotz	150ea2eb76	create engine folder and move code (#3948 ) * retry * older tf * that	2024-03-26 20:38:03 -07:00
Anurag Lamsal	4e0819e40b	fixing the benchmark not printing in handcode resnet50 opt example (#3850 )	2024-03-21 00:55:31 -04:00
qazal	337cd53444	multioutput ScheduleItem (#3699 ) * refactor realize.py * update docs * update test_sched * update runners and devices * update openpilot and unit tests * cleanup runner lowering * update more tests	2024-03-13 08:59:38 -07:00
qazal	aec4c4f01b	linearizer ast as a tuple of lazyops (#3689 ) * multi store op linearizer * currently we do only one output per kernel * named opts	2024-03-11 15:39:04 -07:00
George Hotz	2e60012bcf	move create schedule and delete old API (#3377 ) * move create schedule and delete old API * fix test multitensor	2024-02-12 18:10:45 +01:00
chenyu	77251336d5	fix handcode_resnet50_opt.py (#3289 ) linearizer_opts has moved. also update the logging to print after total_tm update	2024-01-31 19:01:08 -05:00
chenyu	53afec2841	add HALF to handcode_resnet50_opt.py (#3202 ) use this to study tensor cores on HIP	2024-01-21 23:03:59 -05:00
chenyu	58d3d5030b	vars_from_ast -> LazyOp.vars (#2965 )	2024-01-01 18:12:38 -05:00
George Hotz	00d9eda961	FROM -> COPY, move vars_from_ast (#2675 )	2023-12-07 16:32:30 -08:00
chenyu	05a5357dd9	fix handcode_resnet50_opt.py (#2558 )	2023-12-01 20:51:21 -05:00
Akshay Kashyap	a031afb2f6	Update display_name in resnet50 example (#2454 )	2023-11-26 16:07:36 -08:00
George Hotz	0cbf6c1811	move things, clean up extra (#2292 ) * move things * idk why pylint needs that now * delete unused	2023-11-13 20:18:40 -08:00
George Hotz	7103b716c4	merge kernel and optimizer (#2200 ) * merge kernel and optimizer * linearize is reentrant * move global/local size * clean up linearizer copy * remove unneeded lin copies * stop linearizing twice * oops, that should be None	2023-11-01 15:20:01 -07:00
George Hotz	e0201922e3	Q network for pruning BEAM / uops deduping / BEAM_ESTIMATE (#2142 ) * stable diffusion < 324ms * revert swap action * fix tests due to more sum splitting * REDUCEOP_SPLIT_THRESHOLD env var * added from unaligned np test (#2134) * align cpu buffer before copy into cl buffer (#2135) * remove shelve from handcode_resnet50_opt.py (#2139) * Add dictionary keys to reduce db size (#2131) * work * ignore beam cache * dictionary keys are generic * minor db cleanups * fix baseline and extract dataset * fix training * log likelihood * more lin to feats * sts * training policynet * net sort of works * dedup * refactor, stupid new actions * fix uops deduping * BEAM_ESTIMATE --------- Co-authored-by: chenyu <chenyu@fastmail.com> Co-authored-by: imaolo <56898718+imaolo@users.noreply.github.com>	2023-10-27 10:53:06 -10:00
chenyu	d5e2fdea22	remove shelve from handcode_resnet50_opt.py (#2139 )	2023-10-24 10:37:30 -04:00
George Hotz	c36d306606	KOPT is over, BEAM is upstream (#2071 ) * create cache for q learning * make linter happy * global beam * where it belongs * bugfix * ditch the kopt, use the beam * faster lin and DEBUG=2 okay * remove kopt, move search to features	2023-10-16 09:46:03 -07:00
George Hotz	49bcfec383	0s in the action space (#2070 ) * 0s in the action space * simpler * skip duplicate actions	2023-10-14 11:22:48 -07:00
George Hotz	6f1810af2d	with unroll, the action space goes from 161 -> 127 (#2060 ) * with unroll, the action space goes from 161 -> 127 * more reliable instrumentation * beam search is so op * beam bugfix	2023-10-12 20:52:23 -07:00
George Hotz	c5edb3c374	train value net, improve API, add BCE (#2047 ) * api cleanups, BCE losses * valuenet * fixup examples * learning okay * add valuenet runner * net improvements * net improvements * 40% win rate	2023-10-12 07:56:38 -07:00
George Hotz	41bfeb2c1e	start work on auto opt (#2034 ) * start work on auto opt * lin failure * not beating hcopt * greedy * timing is fast * codegen.search * greedy search in handcode_opt * track running gflops * clean up those files * no failure	2023-10-11 12:54:53 -07:00
George Hotz	44ed94ef5c	use the device abstraction in handcode_resnet50_opt	2023-10-07 13:22:20 -07:00
George Hotz	121f7aa8c5	Schedule item (#2012 ) * ScheduleItem * put var_vals in the schedule * fix tests, wow that proliferated quickly * not ready to be in the schedule	2023-10-07 08:59:25 -07:00
George Hotz	f54959e5cd	move print tree into graph (#2003 ) * move print tree into graph * add winograd profiling test * change pre-commit to run ruff first	2023-10-07 04:39:21 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00
George Hotz	90326dbdc3	resnet50 hand coded optimization (#1945 ) * resnet50 hand coded opt * hand optimize one kernel * opt in both places to fix test	2023-09-29 09:34:51 -07:00

33 Commits