tinygrad

Commit Graph

Author	SHA1	Message	Date
qazal	6dbe5585b0	batchnorm + conv backward in test_schedule (#4420 ) * test both optims * batchnorm_backward	2024-05-06 16:40:17 +03:00
Timmy	3f3c973022	Multiple Reduce Kernels - kernel properly orders reduceops (#4418 ) * enable kernel with multiple reduceops * copy self.reduceops * assert only one reduceop per kernel * kernel.py dfs order * linters --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-05-06 13:54:44 +03:00
wozeparrot	603d3a351b	feat: allow keeping multiple cookies (#4440 )	2024-05-05 19:26:48 -07:00
chenyu	afe020710d	disable PADTO on upcasted axis (#4444 ) fixed test_failure_31. PADTO upcasted is at best a no-op, and might fail at edge cases.	2024-05-05 21:52:03 -04:00
Francis Lam	709410071c	mlperf/resnet: updated BEAM params to increase performance (#4443 )	2024-05-05 21:49:46 -04:00
Francis Lam	c8595a9655	update sops.gz, fix tests and add new linearizer test (#4437 ) * update sops.gz, fix tests and add new linearizer test * remove METAL CI skip for test_failure_22 * re-add skip to METAL CI to test_failure_22	2024-05-05 17:31:25 -04:00
wozeparrot	9ad3d0520a	hotfix: npy is also ok (#4439 )	2024-05-05 13:48:54 -07:00
chenyu	d0eb1540d5	helpers.diskcache_clear (#4436 ) drop all tables in diskcache. added a unit test but disabled it by default because it will drop all cache...	2024-05-05 14:19:01 -04:00
George Hotz	595a6e3069	test_fold_conv_relu_backward test	2024-05-05 11:13:43 -07:00
George Hotz	cc16f644d0	hotfix: remove FAKE buffer from graph	2024-05-05 10:52:41 -07:00
qazal	760776c59d	merge EfficientNet to C with clang job (#4426 ) * merge ImageNet to C with linters * add to clang * delete from linter	2024-05-05 20:33:12 +03:00
chenyu	3b30756cbb	update mlperf submission system (#4435 ) more required fields.	2024-05-05 13:19:07 -04:00
George Hotz	f95658bc3e	hotfix: pickle jit works if you delete the function	2024-05-05 10:14:03 -07:00
George Hotz	12be536c06	Clang graph (#4424 ) * clang graph runner * render_dtype * name it ClangGraph * JIT=2 * JIT=2 goes there * JIT as context var	2024-05-05 09:54:12 -07:00
David Hou	544431c388	refactor: pass reduceop into global_load (#4417 ) * pass reduceop directly to global_load * typing * make mypy happy :/ * cede a line to mypy :( * fold in acc_const * add todo	2024-05-05 19:43:48 +03:00
geohotstan	874dfc556c	update setitem tests to test for currently supported cases (#4334 ) * tests, tests, tests * one more test * tests tests tests tests * t e s t * a few more	2024-05-05 11:59:13 -04:00
chenyu	fc9e58e482	Revert "refactor sparse_categorical_crossentropy (#4406 )" (#4429 ) This reverts commit `c7368515d2`.	2024-05-05 02:30:37 -04:00
David Hou	c0a048c044	batchnorm d(var)/d(mean) = 0 (#4430 ) * d(var)/d(mean) = 0 * drop the number in test_schedule!	2024-05-05 00:25:45 -04:00
George Hotz	e2eab9c2b3	hotfix: disk is okay in child process	2024-05-04 18:18:31 +00:00
George Hotz	cf33afa778	don't open devices from children (#4425 ) * don't open devices from children * correct way to do this * fix Device.DEFAULT and add back JITBEAM	2024-05-04 10:35:40 -07:00
qazal	fa17dcaf07	Fix llm.c/export.py (#4423 ) * fix headers * add CI * add stdio * merge clang tests * revert llm.c * revert ci * Revert "revert llm.c" This reverts commit 5fd17e3c8b38dc9549d0548e9515185b7b032573.	2024-05-04 19:37:10 +03:00
George Hotz	cb7289f9c9	remove clang program header (#4422 ) * remove clang program header * proper max * bools are numbers * fix compile enet	2024-05-04 08:38:01 -07:00
qazal	267bbb57f9	Revert "Add `insert_before` to Linearizer Functions (#4320 )" (#4421 ) This reverts commit `664b563c91`.	2024-05-04 17:50:21 +03:00
qazal	5f3bae378f	search children in fusion (#4322 ) * scheduler diff * tests diff * new changes * realizes * chores * assign * kind of r3 * forced_realize wont do it * with forced_realize * start with children * test search * r3 with parents * diff cleanup * add children * crossing assign * late fuse descendants * update kernel counts * assign diff doesnt belong here	2024-05-04 17:22:15 +03:00
qazal	249cadd106	fusing crossing diamond assign (#4403 ) * refactor scheduler parents search * assign target * unit test * can't chase this	2024-05-04 15:19:48 +03:00
George Hotz	9fc4465557	subbuffer support (#4397 ) * subbuffer support * diskbuffer offset * cuda subbuffer works * use subbuffer * more subbuffer tests * consecutive * cast * consec * offset * view is a better name * offset is in nbytes * fix view + memory planner * delete unused DiskRunner * reverse order * no subbuffers on unrealized consts * only enabled for disk * don't reverse memory * view supported devices * pickle buffer view * ring jit * support extra view inputs in jit * fix JIT=2 issue * test copy jit * p2p isn't an option anymore * fix dep tracking issue * fix mypy * fix pickle * from_nv is contents now	2024-05-03 18:05:57 -07:00
chenyu	c7368515d2	refactor sparse_categorical_crossentropy (#4406 ) factor out the -1 * and / loss_mask.sum() for both smoothing and non-smoothing terms	2024-05-03 14:28:36 -04:00
qazal	3401734e54	infra for scheduler process replay (#4405 ) * use getenv * capture ast * fix graph * replay schedules * exec	2024-05-03 20:29:13 +03:00
chenyu	473ecb978a	remove SPLIT_REDUCEOP=1 from resnet scripts (#4404 ) SPLIT_REDUCEOP=1 is default	2024-05-03 12:36:23 -04:00
David Hou	b767d59684	resnet trainer: keep old cookie around until next step has been queued (#4401 ) * keep old cookie around until next step has been queued (-10ms 6gpu) * also for eval * drop cookie before data_get? * Revert "drop cookie before data_get?" This reverts commit b01e6aa2b27f49aeab04b448f09e0ef9e689ea53. * Revert "Revert "drop cookie before data_get?"" This reverts commit 23464e73d445007c15537c69818fdee89adf0740.	2024-05-03 12:15:21 -04:00
qazal	cf3ccb809f	refactor scheduler parents search (#4402 )	2024-05-03 17:16:34 +03:00
George-the-1st	0627e26140	Added missing unittest execution code (#4400 ) same code as on every other test file, just missing from this one for some reason.	2024-05-02 22:34:30 -04:00
chenyu	d4062cb6fc	NV tensor_cores in kernel.py (#4399 )	2024-05-02 22:33:08 -04:00
qazal	0deaaf2bc8	partial fusion spec (#4398 )	2024-05-03 04:14:23 +03:00
chenyu	2c3b7f8e70	pad resnet training data with training data mean (#4369 ) update model_train resnet to pad training	2024-05-02 20:26:15 -04:00
Francis Lam	3cf8291f2f	mlperf/resnet: update beam params to increase time and quality (#4396 ) * mlperf/resnet: update beam params to increase time and quality * revert upcast 8 in search space and add rocm setup function * refactor to independent setup.sh script	2024-05-02 20:14:46 -04:00
nimlgen	ca6c8ae739	factor out resource access logic in multigraph base class (#4385 ) * factor out resource access logic in multigraph base class * hsa fixes * clean * linter * linter 2 * not need this	2024-05-03 00:38:22 +03:00
chenyu	ab01a9433d	resnet eval 4n+3 if epoch < 33 (#4391 ) the rule is as thoroughly as 4n+k and we can stop the clock as soon as eval hits target. this can save 24 evals or 12 minutes	2024-05-02 16:52:07 -04:00
Francis Lam	7c8401fc65	search: skip timing the unoptimized kernel (#4395 ) * search: skip timing the unoptimized kernel also ensure the return the unoptimized kernel if no opts are valid and refactor debugging to a single BEAM_DEBUG variable * stop early on fast kernels that can't improve enough	2024-05-02 16:48:49 -04:00
Francis Lam	5c5b40880f	search: fix edge cases on screening potential ops (#4394 ) * search: fix edge cases on screening potential ops won't change correctness, but will save a little python time by properly deduplicating potential actions * check for de-duplication instead of exact valid actions * refactor long line	2024-05-02 14:53:05 -04:00
George Hotz	89030b238a	add consecutive property to shapetracker	2024-05-02 10:41:28 -07:00
George Hotz	2786dff26d	new disk tensor tests (#4393 )	2024-05-02 08:54:44 -07:00
chenyu	7492e5d3e7	resnet correct log name for red (#4390 )	2024-05-02 10:58:55 -04:00
chenyu	bf31837e6d	resnet correct steps_in_val_epoch in logging (#4389 ) also added random seed from system in scripts	2024-05-02 10:51:36 -04:00
George Hotz	c8a2047377	testing for all reduce (#4387 )	2024-05-02 06:34:10 -07:00
ym555	3113785604	Llama 3 Models (#4339 ) * Full Impl * fix test * Fix inference loop --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-05-02 06:06:07 -07:00
qazal	0b47818e0f	simpler reduceop children chasing (#4350 ) * simplest case * midreduce case * all tests * pending things * unify tests	2024-05-02 15:15:30 +03:00
chenyu	22376e53b7	resnet mlperf logging (#4361 ) * resnet mlperf logging * cropping too much?	2024-05-02 00:00:04 -04:00
George Hotz	f635c4d273	fix define global (#4383 ) * fix define global * remove name from DEFINE_GLOBAL * fix fuzzing * fix ptx * fix python	2024-05-01 22:32:56 -04:00
chenyu	ad116dc5c6	fill in mlperf system description (#4381 ) it did not ask too many details. will put software versions later with tinygrad commit. ``` python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_red.json training 4.0.0 INFO - System description checker passed for tinybox red ``` ``` python3 -m mlperf_logging.system_desc_checker examples/mlperf/training_submission_v4.0/tinycorp/systems/tinybox_green.json training 4. 0.0 INFO - System description checker passed for tinybox green ```	2024-05-01 16:47:45 -04:00

1 2 3 4 5 ...

4422 Commits All Branches Search

4422 Commits

All Branches