tinygrad

Commit Graph

Author	SHA1	Message	Date
George Hotz	fc2303e520	gitignore in weights	2023-08-02 16:26:41 +00:00
chenyu	18d0a93f09	LazyBuffer.get_variable_buffers() (#1391 ) * LazyBudder.get_variable_buffers() * remove left_only, add ProdNode * no vars for OpNode.b * do not change symbolic vars, remove ProdNode	2023-08-02 09:01:35 -07:00
Umut Zengin	8889821547	Const pad support to pad2d and slice (#1392 ) * slice to pad2d migrate * Gain line * Mypy happy * Mypy happy * Revert * whitespace	2023-08-02 08:58:52 -07:00
wozeparrot	ab9e4a2e93	Make cuda CI a bit more consistent (#1403 ) * feat: use fast-apt-mirror * feat: use in more places	2023-08-02 07:38:22 -07:00
wozeparrot	7aff8c4ded	cl fixes (#1402 ) * feat: non-blocking * feat: store event on buffer	2023-08-01 22:13:51 -07:00
Alex Telon	b66361843a	Timing and Context can now be used as decorators (#1385 ) * Context and Timing can now be used as decorators * Using Timing decorator in quickstart.md The time formating is better and is a useful tool to learn. Old: Time: 3.5260659999912605 New: Time: 3526.14 ms * Updated env_vars documentation for Context * Added test for Context decorator * Put new import on same line as others	2023-08-01 17:16:10 -07:00
chenyu	d9d1372dd0	Update pytest.ini format (#1398 )	2023-08-01 18:00:51 -04:00
George Hotz	f4218b709f	Revert "Improve Metal runtime command buffer handling (#1335 )" (#1397 ) This reverts commit `bd54105b6b`.	2023-08-01 12:10:20 -07:00
Diogo	4dc8595069	simple exporting models (#1344 ) * unified exporting * json exporting * ignore more * simplified buffer export * added dtypes * added assert * swift example * fix tests * linter * remove whitespace * fixed tests * remove swift example * remove unintended changes * allow callable models to be used * whitespace * more readable json export * name change * whitespace * whitespace	2023-08-01 09:35:48 -07:00
wozeparrot	7c7cf16ef2	use host ptr for speed on copyouts (#1393 ) * feat: use mapped buffer for speed * fix: whoops don't need that * feat: don't need explicit call to memoryview	2023-08-01 09:34:12 -07:00
Diogo	ba5e3818a0	Limit dims based on max size (#1390 ) * working * whitespace * changed defaults to None * linter * last linter error	2023-07-31 19:18:19 -07:00
chenyu	b2fde9ec36	reshape to register variable value (#1386 ) * reshape to register variable value * better error message	2023-07-31 17:10:02 -07:00
Umut Zengin	0de5f20970	Re-open constant pad support to Tensor.pad (#1388 ) * Added const padding support to .pad * Linter	2023-07-31 17:08:57 -07:00
David Hou	3300d0aeaf	syncthreads before wmma (#1389 ) (venv) chaos@tiny3:~/tinygrad$ KX=2 KY=2 N=2048 python extra/gemm/hip_matmul.py 4194304 289.60 us, would be 59322.55 GFLOPS matmul, 173.80 GB/s	2023-07-31 17:05:49 -07:00
Alex Telon	2d10e0340e	Refactored ContextVars (#1331 )	2023-07-31 15:44:46 -04:00
George Hotz	f27df835a6	delete dead stuff (#1382 ) * delete bpe from repo * remove yolo examples * Revert "remove yolo examples" This reverts commit cd1f49d4662a5565726ae1fa7bf3f6a3e3985965. * no windows	2023-07-31 11:17:49 -07:00
Yixiang Gao	6e62dcfbf3	add check global dim limit in linearizer (#1299 ) * need a better place for reshape and permute * add permutation * cuda fixed * clean up * enable nvidia GPU with global max * fix order * fix CI * add check for global dim limit but need refactor * refactor * fix ignore	2023-07-31 11:14:54 -07:00
ronak69	ce0ab1c14e	convert `$@` to `"$@"` in `run_multibackend.sh` (#1379 )	2023-07-31 10:39:22 -07:00
chenyu	f5ef445cb6	trim space (#1381 )	2023-07-31 10:37:57 -07:00
JaSpa99	5ab12059da	rng hlops: add normal and kaiming_normal (#1378 ) * add normal and kaiming_normal * make sure its float * add tests	2023-07-31 10:37:02 -07:00
George Hotz	37fa7e96fb	Revert "update editorconfig, enforce via CI (#1343 )" (#1380 ) This reverts commit `da2efecbe2`.	2023-07-31 10:35:50 -07:00
Pavol Rusnak	da2efecbe2	update editorconfig, enforce via CI (#1343 ) * update editorconfig to set unix-style newlines and trim whitespace * add editorconfig github action to the CI * fix whitespace	2023-07-30 18:44:30 -07:00
S-Lykles	c2b82ea8ac	fix to_shape_strides (#1374 ) * add tests for expr_node and expr_idxs * simplify condition and add missing optimization	2023-07-30 18:42:46 -07:00
chenyu	1fdf560fb1	simplify get_contraction (#1373 )	2023-07-30 18:35:22 -07:00
S-Lykles	a32c677601	Fix off by one error in View.expr_node (#1363 ) * Fix off_by_one error in View.expr_node * Add test for expr_node * Remove whitespace before : * test no arguments and properly test idx=None	2023-07-29 08:10:37 -07:00
chenyu	ab80ea0d38	use ubuntu for clang ci test (#1368 )	2023-07-28 20:51:25 -04:00
Karan Handa	e0a69bdbe6	Fix argfix and add tests (#1365 ) * Remove unreachable code * Fixed argfix * Add empty check and tests * Removed redundant tests"	2023-07-28 09:09:49 -07:00
wozeparrot	32d1afa4b5	feat: correct case when base is 0 (#1360 )	2023-07-27 13:53:38 -04:00
wozeparrot	c22e77abfd	Match torch on fractional negative base pow (#1352 ) * feat: match torch on fractional negative base pow * feat: tests for trunc	2023-07-26 19:14:54 -07:00
Anthony Zboralski	bd54105b6b	Improve Metal runtime command buffer handling (#1335 ) * Improve Metal runtime command buffer handling * Remove obsolete mtl_buffers_in_flight list from _METAL class * remove unused import in ops_metal.py * Refactor: Use `self.dispatch_group` over `METAL.dispatch_group` Changes `libdispatch.dispatch_group_enter(METAL.dispatch_group)` to `libdispatch.dispatch_group_enter(self.dispatch_group)`	2023-07-26 15:45:40 -07:00
Umut Zengin	d4ebadf2da	Small Tensor.cat optimization and reformating (#1347 )	2023-07-26 18:01:12 -04:00
geohotstan	4056f97187	Gather (#1329 )	2023-07-25 15:05:41 -04:00
Francis Lam	9d142430cb	Add option in llama.py to quantize weights to int8 at runtime (#1289 ) * Add option in llama.py to quantize weights to int8 at runtime Also added lm-eval to external * Add support for llama-2 evaluation	2023-07-24 17:22:38 -07:00
wozeparrot	12dd09ad54	feat: better comment for state bfloat16 conversion (#1338 )	2023-07-24 17:17:40 -04:00
Pavol Rusnak	cd60b8561c	Add LLaMA-2 support (#1284 ) Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2023-07-24 17:12:02 -04:00
waifairer	d89fb729e5	flake8 (#1323 ) * flake8: Ignore frequent violations, correct infrequent ones * Ignore some rules in test * Reorder test ignores * Lint test + main * EOF indent * Include all E71,E72 errors * Test the failing case in CI * Revert "Test the failing case in CI" This reverts commit 110add0a70f5a619d07631269104e84f908af6b9. * Push to test! This reverts commit f317532779a0e1ac8401e2474fd5c6c8695c08e9. * ok back to passing This reverts commit ba5052685f93f83e06152cdc696b9e26131d8ab7. * Prove that CI fails when formatting is incorrect. * Fix formatting * Remove duplicitous E117 rule * Use flake8 config for precommit --------- Co-authored-by: waifairer <waifairer@gmail.com>	2023-07-24 11:19:58 -04:00
wozeparrot	51173f0a48	HIP backend fixes (#1336 ) * feat: hip trains cifar * feat: test_dtype fixes	2023-07-24 08:16:57 -07:00
George Hotz	086382b64e	Revert "Fix max nan (#1298 )" (#1334 ) This reverts commit `50774470b2`.	2023-07-23 20:41:28 -07:00
uncommonSensor	50774470b2	Fix max nan (#1298 ) * Fix max nan * Adds nan check option to max function * Calls to max can pass in "ignore_nan=True" argument * Added max nan CI tests * Fix max nan * Adds nan check option to max function * Calls to max can pass in "ignore_nan=True" argument * Added max nan CI tests * Turned off due to the need for granularity	2023-07-23 19:39:44 -07:00
cheeetoo	a0965ee198	CI < 5 minutes (#1252 ) * models matrix * fix typo and install gpu deps * install llvm deps if needed * fix * testops with cuda * remove pip cache since not work * cuda env * install cuda deps * maybe it will work now * i can't read * all tests in matrix * trim down more * opencl stuff in matrix * opencl pip cache * test split * change cuda test exclusion * test * fix cuda maybe * add models * add more n=auto * third thing * fix bug * cache pip more * change name * update tests * try again cause why not * balance * try again... * try apt cache for cuda * try on gpu: * try cuda again * update packages step * replace libz-dev with zlib1g-dev * only cache cuda * why error * fix gpuocelot bug * apt cache err * apt cache to slow? * opt and image in single runner * add a couple n=autos * remove test matrix * try cuda apt cache again * libz-dev -> zlib1g-dev * remove -s since not supported by xdist * the cache takes too long and doesn't work * combine webgpu and metal tests * combine imagenet to c and cpu tests * torch tests with linters * torch back by itself * small windows clang test with torch tests * fix a goofy windows bug * im dumb * bro * clang with linters * fix pylint error * linter not work on windows * try with clang again * clang and imagenet? * install deps * fix * fix quote * clang by itself (windows too slow) * env vars for imagenet * cache pip for metal and webgpu tests * try torch with metal and webgpu * doesn't work, too long * remove -v * try -n=logical * don't use logical * revert accidental thing * remove some prints unless CI * fix print unless CI * ignore speed tests for slow tests * clang windows in matrix (ubuntu being tested in imagenet->c test) * try manual pip cache * fix windows pip cache path * all manual pip cache * fix pip cache dir for macos * print_ci function in helpers * CI as variable, no print_ci * missed one * cuda tests with docker image * remove setup-python action for cuda * python->python3? * remove -s -v * try fix pip cache * maybe fix * try to fix pip cache * is this the path? * maybe cache pip * try again * create wheels dir * ? * cuda pip deps in dockerfile * disable pip cache for clang * image from ghcr instead of docker hub * why is clang like this * fast deps * try use different caches * remove the fast thing * try with lighter image * remove setup python for cuda * small docker and cuda fast deps * ignore a few more tests * cool docker thing (maybe) * oops * quotes * fix docker command * fix bug * ignore train efficientnet test * remove dockerfile (docker stuff takes too long) * remove docker stuff and normal cuda * oops * ignore the tests for cuda * does this work * ignore test_train on slow backends * add space * llvm ignore same tests as cuda * nvm * ignore lr scheduler tests * get some stats * fix ignore bug * remove extra ' * remove and * ignore test for llvm * change ignored tests and durationon all backends * fix * and -> or * ignore some more cuda tests * finally? * does this fix it * remove durations=0 * add some more tests to llvm * make last pytest more readable * fix * don't train efficientnet on cpu * try w/out pip cache * pip cache seems to be generally better * pytest file markers * try apt fast for cuda * use quick install for apt-fast * apt-fast not worth * apt-get to apt * fix typo * suppress warnings * register markers * disable debug on fuzz tests * change marker names * apt update and apt install in one command * update marker names in test.yml * webgpu pytest marker	2023-07-23 13:00:56 -07:00
George Hotz	47f9d82722	test_conv: relax to 0.93	2023-07-23 12:57:29 -07:00
Giles Bathgate	c4238b4ea0	Fix discriminator balancing in mnist_gan example (#1332 )	2023-07-23 12:43:05 -07:00
chenyu	aa05495620	symbolic stride (#1326 )	2023-07-23 12:41:22 -07:00
Cole Sutyak	2d4e182294	change fetch to allow for local file selection (#1309 )	2023-07-23 15:00:16 -04:00
waifairer	7cac5ea16c	[GH-1305] Refactor test_dtypes.py to be cleaner (#1306 ) Co-authored-by: waifairer <waifairer@gmail.com>	2023-07-21 18:18:02 -04:00
Maxim Zakharov	48c4df1263	fix: prevent infinite "loading..." state (#1319 ) * demo somewhy doesn't work on my device and throw eror "Error: GPUPipelineError: [Invalid ShaderModule] is invalid" inside setupNet func * because of that, JS halts the execution of the rest of the code below and on the screen we see "loading..." forever * added try catch here to communicate about the error in a proper way	2023-07-21 14:01:53 -07:00
Jacob Pradels	b112edd2c3	Add pylint trailing whitespace rule (#1314 )	2023-07-21 13:37:55 -04:00
George Hotz	bfbb8d3d0f	fix ones, BS=2 stable diffusion, caching optimizer (#1312 ) * fix ones, BS=2 stable diffusion * caching optimizer * print search time * minor bug fix	2023-07-21 09:55:49 -07:00
George Hotz	9746f6d094	move hand coded optimizer (#1310 ) * move hand coded optimizer * llvm can optimize * fix llvm * save linearizer	2023-07-21 07:53:12 -07:00
madt2709	d2c1e8409a	Update arange to be (start, stop, step) (#1308 )	2023-07-21 00:27:23 -04:00

1 2 3 4 5 ...

2202 Commits All Branches Search

2202 Commits

All Branches