tinygrad

Commit Graph

Author	SHA1	Message	Date
George Hotz	3d445039c2	hotfix: 8800 lines for AMX+intel tc	2024-08-06 17:50:26 -07:00
chenyu	adba5efc64	enable llama 2 70B in tinybox green CI (#5905 ) runnable with MAX_CONTEXT=256	2024-08-04 18:48:46 -04:00
George Hotz	7348c40d9d	sampling time sync (8700 lines) (#5843 ) * sampling time sync * jitter matrix * comment * pass mypy * line count	2024-08-02 14:44:35 -07:00
wozeparrot	acadccf344	comma benchmark (#5518 )	2024-08-02 14:36:54 -07:00
chenyu	f27f949a5d	Revert "revert some UOp IDIV bound (#5863 )" (#5871 ) This reverts commit `0c8d202348`.	2024-08-01 21:38:31 -04:00
chenyu	df138bc558	Revert "revert a mod pattern (#5864 )" (#5870 ) This reverts commit `5c8de2d044`.	2024-08-01 20:44:26 -04:00
chenyu	1b0314d9ef	Revert "remove one more UOp mod pattern (#5865 )" (#5868 ) This reverts commit `b03b8e18c2`.	2024-08-01 20:28:35 -04:00
chenyu	b03b8e18c2	remove one more UOp mod pattern (#5865 ) fixed UOP_IS_SYMBOLIC=1 test_failure_40	2024-08-01 18:29:04 -04:00
chenyu	5c8de2d044	revert a mod pattern (#5864 ) fixed UOP_IS_SYMBOLIC=1 linearizer failure 47	2024-08-01 17:24:26 -04:00
chenyu	0c8d202348	revert some UOp IDIV bound (#5863 ) * revert some UOp IDIV bound breaks conv with UOP_IS_SYMBOLIC, added some conv tests in CI * those are correct * skip slow ones	2024-08-01 15:09:06 -04:00
George Hotz	5eedd9e3ad	raise the line ceiling to 8600. USE LINES CAREFULLY	2024-07-31 09:56:39 -07:00
wozeparrot	eebb1b9922	feat: temperature 0 llama3 benchmark (#5806 )	2024-07-30 12:05:36 -07:00
chenyu	cb6718347f	`python -m mkdocs build --strict` in CI (#5800 )	2024-07-29 16:46:30 -04:00
chenyu	be3899d211	hotfix increase ci timeout to 20 mintues (#5799 ) when cache is clear it takes time to populate cache	2024-07-29 16:25:27 -04:00
chenyu	471b188d79	fix mypy errors in latest mypy (#5794 ) * fix mypy errors in latest mypy mypy has stricter partial and api arg checks now * PYTHONPATH="."	2024-07-29 14:53:30 -04:00
George Hotz	0392123e6e	TC=2 still sets tensor cores (and TC=3 support for locals) (#5780 ) * TC=2 still sets tensor cores * add TC=3 support for using locals * bugfix * lines + TC=3 tests * CUDA can use threads, fix fuzz linearizer	2024-07-28 16:16:53 -07:00
qazal	3e49d86c01	process replay diffs 3 things now (#5731 ) * github api infra * process replay is 3 parts now * parse benchmarks * add gh_token * complete diff * move process replay tests * last successful run * add tempdir * skip master	2024-07-27 12:52:20 +03:00
qazal	57b4a8e98d	assert process replay asserts (#5737 ) * assert process replay asserts * one ci job is fine * test: Revert "separate process replay main loop (#5734)" This reverts commit `94d578396f`. * mac sed needs that * Revert "test: Revert "separate process replay main loop (#5734)"" This reverts commit e4ad7684d5472a64841a66b43bc1db7c9bbbf9e8. * disable process replay capture * save time * amd is tiny * send to /dev/null	2024-07-27 12:07:50 +03:00
George Hotz	db1d093b29	reenable LLaMA-3 8B BEAM on NV (#5746 )	2024-07-26 16:56:41 -07:00
chenyu	eff7c5fd2c	halve kernel counts in metal Fuzz Test linearizer (#5716 ) the test time has increased to 3 minutes	2024-07-25 14:35:11 -04:00
chenyu	7c8fe0fe47	skip interpolate tests for PYTHON=1 (#5664 )	2024-07-23 18:47:15 -04:00
George Hotz	e3f00ac77d	Fix cuda tc emu test (#5663 ) * fix acc folding for NV tensor cores * fix correctness of reduce_before_expand * fix test emulated CUDA tensor cores * test_gemm_fp16 on some devices	2024-07-23 15:04:25 -07:00
qazal	fdfc0015a7	[run_process_replay] for opencl/openpilot (#5009 ) * lil reset script * find the prg * use lower_schedule_item * add process replay back * cleanups	2024-07-18 19:42:33 +03:00
wozeparrot	6ccb2390c3	feat: update_benchmark_staging (#5529 )	2024-07-17 20:40:57 -07:00
George Hotz	d3b098299d	add failing regression test for image (#5540 ) * add failing regression test for image * tg type * simpler test * don't realize image to image casts caused issue * simple pad	2024-07-17 17:27:18 -07:00
wozeparrot	218e157f00	benchmark on update_benchmark_staging (#5541 )	2024-07-17 17:11:52 -07:00
Alessandro Benetti	13e200b437	add strict mkdocs check (#5497 )	2024-07-15 14:21:37 -07:00
qazal	40ec9410f9	simpler process replay (#5452 ) * remove check_process_replay * that can go to the top * add assert back * [run_process_replay] * checkout code [run_process_replay] * temp [run_process_replay] * revert temp [run_process_replay] * ahh this is why [run_process_replay] * revert temp [run_process_replay]	2024-07-13 19:55:06 +03:00
George Hotz	955e1179fb	move compile tests and merge (#5451 ) * move compile tests and merge * revert enet move, bump download cache * oh, try setting clang	2024-07-13 08:04:46 -07:00
chenyu	9a187e6102	fix handcode_opt script (#5435 ) * fix handcode_opt script * run in ci * real run in ci * HALF=0	2024-07-12 20:52:28 -04:00
George Hotz	b055ece550	hotfix: bump to cache gpuocelot	2024-07-12 13:54:14 -07:00
chenyu	b17e4adb3a	add `-c advice.detachedHead=false` to process replay git checkout (#5419 ) remove the noisy `Note: switching to 'origin/master'. You are in 'detached HEAD' state. You can look around, make experimental changes...` in log	2024-07-12 15:13:26 -04:00
qazal	31fcc516dc	more process replay tooling (#5407 ) * replays * what's in there * can it be up there * sha is enough * insert sha as the key * fix str * update reset utils * that nested try/except was terrible * github_context can go	2024-07-12 13:11:34 +03:00
Roelof van Dijk	6ec7dbc287	ci: parallelize uops tests (#5405 )	2024-07-12 11:22:41 +03:00
qazal	b91a0ccdc3	make [run_process_replay] [no_assert] the default (#5390 )	2024-07-11 22:36:59 +03:00
qazal	004366b193	context aware process replay [run_process_replay] (#5378 ) * test tc as ctx var * remove from opts * process replay * pop variable * B -> Variable * fix re-assign * pop temp vars * move TRANSCENDENTAL=2	2024-07-11 13:07:28 +03:00
chenyu	2396ab9b33	more transcend cleanup [run_process_replay] (#5369 ) fix test name, less # noqa: E501 and removed the cast	2024-07-10 23:05:03 -04:00
chenyu	64986f949c	more transcend math tests in ci (#5368 ) * more transcend math tests in ci test large input to trig functions that hit different reduction algo, and test TRANSCENDENTAL=2 for all backend * no CUDACPU * try that	2024-07-10 21:19:09 -04:00
chenyu	322c37e621	use helpers.JIT in llama and gpt2 examples (#5350 ) * use helpers.JIT in llama and gpt2 examples replaced getenv("JIT"), effectively made gpt2 default jit * fix test_gpt2	2024-07-09 15:04:43 -04:00
Ian Paul	d5a68ae6b3	Simple abstractions3.py fix (#5343 ) * abstractions3.py fix * Add abstractions3.py to CI tests	2024-07-09 13:48:42 +03:00
chenyu	631bc974a0	raise line count limit to 8500 (#5331 )	2024-07-08 14:00:28 -04:00
SnakeOnex	8c03816ae9	fix README example (#5284 ) * fixed README example * README test * changed py -> python markdown code flags in REAME	2024-07-04 11:15:07 -04:00
chenyu	191463a919	add timing to SDXL (#5273 )	2024-07-02 23:29:54 -04:00
chenyu	5808c37302	hotfix disable flaky llama3 beam benchmark on green (#5249 )	2024-07-01 15:00:47 -04:00
chenyu	b9122ecdaf	revert stable diffusion validation with threefry (#5248 ) * Revert "use threefry in stable diffusion benchmark (#4988)" This reverts commit `44dfa37c70`. * sdxl and validation fix * relax threshold	2024-07-01 14:43:47 -04:00
nimlgen	57e89645cd	hcq spec test (#5226 ) * start hcq spec test * more test * fixes * run on amd as well * test amdgpu exec * fix amd * amd mockgpu support sdma timestamp	2024-07-01 17:36:37 +03:00
chenyu	88763eb9ff	fix stable_diffusion with fp16 (#5239 )	2024-06-30 12:59:31 -04:00
nimlgen	dd7eef7d71	libc defs to autogen (#5217 ) * libc defs to autogen * amd import libc * linter * better a bit * remove comment, check this * not hardcoded path	2024-06-29 14:37:33 +03:00
nimlgen	6b08cb5e38	ptx runs on nv in benchmarks (#5224 )	2024-06-29 11:06:44 +03:00
nimlgen	b4c49ae3fa	remove cudacpu in favour of mockgpu (#5225 ) * remove cudacpu in favour of mockgpu * remove unused import * not used as well	2024-06-29 11:05:16 +03:00

1 2 3 4 5 ...

519 Commits