tinygrad

Commit Graph

Author	SHA1	Message	Date
chenyu	8e22c0d95c	everything can jit now (#2338 )	2023-11-16 23:54:57 -05:00
Friedrich Carl Eichenroth	a8875bd770	add types to lazy (#2327 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-16 20:48:41 -08:00
George Hotz	1d5501594e	force rebuild of ocelot (#2334 ) * force rebuild of ocelot * SzymonOzog gpuocelot * delete that * downgrade that * non parallel * force rebuild * use llvm * nauto * less mem maybe * print test * helper_test_exception skip CUDACPU * helper_test_exception * shippable	2023-11-16 20:44:14 -08:00
imaolo	0d0c74bac9	Assert for memory allocation failures (#2337 ) * assert adequate memory has been freed * cleaned up runtime error message * improved metal buffer alloc error catching and reporting * decreased lines and altered messages * removed unnecessary _get_cur_free_space() call * improved assert message * added allocate massive buffer test * added test_lru_allocator_metal_max_buffer_length * split into two asserts and removed walrus assignment from assert expression * update assert message and use byte data type for clarity	2023-11-16 20:14:16 -08:00
chenyu	aa01a63b3f	cleanup of lines / unused / types (#2336 )	2023-11-16 21:15:32 -05:00
chenyu	3971259832	fix test_real_world llama (#2335 )	2023-11-16 19:50:08 -05:00
chenyu	3b9dd3330c	add device to beam search cache key (#2333 )	2023-11-16 18:35:08 -05:00
Friedrich Carl Eichenroth	75676ab8e1	Profiling-helper (#2321 ) * change profiler * remove unused imports * remove unused imports * change lazybuffer references * remove unused line * remove unused import * remove unused stuff * add types * typing * typing * typing * trigger actions * -1 loc * fixup * trigger actions * revert lazy typing changes * WIP profiler helper * replace old start & stop profiler * fixup * linting * Update llama.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-16 14:15:56 -08:00
mmmkkaaayy	8235da11dd	whisper: support batch inference, add librispeech WER test (#2074 ) * whisper: support batch inference, add librispeech WER test, add kv caching and JIT * remove JIT_SUPPORTED_DEVICE --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-16 13:50:08 -08:00
George Hotz	3baaf298d6	two stage cumsum in tensor.py (#2331 ) * two stage cumsum in tensor.py * 2 more kernels for llama cumsum * gpt-2 and llama use fast multinomial	2023-11-16 12:09:53 -08:00
chenyu	163b2bc26a	wgpu.utils._device -> wgpu.utils.device (#2330 ) * wgpu.utils._device -> wgpu.utils.device * can i do this? * no need to specify metal	2023-11-16 12:52:13 -05:00
chenyu	27f4c26312	fix getitem slice when end < start (#2329 )	2023-11-16 11:20:27 -05:00
chenyu	822d6e6f18	Simpler mops verify (#2325 ) * rewrite the to_movement_ops check using symbolic * tweak	2023-11-15 21:47:18 -05:00
George Hotz	ef67d7ff5d	shapetracker whitespace	2023-11-15 15:24:09 -08:00
chenyu	a98511561c	fuzz_linearizer same api for interpreted and compiled (#2320 )	2023-11-15 17:40:22 -05:00
George Hotz	294e71de15	remove lines (unused code) (#2319 ) * remove lines * uhh, i'm tired * that function never worked * types for ast_parse	2023-11-15 14:36:11 -08:00
George Hotz	628365eab6	JIT cleanups (#2317 ) * cleanup cleanup * dedup update_stats	2023-11-15 13:34:52 -08:00
forcefieldsovereign	b64738e1d6	Remove AS_STRIDED from shapetracker (#2216 ) * very close * remove comment * negative strides working * almost everything passes * calculate offset with list comprehension * some cleanup * got disk load working * review suggestions * fix after merge * overlap working * did it * clean * fixed disk load * lint * mypy * removed as_strided * trying without simplify * added back simplify * make sure expanding to smaller shape * cleanup * removed comment * removed env file * trying whisper test again * onnx test sqlite issue * working on test * finished test * eliminate unnecessary shrink-then-pad * don't shrink buffer * added strides check * added to ci under linters * switch issue * allow symbolic stride * removed .env * isinstance * adjust strides for double expand * cleanup * needed to add type hint for mypy * set pythonpath	2023-11-15 15:50:17 -05:00
Marcello Fuschi	b8d460d203	Add Tensor.multinomial (#2295 ) * add Tensor.multinomial only with replacement * add support for 2D input in Tensor.multinomial * fix multinomial output shape * allow passing replacement=False to Tensor.multinomial when num_samples=1 * improve tests for Tensor.multinomial * fix edge case in Tensor.multinomial * Tensor.multinomial no more staticmethod	2023-11-15 11:38:39 -08:00
taher	cb6cfcc8f8	add icb support check for metal device (#2313 )	2023-11-15 11:37:28 -08:00
George Hotz	70a65c201e	JIT support in Interpreted (#2314 ) * factor that out * jit is supported everywhere * fix some tests * there's no jit supported device, the jit is everywhere * fix test uops	2023-11-15 11:13:38 -08:00
chenyu	9a20bc08d6	Tensor(None) is Tensor([]) (#2316 )	2023-11-15 13:49:18 -05:00
chenyu	f1f863c953	allow 0-dim array to broadcast into zero shape tensor (#2315 ) * allow 0-dim array to broadcast into zero shape tensor * not in	2023-11-15 13:12:21 -05:00
George Hotz	4da2ddea6e	Interpreted cleanups (#2312 ) * move the compiler out of ops * don't return realized * var_vals filter, fix custom * typing	2023-11-15 09:02:23 -08:00
chenyu	123a0b86b2	support zero in shape (#2303 ) * zero in shape start * no assert for that * if output size is 0, return without exec * tweak * strides * reduce over non-zero * shrink and expand * fix import * test_elementwise where * cannot reshape from size 0 to size 1 * compiled backend reduce over 0 * zeros for numpy * reduce over 0 and keepdim resulted in 1 * reduce empty set default values * compare with same input * pad test case * cat test case * torch does not support that?	2023-11-15 11:57:48 -05:00
qazal	f113a0b83b	dtype promotion priorities (#2311 )	2023-11-15 07:19:52 -08:00
geohotstan	3c5a51fb3a	aaaaaaa finally (#2310 )	2023-11-15 07:12:38 -08:00
kormann	cff8375aa2	make self referential AST fast too (#2278 ) * cleanup * linter * linter * linter * rm .buffers * linter * linter * huh? * cleanup * typo * min diff * property * rev * linter * no matel hack * minimal properties * line * checkout master * copy_to_device * idk * revert * type * type * faast * speed test * cleanup test * softer test * monotonic * harder test * clean code * cleanup	2023-11-15 07:12:07 -08:00
George Hotz	4f7b1ac0d2	cleanups before interpreted jit (#2306 ) * jit mnist * InterpretedFlopCounter doesn't rely on Interpreted * allocator for cpu and torch * types for exec_ast * fix type issues * fix onnx, remove print * always self.from_underlying	2023-11-14 21:44:25 -08:00
mmmkkaaayy	91546225f4	Add cache step for model weights in CI, re-enable whisper test (#2307 )	2023-11-14 21:16:04 -08:00
chenyu	175cdbe815	fix pad None will value (#2308 )	2023-11-14 23:57:05 -05:00
George Hotz	01f8781c26	fix CI (#2300 ) * might work * might work 2 * might work 3 * sneak that in to llama too * pin them all	2023-11-14 11:02:59 -08:00
nimlgen	4e0d47533e	beam works with var vals (#2296 ) * beam works with var vals * test passes now * better comment * linter happy	2023-11-14 13:03:19 -05:00
chenyu	fac8633ba8	explicit opts for test_linearizer_failures (#2299 ) * explicit opts for test_linearizer_failures * typo * update the invalid check	2023-11-14 11:52:38 -05:00
George Hotz	8916028ddd	move BatchExecutor (#2297 ) * move BatchExecutor * refactor to get_optimized_program * that changed	2023-11-14 08:08:51 -08:00
George Hotz	0cbf6c1811	move things, clean up extra (#2292 ) * move things * idk why pylint needs that now * delete unused	2023-11-13 20:18:40 -08:00
George Hotz	b1f7f29525	metal indirect command buffers (#2285 ) * metal indirect command buffers * sub 1ms gpt * metal batch exec is good * remove whitespace * input_replace * fix ci * useResources * very simple cacheallocator * update_stats * fix CI * minor * remove that from jit	2023-11-13 17:58:26 -08:00
chenyu	d86ea188dd	support symbolic shape in Interpreted (#2289 ) * support symbolic shape in Interpreted * simpler * no InterpretedFlopCounter * tragic NumNode * regex is hard	2023-11-13 20:13:18 -05:00
George Hotz	6960bcded0	back to 6.54GB for stable diffusion (#2288 ) * back to 6.54GB for stable diffusion * cleanups * only outputs, not inputs * err, restore hack for world	2023-11-13 16:50:04 -08:00
nimlgen	960535dfb8	get_linearizer_actions does not return illegal actions (#2287 ) * fix some linearizer failures * linter happy * no new test class	2023-11-13 11:48:54 -05:00
rodfer	53c5baa8b6	add dilation to avg_pool2d (#2270 ) * add dilation to avg_pool2d * avg_pool_fix * avg_pool_fix * woo * oops * force it correct --------- Co-authored-by: rodfer0x80 <rodfer0x80@proton.me> Co-authored-by: zibokapi <zibokapi@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-13 08:47:56 -08:00
chenyu	a72b370066	llama take int and convert to Variable internally (#2284 )	2023-11-12 17:11:37 -05:00
valar	123ea051e6	refactor/ci: delete many `# type: ignore` (#2281 ) * refactor/ci: delete many `# type: ignore` * replace `axis.__class__ is int` with `isinstance(axis, int)` to make mypy happy * add `--warn-unused-ignores` to mypy flag refs #2240 * ci: move `--warn-unused-ignores` flag to mypy config refs #2240	2023-11-12 11:04:20 -08:00
George Hotz	2e2154ae4f	bad hotfix for optimize_local_size, try again	2023-11-12 10:41:11 -08:00
George Hotz	270f747065	hotfix optimize_local_size (TODO: add regression test)	2023-11-12 10:29:00 -08:00
chenyu	f5a62a1b42	fix some tests related to JitItem (#2279 )	2023-11-11 23:00:35 -05:00
chenyu	5ef8d682e3	clean up attentions in stable diffusion (#2275 )	2023-11-11 14:25:36 -05:00
chenyu	453f48ce02	pad None means (0,0) (#2273 )	2023-11-11 09:50:26 -08:00
jxdv	c5d70c1871	typo (#2271 )	2023-11-11 07:18:04 -08:00
chenyu	880e693207	fix llama n_kv_heads in kvcache (#2267 ) * fix llama n_kv_heads in kvcache * trigger ci	2023-11-10 21:44:39 -05:00

... 3 4 5 6 7 ...

3006 Commits All Branches Search

3006 Commits

All Branches