tinygrad

Commit Graph

Author	SHA1	Message	Date
qazal	7451812bbf	delete AST_REWRITE ctx var (#6995 )	2024-10-11 11:33:16 +03:00
qazal	7988547df2	start changes from big graph (#6993 ) * start changes from big graph [pr] * space * still capture ctx	2024-10-11 11:13:46 +03:00
George Hotz	e7a0ffe46a	break out linearization [pr] (#6994 )	2024-10-11 15:27:33 +08:00
George Hotz	f319530191	don't track simplify [pr] (#6992 )	2024-10-11 15:03:03 +08:00
George Hotz	e441794c4b	remove custom op support, we waste time maintaining this (#6991 ) * remove custom op support, we waste time maintaining this * customop is over	2024-10-11 14:31:09 +08:00
George Hotz	c08521e823	minor cleanups from toonygrad (#6990 )	2024-10-11 14:19:10 +08:00
George Hotz	f50d0e0ee0	cloud device [pr] (#6964 ) * first try at cloud device [pr] * real separation * we're free * clang works * unhappy with timeout * better timeouts and free * unrelated * use http verbs + add test * lines + better test * fix DELETE * shorter cloud * split key * fix sending renderer * PTXRenderer serialization * add sessions * http.client * minor timeout bump * fix keep-alive * inc server timeout * real fix timeout * that one too	2024-10-11 12:24:06 +08:00
Bhavya Gada	23c09f4b4c	add support for padding='same' in nn.conv (#6975 ) * add support for padding='same' in nn.conv * express concisely * simplify loop * test same padding with dilation and conv1d * fix bad indentation * make loop one liner	2024-10-11 11:39:07 +08:00
qazal	54dcea235d	viz auto recenter on out of view graph [pr] (#6986 )	2024-10-11 02:40:06 +03:00
nimlgen	159ee04489	include qcom in view_supported_devices (#6985 ) * include qcom in view_supported_devices * ignore images	2024-10-11 01:10:51 +03:00
nimlgen	f9d454aed5	correct kernargs alignment (#6984 )	2024-10-11 00:06:28 +03:00
qazal	2b17279d4e	viz don't default open the browser [pr] (#6983 ) * viz don't default open the browser [pr] * move st * scale down	2024-10-10 22:12:18 +03:00
qazal	4f60252210	reduce scheduler process replay overhead [pr] (#6981 )	2024-10-10 20:03:38 +03:00
Friedrich Carl Eichenroth	859d6d0407	Fix mypy examples/beautiful_.py (#6978 ) fix mypy examples/beautiful_.py backwards * add test * Revert "add test" This reverts commit 4d88845ba3f24d83621da0abf55096553abda7fa. --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-10-10 11:34:29 -04:00
qazal	4ef5310039	track viz context even if rewrite errors [pr] (#6976 )	2024-10-10 18:33:15 +03:00
chenyu	592e5f1df2	skip test_viz test_no_dedup_different_opts (#6979 )	2024-10-10 11:10:24 -04:00
chenyu	e3dc10f8f6	improve fold_unrolled_divs (#6977 ) addressed #6935 the first few terms in fold_unrolled_divs might have been folded already, so the check should first try to add those terms back. there is a case that every but one term is folded which is not an add chain anymore, so just added as a failed test case for now	2024-10-10 10:52:05 -04:00
qazal	3481468702	bring viz to core (#6970 ) * move viz to core * pathfix * move test_viz to core * cleanup test_viz diff * use contextvars	2024-10-10 16:56:26 +03:00
nimlgen	fad575ec76	qcom tiny cleanups (#6973 )	2024-10-10 12:26:41 +03:00
qazal	3724a66716	move test_viz to test/, prereq for tinygrad/viz [pr] (#6972 )	2024-10-10 11:40:46 +03:00
Kinvert	960c495755	added beautiful fashion mnist and example (#6961 ) * added beautiful fashion mnist and example * fixing whitespace * refactor Fashion MNIST to fewer lines * fix newline to reduce diff * Update beautiful_mnist.py * Update beautiful_mnist.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-10-10 12:01:07 +08:00
chenyu	b5546912e2	10% more TRAIN_STEPS for bert (#6971 ) got two very close run, adding more steps for buffer	2024-10-09 19:21:43 -04:00
nimlgen	f90d8493cc	add HCQDEV_WAIT_TIMEOUT_MS (#6968 )	2024-10-09 19:50:00 +03:00
chenyu	35cf48659b	limit beam param for bert on green (#6966 ) seems to mitigate the crash	2024-10-09 11:48:18 -04:00
mesozoic-egg	0e8bcda07e	get readable error from wait_check (#6965 ) Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.me>	2024-10-09 17:28:58 +03:00
qazal	20d3c2d113	unify UOps.SHAPETRACKER and UOps.SWIZZLE with UOps.VIEW (#6955 ) * add UOps.VIEW * update hardcoded asts * update sops.gz	2024-10-09 02:00:17 +08:00
nimlgen	137ad5519f	amd fix cwsr for gfx11 (#6950 ) * amd cwsr * ()	2024-10-08 17:44:29 +03:00
nimlgen	0d526e251e	nv sync on gpu before local update (#6954 )	2024-10-08 17:43:58 +03:00
qazal	2800520dd5	even smaller process_replay.py [pr] (#6941 ) * even smaller process_replay.py [pr] * delete those tests * dedup asts	2024-10-08 20:43:22 +08:00
qazal	851f39653a	rename to BUFFER_VIEW + MetaOps cleanup (#6953 )	2024-10-08 20:09:22 +08:00
chenyu	1ff2c98f8a	fix logfile name for bert red (#6952 )	2024-10-08 05:37:52 -04:00
czhu	08bfa8632b	embedding shape (#6930 )	2024-10-08 14:42:20 +08:00
vladov	20a9683403	Make self.fd Optional. (#6855 ) * Make self.fd Optional. * Fix io_uring when missing fd. * Compress io_uring fast path code.	2024-10-08 13:25:34 +08:00
chenyu	a78c96273a	update bert epoch logging (#6940 ) * update bert epoch logging epoch for bert is simply number of examples seen (which is used for RCP check) * update total steps too * more changes	2024-10-08 00:34:06 -04:00
George Hotz	0498e846a5	break out metaops (#6948 )	2024-10-08 12:08:54 +08:00
nimlgen	42609300ff	hcq no timeline signals in init (#6944 )	2024-10-07 23:36:19 +03:00
qazal	0ecc417dd2	prep for viz move to core [pr] (#6938 ) * prep for viz move to core [pr] * polish	2024-10-07 23:24:04 +08:00
chenyu	e4c0743188	failed example for logcumsumexp (#6936 ) need cummax for numerical stability	2024-10-07 10:55:45 -04:00
chenyu	102dfe5510	back to 210 for bert loss scaler (#6934 ) getting 2 NaN for this, revert back to 210	2024-10-07 10:17:21 -04:00
qazal	9250452da4	no codegen import in ops [pr] (#6888 ) * no codegen import in ops [pr] * @track_rewrites * all functions need this * polish	2024-10-07 20:54:21 +08:00
George Hotz	f7f94cd62f	bitcast cleanup [pr] (#6933 )	2024-10-07 19:16:16 +08:00
chenyu	0cf815a93a	bert use BS=66 and update hparams (#6932 ) with dropout memory improvement, we can fit BS=66 now. revert back to the hparams in #5891 too	2024-10-07 05:08:27 -04:00
ignaciosica	32ac24c45c	Generic wmma rendering for cuda, ptx [run_process_replay] (#6838 ) * generic wmma rendering for cuda, ptx - also adds wmma generic shape ops_python support * hotfix: fixed values in ops_python * hotfix: more fixed values * hotfix: revert changes in ops_python * refactor wmma rendering * hotfix: get n_args directly * hotfix: use n_args[0] for a * hotfix: simplify * hotfix: add args_slices * hotfix: rename args back to operands * hotfix: fix spacing * hotfix: rename upc to sz * hotfix: rename args to operands in assembly * hotfix: space * hotifx: add comment for literal 4 * hotfix: rename some variables and change for clarity	2024-10-07 16:36:36 +08:00
qazal	b82023c97e	process replay cleanup to generic _pmap [pr] (#6929 ) * process replay cleanup to generic _pmap [pr] * delete `COMPARE_SCHEDULE`	2024-10-07 13:57:05 +08:00
qazal	16312b4c59	rip out old scheduler process replay stuff, diff pure UOps [pr] (#6927 )	2024-10-07 13:20:35 +08:00
chenyu	999e3780e9	dropout contiguous after >= p (#6892 ) make it a bool buffer	2024-10-06 19:40:42 -04:00
wozeparrot	9eb6eef441	seed in tensor (#6869 )	2024-10-06 14:46:58 -04:00
Tobias Fischer	f9e32f2bb2	clip device fix (#6924 )	2024-10-07 00:47:32 +08:00
chenyu	01a2d7316d	dtype=float in bert log_softmax for loss and accuracy (#6916 )	2024-10-06 11:15:56 -04:00
jeffzh4ng	19a7e41113	implement logcumsumexp (#6921 ) * implement logcumsumexp * change axis=None to axis=0	2024-10-06 10:45:36 -04:00

1 2 3 4 5 ...

6388 Commits All Branches Search

6388 Commits

All Branches