tinygrad

Commit Graph

Author	SHA1	Message	Date
chenyu	507e0afba0	fix onehot and jit in examples/transformer (#3073 ) trained to 0.999 in < 6 seconds on M1 Max consistently	2024-01-10 02:22:41 -05:00
George Hotz	ae83733431	hotfix: examples/transformer.py	2024-01-09 19:28:09 -08:00
George Hotz	0cbf6c1811	move things, clean up extra (#2292 ) * move things * idk why pylint needs that now * delete unused	2023-11-13 20:18:40 -08:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
nimlgen	d363d25ee2	fix imports for examples/transformer.py (#1136 )	2023-07-05 08:15:13 -07:00
Eli Frigo	10f1aeb144	fixed broken link (#1097 )	2023-07-02 15:06:59 -07:00
Fernando Vidal	73bd0b217b	add int64 as supported dtype from numpy (#699 ) * add int64 as supported dtype from numpy Without this, examples/transformer.py didn't run. With this change it runs successfully. * Update helpers.py * Update transformer.py * Update training.py	2023-03-18 17:15:04 -07:00
Jacky Lee	9fd41632c6	Import get_parameters from tinygrad.nn (#559 ) * get_parameter is in optim * Update all imports for get_parameters * Clean up * use optim.get_paramters	2023-02-17 15:22:26 -08:00
Jacky Lee	f08187526f	Fix examples (#540 ) * Fix examples * Remove training in parameters * Simplify a bit * Remove extra import * Fix linter errors * factor out Device * NumPy-like semantics for Tensor.__getitem__ (#506) * Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None * Fixed pad2d * mypy doesn't know about mlops methods * normal python behavior for out-of-bounds slicing * type: ignore * inlined idxfix * added comment for __getitem__ * Better comments, better tests, and fixed bug in np.newaxis * update cpu and torch to hold buffers (#542) * update cpu and torch to hold buffers * save lines, and probably faster * Mypy fun (#541) * mypy fun * things are just faster * running fast * mypy is fast * compile.sh * no gpu hack * refactor ops_cpu and ops_torch to not subclass * make weak buffer work * tensor works * fix test failing * cpu/torch cleanups * no or operator on dict in python 3.8 * that was junk * fix warnings * comment and touchup * dyn add of math ops * refactor ops_cpu and ops_torch to not share code * nn/optim.py compiles now * Reorder imports * call mkdir only if directory doesn't exist --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-02-10 12:09:37 -06:00
Jacky Lee	799b3f185a	Refactor getenv into helpers (#508 ) * Refactor getenv into helpers * Remove unused os * Fix default value * Fix more defaults for CI * Fix bracket * Revert changes to openpilot/compile.py * Use getenv from helpers when possible	2023-01-31 15:09:09 -08:00
George Hotz	b132de677d	tinygrad.nn (#367 ) * tinygrad.nn * flake8 * working on pylint * more pylint * more pylint * pylint passes * networkx * mypy can't infer that type * junk	2022-08-18 07:41:00 -07:00
George Hotz	99b6051467	add ff_dim to transformer	2021-11-29 12:40:52 -05:00
George Hotz	d3f169b267	move good models to models, add a training step test	2021-06-19 11:24:15 -07:00
Marcel Bischoff	42b4761025	transformer >99.98% test accuracy in ~30s (#230 ) * transformer * BS might divide len(Y_test) * outoput when accuracy is high * more readeable * fixed loss in serious_mnist for new API	2021-01-02 07:45:09 -08:00
George Hotz	f9170505b3	if you like your transformers twice as slow, use the GPU	2020-12-29 17:14:23 -05:00
George Hotz	3f8e137b6f	extra/transformer	2020-12-29 14:14:00 -05:00
George Hotz	bcb3ceeca3	set training in functions	2020-12-28 22:45:46 -05:00
George Hotz	51bf164b72	dropout, training	2020-12-28 22:12:23 -05:00
George Hotz	7b8fee038d	it works! forgot the sqrt	2020-12-28 16:23:52 -05:00
George Hotz	1faf05ef67	ahh, it's better if i don't train the embedding	2020-12-28 16:07:02 -05:00
George Hotz	c3832e1bde	hmm, fix layernorm to not be batchnorm and it breaks	2020-12-28 13:06:21 -05:00
George Hotz	2e89e75dcb	layernorm fixes transformer instability	2020-12-28 12:58:15 -05:00
George Hotz	593233b668	log and exp are first class ops	2020-12-28 10:00:30 -05:00
Marcel Bischoff	ffff98db78	Evaluation in Transformers (#218 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up * transformer eval	2020-12-28 09:24:51 -05:00
George Hotz	65b07d2f4f	fix onehot embed	2020-12-27 18:50:38 -05:00
George Hotz	d864e1c71a	transformer is training	2020-12-27 18:46:32 -05:00
George Hotz	a361ef6861	fixup training loop	2020-12-27 18:35:56 -05:00
George Hotz	f15bec6dbc	make multidot work on CPU	2020-12-27 17:25:37 -05:00
George Hotz	131e04c90c	cpu only decorator	2020-12-27 17:18:55 -05:00
George Hotz	2f1b2c0a3b	add transpose, start on transformer	2020-12-27 16:59:12 -05:00

30 Commits