tinygrad

Commit Graph

Author	SHA1	Message	Date
Marcel Bischoff	42b4761025	transformer >99.98% test accuracy in ~30s (#230 ) * transformer * BS might divide len(Y_test) * outoput when accuracy is high * more readeable * fixed loss in serious_mnist for new API	2021-01-02 07:45:09 -08:00
Liam	ebd72ff437	Test split (#231 ) * Split tests Split tests into "Test CPU" and "Test GPU". Add test flag "TEST_DEVICES" which is a comma separated list of devices: CPU,GPU,ANE * Run tests based on provided TEST_DEVICES flag By default will run all "CPU,GPU,ANE" * fix bad quote * Revert changes and use GPU=1 This is done through setting the default Tensor Device to Device.CPU of GPU=1 is set. Run GPU tests: GPU=1 pytest -s -v	2021-01-01 09:19:03 -05:00
George Hotz	4a7cf2e420	more reordering	2020-12-31 09:58:02 -05:00
George Hotz	92abe43683	reduce before binary because of unbroadcasting	2020-12-31 09:49:52 -05:00
George Hotz	4291002881	reorder GPU ops	2020-12-31 09:46:39 -05:00
George Hotz	de7fe085de	no read out of bounds	2020-12-31 09:41:36 -05:00
George Hotz	1fb5fcafce	GPU slice should fix tests	2020-12-31 09:37:03 -05:00
Liam	e972a45456	Dynamically register ops to Tensor (#232 ) * Dynamically register ops to Tensor This saves lines. And reduces redundant repetition. * ffs spacing you don't pay me enough!	2020-12-31 09:10:19 -05:00
Marcel Bischoff	e2f833f58f	max to behave on ties like torch (#229 ) * checkpoint * fixing pow * undo pow * backward max on GPU and CPU rewrite * indentation * changing seed for curiosity * max replaced equality * undo seed * rebase * fixed tests * merge error	2020-12-30 18:52:50 -05:00
George Hotz	30f8132646	reorder ops in ops cpu	2020-12-30 11:00:01 -05:00
George Hotz	e5b2803b5d	ops in readme	2020-12-30 10:48:55 -05:00
George Hotz	2d44bf7f1a	Dot -> Matmul	2020-12-30 10:41:51 -05:00
George Hotz	10fc3ff5b9	cleaner syntax	2020-12-30 10:35:37 -05:00
George Hotz	fcfe3dae01	write slice for CPU	2020-12-30 10:32:53 -05:00
George Hotz	47504004fd	ane ops	2020-12-29 18:00:53 -05:00
George Hotz	1f5c9618ef	refactor in readme and issue #225	2020-12-29 17:30:04 -05:00
George Hotz	f9170505b3	if you like your transformers twice as slow, use the GPU	2020-12-29 17:14:23 -05:00
George Hotz	6a6a82e999	support multidot on GPU	2020-12-29 16:56:30 -05:00
George Hotz	27208d729b	add GPU max thanks to marcelbischoff	2020-12-29 16:44:14 -05:00
George Hotz	4bbad11afe	link to papers	2020-12-29 14:15:46 -05:00
George Hotz	3f8e137b6f	extra/transformer	2020-12-29 14:14:00 -05:00
George Hotz	c4e7a1ae59	accessors are dumb	2020-12-29 14:10:26 -05:00
George Hotz	fb6aaefb9b	save 2 lines	2020-12-29 14:02:50 -05:00
George Hotz	ea341c84fe	logsoftmax good, div bad	2020-12-29 13:59:39 -05:00
George Hotz	f18801c7db	simple pool. swimming is very easy now	2020-12-29 13:48:50 -05:00
George Hotz	8f9232d59b	readmee	2020-12-29 13:40:34 -05:00
George Hotz	837aaacfbf	Unpad2D on GPU:	2020-12-29 13:16:14 -05:00
George Hotz	02655c07d5	break maxpool2d on GPU	2020-12-29 13:05:57 -05:00
George Hotz	061e37de39	touchups	2020-12-29 12:41:21 -05:00
George Hotz	a2e6562330	fix max op, less lines	2020-12-29 10:47:04 -05:00
Marcel Bischoff	dc8fa7999c	Transpose on GPU (#221 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up * transformer eval * axis=-1 * transpose * test for permutation using torch.movedims * another test * line	2020-12-29 10:40:11 -05:00
George Hotz	36579f66bf	max op	2020-12-28 23:54:52 -05:00
George Hotz	bcb3ceeca3	set training in functions	2020-12-28 22:45:46 -05:00
George Hotz	51bf164b72	dropout, training	2020-12-28 22:12:23 -05:00
George Hotz	7b8fee038d	it works! forgot the sqrt	2020-12-28 16:23:52 -05:00
George Hotz	1faf05ef67	ahh, it's better if i don't train the embedding	2020-12-28 16:07:02 -05:00
George Hotz	c3832e1bde	hmm, fix layernorm to not be batchnorm and it breaks	2020-12-28 13:06:21 -05:00
George Hotz	2e89e75dcb	layernorm fixes transformer instability	2020-12-28 12:58:15 -05:00
George Hotz	628d21f899	doc touchup	2020-12-28 10:45:26 -05:00
George Hotz	fafece9db7	avgpool2d is a second class op	2020-12-28 10:41:59 -05:00
George Hotz	593233b668	log and exp are first class ops	2020-12-28 10:00:30 -05:00
Marcel Bischoff	ffff98db78	Evaluation in Transformers (#218 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up * transformer eval	2020-12-28 09:24:51 -05:00
George Hotz	65b07d2f4f	fix onehot embed	2020-12-27 18:50:38 -05:00
George Hotz	d864e1c71a	transformer is training	2020-12-27 18:46:32 -05:00
George Hotz	a361ef6861	fixup training loop	2020-12-27 18:35:56 -05:00
George Hotz	f15bec6dbc	make multidot work on CPU	2020-12-27 17:25:37 -05:00
George Hotz	131e04c90c	cpu only decorator	2020-12-27 17:18:55 -05:00
George Hotz	2f1b2c0a3b	add transpose, start on transformer	2020-12-27 16:59:12 -05:00
gamwe6	d379502c04	Cleaning (#211 ) * Cleaned * Brought the lines into line Co-authored-by: gamwe6 <gamwe6@users.noreply.github.com>	2020-12-27 09:58:51 -05:00
George Hotz	8a335f03ad	clock speed 32x32	2020-12-22 18:18:52 -05:00

... 3 4 5 6 7 ...

693 Commits All Branches Search

693 Commits

All Branches