Yassine Yousfi
2372b262ee
new desire shape
2022-10-17 15:20:38 -07:00
Yassine Yousfi
870ea766ee
dont need to up this threshold
2022-10-10 20:18:47 -07:00
Yassine Yousfi
678fe9ad7c
new models in tests
2022-10-10 20:12:34 -07:00
Yassine Yousfi
a2b77cc399
workflow dispatch
2022-10-10 19:52:39 -07:00
Yassine Yousfi
44159a5ec4
fix typo in compile script
2022-10-10 19:28:43 -07:00
YassineYousfi
46f6db7522
Merge pull request #2 from commaai/faster_ngrl
...
faster rocket launcher
2022-10-07 16:43:30 -07:00
Yassine Yousfi
c86edb947f
fngrl
2022-10-06 20:03:59 -07:00
HaraldSchafer
2993dfe921
Merge pull request #1 from commaai/ngrl
...
ngrl stuff
2022-10-05 23:01:45 -07:00
Yassine Yousfi
82ca9c6666
ngrl stuff
2022-10-04 09:35:37 -07:00
George Hotz
392e57aea7
ugh, why did that fail
2022-10-01 13:38:43 -04:00
George Hotz
8382c51c12
always MATMUL, test the ops in OPENCL
2022-10-01 13:31:29 -04:00
George Hotz
7a61dc7ee9
test_sd_big_conv
2022-10-01 13:26:05 -04:00
George Hotz
178ba50c03
some args for stable diffusion
2022-09-29 01:52:04 -04:00
Ollin Boer Bohan
3b1767e013
Fix OpenCL Metal texture issues ( #378 )
...
* Fix OpenCL Metal texture issues
Tile CL images when needed, to fit into the 16384 max Metal image size;
gets me to ~4.8s/iteration for SD on M1 Pro with OPENCL=1 FLOAT16=1.
* Minor cleanup
* Fix mish in CI, or no-op?
* Is mish being framed?
* It would help if any of this reproduced locally
* ???
* OPT is reverted; use original mish
* Cleanup post-review
* Fix some shape usage
* Tiler tests, shouldn't oom or overflow either
* Can't CL if there's no CL?
* Run tiler tests even if GPU=1
* relu6 segfault binary chop; revert test
* relu6 segfault binary chop; revert accel
* relu6 segfault binary chop; revert . (???)
* end relu6 segfault binary chop; repo's haunted
2022-09-29 01:21:54 -04:00
George Hotz
e737513c52
external_test_opt
2022-09-28 23:29:41 -04:00
George Hotz
650c011646
notrain test
2022-09-28 23:27:20 -04:00
George Hotz
af87d692e4
should this be 10?
2022-09-28 23:25:52 -04:00
George Hotz
0fd459b24e
ugh, global state
2022-09-28 23:10:49 -04:00
George Hotz
fa4eff9cc1
Device.GPU isn't definied
2022-09-28 23:00:15 -04:00
George Hotz
0b6537a572
fix tests
2022-09-28 22:57:58 -04:00
George Hotz
726cca78cd
fix bn folding issue, add new test
2022-09-28 22:52:18 -04:00
George Hotz
a0d169eb59
fix efficientnet
2022-09-28 14:23:01 -07:00
George Hotz
dec5334da9
revert layernorm to have axis param
2022-09-26 10:11:38 -04:00
George Hotz
dc80bf6f85
layernorm is all axis but the first
2022-09-25 17:55:48 -04:00
George Hotz
60df954377
Fix weight init: this work? ( #391 )
...
* this work?
* glorot uniform
* requies_grad broke
* propagate the None correctly
* so this weight init works
* ahh, i think it's this
* can't beat this
* glorot is best for ae
* remove comments
2022-09-25 16:46:33 -04:00
George Hotz
ff11c4316b
move get_parameters to optim.py
2022-09-25 13:16:58 -04:00
George Hotz
a0c0239ff1
fix mnist load from other dirs
2022-09-25 12:50:28 -04:00
Jacky Lee
2c01a66265
Reshape dataset from fetch_mnist ( #390 )
2022-09-24 21:16:29 -04:00
George Hotz
acae9a20c1
clipnorm support
2022-09-24 13:26:38 -04:00
George Hotz
271446e3eb
set requires_grad to None ( #387 )
...
* set requires_grad to None
* some things need gradients
* hmm, why was get_parameters filtering
2022-09-21 11:16:02 -04:00
George Hotz
29ae21bb0d
import tests from CL metal texture fix
2022-09-19 20:01:47 -04:00
George Hotz
a8aa1f9589
that's simpler
2022-09-18 20:40:46 -04:00
George Hotz
57e804a9bf
add min support
2022-09-18 20:39:41 -04:00
YassineYousfi
2f0f91ba3d
support float16 onnx weights ( #384 )
2022-09-15 09:12:18 -04:00
Comma Device
75f937227a
add barrier
2022-09-13 11:39:48 -04:00
George Hotz
3c3534736e
fix matmul kernel and tests
2022-09-13 08:31:04 -07:00
Comma Device
62e9419206
fix test failure on MATMUL=1 backward pass
2022-09-13 11:18:52 -04:00
Comma Device
3b82afc6a0
simple on device failing test
2022-09-13 10:59:15 -04:00
George Hotz
4efde1ba0a
test_matmul
2022-09-13 07:51:33 -07:00
George Hotz
894a7cee79
forgot a few
2022-09-12 09:21:46 -07:00
George Hotz
801ecd4a07
cleanup clip tokenizer
2022-09-12 09:20:12 -07:00
Fernand Pajot
ff0da4c802
Added standalone CLIP tokenizer ( #382 )
...
* Added standalone CLIP tokenizer.
* Fixed empty phrase.
* Truncating long prompts.
* Keeping two slots for the start and end token.
* Fixed empty phrase.
* Using tokenizer for empty phrase.
* Typo.
2022-09-12 09:12:55 -07:00
David Redmon
a1810c8617
update serious_mnist.py ( #380 )
2022-09-11 13:37:40 -07:00
George Hotz
ce348f0c92
Revert "change default opt to 2"
...
This reverts commit 726f4e98e9
.
2022-09-11 13:35:42 -07:00
George Hotz
726f4e98e9
change default opt to 2
2022-09-09 07:50:25 -07:00
YassineYousfi
1a7bdc51f8
support more onnx ops ( #376 )
...
* broadcast from right to left
* add another broadcasted add test
* more onnx ops
* use float32 range in clip
2022-09-07 15:15:24 -07:00
George Hotz
0b8c2221b5
relax mnist test a tiny bit
2022-09-07 07:52:05 -07:00
George Hotz
ecc1a0470d
add Linear to tinygrad.nn
2022-09-07 07:40:48 -07:00
George Hotz
d26bd73c1e
have to ignore that type
2022-09-07 07:24:27 -07:00
George Hotz
b7783565af
cpu line savings and cleaner
2022-09-06 21:24:22 -07:00