George Hotz
1a039306d2
good changes from llama branch ( #671 )
...
* good changes from llama
* transpose behavior changed
2023-03-09 20:51:22 -08:00
George Hotz
b14d31d6db
ConvNeXt + extras ( #657 )
...
* simple convnext implementation
* shorter function names
* need to realize the random functions now
* creating an optimizer realizes all params
* assign contiguous
* fix lazy lazy
* why was i doing that...add convnext to tests
* LazyNumpyArray
* enable assert + comment
* no two tiny
2023-03-06 22:10:56 -08:00
George Hotz
2e56a4793e
rename log_softmax, support dim, fix onnx Softmax
2023-02-24 10:11:24 -08:00
Jacky Lee
cb679cd051
Fix weight initialization ( #566 )
...
* Fix weight initialization
* Use scaled_uniform in serious_mnist
2023-02-19 11:25:29 -08:00
Kirill
7944cfdadc
Remove Tensor.data ( #565 )
2023-02-18 16:36:12 -08:00
Jacky Lee
e172f0087a
BatchNorm2D -> BatchNorm2d ( #558 )
...
* BatchNorm2D -> BatchNorm2d
* Fix typo
2023-02-16 12:31:49 -08:00
Lucas Keller
56a06280c5
Testing/utils ( #548 )
...
* New unittest for utils.py
Unit test fetch in basic ways. Would have tested more fetches, but
downloading stuff for tests is annoying and mocking is more
dependencies.
* Remove unused imports
2023-02-10 12:08:20 -06:00
George Hotz
a0d169eb59
fix efficientnet
2022-09-28 14:23:01 -07:00
Comma Device
a734df98fa
TEST_ENET for openpilot compiler
2022-08-31 13:23:36 -04:00
George Hotz
368c0ce2f6
NUM=-2 for ants
2022-07-02 15:47:10 -07:00
George Hotz
0cb99d72e9
NUM=-1 is a small efficientnet for small people
2022-07-02 15:11:51 -07:00
George Hotz
8cf1aed0f4
don't track_running_stats, parameters must require_grad
2022-07-02 14:38:45 -07:00
George Hotz
67ff6b52fd
move padding to convs in enet
2022-06-26 23:14:31 -07:00
George Hotz
892ac661e1
enet readability
2022-06-07 10:23:05 -07:00
George Hotz
0ee21ba115
add ViT test and car
2022-06-05 17:12:43 -07:00
George Hotz
c8b569a8c7
cleaner comments
2022-05-14 21:28:39 -07:00
cjg91
7025c9bbeb
Transfer learning for ResNet ( #295 )
...
* Transfer learning for ResNet
* moved ResNet depth specifics into the class
2022-01-15 23:22:10 -05:00
George Hotz
55d792b065
Revert "fixup resnet"
...
This reverts commit 4eabe677ed
.
2022-01-15 20:22:01 -08:00
George Hotz
4eabe677ed
fixup resnet
2022-01-15 20:21:02 -08:00
George Hotz
c0c2c0b041
support larger ViT models
2021-12-12 10:45:10 -08:00
George Hotz
e28cdfb0cf
clean up resnet
2021-11-30 16:14:54 -05:00
George Hotz
8f5779eeaa
very minor change
2021-11-30 15:54:03 -05:00
George Hotz
d31ef0ae48
make vit names match pytorch
2021-11-30 11:34:14 -05:00
George Hotz
4b7c31b5b7
break vit into it's own file
2021-11-30 11:19:22 -05:00
George Hotz
46bbbcf7f0
model touchups
2021-11-30 11:13:34 -05:00
George Hotz
835869974c
clean up vit code
2021-11-30 10:58:03 -05:00
George Hotz
c39824bc62
oops, forgot some stars
2021-11-30 00:46:14 -05:00
George Hotz
908db3bdea
support bias in conv like linear
2021-11-30 00:44:59 -05:00
George Hotz
bd21304e3c
linear takes in weight and bias
2021-11-30 00:38:47 -05:00
George Hotz
535f02cc64
use sequential
2021-11-30 00:25:39 -05:00
George Hotz
de938c2d9d
vit is now tested
2021-11-30 00:23:06 -05:00
George Hotz
aff810e722
unify transformer block
2021-11-29 18:58:15 -05:00
George Hotz
58ed46963e
fix broadcastdot
2021-11-29 18:54:57 -05:00
George Hotz
033b04494a
resnet pretrained is broken
2021-11-29 18:13:52 -05:00
George Hotz
dca076dbf1
remove dumb nn ops
2021-11-29 18:05:31 -05:00
George Hotz
8097b8f7d6
vit works
2021-11-29 16:28:14 -05:00
George Hotz
f909ab194f
gelu with broken test
2021-11-29 15:00:50 -05:00
George Hotz
1eafa5580e
layernorm with learnable parameters
2021-11-29 13:03:57 -05:00
George Hotz
c7f795ca1e
added dot affine
2021-11-29 12:55:56 -05:00
George Hotz
30eb3afbe1
add bias term to transformer
2021-11-29 12:45:27 -05:00
George Hotz
99b6051467
add ff_dim to transformer
2021-11-29 12:40:52 -05:00
George Hotz
641b1dbb40
remove ane, start supporting ops_torch
2021-10-30 17:47:00 -07:00
George Hotz
7d12482d80
refactor efficientnet loading
2021-10-30 17:02:17 -07:00
Sebastian Kreft
3358770182
chore(efficientnet): don't use eval when loading weights ( #286 )
...
Because the weights are being loaded from a third party internet address, it's unsafe to use eval. Also with the change I think the code became a little bit more clear as now it's clearer which keys are being transformed.
Co-authored-by: Seba Kreft <sebastian.kreft@houm.com>
2021-10-22 15:10:04 -07:00
Guglielmo Camporese
2b7589db64
Added ResNet-{18, 34, 50, 101, 152} ( #271 )
...
* added resnets
* fix minor
* fix minor
* resnet in models
* added resnet test
* added resnet train test
* added linear, conv2d nn tests
* fix minor in extra/training
* resnet in models
* fix minor
* fix tolerance for linear in nn test
* fix eval, this causes cpu and gpu UT failing
* revert transformer test
* fix minor for CPU test
* improved model get_params for sequential layer
* fix minor for params counting
* commented broken ops tests
* improved train for resnet
2021-06-21 09:37:24 -07:00
George Hotz
d3f169b267
move good models to models, add a training step test
2021-06-19 11:24:15 -07:00