George Hotz
58e703d099
fix tests
2020-11-10 09:49:19 -08:00
George Hotz
866b759d3b
match torch api for pad2d
2020-11-09 17:48:56 -08:00
Ryan Neph
16d564a53c
finish unsupporting strided pool, add global avg pool test ( #92 )
2020-11-09 17:31:22 -08:00
George Hotz
870b84a893
test pad2d backward on GPU
2020-11-09 15:50:43 -08:00
George Hotz
e46d122f65
not supporting stride
2020-11-09 15:06:58 -08:00
Ryan Neph
c21c2a0b62
revert b0c0c5d: Strided Pool funcs ( #74 ) ( #87 )
...
Strided CPU Pooling was introduced but assumes small kernel size
(<=(10,10)), but efficientnet.py feeds kernel_size=(112,112).
This causes a huge array buffer allocation in stack_for_pool() that
hangs inference for a long time or until system OOM.
Revert CPU Pooling for now, and re-introduce #74 later with a new
global-average-pooling op that can be used instead of avgpool2d with
large kernel size for efficientnet inference.
Co-authored-by: Ryan Neph <ryanneph@google.com>
2020-11-09 14:58:18 -08:00
Ryan Neph
7e515308a5
label op subtests by params ( #83 )
2020-11-09 06:25:06 -08:00
Ryan Neph
5bedf566d1
tests should use rtol unless special case ( #82 )
2020-11-08 17:25:11 -08:00
Ryan Neph
04b9312a34
Fix GPU Pooling bug at boundary + better Pooling test coverage ( #81 )
...
* fixed Pooling bug
* Clarify Pooling tests
2020-11-08 17:25:01 -08:00
Ryan Neph
b0c0c5d0d6
strided Pool funcs ( #74 )
...
* *Pool2D GPU forward supports stride
* kernel_size from ctx instead of saved_tensors
* *Pool2D CPU forward supports stride
* update ctx.stride properly
2020-11-08 11:45:55 -08:00
ziofil
db3eccc16b
implemented backward for Pad2D & test ( #73 )
2020-11-07 21:58:42 -08:00
Ryan Neph
5265f6c578
add AvgPool2D backward pass on GPU ( #68 )
2020-11-07 12:27:29 -08:00
George Hotz
30442a086a
some broadcasting, pool test is fail
2020-11-07 11:29:42 -08:00
George Hotz
94d44c97bf
add pad2d on GPU
2020-11-07 10:46:36 -08:00
George Hotz
fbff6ab2e5
fix strided convs, GPU env var for enet
2020-11-07 10:26:37 -08:00
George Hotz
ec03eb44bd
tinygrad does forward pass convs on GPU
2020-11-07 10:15:56 -08:00
George Hotz
bc7758cc5b
getting convs to work on gpu
2020-11-07 09:17:57 -08:00
George Hotz
3302286e68
yayay test_sgd_gpu passes
2020-11-07 08:48:17 -08:00
George Hotz
38e112cccd
logsoftmax test
2020-11-07 07:26:53 -08:00
Rene Delgado
cd54697fd8
fix gpu sum forward ( #61 )
...
* ignore venv
* add sum test
* fix sum forward
2020-11-05 21:59:16 -08:00
NeuralLink
cc605da36d
Stable Sigmoid op ( #59 )
...
* 🔨 Added stable sigmoid
* ✅ added sigmoid test
* 🔧 suppressed overflow warning
* 🔧 clean up
2020-11-05 21:57:50 -08:00
George Hotz
f178d23ff3
gpu relu is good
2020-11-02 08:25:32 -08:00
George Hotz
231c1134bd
cute trick for GPU test
2020-11-02 08:17:17 -08:00
George Hotz
5201a8e89f
matmul on GPU
2020-11-01 08:54:20 -08:00
George Hotz
41e7d59aed
test dot
2020-11-01 07:51:35 -08:00
George Hotz
1f544d6ece
test mnist on GPU
2020-11-01 07:46:17 -08:00
George Hotz
9ac1ad40d6
Add GPU Support! (do not merge yet) ( #41 )
...
* copy tensors to and from gpu
* add on GPU
* adding works
* we stick shapes in
* works on cpu and gpu
* test changes, not passing yet
* something else
* op tests pass
* add, mean, and sum have working forward/backward
* mul ops test
* no gpu support, no problem
* test pass, clean up later
* gpu cleanup
* cleanup test ops, don't let div fail
* revert more
* aimpler dispatcher
* clean up grad
* GPU and
* grad is a Tensor now
* gate test on GPU
* cleanups
* late loading gpu
* GPU as input option
* last cleanups
2020-11-01 07:00:49 -08:00
George Hotz
2c7e75d733
group conv: forward pass works ( #34 )
...
* forward pass works
* got the backward pass
* okay, it's now a coho
2020-10-30 09:19:20 -07:00
George Hotz
339a35b081
div needs help
2020-10-30 08:32:16 -07:00
George Hotz
c14473f87d
unit test for batchnorm2d
2020-10-30 08:19:58 -07:00
George Hotz
5e7e359706
fix tests
2020-10-29 08:19:07 -07:00
George Hotz
9ae3e9daf3
shape has to be a kwarg now, idk why this didn't break before
2020-10-29 08:13:05 -07:00
George Hotz
f84f6c1edd
write sqrt and div using pow
2020-10-29 07:57:25 -07:00
Göktuğ Karakaşlı
4b163ee270
efficient version of adam ( #20 )
...
* counteracted bias initialization
* test new adam
* add optimizer tests
* rename helper function names to fix the test
* remove redundant import
2020-10-27 15:54:40 -07:00
George Hotz
f9788eba14
parameters, and start on efficientnet
2020-10-27 08:53:35 -07:00
George Hotz
1654008c1f
conv stride support
2020-10-26 08:54:43 -07:00
George Hotz
2a55d7402b
clean up ops, refactor pool backward. add stride test
2020-10-26 08:47:11 -07:00
George Hotz
93dceb4bee
fix kernel_size bug, name like torch, add test
2020-10-26 08:38:53 -07:00
Timothy Mc Alister
15e5988323
make default parameters work for functions
2020-10-26 12:43:36 +01:00
George Hotz
2d37fd686b
test ops
2020-10-25 19:03:49 -07:00
George Hotz
2eebbd32c6
ops test speed
2020-10-25 19:01:02 -07:00
George Hotz
b27bcbe4b4
avgpool and test refactor
2020-10-25 18:40:01 -07:00
George Hotz
4c42676cb6
400 -> 200
2020-10-25 17:19:59 -07:00
George Hotz
567707a5f6
rename max_pool2d to match torch, remove more fast conv crap
2020-10-25 17:16:47 -07:00
George Hotz
ea41f5e1c1
seems more generic
2020-10-25 16:40:37 -07:00
George Hotz
2333c4dea7
no tqdm in actions
2020-10-25 16:40:08 -07:00
George Hotz
ad48061927
better sort in torch profiler
2020-10-25 16:07:49 -07:00
George Hotz
82f8e10813
no hacks in that test
2020-10-25 15:52:05 -07:00
George Hotz
4baa4c041f
it's crazy how much faster pytorch is than numpy
2020-10-25 15:42:33 -07:00
George Hotz
5ddbd7f04b
2 to 3x slower than torch
2020-10-25 15:27:33 -07:00