Commit Graph

97 Commits

Author SHA1 Message Date
George Hotz 58e703d099 fix tests 2020-11-10 09:49:19 -08:00
George Hotz 866b759d3b match torch api for pad2d 2020-11-09 17:48:56 -08:00
Ryan Neph 16d564a53c
finish unsupporting strided pool, add global avg pool test (#92) 2020-11-09 17:31:22 -08:00
George Hotz 870b84a893 test pad2d backward on GPU 2020-11-09 15:50:43 -08:00
George Hotz e46d122f65 not supporting stride 2020-11-09 15:06:58 -08:00
Ryan Neph c21c2a0b62
revert b0c0c5d: Strided Pool funcs (#74) (#87)
Strided CPU Pooling was introduced but assumes small kernel size
(<=(10,10)), but efficientnet.py feeds kernel_size=(112,112).

This causes a huge array buffer allocation in stack_for_pool() that
hangs inference for a long time or until system OOM.

Revert CPU Pooling for now, and re-introduce #74 later with a new
global-average-pooling op that can be used instead of avgpool2d with
large kernel size for efficientnet inference.

Co-authored-by: Ryan Neph <ryanneph@google.com>
2020-11-09 14:58:18 -08:00
Ryan Neph 7e515308a5
label op subtests by params (#83) 2020-11-09 06:25:06 -08:00
Ryan Neph 5bedf566d1
tests should use rtol unless special case (#82) 2020-11-08 17:25:11 -08:00
Ryan Neph 04b9312a34
Fix GPU Pooling bug at boundary + better Pooling test coverage (#81)
* fixed Pooling bug

* Clarify Pooling tests
2020-11-08 17:25:01 -08:00
Ryan Neph b0c0c5d0d6
strided Pool funcs (#74)
* *Pool2D GPU forward supports stride

* kernel_size from ctx instead of saved_tensors

* *Pool2D CPU forward supports stride

* update ctx.stride properly
2020-11-08 11:45:55 -08:00
ziofil db3eccc16b
implemented backward for Pad2D & test (#73) 2020-11-07 21:58:42 -08:00
Ryan Neph 5265f6c578
add AvgPool2D backward pass on GPU (#68) 2020-11-07 12:27:29 -08:00
George Hotz 30442a086a some broadcasting, pool test is fail 2020-11-07 11:29:42 -08:00
George Hotz 94d44c97bf add pad2d on GPU 2020-11-07 10:46:36 -08:00
George Hotz fbff6ab2e5 fix strided convs, GPU env var for enet 2020-11-07 10:26:37 -08:00
George Hotz ec03eb44bd tinygrad does forward pass convs on GPU 2020-11-07 10:15:56 -08:00
George Hotz bc7758cc5b getting convs to work on gpu 2020-11-07 09:17:57 -08:00
George Hotz 3302286e68 yayay test_sgd_gpu passes 2020-11-07 08:48:17 -08:00
George Hotz 38e112cccd logsoftmax test 2020-11-07 07:26:53 -08:00
Rene Delgado cd54697fd8
fix gpu sum forward (#61)
* ignore venv

* add sum test

* fix sum forward
2020-11-05 21:59:16 -08:00
NeuralLink cc605da36d
Stable Sigmoid op (#59)
* 🔨 Added stable sigmoid

*  added sigmoid test

* 🔧 suppressed overflow warning

* 🔧 clean up
2020-11-05 21:57:50 -08:00
George Hotz f178d23ff3 gpu relu is good 2020-11-02 08:25:32 -08:00
George Hotz 231c1134bd cute trick for GPU test 2020-11-02 08:17:17 -08:00
George Hotz 5201a8e89f matmul on GPU 2020-11-01 08:54:20 -08:00
George Hotz 41e7d59aed test dot 2020-11-01 07:51:35 -08:00
George Hotz 1f544d6ece test mnist on GPU 2020-11-01 07:46:17 -08:00
George Hotz 9ac1ad40d6
Add GPU Support! (do not merge yet) (#41)
* copy tensors to and from gpu

* add on GPU

* adding works

* we stick shapes in

* works on cpu and gpu

* test changes, not passing yet

* something else

* op tests pass

* add, mean, and sum have working forward/backward

* mul ops test

* no gpu support, no problem

* test pass, clean up later

* gpu cleanup

* cleanup test ops, don't let div fail

* revert more

* aimpler dispatcher

* clean up grad

* GPU and

* grad is a Tensor now

* gate test on GPU

* cleanups

* late loading gpu

* GPU as input option

* last cleanups
2020-11-01 07:00:49 -08:00
George Hotz 2c7e75d733
group conv: forward pass works (#34)
* forward pass works

* got the backward pass

* okay, it's now a coho
2020-10-30 09:19:20 -07:00
George Hotz 339a35b081 div needs help 2020-10-30 08:32:16 -07:00
George Hotz c14473f87d unit test for batchnorm2d 2020-10-30 08:19:58 -07:00
George Hotz 5e7e359706 fix tests 2020-10-29 08:19:07 -07:00
George Hotz 9ae3e9daf3 shape has to be a kwarg now, idk why this didn't break before 2020-10-29 08:13:05 -07:00
George Hotz f84f6c1edd write sqrt and div using pow 2020-10-29 07:57:25 -07:00
Göktuğ Karakaşlı 4b163ee270
efficient version of adam (#20)
* counteracted bias initialization

* test new adam

* add optimizer tests

* rename helper function names to fix the test

* remove redundant import
2020-10-27 15:54:40 -07:00
George Hotz f9788eba14 parameters, and start on efficientnet 2020-10-27 08:53:35 -07:00
George Hotz 1654008c1f conv stride support 2020-10-26 08:54:43 -07:00
George Hotz 2a55d7402b clean up ops, refactor pool backward. add stride test 2020-10-26 08:47:11 -07:00
George Hotz 93dceb4bee fix kernel_size bug, name like torch, add test 2020-10-26 08:38:53 -07:00
Timothy Mc Alister 15e5988323 make default parameters work for functions 2020-10-26 12:43:36 +01:00
George Hotz 2d37fd686b test ops 2020-10-25 19:03:49 -07:00
George Hotz 2eebbd32c6 ops test speed 2020-10-25 19:01:02 -07:00
George Hotz b27bcbe4b4 avgpool and test refactor 2020-10-25 18:40:01 -07:00
George Hotz 4c42676cb6 400 -> 200 2020-10-25 17:19:59 -07:00
George Hotz 567707a5f6 rename max_pool2d to match torch, remove more fast conv crap 2020-10-25 17:16:47 -07:00
George Hotz ea41f5e1c1 seems more generic 2020-10-25 16:40:37 -07:00
George Hotz 2333c4dea7 no tqdm in actions 2020-10-25 16:40:08 -07:00
George Hotz ad48061927 better sort in torch profiler 2020-10-25 16:07:49 -07:00
George Hotz 82f8e10813 no hacks in that test 2020-10-25 15:52:05 -07:00
George Hotz 4baa4c041f it's crazy how much faster pytorch is than numpy 2020-10-25 15:42:33 -07:00
George Hotz 5ddbd7f04b 2 to 3x slower than torch 2020-10-25 15:27:33 -07:00