George Hotz
df64658a2c
weee, opencl tests in CI
2020-11-10 10:04:45 -08:00
George Hotz
d47a128812
pocl
2020-11-10 10:02:13 -08:00
George Hotz
c05401a9ca
sudo maybe
2020-11-10 09:53:49 -08:00
George Hotz
09bc8eddfe
clinfo
2020-11-10 09:51:38 -08:00
George Hotz
58e703d099
fix tests
2020-11-10 09:49:19 -08:00
George Hotz
23405cec43
intel opencl
2020-11-10 09:41:40 -08:00
George Hotz
33090c4b0d
install more
2020-11-10 09:34:56 -08:00
George Hotz
a52590e76c
cpu opencl maybe
2020-11-10 09:32:54 -08:00
George Hotz
f513302955
refactor profiler
2020-11-10 07:31:16 -08:00
adamritter
f27628b21c
No separate pad2d kernel needed ( #99 )
...
Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-10 06:47:53 -08:00
George Hotz
2d4a5d5950
readme
2020-11-10 01:27:04 -08:00
George Hotz
6e6bcbe5f2
shapes on backward
2020-11-10 01:23:31 -08:00
Ryan Neph
56f71ae8e5
Cleanup ( #96 )
...
* init GPU supsample retbuf to 0
* reduce GPU kernel source lines
ref: #94
2020-11-10 01:20:04 -08:00
George Hotz
55012d21bb
debug in backward pass too
2020-11-10 01:19:52 -08:00
George Hotz
5d1985312c
miniprofiler is real
2020-11-10 01:05:29 -08:00
George Hotz
c76a20b4be
4s and 7s work
2020-11-10 00:54:17 -08:00
George Hotz
ae0cd17c2d
debug is env var, and simpler faster pad2d
2020-11-10 00:42:23 -08:00
George Hotz
f7d10d5639
DEBUG flag
2020-11-10 00:36:59 -08:00
George Hotz
6a56d5d030
remove pyopencl
2020-11-09 23:58:35 -08:00
George Hotz
943ff6490d
logic error okay too
2020-11-09 23:55:21 -08:00
George Hotz
29b22d117f
will gpu tests work?
2020-11-09 23:51:29 -08:00
George Hotz
abbf0d1328
cleanup logsoftmax
2020-11-09 23:47:04 -08:00
George Hotz
aeb90226a8
reduce lines with reduce_op
2020-11-09 23:36:44 -08:00
George Hotz
55c914912d
minor cleanup
2020-11-09 23:14:59 -08:00
George Hotz
d41ad2bf37
uint2 saves lines
2020-11-09 23:11:20 -08:00
George Hotz
8b23033fa9
support all the enet sizes
2020-11-09 18:04:16 -08:00
George Hotz
9db95ab942
fix enet padding
2020-11-09 17:56:57 -08:00
George Hotz
866b759d3b
match torch api for pad2d
2020-11-09 17:48:56 -08:00
George Hotz
daf073535f
new -> zeros
2020-11-09 17:31:52 -08:00
Ryan Neph
16d564a53c
finish unsupporting strided pool, add global avg pool test ( #92 )
2020-11-09 17:31:22 -08:00
George Hotz
7ac1b163a5
add backward to enet train
2020-11-09 16:05:52 -08:00
George Hotz
8ca9c0205f
train_efficientnet is broken still
2020-11-09 16:01:16 -08:00
George Hotz
870b84a893
test pad2d backward on GPU
2020-11-09 15:50:43 -08:00
adamritter
b541c05d88
Pad2d backward pass on GPU ( #89 )
...
* Pad2d backward pass on GPU
* Faster Pad2D GPU backward pass (no zeroing needed)
* Fix out of bounds error
* Don't save prg
Co-authored-by: holonomicjl <58403584+holonomicjl@users.noreply.github.com>
2020-11-09 15:49:37 -08:00
George Hotz
e46d122f65
not supporting stride
2020-11-09 15:06:58 -08:00
Ryan Neph
c21c2a0b62
revert b0c0c5d: Strided Pool funcs ( #74 ) ( #87 )
...
Strided CPU Pooling was introduced but assumes small kernel size
(<=(10,10)), but efficientnet.py feeds kernel_size=(112,112).
This causes a huge array buffer allocation in stack_for_pool() that
hangs inference for a long time or until system OOM.
Revert CPU Pooling for now, and re-introduce #74 later with a new
global-average-pooling op that can be used instead of avgpool2d with
large kernel size for efficientnet inference.
Co-authored-by: Ryan Neph <ryanneph@google.com>
2020-11-09 14:58:18 -08:00
George Hotz
53157fb876
add back scale
2020-11-09 10:20:56 -08:00
George Hotz
3ffbd47335
Revert "Revert "pygame is fine, cv2 can also do the trick ( #79 )" ( #85 )"
...
This reverts commit 6b982621f8
.
2020-11-09 10:18:48 -08:00
George Hotz
6b982621f8
Revert "pygame is fine, cv2 can also do the trick ( #79 )" ( #85 )
...
This reverts commit e7f2f43331
.
2020-11-09 10:03:38 -08:00
dustcollector12
e7f2f43331
pygame is fine, cv2 can also do the trick ( #79 )
...
* pygame is fine, cv2 can also do the trick
* retimg and copy constructor not needed
* shape is missing without copy constructor
* retimg put back
* addressing capture buffering
2020-11-09 10:02:06 -08:00
Ryan Neph
7e515308a5
label op subtests by params ( #83 )
2020-11-09 06:25:06 -08:00
Ryan Neph
5bedf566d1
tests should use rtol unless special case ( #82 )
2020-11-08 17:25:11 -08:00
Ryan Neph
04b9312a34
Fix GPU Pooling bug at boundary + better Pooling test coverage ( #81 )
...
* fixed Pooling bug
* Clarify Pooling tests
2020-11-08 17:25:01 -08:00
niclaswue
c57b1b9e7d
deleted unnecessary import in utils ( #78 )
2020-11-08 15:55:16 -08:00
Ryan Neph
b0c0c5d0d6
strided Pool funcs ( #74 )
...
* *Pool2D GPU forward supports stride
* kernel_size from ctx instead of saved_tensors
* *Pool2D CPU forward supports stride
* update ctx.stride properly
2020-11-08 11:45:55 -08:00
George Hotz
06504a5824
bump version
2020-11-08 09:34:07 -08:00
ziofil
db3eccc16b
implemented backward for Pad2D & test ( #73 )
2020-11-07 21:58:42 -08:00
George Hotz
75d69e956f
readme more
2020-11-07 21:58:20 -08:00
Dimitar Vagalinski
35a5c82a2a
done as he said ( #71 )
2020-11-07 18:28:39 -08:00
Ryan Neph
5265f6c578
add AvgPool2D backward pass on GPU ( #68 )
2020-11-07 12:27:29 -08:00