Marcel Bischoff
42b4761025
transformer >99.98% test accuracy in ~30s ( #230 )
...
* transformer
* BS might divide len(Y_test)
* outoput when accuracy is high
* more readeable
* fixed loss in serious_mnist for new API
2021-01-02 07:45:09 -08:00
Liam
ebd72ff437
Test split ( #231 )
...
* Split tests
Split tests into "Test CPU" and "Test GPU".
Add test flag "TEST_DEVICES" which is a comma separated list of devices:
CPU,GPU,ANE
* Run tests based on provided TEST_DEVICES flag
By default will run all "CPU,GPU,ANE"
* fix bad quote
* Revert changes and use GPU=1
This is done through setting the default Tensor Device to Device.CPU of
GPU=1 is set.
Run GPU tests: GPU=1 pytest -s -v
2021-01-01 09:19:03 -05:00
George Hotz
4a7cf2e420
more reordering
2020-12-31 09:58:02 -05:00
George Hotz
92abe43683
reduce before binary because of unbroadcasting
2020-12-31 09:49:52 -05:00
George Hotz
4291002881
reorder GPU ops
2020-12-31 09:46:39 -05:00
George Hotz
de7fe085de
no read out of bounds
2020-12-31 09:41:36 -05:00
George Hotz
1fb5fcafce
GPU slice should fix tests
2020-12-31 09:37:03 -05:00
Liam
e972a45456
Dynamically register ops to Tensor ( #232 )
...
* Dynamically register ops to Tensor
This saves lines. And reduces redundant repetition.
* ffs spacing
you don't pay me enough!
2020-12-31 09:10:19 -05:00
Marcel Bischoff
e2f833f58f
max to behave on ties like torch ( #229 )
...
* checkpoint
* fixing pow
* undo pow
* backward max on GPU and CPU rewrite
* indentation
* changing seed for curiosity
* max replaced equality
* undo seed
* rebase
* fixed tests
* merge error
2020-12-30 18:52:50 -05:00
George Hotz
30f8132646
reorder ops in ops cpu
2020-12-30 11:00:01 -05:00
George Hotz
e5b2803b5d
ops in readme
2020-12-30 10:48:55 -05:00
George Hotz
2d44bf7f1a
Dot -> Matmul
2020-12-30 10:41:51 -05:00
George Hotz
10fc3ff5b9
cleaner syntax
2020-12-30 10:35:37 -05:00
George Hotz
fcfe3dae01
write slice for CPU
2020-12-30 10:32:53 -05:00
George Hotz
47504004fd
ane ops
2020-12-29 18:00:53 -05:00
George Hotz
1f5c9618ef
refactor in readme and issue #225
2020-12-29 17:30:04 -05:00
George Hotz
f9170505b3
if you like your transformers twice as slow, use the GPU
2020-12-29 17:14:23 -05:00
George Hotz
6a6a82e999
support multidot on GPU
2020-12-29 16:56:30 -05:00
George Hotz
27208d729b
add GPU max thanks to marcelbischoff
2020-12-29 16:44:14 -05:00
George Hotz
4bbad11afe
link to papers
2020-12-29 14:15:46 -05:00
George Hotz
3f8e137b6f
extra/transformer
2020-12-29 14:14:00 -05:00
George Hotz
c4e7a1ae59
accessors are dumb
2020-12-29 14:10:26 -05:00
George Hotz
fb6aaefb9b
save 2 lines
2020-12-29 14:02:50 -05:00
George Hotz
ea341c84fe
logsoftmax good, div bad
2020-12-29 13:59:39 -05:00
George Hotz
f18801c7db
simple pool. swimming is very easy now
2020-12-29 13:48:50 -05:00
George Hotz
8f9232d59b
readmee
2020-12-29 13:40:34 -05:00
George Hotz
837aaacfbf
Unpad2D on GPU:
2020-12-29 13:16:14 -05:00
George Hotz
02655c07d5
break maxpool2d on GPU
2020-12-29 13:05:57 -05:00
George Hotz
061e37de39
touchups
2020-12-29 12:41:21 -05:00
George Hotz
a2e6562330
fix max op, less lines
2020-12-29 10:47:04 -05:00
Marcel Bischoff
dc8fa7999c
Transpose on GPU ( #221 )
...
* 2serious
* load/save
* fixing GPU
* added DEBUG
* needs BatchNorm or doesn't learn anything
* old file not needed
* added conv biases
* added extra/training.py and checkpoint
* assert in test only
* save
* padding
* num_classes
* checkpoint
* checkpoints for padding
* training was broken
* merge
* rotation augmentation
* more aug
* needs testing
* streamline augment, augment is fast thus bicubic
* tidying up
* transformer eval
* axis=-1
* transpose
* test for permutation using torch.movedims
* another test
* line
2020-12-29 10:40:11 -05:00
George Hotz
36579f66bf
max op
2020-12-28 23:54:52 -05:00
George Hotz
bcb3ceeca3
set training in functions
2020-12-28 22:45:46 -05:00
George Hotz
51bf164b72
dropout, training
2020-12-28 22:12:23 -05:00
George Hotz
7b8fee038d
it works! forgot the sqrt
2020-12-28 16:23:52 -05:00
George Hotz
1faf05ef67
ahh, it's better if i don't train the embedding
2020-12-28 16:07:02 -05:00
George Hotz
c3832e1bde
hmm, fix layernorm to not be batchnorm and it breaks
2020-12-28 13:06:21 -05:00
George Hotz
2e89e75dcb
layernorm fixes transformer instability
2020-12-28 12:58:15 -05:00
George Hotz
628d21f899
doc touchup
2020-12-28 10:45:26 -05:00
George Hotz
fafece9db7
avgpool2d is a second class op
2020-12-28 10:41:59 -05:00
George Hotz
593233b668
log and exp are first class ops
2020-12-28 10:00:30 -05:00
Marcel Bischoff
ffff98db78
Evaluation in Transformers ( #218 )
...
* 2serious
* load/save
* fixing GPU
* added DEBUG
* needs BatchNorm or doesn't learn anything
* old file not needed
* added conv biases
* added extra/training.py and checkpoint
* assert in test only
* save
* padding
* num_classes
* checkpoint
* checkpoints for padding
* training was broken
* merge
* rotation augmentation
* more aug
* needs testing
* streamline augment, augment is fast thus bicubic
* tidying up
* transformer eval
2020-12-28 09:24:51 -05:00
George Hotz
65b07d2f4f
fix onehot embed
2020-12-27 18:50:38 -05:00
George Hotz
d864e1c71a
transformer is training
2020-12-27 18:46:32 -05:00
George Hotz
a361ef6861
fixup training loop
2020-12-27 18:35:56 -05:00
George Hotz
f15bec6dbc
make multidot work on CPU
2020-12-27 17:25:37 -05:00
George Hotz
131e04c90c
cpu only decorator
2020-12-27 17:18:55 -05:00
George Hotz
2f1b2c0a3b
add transpose, start on transformer
2020-12-27 16:59:12 -05:00
gamwe6
d379502c04
Cleaning ( #211 )
...
* Cleaned
* Brought the lines into line
Co-authored-by: gamwe6 <gamwe6@users.noreply.github.com>
2020-12-27 09:58:51 -05:00
George Hotz
8a335f03ad
clock speed 32x32
2020-12-22 18:18:52 -05:00