* quick math: 0 + x = x.
* gradient w.r.t. x using cherry for conv
* gradient w.r.t. w for conv on cherry but doing vector dot products
* small optimization
* [cherry] optimize conv backpass for large channel count
* get rid of numpy einsum
* added resnets
* fix minor
* fix minor
* resnet in models
* added resnet test
* added resnet train test
* added linear, conv2d nn tests
* fix minor in extra/training
* resnet in models
* fix minor
* fix tolerance for linear in nn test
* fix eval, this causes cpu and gpu UT failing
* revert transformer test
* fix minor for CPU test
* improved model get_params for sequential layer
* fix minor for params counting
* commented broken ops tests
* improved train for resnet
* ops_risk
* risk sim
* guessing is for winners
* minor
* better
* matmal with risk
* conv doesn't work
* closer
* conv2d works
* ops_risk
* opt2 works
* opt1 may not be possible
* opt1 is a mulacc
* arty
* attosoc example building on mac
* minor
* riscv assembler
* gucci gang
* we got C code
* not a scam
* hello
* make risk mergeable into master
* unop support
* use isinstance, some optimizations & whitespace removal
* revert whitespace changes
* revert more whitespace
* some more cleanup
* revert fstring (not a fan of the {{}})
* fix typo
* fix typo
* vgg7 implementation - not the best, but it works
* VGG7 implementation: Spread nansbane to deter NaNs, maybe improved training experience
* VGG7 implementation: Fix training, for real this time
Results actually attempt to approximate the input
* VGG7 implementation: Sample probability management
* Split tests
Split tests into "Test CPU" and "Test GPU".
Add test flag "TEST_DEVICES" which is a comma separated list of devices:
CPU,GPU,ANE
* Run tests based on provided TEST_DEVICES flag
By default will run all "CPU,GPU,ANE"
* fix bad quote
* Revert changes and use GPU=1
This is done through setting the default Tensor Device to Device.CPU of
GPU=1 is set.
Run GPU tests: GPU=1 pytest -s -v
* 2serious
* load/save
* fixing GPU
* added DEBUG
* needs BatchNorm or doesn't learn anything
* old file not needed
* added conv biases
* added extra/training.py and checkpoint
* assert in test only
* save
* padding
* num_classes
* checkpoint
* checkpoints for padding
* training was broken
* merge
* rotation augmentation
* more aug
* needs testing
* streamline augment, augment is fast thus bicubic
* tidying up
* transformer eval
* axis=-1
* transpose
* test for permutation using torch.movedims
* another test
* line
* 2serious
* load/save
* fixing GPU
* added DEBUG
* needs BatchNorm or doesn't learn anything
* old file not needed
* added conv biases
* added extra/training.py and checkpoint
* assert in test only
* save
* padding
* num_classes
* checkpoint
* checkpoints for padding
* training was broken
* merge
* rotation augmentation
* more aug
* needs testing
* streamline augment, augment is fast thus bicubic
* tidying up
* transformer eval