Commit Graph

488 Commits

Author SHA1 Message Date
George Hotz a18c1f3178 zero out the inputs 2022-10-20 13:46:52 -07:00
George Hotz ace8db29f8 ReduceSum 2022-10-20 12:48:14 -07:00
George Hotz c400ee0beb
refactoring thneed (#400)
* refactoring thneed

* continue

* minor update

* looks like it's working

* big refactor

* confirm thneed got the right output

* code is there but it's broken

* works now

* always OPTWG, input -> dat

* fix type issue
2022-10-20 12:35:59 -07:00
YassineYousfi ae0f9b17df
openpilot: new models and onnx ops (#401)
* ngrl stuff

* fngrl

* fix typo in compile script

* workflow dispatch

* new models in tests

* dont need to up this threshold

Co-authored-by: HaraldSchafer <harald.the.engineer@gmail.com>
2022-10-20 11:49:19 -07:00
George Hotz ff11c4316b move get_parameters to optim.py 2022-09-25 13:16:58 -04:00
Jacky Lee 2c01a66265
Reshape dataset from fetch_mnist (#390) 2022-09-24 21:16:29 -04:00
George Hotz 271446e3eb
set requires_grad to None (#387)
* set requires_grad to None

* some things need gradients

* hmm, why was get_parameters filtering
2022-09-21 11:16:02 -04:00
YassineYousfi 2f0f91ba3d
support float16 onnx weights (#384) 2022-09-15 09:12:18 -04:00
YassineYousfi 1a7bdc51f8
support more onnx ops (#376)
* broadcast from right to left

* add another broadcasted add test

* more onnx ops

* use float32 range in clip
2022-09-07 15:15:24 -07:00
George Hotz 0516359af8 fix stupid OPENCL=1 OOM 2022-09-06 14:29:23 -07:00
George Hotz 4dadd95e3c fix tests hopefully, more stable diffusion 2022-09-03 10:38:31 -07:00
George Hotz c01a8c5c2d stable diffusion start 2022-09-03 10:08:42 -07:00
George Hotz a3fc64a585 fix batchnorm folding in openpilot compile 2022-08-31 13:04:49 -07:00
George Hotz dc7af8c3ac thneed run float32 2022-08-28 11:03:35 -07:00
George Hotz b132de677d
tinygrad.nn (#367)
* tinygrad.nn

* flake8

* working on pylint

* more pylint

* more pylint

* pylint passes

* networkx

* mypy can't infer that type

* junk
2022-08-18 07:41:00 -07:00
George Hotz f76d41812b prune graph 2022-07-17 15:38:43 -07:00
George Hotz eda6f071b2 default opt level 2 2022-07-17 14:54:40 -07:00
George Hotz 73b0471b25 join expands 2022-07-17 13:42:05 -07:00
George Hotz d04b274cd2 noop removal can replace with reshape 2022-07-16 08:32:42 -07:00
George Hotz 2720ef49ca extra and test and tuple 2022-07-07 10:01:33 -07:00
George Hotz 81b73f97a3
Optiimzation (#355)
* constant folding into kernels

* that opt worth it?

* fix mypy

* ast one kernel

* save 2 lines in conv kernel

* debug print kernel count

* cl debugging

* early realize inputs

* refactor Device
2022-07-04 08:58:57 -07:00
George Hotz 7276f8d6bf improve constant folding, detach before moving tensor 2022-07-02 15:29:40 -07:00
George Hotz 8cf1aed0f4 don't track_running_stats, parameters must require_grad 2022-07-02 14:38:45 -07:00
George Hotz 49c954b389 comments 2022-06-26 17:20:25 -07:00
George Hotz 83d50e2687 move to extra.onnx 2022-06-21 19:43:44 -07:00
George Hotz 9b27ba650b load new torch files 2022-06-07 10:06:48 -07:00
George Hotz 233c71a7ba support requires_grad 2022-06-06 07:47:31 -07:00
George Hotz d8d19ed468 wikimedia wasn't returning 200 2022-01-15 19:09:29 -08:00
George Hotz e28cdfb0cf clean up resnet 2021-11-30 16:14:54 -05:00
George Hotz 58ed46963e fix broadcastdot 2021-11-29 18:54:57 -05:00
George Hotz dca076dbf1 remove dumb nn ops 2021-11-29 18:05:31 -05:00
George Hotz 30eb3afbe1 add bias term to transformer 2021-11-29 12:45:27 -05:00
George Hotz e2a8961a18 less lines, fix bug 2021-11-17 12:52:17 -08:00
George Hotz ba28761894 move yolo into examples/yolo 2021-10-30 19:46:00 -07:00
George Hotz 63f50cff45 move back again 2021-10-30 16:13:29 -07:00
Evan Mays 285621aeda
Cherry backprop for conv2d (#281)
* quick math: 0 + x = x.

* gradient w.r.t. x using cherry for conv

* gradient w.r.t. w for conv on cherry but doing vector dot products

* small optimization

* [cherry] optimize conv backpass for large channel count

* get rid of numpy einsum
2021-10-30 16:12:19 -07:00
George Hotz 3d646272d6 move back 2021-10-30 16:12:12 -07:00
George Hotz ac8afd24fa refactor accel 2021-10-30 16:10:59 -07:00
Guglielmo Camporese 2b7589db64
Added ResNet-{18, 34, 50, 101, 152} (#271)
* added resnets

* fix minor

* fix minor

* resnet in models

* added resnet test

* added resnet train test

* added linear, conv2d nn tests

* fix minor in extra/training

* resnet in models

* fix minor

* fix tolerance for linear in nn test

* fix eval, this causes cpu and gpu UT failing

* revert transformer test

* fix minor for CPU test

* improved model get_params for sequential layer

* fix minor for params counting

* commented broken ops tests

* improved train for resnet
2021-06-21 09:37:24 -07:00
George Hotz 89798d2f43 some flags 2021-06-19 11:46:31 -07:00
George Hotz d81eae8288 debug cherry crash 2021-06-19 11:41:20 -07:00
George Hotz d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
George Hotz b48d4bad2e clean up print spam 2021-06-19 10:31:04 -07:00
George Hotz 027535d0b5 microcoded matmul 2021-06-17 21:03:08 -07:00
George Hotz 026e2ae6a7 three registers and a zero command 2021-06-17 17:09:18 -07:00
George Hotz 2e71ae33f6 max op works 2021-06-17 17:01:21 -07:00
George Hotz 9e12c1bbba cherry binop 2021-06-17 16:50:40 -07:00
George Hotz fcdabea880 training mnist with cherry ops 2021-06-17 16:45:35 -07:00
George Hotz 2affd226b3 speed up sum 2021-06-17 16:38:34 -07:00
George Hotz e8eb7d1b7e max op 2021-06-17 16:20:56 -07:00
George Hotz c1d469d440 sum op 2021-06-17 16:19:35 -07:00
George Hotz b1000d866e readme, plus reduce ops 2021-06-16 11:21:06 -07:00
George Hotz ff3fdc58e5 risk -> cherry 2021-06-16 09:59:48 -07:00
George Hotz 2f91c012eb build note 2021-06-15 22:41:41 -07:00
George Hotz 4850d6eb43 update todo 2021-06-15 10:22:39 -07:00
George Hotz 4e1edb3692 have tinygrad log the loads 2021-06-14 18:35:14 -07:00
George Hotz 93f2e9769d little note 2021-06-14 15:49:41 -07:00
George Hotz a89d12d735 wow, way faster 2021-06-10 17:11:39 -07:00
George Hotz 10b1306525 binops 2021-06-10 16:52:37 -07:00
George Hotz 4535d39baa comments and pow 2021-06-10 09:03:40 -07:00
George Hotz 2075fdeb4f
FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
Josh Smith ad756f6112
minor optimizations & cleaning (#257)
* use isinstance, some optimizations & whitespace removal

* revert whitespace changes

* revert more whitespace

* some more cleanup

* revert fstring (not a fan of the {{}})

* fix typo

* fix typo
2021-06-02 09:57:15 -07:00
George Hotz b80cacb416 fix GPU efficientnet example 2021-05-26 17:29:35 -07:00
20kdc 2653d33292
vgg7 (image upscaling) implementation - not the best, but it works (#255)
* vgg7 implementation - not the best, but it works

* VGG7 implementation: Spread nansbane to deter NaNs, maybe improved training experience

* VGG7 implementation: Fix training, for real this time

Results actually attempt to approximate the input

* VGG7 implementation: Sample probability management
2021-05-12 23:48:51 -07:00
George Hotz ac229ea750 remove print 2021-01-02 12:53:30 -08:00
George Hotz 895d142503 start trying to load yolo v5 2021-01-02 12:51:55 -08:00
Marcel Bischoff 42b4761025
transformer >99.98% test accuracy in ~30s (#230)
* transformer

* BS might divide len(Y_test)

* outoput when accuracy is high

* more readeable

* fixed loss in serious_mnist for new API
2021-01-02 07:45:09 -08:00
Liam ebd72ff437
Test split (#231)
* Split tests

Split tests into "Test CPU" and "Test GPU".

Add test flag "TEST_DEVICES" which is a comma separated list of devices:
CPU,GPU,ANE

* Run tests based on provided TEST_DEVICES flag

By default will run all "CPU,GPU,ANE"

* fix bad quote

* Revert changes and use GPU=1

This is done through setting the default Tensor Device to Device.CPU of
GPU=1 is set.

Run GPU tests: GPU=1 pytest -s -v
2021-01-01 09:19:03 -05:00
George Hotz f9170505b3 if you like your transformers twice as slow, use the GPU 2020-12-29 17:14:23 -05:00
George Hotz 3f8e137b6f extra/transformer 2020-12-29 14:14:00 -05:00
Marcel Bischoff dc8fa7999c
Transpose on GPU (#221)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up

* transformer eval

* axis=-1

* transpose

* test for permutation using torch.movedims

* another test

* line
2020-12-29 10:40:11 -05:00
George Hotz bcb3ceeca3 set training in functions 2020-12-28 22:45:46 -05:00
Marcel Bischoff ffff98db78
Evaluation in Transformers (#218)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up

* transformer eval
2020-12-28 09:24:51 -05:00
George Hotz d864e1c71a transformer is training 2020-12-27 18:46:32 -05:00
George Hotz a361ef6861 fixup training loop 2020-12-27 18:35:56 -05:00
Nicklas Boman 06f359baa3
issue-193 - Move torch loader out of efficientnet code (#213) 2020-12-22 00:19:16 -05:00
iainwo 56d44637f3
fixed pylint, formatted python files iwth cblack on localhost (#204)
* fixed pylint, formatted python files iwth cblack on localhost

* Revert "fixed pylint, formatted python files iwth cblack on localhost"

This reverts commit 07e2b88466fa53399ad78d962ffb2ad55bc45344.

* dedented 4-spaces added linter

Co-authored-by: Iain Wong <iainwong@outlook.com>
2020-12-17 14:37:31 -08:00
Liam bcf1518309
All devices are equal! (#196)
* Update all devices to be tested

ANE, CPU and OCL all now support all tests.

However tests are not currently passing on GPU and I cannot test on CPU.

Failing GPU test are not an issue caused by this update. Tests have not
been passing due to a missing "six" required installation.

OpenCL Tests have not been run since commit: 1a1c63a08b

devices have 3 types and are handle by a new DeviceTypes enum. (The goal
is to revert to Tensor.<type>, but this current setup allows for keyword
argument defaults: `device=DeviceType.CPU`)

All references to Tensor.GPU/CPU/ANE as been converted to the
corresponding `DeviceTypes` enum.

Refactor of the conversion code to allow for any device to any device
conversion.

* Add six dependency in requirements.txt

* Resolve failure to run tests

Move six into gpu required installs. Remove six from standard
installation.

* Remove repeated data conversion

* Refactor method names

Also reduce code with .to and .to_

* Dynamic device handlers

* Refactor DeviceTypes -> Device

* Add mem copy profiling back

* test_backward_pass_diamond_model passing

* Resolve Sum issue on GPU

* Revert batchnorm2d tests

* Update README with upadated API

* ANE testing with

* Last minute line gains
2020-12-15 23:44:08 -08:00
Marcel Bischoff da72a0eed4
Big MNIST model with PIL augmentation and load/save (#160)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up
2020-12-13 20:45:55 -08:00
George Hotz 07ece2105e actually move it 2020-12-12 15:26:58 -08:00
George Hotz 1d10559d1d tinygrad.utils -> extra.utils 2020-12-12 15:26:07 -08:00
George Hotz 00312b8ad1 batchnorm work 2020-12-06 14:40:07 -08:00
George Hotz da514c2918 fix enet init 2020-12-06 13:52:07 -08:00
George Hotz 521098cc2f se optional, track time better 2020-12-06 12:29:42 -08:00
George Hotz 609d11e699 trainer works with CIFAR 2020-12-06 12:20:14 -08:00
George Hotz 03994e0011 load torch files without torch 2020-11-21 13:43:53 -08:00
George Hotz 2ffb8de1ea move efficientnet to extra 2020-11-16 08:08:07 -08:00
George Hotz 13d34373d1 move gradcheck to extra, clean up unbroadcast 2020-11-16 08:03:31 -08:00