Commit Graph

30 Commits

Author SHA1 Message Date
chenyu 507e0afba0
fix onehot and jit in examples/transformer (#3073)
trained to 0.999 in < 6 seconds on M1 Max consistently
2024-01-10 02:22:41 -05:00
George Hotz ae83733431 hotfix: examples/transformer.py 2024-01-09 19:28:09 -08:00
George Hotz 0cbf6c1811
move things, clean up extra (#2292)
* move things

* idk why pylint needs that now

* delete unused
2023-11-13 20:18:40 -08:00
George Hotz 718ced296c
move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
nimlgen d363d25ee2
fix imports for examples/transformer.py (#1136) 2023-07-05 08:15:13 -07:00
Eli Frigo 10f1aeb144
fixed broken link (#1097) 2023-07-02 15:06:59 -07:00
Fernando Vidal 73bd0b217b
add int64 as supported dtype from numpy (#699)
* add int64 as supported dtype from numpy

Without this, examples/transformer.py didn't run. With this change it runs successfully.

* Update helpers.py

* Update transformer.py

* Update training.py
2023-03-18 17:15:04 -07:00
Jacky Lee 9fd41632c6
Import get_parameters from tinygrad.nn (#559)
* get_parameter is in optim

* Update all imports for get_parameters

* Clean up

* use optim.get_paramters
2023-02-17 15:22:26 -08:00
Jacky Lee f08187526f
Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
Jacky Lee 799b3f185a
Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz b132de677d
tinygrad.nn (#367)
* tinygrad.nn

* flake8

* working on pylint

* more pylint

* more pylint

* pylint passes

* networkx

* mypy can't infer that type

* junk
2022-08-18 07:41:00 -07:00
George Hotz 99b6051467 add ff_dim to transformer 2021-11-29 12:40:52 -05:00
George Hotz d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
Marcel Bischoff 42b4761025
transformer >99.98% test accuracy in ~30s (#230)
* transformer

* BS might divide len(Y_test)

* outoput when accuracy is high

* more readeable

* fixed loss in serious_mnist for new API
2021-01-02 07:45:09 -08:00
George Hotz f9170505b3 if you like your transformers twice as slow, use the GPU 2020-12-29 17:14:23 -05:00
George Hotz 3f8e137b6f extra/transformer 2020-12-29 14:14:00 -05:00
George Hotz bcb3ceeca3 set training in functions 2020-12-28 22:45:46 -05:00
George Hotz 51bf164b72 dropout, training 2020-12-28 22:12:23 -05:00
George Hotz 7b8fee038d it works! forgot the sqrt 2020-12-28 16:23:52 -05:00
George Hotz 1faf05ef67 ahh, it's better if i don't train the embedding 2020-12-28 16:07:02 -05:00
George Hotz c3832e1bde hmm, fix layernorm to not be batchnorm and it breaks 2020-12-28 13:06:21 -05:00
George Hotz 2e89e75dcb layernorm fixes transformer instability 2020-12-28 12:58:15 -05:00
George Hotz 593233b668 log and exp are first class ops 2020-12-28 10:00:30 -05:00
Marcel Bischoff ffff98db78
Evaluation in Transformers (#218)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up

* transformer eval
2020-12-28 09:24:51 -05:00
George Hotz 65b07d2f4f fix onehot embed 2020-12-27 18:50:38 -05:00
George Hotz d864e1c71a transformer is training 2020-12-27 18:46:32 -05:00
George Hotz a361ef6861 fixup training loop 2020-12-27 18:35:56 -05:00
George Hotz f15bec6dbc make multidot work on CPU 2020-12-27 17:25:37 -05:00
George Hotz 131e04c90c cpu only decorator 2020-12-27 17:18:55 -05:00
George Hotz 2f1b2c0a3b add transpose, start on transformer 2020-12-27 16:59:12 -05:00