Commit Graph

31 Commits

Author SHA1 Message Date
nimlgen 1c0449e190
add cache collector (#1595)
* init cache collector

* add test_cache_collector.py

* switch GlobalCounters.cache to CacheCollector

* init jit models test

* jitted SD

* add debug msg to print loaded bufs count

* moved cache collctor to jit

* clearer SD

* no double device import
2023-08-28 19:59:55 -07:00
George Hotz 718ced296c
move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
David Heidelberg 13659ac6fa
examples: numpy() array returns only one value, not an array (#1534)
Fixes issue:
```
    loss_cpu = loss.detach().numpy()[0]
               ~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: too many indices for array: array is 0-dimensional, but 1 were indexed
```

Signed-off-by: David Heidelberg <david@ixit.cz>
2023-08-13 14:33:05 -07:00
Reza Rezvan 8ae9a054ae
Refactor nn.optim (#1091)
* Refactor: nn.optim.py

* Refactor: nn.optim.py; Fix all tests

* Refactor: Replace all optim.get_parameters()

* Refactor: Revert list comp.

* Refactor: Replace optim.get_state_dict

* Refactor: Change quickstart.md
2023-07-02 15:07:30 -07:00
George Hotz 2e56a4793e rename log_softmax, support dim, fix onnx Softmax 2023-02-24 10:11:24 -08:00
George Hotz 7d33f2d659 CL.CACHE is over, GlobalCounters.cache is it 2023-02-11 12:00:14 -08:00
George Hotz fed95119dc CL.mem_used -> GlobalCounters.mem_used 2023-02-10 23:13:29 -06:00
Jacky Lee f08187526f
Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
George Hotz a5a55ac19e GlobalCounters cache + assign in optim 2023-02-08 17:10:55 -06:00
George Hotz 3d63934995
refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz 2844482a60
Mypy fun (#541)
* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup
2023-02-08 09:56:51 -06:00
James Roberts db0a9b0a2d
Refactor CL.time_sum into GlobalCounters (#519) 2023-02-01 20:13:56 -08:00
Jacky Lee 799b3f185a
Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz 2db272c7f7
Kernel Optimizer (#489)
* kernel optimizer

* 10x faster, but wrong. not good deal

* move test -> extra

* print x speedup

* clcache

* fix clcache + DEBUG

* GFLOPS estimate

* i==3
2023-01-29 17:15:00 -08:00
George Hotz 66da3bc3c0 reset the benchmark timer 2023-01-25 09:20:34 -08:00
George Hotz a0d169eb59 fix efficientnet 2022-09-28 14:23:01 -07:00
George Hotz b132de677d
tinygrad.nn (#367)
* tinygrad.nn

* flake8

* working on pylint

* more pylint

* more pylint

* pylint passes

* networkx

* mypy can't infer that type

* junk
2022-08-18 07:41:00 -07:00
George Hotz acbeaf0ba9 adam in benchmark_train_efficientnet 2022-07-19 09:33:07 -07:00
George Hotz d985217fa4 skip reduce noops 2022-07-16 07:47:43 -07:00
George Hotz 5e46561f7e no_grad = NOT backward 2022-07-10 20:54:57 -07:00
George Hotz d5d9cffe7c training param for batchnorm 2022-07-04 13:28:03 -07:00
George Hotz 34f43ea10e LAZY and CLCACHE are defaults 2022-07-04 13:09:15 -07:00
George Hotz b7afd83267 track cl mem used 2022-07-04 12:19:00 -07:00
George Hotz d5de8452c6 dashed loadops 2022-07-04 09:50:56 -07:00
George Hotz 7276f8d6bf improve constant folding, detach before moving tensor 2022-07-02 15:29:40 -07:00
George Hotz 0cb99d72e9 NUM=-1 is a small efficientnet for small people 2022-07-02 15:11:51 -07:00
George Hotz 8cf1aed0f4 don't track_running_stats, parameters must require_grad 2022-07-02 14:38:45 -07:00
George Hotz f607f18006 fix backward 2022-06-25 00:00:53 -07:00
George Hotz ec30f0402f improve benchmark_train_efficientnet 2022-06-24 23:46:38 -07:00
George Hotz d748353ce5 err, okay, a bit more off 2022-06-24 22:44:57 -07:00
George Hotz bdde95f16e CACHE_LAZYBUFFERS options + benchmark. only a couple x from torch 2022-06-24 22:33:53 -07:00