Commit Graph

83 Commits

Author SHA1 Message Date
George Hotz 4da2ddea6e
Interpreted cleanups (#2312)
* move the compiler out of ops

* don't return realized

* var_vals filter, fix custom

* typing
2023-11-15 09:02:23 -08:00
chenyu a753c8e071
examples of new GPT2 and JIT change (#2261)
* var_vals are global

* working with global ish

* better

* fix export model

* fix tests

* better kv cache

* does it run?

* use where for kvmask

* fix excessive var_vals

* fix import

* how does multigpu use this?

* llama kinda work

* faster and simpler

* cleanup

* fix conversation mode

* test cleanups

* fix one more test

* test cleanup

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2023-11-10 15:07:02 -05:00
George Hotz 0c9b4ab885
no to_underlying (#2222)
* no to_underlying

* context is no longer used

* no more optimizing

* update docs
2023-11-05 21:34:20 -08:00
George Hotz f17bc16f46
simple runtime args (#2211)
* simple runtime args

* fix some tests

* fix abstractions and triton

* fix search
2023-11-03 12:31:29 -07:00
George Hotz 03cf0afa4f
move all to compile api (#2203)
* move metal+clang to compile api

* all to the new style

* remove binary arg

* fix triton

* fixup tests

* fix clang

* diskcache is generic

* __wrapped__

* compile_gpu

* fix thneed

* keep the src in the ASTRunner

* lib

* move compile_gpu

* compile_gpu in device

* put compiler in astrunner

* test reverts

* triton compiler

* ugh, that too
2023-11-01 23:01:32 -07:00
chenyu 5d5921d2c8
small doc env update (#2112) 2023-10-18 14:49:25 -07:00
George Hotz c36d306606
KOPT is over, BEAM is upstream (#2071)
* create cache for q learning

* make linter happy

* global beam

* where it belongs

* bugfix

* ditch the kopt, use the beam

* faster lin and DEBUG=2 okay

* remove kopt, move search to features
2023-10-16 09:46:03 -07:00
George Hotz 121f7aa8c5
Schedule item (#2012)
* ScheduleItem

* put var_vals in the schedule

* fix tests, wow that proliferated quickly

* not ready to be in the schedule
2023-10-07 08:59:25 -07:00
Roelof van Dijk 972d9ea215
fix: PRUNEGRAPH is unused (#1985) 2023-10-05 14:28:43 -07:00
George Hotz de5d603ec1
corealize + remove realize from lazybuffer (#1968)
* corealize + remove realize from lazybuffer

* fix multigpu

* fix graph
2023-10-04 10:59:31 -07:00
nimlgen 2ea1dd3e87
no process() in Linearizer (#1966)
* no process() in Linearizer

* more process() clean up
2023-10-04 07:18:42 -07:00
George Hotz 0945848b5f
schedule the loadops like everything else (#1964)
* schedule the loadops like everything else

* unify loadops with other things we schedule

* delete all the ops

* fix symbolic jit
2023-10-04 02:36:04 -07:00
Yixiang Gao 094d3d71be
with Tensor.train() (#1935)
* add with.train

* remove the rest TODOs

* fix pyflake

* fix pyflake error

* fix mypy
2023-09-28 18:02:31 -07:00
George Hotz adab724caa
schedule2, keep the tests working with small changes (#1932)
* lazy cleanups

* ast functions take in LazyOps

* op instead of self.op

* _base for mops

* fix contiguous

* start schedule

* test_schedule

* fix openpilot

* more tests

* bugfix and test skip

* work

* make sure things get freed

* fix zerosized tensors

* fix failing test

* fix ceil and friends

* fix openpilot

* disable training

* disable test collectives
2023-09-28 09:14:43 -07:00
George Hotz c907efbf4a
reorder a few things (#1915)
* reorder a few things

* huh, that has to be there

* move apply shapetracker

* BufferOps

* only for type checking
2023-09-25 10:17:21 +08:00
George Hotz 20059dc55b
Make ShapeTracker Immutable (#1909)
* ugh

* ops test pass

* fix shapetracker tests

* sym shapetracker

* shapetracker is a tuple of views now

* from_shape

* fix has variable shape

* key isn't needed

* post init assert
2023-09-24 21:09:03 +08:00
George Hotz 7ff7aacdb4
LazyOp out of Linearizer (#1908)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx

* dedup buffers

* hmm, bugfix

* fix tensor cores

* opts device
2023-09-24 14:30:53 +08:00
George Hotz 97dc813329
Revert "All LazyOps in the Linearizer (#1905)" (#1907)
This reverts commit a5820390db.
2023-09-24 11:51:22 +08:00
George Hotz a5820390db
All LazyOps in the Linearizer (#1905)
* loadop buffer on cpu

* works for GPU

* sort of working

* has bugs

* gpu tests pass

* fix some tests

* fix tensor cores

* fix test linearizer

* fix symbolic

* fix has_variable_shape

* non symbolic size

* disable weird test

* simple cache fix

* fix custom function

* fix kopt

* cleanups

* a bit broken on the assign

* contig check

* only buffer

* need that order

* idx
2023-09-24 11:50:00 +08:00
George Hotz 9cf13bd055
rename reduce_op (#1900)
* rename reduce_op

* more design v2
2023-09-23 11:27:36 +08:00
chenyu b8fde6bb0f
Test KOPT in CI (#1744)
* test kopt in ci

* getenv takes dtype from default
2023-09-03 14:37:20 -07:00
crankygrumpster c8025c319c
Remove Token from abstractions.py (#1741)
* Remove Token from abstractions.py, update output string

* add dtype
2023-09-02 21:56:11 -07:00
George Hotz 453e437598
move stuff in the linearizer (#1726)
* move stuff in linearizer

* move stuff in linearizer

* minor

* fix opts import
2023-08-31 14:42:09 -07:00
nimlgen 1c0449e190
add cache collector (#1595)
* init cache collector

* add test_cache_collector.py

* switch GlobalCounters.cache to CacheCollector

* init jit models test

* jitted SD

* add debug msg to print loaded bufs count

* moved cache collctor to jit

* clearer SD

* no double device import
2023-08-28 19:59:55 -07:00
wozeparrot f61d0657d1
document new envvars (#1676)
* feat: document some new envvars

* feat: actually put values

* feat: no more cifar torch

* feat: no fakedata
2023-08-26 20:17:02 -04:00
DavidFarago 1ba8f0dca3
Quickstart: Upgrade section "Training" to new code (#1663)
Co-authored-by: Dave Farago <dfarago@innoopract.com>
2023-08-24 17:12:16 -04:00
DavidFarago 29adae84eb
Quickstart: Use tensors to compute train accuracy (#1662)
Co-authored-by: Dave Farago <dfarago@innoopract.com>
2023-08-24 17:09:12 -04:00
George Hotz a6d842af7a
move device to ops (#1646)
* move device to ops

* mlops types

* 2 lines
2023-08-23 08:30:17 -07:00
Niklas D a7752ad65d
Fix link to state.py in quickstart (#1632) 2023-08-22 17:39:30 -04:00
George Hotz 718ced296c
move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
Umut Zengin f720682beb
np.argmax to Tensor.argmax (#1608)
* to tensor argmax

* removed keepdim

* training update
2023-08-21 15:22:29 -07:00
Yixiang Gao 4d54afb6df
sparse cat cross entropy (#1597)
* add sparse cat cross entropy

* minor fix

* add log_softmax into loss function

* add test

* update docs

* fix training loss

* add device
2023-08-21 14:14:54 -07:00
George Hotz 2e60920317
Revert "sparse cat cross entropy (#1591)" (#1596)
This reverts commit f0ee850e98.
2023-08-21 10:04:26 -07:00
Yixiang Gao f0ee850e98
sparse cat cross entropy (#1591)
* add sparse cat cross entropy

* minor fix

* add log_softmax into loss function

* add test

* update docs
2023-08-21 09:56:41 -07:00
chenyu ae39cf84ab
Symbolic Shape JIT main PR (#1353)
* Symbolic Shape JIT

update tests

2 variables symbolic ops, adding more tests

test passing

cleanup

* more test cases

* single flag

* review update

* jit attention one piece

* realize

* symbolic_jit test for cuda

* old artifact

* works with cuda gpu but failed ci

* CUDACPU
2023-08-18 14:39:55 -07:00
George Hotz d24f936501
just cmplt (#1493)
* just cmplt

* fix maximum

* don't save, there's no backward

* ugh, no slot either

* eq is a scam
2023-08-08 13:58:10 -07:00
George Hotz 7b8d06c9f1
test uops (#1444)
* test uops

* tests should pass

* improve uops

* precision
2023-08-05 12:35:56 -07:00
George Hotz 84c430355e
fix backends for new style (#1443)
* fix backends for new style

* fix method cache

* fix fakeless

* llvm blacklist

* fix kernel optimizer
2023-08-05 11:07:04 -07:00
Alex Telon b66361843a
Timing and Context can now be used as decorators (#1385)
* Context and Timing can now be used as decorators

* Using Timing decorator in quickstart.md

The time formating is better and is a useful tool to learn.

Old: Time: 3.5260659999912605
New: Time: 3526.14 ms

* Updated env_vars documentation for Context

* Added test for Context decorator

* Put new import on same line as others
2023-08-01 17:16:10 -07:00
chenyu 940b6fd21a
Revert "Fix constant folding for Tensor([3]) (#1227)" (#1274)
This reverts commit ab645317c9.
2023-07-19 10:51:06 -07:00
David Hou 56ee97b37f
dedup kernel args v2 (#1272)
* new version

* fix abstractions

* try remove test

* Revert "try remove test"

This reverts commit 2fc18a9f8ed180540baf73d32b568262709822f1.

* assert_allclose

* minimize the test

* minimize the test

* minimize the test

* minimize the test

* Revert "minimize the test"

This reverts commit e0c092959636109f745d1c8a73f2db90c75fe3c1.

* Revert "minimize the test"

This reverts commit 88240551b13403b21a81765043d5736103a49293.

* Revert "minimize the test"

This reverts commit 78328a7ce27328c8bf9a325ae017cc2a4d98f65b.

* Revert "minimize the test"

This reverts commit 989523fded4319b13db047e45ad8c35c861a36aa.

* skip test inside body

* oops

* oops
2023-07-18 20:03:42 -07:00
Adrian Kretz 5a8ad57163
Add WHERE ternary (or trinary?) op (#1196)
* Rename FusedOps to TernaryOps

* Support ternary broadcast

* Add where llop and mlop

* Make where op work in cstyle codegen

* Don't skip test_inf_where

* Add backward path to where op

* Use bool in cstyle codegen

* Add LLVM where op

* Add numpy where op

* Add torch where op

* Simplify where mlop

* Update documentation

* Forgot a rename

* Merged relevant changes from PR #1195 onto PR #1196

* Add test to cover changes to linearizer.ast_parse for WHERE op

Without this METAL will try to use ternary op on float4 and fail

* Make where op work in wgsl backend

* Allow ternary ops to be merged

* Make mypy happy

---------

Co-authored-by: Francis Lam <flam@alum.mit.edu>
2023-07-16 00:31:55 -07:00
chenyu ab645317c9
Fix constant folding for Tensor([3]) (#1227)
* Fix constant folding for Tensor([3])

* Remove duplicated prod import

* load in the same device

* better numpy

* add constant fold shape test cases

* improve tests
2023-07-11 14:01:32 -07:00
George Hotz 2952b8e7a8
Fix up abstractions.py to include the Linearizer (#1177)
* fix up docs

* remove pow, add sqrt
2023-07-07 18:33:51 -07:00
terafo aa60feda48
Fix naming conflict with huggingface datasets (#1161)
* Rename in files

* Move files

* Moved to extra/datasets as suggested

* Changes to files

* Fixed stupid mistake

---------

Co-authored-by: terafo <terafo@protonmail.com>
2023-07-07 10:43:44 -07:00
Eli Frigo 801564f31b
Remove POW llop and add SQRT llop (#1104)
* fixed division by zero for fast operations

* made et closer to 0

* replace POW llop with SQRT

* updated mlops to swap SQRT and POW llops

* updated hlops to swap POW and SQRT

* added sqrt llop to cpu runtime

* added sqrt llop to cstyle codegen

* added POW llop to llvm ir codegen

* added SQRT llop to torch runtime

* moved pow from mlops to hlops

* found a better way to do reverse pow

* fixed indentation

* added SQRT llop to triton

* update docs to match new llops

* removed POW operator from assembly codegen

* added sqrt and rsqrt to pow hlop

* rewrote pow function in tensor.py

* Adjust tolerance

* Adjust for adamw

* Reduce for Adam too

* removed accidental leftover code

* removed all of accidental code

* added rsqrt test

* removed pow from mlops again

it was added back when resolving merge conflicts

---------

Co-authored-by: Jacky Lee <jla524@sfu.ca>
2023-07-05 18:07:58 -07:00
tricky-labyrinth fd98f6cffa
Small fix to abstractions.py so it runs on Windows without throwing an AttributeError (#1109)
Co-authored-by: Tricky Labyrinth <trickylabyrinth@gmail.com>
2023-07-03 13:44:49 -07:00
Reza Rezvan 8ae9a054ae
Refactor nn.optim (#1091)
* Refactor: nn.optim.py

* Refactor: nn.optim.py; Fix all tests

* Refactor: Replace all optim.get_parameters()

* Refactor: Revert list comp.

* Refactor: Replace optim.get_state_dict

* Refactor: Change quickstart.md
2023-07-02 15:07:30 -07:00
foreign-sub 574cbda979
Quickstart (#1015)
* fix quickstart md

* add quickstart to ci
2023-06-29 13:26:58 -07:00
ernie 4d703be6d7
fix typo (#1065) 2023-06-27 10:56:54 -07:00