wozeparrot
8b354b3f73
feat: version bump! ( #1687 )
2023-08-27 12:38:58 -04:00
Roelof van Dijk
abaa605f71
[ready] perf: start enumerate at 1 instead of checking all i ( #1691 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-27 12:00:32 -04:00
Roelof van Dijk
2730ed657f
perf: faster lazyop eq ( #1693 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-27 11:17:02 -04:00
Roelof van Dijk
6ca509a485
perf: constant in while in for in busy func ( #1688 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-27 11:13:16 -04:00
Roelof van Dijk
b89d81330f
fix: restore old behaviour ( #1689 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-27 10:45:53 -04:00
chenyu
66fbf4800b
fix symbolic_ops tests with Tensor.training=True ( #1686 )
2023-08-26 23:19:56 -04:00
Roelof van Dijk
6c5dc9c153
[ready] perf: faster lazyop init ( #1673 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-26 22:59:10 -04:00
wozeparrot
f61d0657d1
document new envvars ( #1676 )
...
* feat: document some new envvars
* feat: actually put values
* feat: no more cifar torch
* feat: no fakedata
2023-08-26 20:17:02 -04:00
Yixiang Gao
9d93a82354
remove FAKEDATA ( #1685 )
2023-08-26 20:15:54 -04:00
chenyu
b5d700adae
update openpilot supercombo.onnx to 0.9.4 ( #1681 )
...
* update openpilot supercombo.onnx to 0.9.4
* update tests for the new model
* comment out comma models from external_model_benchmark
2023-08-26 19:16:08 -04:00
Roelof van Dijk
89b529c07f
[ready] ci: add py38 to linters ( #1674 )
...
* ci: add py38 to linters
* fix: run linters only on py38
---------
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-26 09:34:15 -04:00
Jordan Wright
25be7f745d
Tensor.uniform with dtype=int bug fix ( #1593 )
2023-08-26 01:59:53 -04:00
Roelof van Dijk
f702a8f497
[ready] avoid in-function graph imports in lazy.py ( #1666 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-25 13:56:28 -04:00
Roelof van Dijk
02e64da678
refactor: tuples can be concatenated with + ( #1671 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-25 12:37:13 -04:00
Yixiang Gao
173850f599
fix CIFAR jit ( #1657 )
...
* update mask function
* kept 94 with the new fetcher
clean up batch fetcher
* 94.04% without cutmix
* 94.04% with cutmix
* move batch fetcher to avoid fetching additional batch last STEP
2023-08-24 16:14:40 -07:00
chenyu
f00325e77d
ops_metal newCommandQueueWithMaxCommandBufferCount_(1024) ( #1664 )
2023-08-24 15:42:00 -07:00
DavidFarago
1ba8f0dca3
Quickstart: Upgrade section "Training" to new code ( #1663 )
...
Co-authored-by: Dave Farago <dfarago@innoopract.com>
2023-08-24 17:12:16 -04:00
DavidFarago
29adae84eb
Quickstart: Use tensors to compute train accuracy ( #1662 )
...
Co-authored-by: Dave Farago <dfarago@innoopract.com>
2023-08-24 17:09:12 -04:00
George Hotz
d37d092c14
split linearizer into 3 files ( #1654 )
2023-08-23 14:58:47 -07:00
George Hotz
1b8c40234f
Uast start ( #1650 )
...
* work
* more tests
* more tests 2
* don't break it
2023-08-23 12:00:06 -07:00
geohotstan
484708da87
#1615 fix ( #1616 )
2023-08-23 14:51:05 -04:00
Pavol Rusnak
b57c374164
add accelerator links to readme ( #1649 )
2023-08-23 14:47:55 -04:00
George Hotz
82623697a8
Move asm renderer ( #1648 )
...
* teeny changes
* teeny updates
* move to renderer
2023-08-23 10:06:43 -07:00
George Hotz
a89363574d
teeny changes ( #1647 )
...
* teeny changes
* teeny updates
2023-08-23 09:53:39 -07:00
George Hotz
a6d842af7a
move device to ops ( #1646 )
...
* move device to ops
* mlops types
* 2 lines
2023-08-23 08:30:17 -07:00
nimlgen
a65ae1198b
do replace div->mul for non-floats ( #1644 )
2023-08-23 07:34:31 -07:00
George Hotz
da694d4241
move that image import
2023-08-22 21:30:55 -07:00
George Hotz
41e83be3dd
simple where broadcast ( #1643 )
2023-08-22 21:24:49 -07:00
George Hotz
c831218139
Optional: Reduce line count and simplify the LazyBuffer interface ( #1642 )
...
* less lines in lazybuffer, def e
* custom function
* cast
* reorder functions
* lb type
2023-08-22 21:01:10 -07:00
George Hotz
d25046e66a
matvec tests ( #1634 )
...
* matvec tests
* f16
* f16 is broken
2023-08-22 17:33:58 -07:00
George Hotz
643cbdfd50
make embedding and GPT-2 fast ( #1631 )
...
* make embedding fast
* jit more, variable shape support
* print mem bw
2023-08-22 15:14:38 -07:00
Niklas D
a7752ad65d
Fix link to state.py in quickstart ( #1632 )
2023-08-22 17:39:30 -04:00
c143
c9c40bb16f
Import whole math module in tensor.py ( #1628 )
2023-08-22 17:07:46 -04:00
Roelof van Dijk
6fcfa50b35
[ready] perf: no noop cast just to make mypy happy ( #1626 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-22 17:07:22 -04:00
Roelof van Dijk
f04a6d7882
perf: faster partition ( #1625 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-22 11:56:41 -07:00
George Hotz
d3c401ba3c
llama quantize: scale uses mul, not div
2023-08-22 11:48:56 -07:00
George Hotz
696e4d20a1
fix KOPT=2 with variable shape
2023-08-22 11:34:34 -07:00
George Hotz
de1fcc418f
no more toCPU path ( #1624 )
2023-08-22 11:07:26 -07:00
George Hotz
463dece63e
auto arg dtypes ( #1623 )
2023-08-22 10:22:40 -07:00
George Hotz
db8344ab83
add noalias to llvm ( #1622 )
2023-08-22 09:26:01 -07:00
chenyu
89e13f2f04
support symbols in shrink ( #1611 )
2023-08-22 09:08:21 -07:00
George Hotz
718ced296c
move state to nn/state ( #1619 )
2023-08-22 07:36:24 -07:00
Umut Zengin
1e93fd5449
Readability for unreadable functions ( #1610 )
...
* cleaned
* typing
* typing
* if format
* if format
* mypy
* update argmax
* argmax more readable
* More stable def pad
* lint
2023-08-22 07:09:08 -07:00
George Hotz
86a32ffb1a
lt sum ( #1617 )
2023-08-21 21:19:16 -07:00
George Hotz
c64c47a6ae
test arange simple
2023-08-21 20:16:17 -07:00
George Hotz
4f459841bc
Symbolic JIT for GPT2 ( #1613 )
...
* not fast yet
* simpler
* symbolic jit
* fp16 GOPS and GB
2023-08-21 19:44:57 -07:00
Yixiang Gao
4f02491cd4
add cpu if torch tensor ( #1609 )
2023-08-21 16:57:59 -07:00
Umut Zengin
f720682beb
np.argmax to Tensor.argmax ( #1608 )
...
* to tensor argmax
* removed keepdim
* training update
2023-08-21 15:22:29 -07:00
George Hotz
4ea00bad38
track down llama bug
2023-08-21 15:14:21 -07:00
Roelof van Dijk
b02f77b354
perf: faster broadcasted ( #1601 )
...
Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>
2023-08-21 14:21:46 -07:00