George Hotz
bfcec234a2
Refactor ASTs ( #622 )
...
* ugh worst branch name
* compiler refactor continues
* scc -> cloc
* buf -> _buf
* finish _buf, and program -> runtime
* gpu is still working, clang isn't
* clang in new style
* ops_metal
* something broke it
* improve metal
* clean up tons of cl crap
* hack fix sync
* cleaner gpu
* gpu metal clang
* cleanups
* minor refactor
* GPUCodegen
* fix up LLVM
* blind CUDA refactor
* codegen / runtime
* keep ops naming
* linter passes
* woah, llvm was allocing 4x what it needed to
* bugfixes
* fix openpilot compiler
* fix compile_efficientnet
* method cache should fix tests
* deal with duped functions
2023-03-01 18:57:29 -08:00
voidz
94bec40110
moved extras/jit.py -> tinygrad/jit.py ( #599 )
...
* moved extras/jit.py to tinygrad/jit.py
* fixed indent
* removed tinygrad.helpers.DEBUG from jit.py
2023-02-25 08:32:33 -08:00
George Hotz
d3029c91c5
no rng for op test
2023-02-24 00:23:20 -08:00
George Hotz
661812ffef
don't ignore type
2023-02-23 19:38:52 -08:00
George Hotz
8b0082540b
openpilot compile cleanups
2023-02-20 09:16:03 -08:00
George Hotz
de71c13934
test speed v torch uses jit
2023-02-12 07:43:17 -08:00
George Hotz
031edd01e6
switch openpilot compile to TinyJit
2023-02-11 09:51:44 -08:00
George Hotz
3d63934995
refactor to keep cl in the runtime ( #545 )
...
* refactor to keep cl in the runtime
* fix thneed, rename cl to _cl
* bugfix + _cuda
* fix tests
* thneed more correct
2023-02-08 16:46:09 -06:00
Jacky Lee
799b3f185a
Refactor getenv into helpers ( #508 )
...
* Refactor getenv into helpers
* Remove unused os
* Fix default value
* Fix more defaults for CI
* Fix bracket
* Revert changes to openpilot/compile.py
* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
92001a06e1
openpilot/go.sh
2023-01-28 13:57:43 -08:00
George Hotz
6d7658db12
delete opencl <celebration>
2023-01-24 14:18:35 -08:00
George Hotz
e313c8af20
update openpilot tests from OPENCL to GPU
2023-01-24 14:05:20 -08:00
George Hotz
281b0db773
three from image
2023-01-12 12:26:58 -08:00
George Hotz
4885fce56e
shapetracker from newgpu ( #456 )
...
* shapetracker from newgpu
* touchup ops
* test
* testst
* thneed deletes unused inputs
* test
* bugfix
2023-01-09 12:40:01 -08:00
George Hotz
e6b65f8e01
fix graph in openpilot/compile.py
2022-10-28 08:55:34 -07:00
George Hotz
ef62db3186
cleanups, remove E701
2022-10-28 08:28:56 -07:00
George Hotz
b65b70812a
Exec AST ( #404 )
...
* working exec ast
* exec_ast is staticmethod
* GenericExecAST
* fold that sometimes
* ExplicitExecAST
* exec_ast for GPU
* gpu working
* get_lazyop_shape
* now gpubuffer is ExplicitExecAST
* dedup
* add a type
* RESHAPE in opencl code
* fix linter
* that too for linter
* cleanups
* remove dead code
* GenericShape is less lines
* add ALLOWED_KERNEL_COUNT to tests
* fix mypy
* that's gotta be recursive
* fix opencl shape processing
* remove unneeded lambda
2022-10-28 08:27:03 -07:00
George Hotz
6a8fb53304
move ops.py into lazy.py ( #402 )
...
* move ops.py into lazy.py
* fix graph and linter
* ugh, didn't add
2022-10-25 13:58:03 -07:00
George Hotz
3b9b7eda48
remove run_thneed dead code
2022-10-20 17:24:18 -07:00
George Hotz
1bec4651b3
fix nonstatic weights
2022-10-20 17:04:14 -07:00
George Hotz
50c95c7d9a
add assert to catch issue in attention
2022-10-20 15:13:00 -07:00
George Hotz
26c78ccf7d
remove useless buffer
2022-10-20 14:07:28 -07:00
George Hotz
a18c1f3178
zero out the inputs
2022-10-20 13:46:52 -07:00
George Hotz
61ee428e4c
rerun
2022-10-20 13:29:14 -07:00
George Hotz
5dae64b7b0
read input shapes and break down the layers
2022-10-20 13:11:24 -07:00
George Hotz
e00601faea
fix thneed self test
2022-10-20 12:55:02 -07:00
George Hotz
ace8db29f8
ReduceSum
2022-10-20 12:48:14 -07:00
George Hotz
c400ee0beb
refactoring thneed ( #400 )
...
* refactoring thneed
* continue
* minor update
* looks like it's working
* big refactor
* confirm thneed got the right output
* code is there but it's broken
* works now
* always OPTWG, input -> dat
* fix type issue
2022-10-20 12:35:59 -07:00
YassineYousfi
ae0f9b17df
openpilot: new models and onnx ops ( #401 )
...
* ngrl stuff
* fngrl
* fix typo in compile script
* workflow dispatch
* new models in tests
* dont need to up this threshold
Co-authored-by: HaraldSchafer <harald.the.engineer@gmail.com>
2022-10-20 11:49:19 -07:00
George Hotz
d6f499fd69
improve opencl, why is it OOMing
2022-09-05 20:14:31 -07:00
George Hotz
2e9b7637b3
don't save input buffers
2022-08-31 15:37:38 -07:00
George Hotz
a3fc64a585
fix batchnorm folding in openpilot compile
2022-08-31 13:04:49 -07:00
Comma Device
a734df98fa
TEST_ENET for openpilot compiler
2022-08-31 13:23:36 -04:00
George Hotz
d919ac32af
fix wrong size input
2022-08-31 09:07:34 -07:00
George Hotz
040640a580
fix cl import error
2022-08-31 08:43:44 -07:00
George Hotz
33ac355bcd
still broken
2022-08-29 19:08:07 -07:00
George Hotz
5efab7cf1d
add reciprocal
2022-08-29 18:00:24 -07:00
George Hotz
880707f2d2
no torch test if no torch
2022-08-29 15:29:19 -07:00
George Hotz
5eba228844
print inputs
2022-08-29 08:56:04 -07:00
George Hotz
dd587d26e3
oops, compare with abs
2022-08-28 11:23:21 -07:00
George Hotz
dc7af8c3ac
thneed run float32
2022-08-28 11:03:35 -07:00
Comma Device
f0d11f29c7
float32 in image desc
2022-08-28 08:47:43 -07:00
George Hotz
11626053b0
run_thneed with test
2022-08-22 09:45:46 -07:00
George Hotz
e7a4cd91ba
fix cpu thneed running
2022-08-21 12:11:07 -07:00
George Hotz
a8734df030
add openpilot tests to tinygrad
2022-08-21 12:03:37 -07:00
Comma Device
85453288d7
run_onnx_torch
2022-08-18 08:30:12 -07:00
Comma Device
1f23517d92
fixup run thneed
2022-08-18 08:22:53 -07:00
Comma Device
6da956b9fa
that should be right
2022-07-19 19:47:37 -07:00
Comma Device
f4ed837f2f
float16 fixups
2022-07-19 19:44:40 -07:00
Comma Device
aa00a3948e
needs_load in image correct
2022-07-19 19:25:47 -07:00
Comma Device
314d70ff17
zero out the buffer
2022-07-19 19:17:47 -07:00
Comma Device
b8a67905e5
save weights
2022-07-19 19:14:14 -07:00
Comma Device
2d402d1135
buffer_id is 8 bytes
2022-07-18 20:27:45 -07:00
Comma Device
577c23731e
outputs with size
2022-07-18 20:21:33 -07:00
Comma Device
29581b5c85
inputs and outputs
2022-07-18 20:17:26 -07:00
Comma Device
ae30641b0d
fix row pitch
2022-07-18 19:48:19 -07:00
Comma Device
02f23e526c
output file to disk
2022-07-18 19:23:22 -07:00
George Hotz
bcf422dfdd
Device2 ( #358 )
...
* option for matmul
* fixups
* fast like a nascar
* running
* thneed runner
* no buffer id makes no backing buffer
* move constant folding to the top
* runs on mac
* folded biases
* was v slow
* maybe just that
* elu touchup
* speed and float32
Co-authored-by: Comma Device <device@comma.ai>
2022-07-16 07:26:19 -07:00
George Hotz
d651caa864
fixup openpilot/compile.py
2022-07-11 13:59:09 -07:00
George Hotz
d8e7f1f8bc
opencl type ignore
2022-07-08 10:33:55 -07:00
George Hotz
df7976248b
be lazy with the gpubuffer copies for host for constant folding
2022-07-03 23:04:14 -07:00
George Hotz
18d74c01b1
float4 opt
2022-06-21 21:27:51 -07:00
George Hotz
ff3d5fe962
debugging while we compile
2022-06-21 21:12:04 -07:00
George Hotz
b12985b013
openpilot compiler
2022-06-21 20:31:18 -07:00