Commit Graph

38 Commits

Author SHA1 Message Date
qazal a4a23c40a0
test masked assign views (#4599)
* possible masked

* not contiguous mask
2024-05-15 15:06:48 +03:00
qazal 77aa8659f5
use assign_targets in LazyOp creation (#4568)
* start

* correct error

* this is possible

* document it
2024-05-13 10:24:35 +03:00
qazal b0fa97e176
assert error detail in test_assign (#4567)
* use regex assert

* that shouldnt raise
2024-05-13 09:56:05 +03:00
qazal 4e1135a0bc
assign buffer read/write tests (#4565)
* simple tests

* more tests
2024-05-13 09:43:36 +03:00
qazal 249cadd106
fusing crossing diamond assign (#4403)
* refactor scheduler parents search

* assign target

* unit test

* can't chase this
2024-05-04 15:19:48 +03:00
qazal 9a47ed0705
test crossing diamond assigns (#4298) 2024-04-25 21:52:05 +03:00
George Hotz 967638f0d5
update docs, remove corealize (#4264)
* update docs, remove corealize

* handle 0 line count

* tensor schedule
2024-04-23 12:05:29 +04:00
qazal a9bc7c1c49
unify assign tests (#4247) 2024-04-22 11:01:15 +03:00
chenyu bdbcac67f1
assign jit test case with other tensor as input (#4098)
hmm it works
2024-04-06 14:41:14 -04:00
chenyu c71627fee6
move GlobalCounter to helpers (#4002)
break circular import between ops and buffer
2024-03-30 00:30:30 -04:00
George Hotz 60639cccac hotfix: RuntimeError for assign 2024-03-27 11:18:48 -07:00
qazal 9fb573d73c
DAG cycle asserts (#3955)
* assert cycles

* these are cycle errors

* flip to positive
2024-03-27 11:09:59 -07:00
qazal d8fafca13a
assign regression (#3907)
* infra

* track mutations

* assign levels

* add seen back

* add test

* infra 2.0

* add assign targets

* dont need levels

* delete

* Update test_assign.py

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-03-24 15:12:31 -07:00
George Hotz 54dc48aa47
fix assign (#3878)
* fix assign

* remove terrible optimizer hack

* oops, not realized assigns
2024-03-22 11:48:48 -07:00
George Hotz 86ee36e697
preschedule all (#3875) 2024-03-22 11:20:06 -07:00
George Hotz 4c4d3cb3e3
restrict assignment to base (#3809)
* restrict assignment to base

* add some restrictions there

* more restrictions
2024-03-18 15:33:06 -07:00
George Hotz d8296d4a3f
simple assign tests (#3807) 2024-03-18 13:57:01 -07:00
George Hotz 0183a05f0a
test assign (#3798)
* Reapply "add failing assign test (#3796)" (#3797)

This reverts commit 1e1beb888c.

* no realized check
2024-03-18 08:58:04 -07:00
George Hotz 1e1beb888c
Revert "add failing assign test (#3796)" (#3797)
This reverts commit 2dea12832c.
2024-03-18 08:55:36 -07:00
George Hotz 2dea12832c
add failing assign test (#3796)
* that was a hack

* tests to reveal the issue

* add assign for realized assign
2024-03-18 08:47:30 -07:00
George Hotz 641f347232
simple LoadOps.ASSIGN (#3745)
* simple LoadOps.ASSIGN

* skip that test

* don't assign in onnx ops gemm

* track cache usage

* recreate the lazybuffer to avoid the cache

* fix contigs

* skip that test

* lol

* better letters
2024-03-14 20:44:34 -07:00
George Hotz d52d0b0efb test_assign_kv_cache 2024-03-14 16:17:20 -07:00
George Hotz 3527c5a9d2
add Tensor.replace (#3738)
* add Tensor.replace

* fix dtypes in that test

* should be replace

* and mixtral
2024-03-14 13:34:14 -07:00
George Hotz 56b914fc8c hotfix: test_assign_contiguous 2024-03-13 17:49:54 -07:00
George Hotz 838afbc351
assign tests (#3728) 2024-03-13 17:04:55 -07:00
xarkes 28a8b72024
Remove Interpreted device & remaining CPU/TORCH ref (#3423)
* Remove Interpreted device & remaining CPU/TORCH ref

* Oops

* supports_device was useful

* Fix doc wording

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-02-16 00:30:21 -05:00
George Hotz a280cfe169
move dtypes to dtype.py (#2964)
* move dtypes to dtype.py

* fix urllib
2024-01-01 14:58:48 -08:00
George Hotz 1765849937
new lazy, benchmark (#2878)
* lazy rewrite, try 2

* min fix tests

* pass contig test

* put broken pads back

* move that to realize

* no contig child fixes array packing

* so wrong

* now that's correct

* base children

* fix bind issues

* disable to_image_idx

* fix tests

* that failure shouldn't break other tests

* more fixes

* fix torch

* skip failing tests in CI

* 1e-7

* half is broken

* 1e-6 margin of error
2023-12-20 14:33:21 -08:00
Christopher Mauri Milan 7f01dd04f0
Apply ruff linting rules to tests (#2473)
* everything except F821

* enable F821 with noqa

* dumb fix

* fix remaining imports and (former) lambdas

* replace _ with noqa to avoid gc
2023-11-27 21:24:06 -08:00
George Hotz 9e07824542
move device to device.py (#2466)
* move device to device.py

* pylint test --disable R,C,W,E --enable E0611

* fix tests
2023-11-27 11:34:37 -08:00
George Hotz 2f7aab3d13
move optimize_local_size (#2221)
* move optimize_local_size

* interpret_ast
2023-11-05 21:00:52 -08:00
George Hotz de5d603ec1
corealize + remove realize from lazybuffer (#1968)
* corealize + remove realize from lazybuffer

* fix multigpu

* fix graph
2023-10-04 10:59:31 -07:00
George Hotz adab724caa
schedule2, keep the tests working with small changes (#1932)
* lazy cleanups

* ast functions take in LazyOps

* op instead of self.op

* _base for mops

* fix contiguous

* start schedule

* test_schedule

* fix openpilot

* more tests

* bugfix and test skip

* work

* make sure things get freed

* fix zerosized tensors

* fix failing test

* fix ceil and friends

* fix openpilot

* disable training

* disable test collectives
2023-09-28 09:14:43 -07:00
segf00lt 9e8c1dbf34
patch to remove hack from stable_diffusion.py (#1814)
* patch to remove hack from stable_diffusion.py

* sorry linter

* realize after assign?

* float16 broken in llvmlite use float64 for now

* int32

* idiot forgot to change test array dtype
2023-09-08 09:26:50 -07:00
Diogo ba5e3818a0
Limit dims based on max size (#1390)
* working

* whitespace

* changed defaults to None

* linter

* last linter error
2023-07-31 19:18:19 -07:00
cheeetoo a0965ee198
CI < 5 minutes (#1252)
* models matrix

* fix typo and install gpu deps

* install llvm deps if needed

* fix

* testops with cuda

* remove pip cache since not work

* cuda env

* install cuda deps

* maybe it will work now

* i can't read

* all tests in matrix

* trim down more

* opencl stuff in matrix

* opencl pip cache

* test split

* change cuda test exclusion

* test

* fix cuda maybe

* add models

* add more n=auto

* third thing

* fix bug

* cache pip more

* change name

* update tests

* try again cause why not

* balance

* try again...

* try apt cache for cuda

* try on gpu:

* try cuda again

* update packages step

* replace libz-dev with zlib1g-dev

* only cache cuda

* why error

* fix gpuocelot bug

* apt cache err

* apt cache to slow?

* opt and image in single runner

* add a couple n=autos

* remove test matrix

* try cuda apt cache again

* libz-dev -> zlib1g-dev

* remove -s since not supported by xdist

* the cache takes too long and doesn't work

* combine webgpu and metal tests

* combine imagenet to c and cpu tests

* torch tests with linters

* torch back by itself

* small windows clang test with torch tests

* fix a goofy windows bug

* im dumb

* bro

* clang with linters

* fix pylint error

* linter not work on windows

* try with clang again

* clang and imagenet?

* install deps

* fix

* fix quote

* clang by itself (windows too slow)

* env vars for imagenet

* cache pip for metal and webgpu tests

* try torch with metal and webgpu

* doesn't work, too long

* remove -v

* try -n=logical

* don't use logical

* revert accidental thing

* remove some prints unless CI

* fix print unless CI

* ignore speed tests for slow tests

* clang windows in matrix (ubuntu being tested in imagenet->c test)

* try manual pip cache

* fix windows pip cache path

* all manual pip cache

* fix pip cache dir for macos

* print_ci function in helpers

* CI as variable, no print_ci

* missed one

* cuda tests with docker image

* remove setup-python action for cuda

* python->python3?

* remove -s -v

* try fix pip cache

* maybe fix

* try to fix pip cache

* is this the path?

* maybe cache pip

* try again

* create wheels dir

* ?

* cuda pip deps in dockerfile

* disable pip cache for clang

* image from ghcr instead of docker hub

* why is clang like this

* fast deps

* try use different caches

* remove the fast thing

* try with lighter image

* remove setup python for cuda

* small docker and cuda fast deps

* ignore a few more tests

* cool docker thing (maybe)

* oops

* quotes

* fix docker command

* fix bug

* ignore train efficientnet test

* remove dockerfile (docker stuff takes too long)

* remove docker stuff and normal cuda

* oops

* ignore the tests for cuda

* does this work

* ignore test_train on slow backends

* add space

* llvm ignore same tests as cuda

* nvm

* ignore lr scheduler tests

* get some stats

* fix ignore bug

* remove extra '

* remove and

* ignore test for llvm

* change ignored tests and durationon all backends

* fix

* and -> or

* ignore some more cuda tests

* finally?

* does this fix it

* remove durations=0

* add some more tests to llvm

* make last pytest more readable

* fix

* don't train efficientnet on cpu

* try w/out pip cache

* pip cache seems to be generally better

* pytest file markers

* try apt fast for cuda

* use quick install for apt-fast

* apt-fast not worth

* apt-get to apt

* fix typo

* suppress warnings

* register markers

* disable debug on fuzz tests

* change marker names

* apt update and apt install in one command

* update marker names in test.yml

* webgpu pytest marker
2023-07-23 13:00:56 -07:00
George Hotz 791530045d
Refactor LoadOps (#910)
* test

* work

* upd test

* loadops

* cleanups

* real ones

* remove LazyNumpyArray

* fix assign test

* remove range

* np.require

* llama uses arange kernels

* no caching consts

* fix enet

* torch load support

* tests cleanup

* fix shufflenet

* fix image

* fix torch_load test
2023-06-03 09:40:43 -07:00
George Hotz 5de850f6d5
assign buffer reuse (#547)
* assign buffer reuse works

* fix assign for torch and cpu

* allow assign from numpy

* fix llvm output_buffer

* add some assign tests

* fix assignment test

* test should fail without lazy

* env var to disable assign
2023-02-09 11:53:02 -06:00