tinygrad

Commit Graph

Author	SHA1	Message	Date
wozeparrot	9eb6eef441	seed in tensor (#6869 )	2024-10-06 14:46:58 -04:00
wozeparrot	c100f3d406	default threefry (#6116 )	2024-09-25 17:45:13 +08:00
wozeparrot	2be0b26a1f	rand only supports single device (#6682 )	2024-09-24 16:07:44 +08:00
wozeparrot	46e360fdc0	check bfloat16 range with threefry (#6660 )	2024-09-23 10:48:44 +08:00
chenyu	2de174677a	threefry touchup [run_process_replay] (#6169 ) also why is test_gc testing _rng_counter is allocated??	2024-08-18 23:01:24 -04:00
wozeparrot	0c5189de25	threefry half (#6154 )	2024-08-18 15:23:12 -07:00
chenyu	dc942bf1f6	jit sampling functionn in test_randomness.test_multinomial (#5034 ) * jit sampling functionn in test_randomness.test_multinomial `THREEFRY=1 python3 -m pytest test/test_randomness.py::TestRandomness::test_multinomial --durations 1` 7 sec -> 1.2 sec * skip that	2024-06-18 14:21:05 -04:00
qazal	637f482588	configure derandomizing CI tests (#4793 )	2024-05-31 17:06:58 +03:00
chenyu	53b9081aab	check arg types of Tensor.randint (#4751 ) raise TypeError if low, high, dtype are not ints	2024-05-27 20:24:10 -04:00
nimlgen	9b02aef45a	remove rhip (#4579 ) * remove rhip * remove hip runner	2024-05-14 17:58:19 +03:00
nimlgen	2131556c2c	amd mockgpu (#4535 ) * start mock amd gpu * virt files * cleaner * init ci * small fixes * linter * better? * ugh * linter * fix * diable some * run shorter * fixes * add hcq test * fix * fix cmd revert	2024-05-14 14:28:04 +03:00
qazal	23445db2b9	no skipped tests in RHIP (#4337 ) * delete skip * delete split skip * remu dev * compiler fails here * Revert "remu dev" This reverts commit 28b933d4eb54c9a3fb4c39f584122f501c791d27.	2024-04-28 12:23:05 -04:00
chenyu	b43e470f80	always use f32 for rand source of randn (#3998 ) * always use f32 for source of randn fixed bfloat16 randn to not have inf. don't really care about float64. threefry is float32 based too * HSA is broken	2024-03-29 17:04:34 -04:00
chenyu	6b6461122e	test case Tensor.randn should be finite (#3994 ) * test case Tensor.randn should be finite there's a hack to fix float16, need a generic solution that works with bf16 and threefry * skip not supported * bfloat16 local is wrong * skip RHIP	2024-03-29 14:51:02 -04:00
wozeparrot	a0ab755317	threefry again (#3785 ) * feat: initial xor * feat: initial threefly * feat: remove custom random * fix: really need to install precommit * feat: lmao forgot that this is rotate not a shift * clean: put that there * feat: numpy xor * feat: quick test for xor * feat: llvm xor * feat: slightly working xor in torch * feat: rand works in jit * clean: save a line * feat: match jax * feat: maybe test against jax * feat: requires_grad * fix: fix test_symbolic_ops * feat: lower alpha * feat: just pad * fix: maybe fix training tests? * fix: fix some llvm stuff * feat: cursed realize on the way out * feat: testing jax * fix: why is the jax install process not simple * fix: maybe passing test * fix: symbolic workarounds * clean: still need that precommit * fix: aaaa * fix: more test fixes * fix: quick fix for wgsl * feat: need to set requires_grad on the final tensor * feat: one more tensor * feat: don't take forever * feat: seeing y ci is brok * feat: can't allocate 64GiB lmao * fix: fix this * feat: hope this doesn't break smth before i go to bed * feat: don't destroy ram * feat: int * feat: remove jax * feat: properish workaround? * feat: skip slow webgpu tests * feat: no longer fails * feat: use dtypes * feat: real number * fix: torch * fix: don't test against reference for torch * feat: to device * feat: fix advanced indexing * feat: correct casting * feat: even rng_counter * feat: match master * feat: this was actually bad * fix: maybe? * feat: store * feat: remove realizes * feat: somehow this is important * feat: somehow this is also important * feat: save a line * fix: don't need that anymore * feat: restore this * fix: linter * feat: remove realizes * fix: realized is in base now * fix: add back cast * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: :( * fix: :( * fix: not being dumb * feat: try changing less tests * feat: shouldn't have to change that * feat: contiguous bumps it by one * fix: hmm * fix: numpy memory moment * fix: cl_khr_fp16 * fix: torch has different tensor count * fix: missing contiguous * hmm: hmm * fix: some fixes * fix: typing * feat: dont do that * feat: typing fixes * feat: why is this realize required? * feat: ngl kinda odd typing * feat: oh * feat: remove realizes * feat: why is this realize required? * fix: hacky patch for cudacpu * fix: without this realize pytest crashes????? * fix: shorter line * fix: cudacpu fixes * fix: cudacpu fixes * feat: real buffer * feat: don't search when searching lmao * fix: can't use contiguous things * fix: no more 100GB arrays * fix: revert * fix: skip 7 and 10 * feat: working ish beam * feat: minimize changes * feat: seed 0 stable diffusion example changed * fix: different on ci * fix: no beam * feat: make threefry optional * fix: check value * fix: unused import * feat: threefry default * fix: 5d * feat: allow non upcast div * fix: 5d better * fix: 5d better * fix: save all dtype * feat: proper error * feat: lazyop key * fix: check float * feat: try removing this realize now * feat: disable threefry for uops hip tensor cores * feat: don't need that * feat: only check upcast * fix: disable threefry for some metal tests * feat: disable for metal tensor uops as well * feat: disable for most uops * fix: disable threefry for new uops tests * feat: multitensor * fix: typing * feat: threefry default off * feat: skip threefry half rand * feat: restore old * fix: bad git * clean: ruff * feat: bfloat16 fix * fix: :\| * feat: restore old --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-03-18 16:47:07 -04:00
George Hotz	311cf2b7d3	Revert "threefry_2x32 (#2601 )" (#3784 ) This reverts commit `db3de54bc4`.	2024-03-17 10:27:20 -07:00
wozeparrot	db3de54bc4	threefry_2x32 (#2601 ) * feat: initial xor * feat: initial threefly * feat: remove custom random * fix: really need to install precommit * feat: lmao forgot that this is rotate not a shift * clean: put that there * feat: numpy xor * feat: quick test for xor * feat: llvm xor * feat: slightly working xor in torch * feat: rand works in jit * clean: save a line * feat: match jax * feat: maybe test against jax * feat: requires_grad * fix: fix test_symbolic_ops * feat: lower alpha * feat: just pad * fix: maybe fix training tests? * fix: fix some llvm stuff * feat: cursed realize on the way out * feat: testing jax * fix: why is the jax install process not simple * fix: maybe passing test * fix: symbolic workarounds * clean: still need that precommit * fix: aaaa * fix: more test fixes * fix: quick fix for wgsl * feat: need to set requires_grad on the final tensor * feat: one more tensor * feat: don't take forever * feat: seeing y ci is brok * feat: can't allocate 64GiB lmao * fix: fix this * feat: hope this doesn't break smth before i go to bed * feat: don't destroy ram * feat: int * feat: remove jax * feat: properish workaround? * feat: skip slow webgpu tests * feat: no longer fails * feat: use dtypes * feat: real number * fix: torch * fix: don't test against reference for torch * feat: to device * feat: fix advanced indexing * feat: correct casting * feat: even rng_counter * feat: match master * feat: this was actually bad * fix: maybe? * feat: store * feat: remove realizes * feat: somehow this is important * feat: somehow this is also important * feat: save a line * fix: don't need that anymore * feat: restore this * fix: linter * feat: remove realizes * fix: realized is in base now * fix: add back cast * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: bump deadline * fix: :( * fix: :( * fix: not being dumb * feat: try changing less tests * feat: shouldn't have to change that * feat: contiguous bumps it by one * fix: hmm * fix: numpy memory moment * fix: cl_khr_fp16 * fix: torch has different tensor count * fix: missing contiguous * hmm: hmm * fix: some fixes * fix: typing * feat: dont do that * feat: typing fixes * feat: why is this realize required? * feat: ngl kinda odd typing * feat: oh * feat: remove realizes * feat: why is this realize required? * fix: hacky patch for cudacpu * fix: without this realize pytest crashes????? * fix: shorter line * fix: cudacpu fixes * fix: cudacpu fixes * feat: real buffer * feat: don't search when searching lmao * fix: can't use contiguous things * fix: no more 100GB arrays * fix: revert * fix: skip 7 and 10 * feat: working ish beam * feat: minimize changes * feat: seed 0 stable diffusion example changed * fix: different on ci * fix: no beam * feat: make threefry optional * fix: check value * fix: unused import * feat: threefry default * fix: 5d * feat: allow non upcast div * fix: 5d better * fix: 5d better * fix: save all dtype * feat: proper error * feat: lazyop key * fix: check float * feat: try removing this realize now * feat: disable threefry for uops hip tensor cores * feat: don't need that * feat: only check upcast * fix: disable threefry for some metal tests * feat: disable for metal tensor uops as well * feat: disable for most uops * fix: disable threefry for new uops tests * feat: multitensor * fix: typing * feat: threefry default off * feat: skip threefry half rand * feat: restore old * fix: bad git * clean: ruff * feat: bfloat16 fix * fix: :\| --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-03-17 10:19:33 -07:00
chenyu	8ea53951c1	bfloat16 Tensor.rand (#3764 ) * Tensor.rand for bfloat16 for numpy based random, generate one for float then cast for bfloat16. close #3653 * remove realize	2024-03-15 15:05:13 -04:00
Skosh	e8c350fdac	fix: make Tensor.rand produce correct values for float16 (#3654 ) * fix: make Tensor.rand produce correct values for float16 Due to precision loss when casting to float16, the data distribution created by custom_random isnt correctly in the interval ]0, 1[, but instead in the interval ]0, 1], which causes the Tensor.randn to incorrectly generate values of infinity. The solution uses a scaling value to make sure the values stay under 1, when using half precision. Closes #3611 * update implementation to truncate to closest f16 value to 1 * chore: fix whitespace * test larger distribution --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-03-10 18:48:00 -04:00
andresgit	28ba1c5406	fix Tensor.randint ignoring kwargs (#3350 ) * fix Tensor.randint ignoring kwargs * randint kwargs fix	2024-02-09 17:12:16 +01:00
chenyu	ae112c9dbe	fix some long lines in tests (#3006 ) * fix some long lines in tests * better	2024-01-03 23:53:33 -05:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
Will	016aebcd84	Fixed Tensor.randint() not accepting tuple shapes (#2923 ) * ww/Fixed Tensor.randint() to accept shape tuples () * ww/Wrote a test to cover this typo * ww/Updated Tensor random objects to optionally take (,) or () to be more consistent ww/no lint no worries * ww/Made peace with linter * ww/Added new line can't reduce line size without reducing readablitity * ww/reverted to using .mul	2023-12-24 20:32:26 -05:00
chenyu	9c32474a1f	Revert "Revert "Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )" (#2802 )" (#2814 ) This reverts commit `fa84998244`.	2023-12-17 12:14:17 -05:00
chenyu	fa84998244	Revert "Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )" (#2802 ) This reverts commit `86c2f267d4`.	2023-12-16 15:53:28 -05:00
chenyu	86c2f267d4	Tensor.randint is Tensor.uniform with dtypes.int32 (#2801 )	2023-12-16 15:14:50 -05:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
chenyu	1ac958a058	update pytest marks and CI test filters (#2587 ) * remove pytest marks * test more stuff * fine revert some * add that mark back * skip that * hmm LLVM does not work on ubuntu * too slow on CUDA CI * dup test	2023-12-03 15:20:44 -05:00
chenyu	03968622a2	Pretty multinomial (#2365 ) * pretty multinomial p, cdf_normalized -> weight, cdf symmetric unsqueeze / squeeze check num_sample > 0 TODO: how do we want to handle 0/0 in general? * no 0-dim input * single sum	2023-11-19 15:10:10 -05:00
Marcello Fuschi	b8d460d203	Add Tensor.multinomial (#2295 ) * add Tensor.multinomial only with replacement * add support for 2D input in Tensor.multinomial * fix multinomial output shape * allow passing replacement=False to Tensor.multinomial when num_samples=1 * improve tests for Tensor.multinomial * fix edge case in Tensor.multinomial * Tensor.multinomial no more staticmethod	2023-11-15 11:38:39 -08:00
chenyu	9215bccb41	Tensor.uniform set default to standard uniform (#2158 ) * Tensor.uniform set default to standard uniform * clean up test to reuse function	2023-10-27 16:15:30 -04:00
Jordan Wright	25be7f745d	Tensor.uniform with dtype=int bug fix (#1593 )	2023-08-26 01:59:53 -04:00
Yixiang Gao	8d6662a741	.cpu().numpy() -> .numpy() (#1594 ) * .cpu().numpy() -> .numpy() * restore ops_torch * restore test_speed_v_torch	2023-08-21 09:53:29 -07:00
JaSpa99	5ab12059da	rng hlops: add normal and kaiming_normal (#1378 ) * add normal and kaiming_normal * make sure its float * add tests	2023-07-31 10:37:02 -07:00
cheeetoo	a0965ee198	CI < 5 minutes (#1252 ) * models matrix * fix typo and install gpu deps * install llvm deps if needed * fix * testops with cuda * remove pip cache since not work * cuda env * install cuda deps * maybe it will work now * i can't read * all tests in matrix * trim down more * opencl stuff in matrix * opencl pip cache * test split * change cuda test exclusion * test * fix cuda maybe * add models * add more n=auto * third thing * fix bug * cache pip more * change name * update tests * try again cause why not * balance * try again... * try apt cache for cuda * try on gpu: * try cuda again * update packages step * replace libz-dev with zlib1g-dev * only cache cuda * why error * fix gpuocelot bug * apt cache err * apt cache to slow? * opt and image in single runner * add a couple n=autos * remove test matrix * try cuda apt cache again * libz-dev -> zlib1g-dev * remove -s since not supported by xdist * the cache takes too long and doesn't work * combine webgpu and metal tests * combine imagenet to c and cpu tests * torch tests with linters * torch back by itself * small windows clang test with torch tests * fix a goofy windows bug * im dumb * bro * clang with linters * fix pylint error * linter not work on windows * try with clang again * clang and imagenet? * install deps * fix * fix quote * clang by itself (windows too slow) * env vars for imagenet * cache pip for metal and webgpu tests * try torch with metal and webgpu * doesn't work, too long * remove -v * try -n=logical * don't use logical * revert accidental thing * remove some prints unless CI * fix print unless CI * ignore speed tests for slow tests * clang windows in matrix (ubuntu being tested in imagenet->c test) * try manual pip cache * fix windows pip cache path * all manual pip cache * fix pip cache dir for macos * print_ci function in helpers * CI as variable, no print_ci * missed one * cuda tests with docker image * remove setup-python action for cuda * python->python3? * remove -s -v * try fix pip cache * maybe fix * try to fix pip cache * is this the path? * maybe cache pip * try again * create wheels dir * ? * cuda pip deps in dockerfile * disable pip cache for clang * image from ghcr instead of docker hub * why is clang like this * fast deps * try use different caches * remove the fast thing * try with lighter image * remove setup python for cuda * small docker and cuda fast deps * ignore a few more tests * cool docker thing (maybe) * oops * quotes * fix docker command * fix bug * ignore train efficientnet test * remove dockerfile (docker stuff takes too long) * remove docker stuff and normal cuda * oops * ignore the tests for cuda * does this work * ignore test_train on slow backends * add space * llvm ignore same tests as cuda * nvm * ignore lr scheduler tests * get some stats * fix ignore bug * remove extra ' * remove and * ignore test for llvm * change ignored tests and durationon all backends * fix * and -> or * ignore some more cuda tests * finally? * does this fix it * remove durations=0 * add some more tests to llvm * make last pytest more readable * fix * don't train efficientnet on cpu * try w/out pip cache * pip cache seems to be generally better * pytest file markers * try apt fast for cuda * use quick install for apt-fast * apt-fast not worth * apt-get to apt * fix typo * suppress warnings * register markers * disable debug on fuzz tests * change marker names * apt update and apt install in one command * update marker names in test.yml * webgpu pytest marker	2023-07-23 13:00:56 -07:00
George Hotz	791530045d	Refactor LoadOps (#910 ) * test * work * upd test * loadops * cleanups * real ones * remove LazyNumpyArray * fix assign test * remove range * np.require * llama uses arange kernels * no caching consts * fix enet * torch load support * tests cleanup * fix shufflenet * fix image * fix torch_load test	2023-06-03 09:40:43 -07:00
George Hotz	4d28d55683	add nn layer tests	2023-06-01 21:34:24 -07:00
George Hotz	8a928ed2f3	nn init matches torch (#901 )	2023-06-01 21:24:11 -07:00
Rabia Eda Yılmaz	3075988468	Added kaiming_uniform initialization for Conv2d and Linear layers (#756 ) * added kaiming_uniform init for conv2d and linear layers * fix: set getattr * up * fix: set getattr * fix comments * better does not mean it is good * more nonlinearities * added test checks the distribution of default relu option * prettier * fix kernel size * edit distribution of returned tensor * complete tests and fix fan_mode * added higher dim test * prettier test * fix silly blank * just leaky_relu mode * default fan in and leaky relu * update params * fix test * shorter * generalize Tensor.uniform and adjust kaiming init - added low and high parameters to Tensor.uniform function, so it can have a specific range (default is 0 to 1) - adjusted return line of kaiming_uniform * range from -1 to 1 * delete comment * adjusted test_uniform * fixed * delete comment	2023-05-29 15:09:55 -07:00
Jacky Lee	b80cf9220c	Statistics test: check if distributions match torch (#769 ) * Check if tensor values match torch * Clean up randomness tests and remove dependency * Remove kaiming uniform test	2023-05-07 21:43:23 -07:00
Jacky Lee	5e41d5857c	Add tests for randomness (#621 ) * Add tests for random creation functions * It worked on my machine! * Rename to helper_same_distribution * Remove extra line * Add tests for equal distribution * Test without scipy * Do a different test for randn	2023-03-01 15:39:20 -08:00

41 Commits