Commit Graph

107 Commits

Author SHA1 Message Date
chenyu 68e59eb3f5
update mlperf-logging to 4.1.0-rc3 (#6796) 2024-09-28 21:45:37 -04:00
wozeparrot abd484a9f7
fix: need numpy for docs and testing (#6766) 2024-09-26 16:44:59 +08:00
wozeparrot 2b899164c6
no numpy (#6751) 2024-09-26 16:40:18 +08:00
mesozoic-egg 992cde05d7
Metal with CDLL instead of py-objc (#6545)
* Add CDLL interface for metal

* remove two unused functions

* Cover most of the API methods

* switch to cdll

* directly call objc message in ops_metal

* keep only obj interface

* Use direct message sending for graph

* may have found a solution to the memoryview on ctypes pointer

* buf indexing bug fixed

* fix c_int

* fix c int to bytes

* fix gpu time bug

* line savings for cdll metal core

* wip

* c int bug

* fix buf casting

* dedup for c_void_p

* dedup for c_void_p

* linter fix

* remove unused stuff

* my py fix

* more mypy error fix

* line savings

* line savings

* rename send_message to msg; add __hash__ and __eq__ for dedup

* wip

* refactor

* refactor

* remove named import from ctypes

* forgot to change variable name

* file reorg, put support.py to ops_metal

* refactor

* hash error

* remove to_ns_array

* test oom exception, fix exception change

* typevar for msg

* add back dedup

* test for compile error

* move constant to graph

* move header constant around

* get label for icb buffer

* check icb label using "in"

* wip fixing mypy reported error

* fixed mypy error

* code formatting

* all_resources dedup match previous

* code formatting

* code formatting; buffer set to objc_id

* revert changes on buf for the manual release, seems like _free is not always called

* skip unless on metal, for test_metal

* fix premature mem release causing seg fault

* test_metal check for device before importing

* Buffer should only be released under _free explicitly

* mypy fixes

* change object ownership

* test compile success

* lint fixes

* remove load_library

* wrap sel_register in cache

* simplify to_struct

* swap lines

* fix type error in to_struct

* bump line to 9800

* remove pyobjc from setup.py

* command buffer should be objc_instance and get released

* stringWithUTF8String: returns objc_instance

* Use constant for MTLPipelineOptionNone

* better explanation for [MTLBuffer contents:] return

* Use dyld_find in case the path differs

* trailing whitespace

* handle exception for methods that take error:

* load /System/Library instead of /Library

* Init c_void_p with None instead of zero for error objects

---------

Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.me>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-09-25 17:43:01 +08:00
George Hotz c6e117c899
add a single py.typed (#6083) 2024-08-14 17:31:46 -07:00
wozeparrot 518c022c29
feat: tag 0.9.2 (#6067) 2024-08-13 16:15:36 -07:00
chenyu 2cadf21684
include "mkdocs" in setup docs (#5798) 2024-07-29 15:54:52 -04:00
uuuvn 3cb94a0a15
Rename tinygrad/runtime/driver to support (#5413) 2024-07-12 11:06:42 -07:00
wozeparrot dfbee4f0f5
feat: add blobfile to testing (#5254) 2024-07-01 19:33:58 -07:00
wozeparrot 7bcb74ab23
feat: tag 0.9.1 (#5220) 2024-06-28 20:16:14 -07:00
wozeparrot 8209cd3c55
easier llama3 + fetch subdir (#4938) 2024-06-14 13:47:27 -07:00
SnakeOnex b1db2d0094
tqdm replacement (#4846)
* tqdm replacement almost

* formatting

* formatting

* imports

* line len

* fix

* removed set description :(

* removed set description :(

* fix

* fix

* green check?

* rewrote as class, fixed several bugs

* types spacing

* removed imports

* fix

* iterable

* typing

* mypy disagreement

* imports

* more e2e tests vs tqdm

* removed seed setting

* robustness against time.sleep() flakiness

* flaky fix

* automatic bar closing when count==total

* cleanup

* clang error with tqdm

* tqdm back

* use os lib, print to stderr (fixes the clang bug, where the bar was leaking into the generated c program

* back to shutil

* unit_scale + unit_scale test

* custom unit to tests

* pretty

* clean

* removed flaky test

* less test iters

* empty line

* remove disable
2024-06-09 23:46:03 +02:00
wozeparrot 6fcf220b21
feat: tag 0.9.0 (#4762) 2024-05-28 18:44:45 +00:00
George Hotz 5ba611787d
move image into tensor.py. delete features (#4603)
* move image into tensor.py

* change setup.py

* openpilot tests need pythonpath now
2024-05-15 10:50:25 -07:00
Francis Lata bb849a57d1
[MLPerf] UNet3D dataloader (#4343)
* add support for train/val datasets for kits19

* split dataset into train and val sets

* add tests for kits19 dataloader

* add MLPerf dataset tests to CI

* update unet3d model_eval script

* fix linting

* add nibabel

* fix how mock dataset gets created

* update ref implementation with permalink and no edits

* clean up test and update rand_flip implementation

* cleanups
2024-04-28 22:34:18 -04:00
chenyu d31e220cbf
add mlperf-logging to setup.py mlperf (#4289) 2024-04-24 23:34:34 -04:00
wozeparrot 4c99d49c4d
some docstrings (#4201)
* feat: create and data access docstrings

* fix: linter

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-04-21 16:34:08 +04:00
George Hotz 8f749ae0eb
New docs are in mkdocs (#4178)
* start mkdocs

* simple docs for tensor

* more docs

* move those back

* more docs

* copy markdown extensions

* docs legacy

* docs building workflow

* fix showcase links

* only that?

* install tinygrad

* add docs to setup.py

* Delete examples/llm.c/data
2024-04-16 10:59:51 +04:00
geohotstan fe88591890
update onnx to 1.16.0 (#4127)
* update

* pass tests and skip tests
2024-04-10 11:19:13 -04:00
George Hotz 150ea2eb76
create engine folder and move code (#3948)
* retry

* older tf

* that
2024-03-26 20:38:03 -07:00
David Hou 0afaf70d57
lars optimizer + tests (#3631)
* lars optimizer + tests

* fix skip list!

* use id to compare in skip list

* go back to using set

* Tensor(bool) * Tensor(bool) is and

* don't lint external/mlperf_resnet

* whitespace

* add external_test_optim to opencl tests

* give mlperf task a name

* mlperf under onnx

* remove track_gnorm

* contiguous instead of realize

* assert momentum and weight decay positive

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-03-06 18:11:01 -05:00
George Hotz fe97a85014
the compiler is a driver (#3427) 2024-02-16 10:18:09 +01:00
George Hotz 473935125a
use comgr to compile (#3248)
* use comgr to compile

* fast

* bfloat16

* move comgr to it's own file

* cleaner style

* comgr in new place

* comgr free + dtype cleanup
2024-01-26 18:27:49 -08:00
George Hotz 03a6bc59c1
move autogen to runtime/autogen (#3254) 2024-01-26 12:44:19 -08:00
George Hotz a3869ffd46
move gpuctypes in tree (#3253)
* move gpuctypes in tree

* fix mypy

* regex exclude

* autogen sh

* mypy exclude

* does that fix it

* fix mypy

* add hip confirm

* verify all autogens

* build clang2py

* opencl headers

* gpu on 22.04
2024-01-26 12:25:03 -08:00
George Hotz 8cbcd1b342
Remove webgpu, back to 5k lines (#3040)
* remove webgpu

* max 5000 lines
2024-01-08 09:10:07 -08:00
George Hotz 6617dcf095
move graph to runtime, check line count with sz.py (#2842)
* move graph to runtime, check line count with sz.py

* oops, didn't save

* dtype aliases

* restore comment, REALCOUNT
2023-12-18 20:30:06 -08:00
George Hotz 6d6eb9302d
ruff checks the max line length is 150 (#2734)
* ruff checks the max line length is 150

* fix tensor.py

* a lot more

* done
2023-12-12 17:34:47 -08:00
qazal 29f2653d8d
add graph (#2670) 2023-12-07 10:53:31 -08:00
qazal c704a77ca0
green dtypes ALU tests (#2617)
* dtypes alu test

* those types don't exist in torch

* floats

* more tests

* disable those

* a couple unary tests

* skip float16 tests in CI for GPU

* fix LLVM bool add True+True=1+1=2 which truncates to False in native LLVM

* remove hardcoded float for LLVM ALU fns

* less sensitive atol for fp32, 1e-10 is flaky and sometimes failed even if you revert the merge commit for non-fp32 math, nothing has changed in our kernels for fp32.

* return on overflows

* fix CUDA exp2

* compute results of op regardless of bounds in a python backend

* skip fp16 in GPU and CUDACPU

* fuzz a smaller range in the float_midcast_int32 test

I sampled this and we overflow ~70% of the time.
because numpy behaves differently on different devices for overflows and Metal seems to do the same, I'm opting to eliminate the non-determinism here

* remove CUDA exp2 overload it's already there now

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2023-12-06 08:15:46 -08:00
George Hotz 232ed2af3f
more test cleanups (#2631)
* more test cleanups

* move test example back
2023-12-05 16:17:57 -08:00
George Hotz 27481b9206
Switch ops_gpu -> gpuctypes (#2532)
* ops_gpu is go

* fix size 0

* fix image, and add more tests

* nerf openpilot test, doesn't test thneed

* run the schedule

* better

* oops, new inputs

* delete pyopencl

* Update ops_gpu.py
2023-12-01 22:30:21 -08:00
George Hotz 4c984bba7e
bump version to 0.8.0, clean CI, remove requests (#2545)
* bump version to 0.8.0, clean CI, remove requests

* why was that even there
2023-12-01 10:42:50 -08:00
George Hotz 8fd8399437
remove flake8 (#2544) 2023-12-01 09:48:41 -08:00
nimlgen badc97f824
hip & cuda to gpuctypes (#2539)
* cuda with gpuctypes

* hip gpuctypes

* graphs

* rename + linter happy

* use cpu_time_execution

* no ji in build_kernel_node_params

* remove hip_wrapper

* hip fix

* no arc

* smalle changes

* no clean moduke in cudacpu
2023-12-01 09:25:27 -08:00
George Hotz 5bb720a777 Cocoa is no longer used 2023-11-23 14:31:21 -08:00
George Hotz 095e2ced61
add name support to fetch (#2407)
* add name support

* use fetch in gpt2

* remove requests from main lib, networkx also optional

* umm, keep that assert

* updates to fetch

* i love the walrus so much

* stop bundling mnist with tinygrad

* err, https

* download cache names

* add DOWNLOAD_CACHE_VERSION

* need env.

* ugh, wrong path

* replace get_child
2023-11-23 14:16:17 -08:00
George Hotz a0890f4e6c
move fetch to helpers (#2363)
* switch datasets to new fetch

* add test_helpers

* fix convnext and delete old torch load
2023-11-19 12:29:51 -08:00
chenyu 163b2bc26a
wgpu.utils._device -> wgpu.utils.device (#2330)
* wgpu.utils._device -> wgpu.utils.device

* can i do this?

* no need to specify metal
2023-11-16 12:52:13 -05:00
geohotstan b853e9bb8c
Onnx 1.15.0 gogogo (#2217)
* lol

* lol

* add GELULULULUL

* onnx 1.50

* fuk torch bool neg

* exclude regex tests

* exclude dequantizelinear for now

* is sunny in philly

* damn it affinegrid

* fixed auto_pad VALID

* skip 0 shape tests

* add temporary cast in Reduces

* tests should pass now

* added comments and cleanup

* try moving dequantizelinear to onnx.py

* fixed dequantizedlinear?

* cleanup

* try?

* float16 segfaults LLVM CI..???

* cleanup comments

* pin to 1.50.0

* remove use of -np.inf cuz numpy is kill

* 1.50? lol I'm actually retarded

* thx for review, muhbad

* moved Gelu higher up
2023-11-10 15:36:48 -08:00
George Hotz a48ccdb359
cleanup deps, no pyyaml, pillow to testing (#2231) 2023-11-07 10:32:23 -08:00
George Hotz 51fd993f1f pin onnx to 1.14.1 2023-11-02 18:03:21 -07:00
George Hotz 6621d2eb98 Revert "Modernize setup.py (#2187)"
This reverts commit 7e8c5f1a0f.
2023-11-03 01:01:15 +00:00
Elias Wahl 7e8c5f1a0f
Modernize setup.py (#2187)
* Added pyproject.toml

* Pin onnx
2023-10-31 13:55:45 -07:00
Szymon Ożóg e0b2bf46b4
Improve triton generated code quality (#2119) 2023-10-19 22:06:19 -07:00
George Hotz 8940c89d13
tests: remove 2 runners, make cache reliable (#2106)
* remove 2 runners

* device.DEFAULT printing

* explain rebuild

* disable ocelot rebuild

* try again to fix workflow

* this? fix cache hash

* force no rebuild

* fix pylint
2023-10-18 11:10:41 -07:00
mmmkkaaayy 91168a28c4
whisper: make file transcription work, add basic CI test (#2042) 2023-10-13 17:13:35 -07:00
wozeparrot c4e8ea73bd
feat: add tinygrad.features to setup.py (#2016) 2023-10-07 21:55:50 -07:00
George Hotz e43d8977f8
Revert "chore: add `py.typed` marker. (#1991)" (#1994)
This reverts commit 6d581e8911.
2023-10-06 01:44:34 -07:00
Vidhan Bhatt 6d581e8911
chore: add `py.typed` marker. (#1991)
* chore: add `py.typed` marker.

* fix: add comma
2023-10-05 16:27:33 -07:00