tinygrad/setup.py

#!/usr/bin/env python3

import os
from setuptools import setup

directory = os.path.abspath(os.path.dirname(__file__))
with open(os.path.join(directory, 'README.md'), encoding='utf-8') as f:
  long_description = f.read()

setup(name='tinygrad',
      version='0.4.0',
      description='You like pytorch? You like micrograd? You love tinygrad! heart',
      author='George Hotz',
      license='MIT',
      long_description=long_description,
      long_description_content_type='text/markdown',
      packages = ['tinygrad', 'tinygrad.llops', 'tinygrad.nn', 'tinygrad.runtime', 'tinygrad.shape'],
      classifiers=[
        "Programming Language :: Python :: 3",
        "License :: OSI Approved :: MIT License"
      ],
      install_requires=['numpy', 'requests', 'pillow', 'tqdm', 'networkx'],
      python_requires='>=3.8',
      extras_require={
        'gpu': ["pyopencl", "six"],
        'llvm': ["llvmlite"],
        'cuda': ["pycuda"],
        'triton': ["triton>=2.0.0.dev20221202"],
        'linting': [
            "flake8",
            "pylint",
            "mypy",
            "pre-commit",
        ],
        'testing': [
            "torch~=1.13.0",
            "protobuf~=3.19.0",
            "pytest",
            "pytest-xdist",
            "onnx~=1.12.0",
            "onnx2torch",
        ],
      },
      include_package_data=True)
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00			`#!/usr/bin/env python3`

			`import os`
			`from setuptools import setup`

			`directory = os.path.abspath(os.path.dirname(__file__))`
			`with open(os.path.join(directory, 'README.md'), encoding='utf-8') as f:`
two spaces 2020-10-26 23:54:55 +08:00			`long_description = f.read()`
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00
			`setup(name='tinygrad',`
bump version to 0.4.0 2022-11-09 00:44:42 +08:00			`version='0.4.0',`
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00			`description='You like pytorch? You like micrograd? You love tinygrad! heart',`
			`author='George Hotz',`
			`license='MIT',`
			`long_description=long_description,`
			`long_description_content_type='text/markdown',`
Add missing packages to setup.py (#554) 2023-02-12 06:41:56 +08:00			`packages = ['tinygrad', 'tinygrad.llops', 'tinygrad.nn', 'tinygrad.runtime', 'tinygrad.shape'],`
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00			`classifiers=[`
two spaces 2020-10-26 23:54:55 +08:00			`"Programming Language :: Python :: 3",`
			`"License :: OSI Approved :: MIT License"`
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00			`],`
shuffle deps: always tqdm, make linting category 2023-02-06 23:27:01 +08:00			`install_requires=['numpy', 'requests', 'pillow', 'tqdm', 'networkx'],`
Update setup.py (#49) I think `:=` in tinygrad/test/test_mnist.py actually needs 3.8 2020-11-03 10:09:31 +08:00			`python_requires='>=3.8',`
Extra install requirements. (#164) * Testing install requirements * GPU install requirements 2020-12-09 18:22:47 +08:00			`extras_require={`
All devices are equal! (#196) * Update all devices to be tested ANE, CPU and OCL all now support all tests. However tests are not currently passing on GPU and I cannot test on CPU. Failing GPU test are not an issue caused by this update. Tests have not been passing due to a missing "six" required installation. OpenCL Tests have not been run since commit: 1a1c63a08b713e8b491e2dd23f0fa92646f5af38 devices have 3 types and are handle by a new DeviceTypes enum. (The goal is to revert to Tensor.<type>, but this current setup allows for keyword argument defaults: `device=DeviceType.CPU`) All references to Tensor.GPU/CPU/ANE as been converted to the corresponding `DeviceTypes` enum. Refactor of the conversion code to allow for any device to any device conversion. * Add six dependency in requirements.txt * Resolve failure to run tests Move six into gpu required installs. Remove six from standard installation. * Remove repeated data conversion * Refactor method names Also reduce code with .to and .to_ * Dynamic device handlers * Refactor DeviceTypes -> Device * Add mem copy profiling back * test_backward_pass_diamond_model passing * Resolve Sum issue on GPU * Revert batchnorm2d tests * Update README with upadated API * ANE testing with * Last minute line gains 2020-12-16 15:44:08 +08:00			`'gpu': ["pyopencl", "six"],`
Simple chonker (#431) * chonker will make llvm fast * work * better speed tests, we will make them fast * with the cache add is the same speed * relu and neg are fast * fix sum speed * maximum maxnum? * hack for gemm opt * gemm very slow * zeros like * test_permute * shapetracker returns self * fix shapetracker factorization * err, int strides * permutes are faster now in tinygrad than pytorch * support -1 in expand * gemm unrolled * improve final test case * WIP GEMM * why isn't GEMM fast? * revert cache dim * ffp contract works on clang, not llvm? * ignore llvm ir * this makes fma work at least, but no faster * USE_4x4 * 63 GFLOPS * 87 GFLOPS * that wasn't matmul, 44 GFLOPS now * 82 GFLOPS permuted * this permute too * a little speed for the convs * 45 GFLOPS * speed tests pass again * clean up prints * fix FMA WHAT A WASTE OF TIME * colors * moar fair * GPU * useless on chonker * cleanups * improve factorized shapetracker * better threshold * label conv * work * ops test pass again * hot load the index * run the last view, no need to create * ZeroView needs a repr for the key to work * fix segfault on out of bounds * one more test * start amx, and llvm.initialize_native_asmparser * amx works * nice AMX class * nicer AMX class * refactor get_idxs * amx working * is slower... * useless flip * cache * SZ_X * AMX_SZ_X/Y work alone * Contiguous mlop * test gemm packed * PREPARE in packed * use_amx factor * prefetch isn't faster * loop * same 3ms * 2.24 ms * allow double on store in TG * amx reduce is the same speed as non amx reduce * include memory bandwidth * clean up shapetracker * flip returns stride * prepare for upstream * Update ops_llvm.py (#426) * permutes are yellow and green now * faster conv * llvm cleanups * Show optimised IR under debug 4 (#428) * ASTKernel class * Make tinygrad work with older python version (#427) * Make tinygrad work with older python version * Use partialmethod instead of partial * smiple chonker is chonking * remove junk from test speed vs torch * fix linker and types * AMX is only here now * add LLVM tests, it's a valid backend now * oops, run llvm test * contiguous_op * fix loadops compare * dedup reduceops Co-authored-by: calledit <1573053+calledit@users.noreply.github.com> 2022-11-11 15:17:09 +08:00			`'llvm': ["llvmlite"],`
Simple CUDA Runtime (#480) * factor out opencl runtime * don't use CL outside the runtime * cuda runtime adds * final_dimension * tests pass with CUDA backend * more cuda * cuda simpler * retain old functionality * linter and typing * move globalcounters out of runtimes * oops, GlobalCounters in cuda * MAX_OUTPUT_SHAPE=3 is fine for CUDA 2023-01-28 08:26:24 +08:00			`'cuda': ["pycuda"],`
A Triton backend for tinygrad (#470) * triton can add * print stuff from triton * write out file * ops triton working * reduce ops * sort of works * Triton bugfixes & implementation of remaining ops (#490) * padding * support pow, max, relu, gt0 * allocate return buffer * Fix reduce * Add tests for power op * Fix triton illegal memory accesses and memory leak (#512) * Fix mypy issue * Add triton to setup.py * Replace torch with pycuda * Use one cuda stream for data transfer and kernels * Remove triton submodule * Fix memory leak by using weakrefs for caching * Fix memory access by adding valid as mask for load * Fix invalid kernel launches by flattening the grid (#515) --------- Co-authored-by: Martin Loretz <20306567+martinloretzzz@users.noreply.github.com> 2023-02-02 03:53:57 +08:00			`'triton': ["triton>=2.0.0.dev20221202"],`
shuffle deps: always tqdm, make linting category 2023-02-06 23:27:01 +08:00			`'linting': [`
			`"flake8",`
			`"pylint",`
			`"mypy",`
			`"pre-commit",`
			`],`
			`'testing': [`
Add test for quick_gelu (#526) * Add test for quick_gelu * Bump PyTorch version for approximate 2023-02-04 12:01:39 +08:00			`"torch~=1.13.0",`
it's a real test now 2022-06-12 02:33:33 +08:00			`"protobuf~=3.19.0",`
oops, pytest is for testing 2023-02-06 23:30:12 +08:00			`"pytest",`
Parallelize CI tests (#535) 2023-02-07 05:27:44 +08:00			`"pytest-xdist",`
tests are 20% faster (#529) * pytorch CPU * no cache, it's slower * pytorch cpu for real * remove double onnx 2023-02-06 23:56:14 +08:00			`"onnx~=1.12.0",`
it's a real test now 2022-06-12 02:33:33 +08:00			`"onnx2torch",`
Extra install requirements. (#164) * Testing install requirements * GPU install requirements 2020-12-09 18:22:47 +08:00			`],`
			`},`
add setup.py and change imports to relative 2020-10-26 23:19:50 +08:00			`include_package_data=True)`