diff --git a/.tokeignore b/.tokeignore new file mode 100644 index 00000000..cb7645b2 --- /dev/null +++ b/.tokeignore @@ -0,0 +1,4 @@ +* +!*/ + +!tinygrad/** diff --git a/README.md b/README.md index 1620b777..6fccefc9 100644 --- a/README.md +++ b/README.md @@ -1,91 +1,57 @@ -
- -
+- -
- --"a horse sized cat eating a bagel" -
- -### tinygrad supports LLaMA - -After putting the weights in weights/LLaMA, you can have a chat with Stacy. She lives inside tinygrad. - -```bash -python3 examples/llama.py -``` - -### tinygrad supports GANs - -See `examples/mnist_gan.py` - -- -
- -### tinygrad supports yolo - -See `examples/yolov3.py` - -- -
- -### Drawing Execution Graph - -```bash -GRAPH=1 python3 test/models/test_mnist.py TestMNIST.test_sgd_onestep -# requires dot, outputs /tmp/net.svg -``` +Additional guidelines can be found in [CONTRIBUTING.md](/CONTRIBUTING.md). ### Running tests For more examples on how to run the full test suite please refer to the [CI workflow](.github/workflows/test.yml). -```bash +Some examples: +```sh python3 -m pip install -e '.[testing]' python3 -m pytest python3 -m pytest -v -k TestTrain python3 ./test/models/test_train.py TestTrain.test_efficientnet ``` - diff --git a/docs/README.md b/docs/README.md index 052347e0..6a8de588 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,125 +1,37 @@ -### Welcome to the tinygrad documentation +# Welcome to the tinygrad documentation! -General instructions you will find in [README.md](https://github.com/geohot/tinygrad/blob/master/README.md) +Here you will find documentation for tinygrad, as well as some examples and tutorials. -[abstraction.py](https://github.com/geohot/tinygrad/blob/master/docs/abstractions.py) is a well documented showcase of the abstraction stack. +## Getting Started -There are plenty of [tests](https://github.com/geohot/tinygrad/tree/master/test) you can read through -[Examples](https://github.com/geohot/tinygrad/tree/master/examples) contains tinygrad implementations of popular models (vision and language) and neural networks. LLama, Stable diffusion, GANs and Yolo to name a few +Read the quick start guide [here](/docs/quickstart.md). -### Environment variables -Here is a list of environment variables you can use with tinygrad. -Most of these are self-explanatory, and used to enable an option at runtime. -Example : `GPU=1 DEBUG=4 python3 -m pytest` +Or if you want to jump right in to how tinygrad works, you can read the [abstraction stack](/docs/abstractions.py) documentation. -The columns are: Variable, Value and Description -They are also grouped into either general tinygrad or specific files +Or if you want to see some examples, you can look at the examples in the [examples](/examples) directory. -##### General tinygrad -DEBUG: [1-4], enable debugging output, with 4 you get operations, timings, speed, generated code and more -GPU: [1], enable the GPU backend -CPU: [1], enable CPU backend -MPS: [1], emable MPS device (for Mac M1 and after) -METAL: [1], enable Metal backend (for Mac M1 and after) -METAL_XCODE: [1], enable Metal using MacOS Xcode sdk -TORCH: [1], enable Torch backend -CLANG: [1], enable Clang backend -LLVM: [1], enable LLVM backend -LLVMOPT: [1], enable LLVM optimization -LAZY: [1], enable lazy operations -OPT: [1-4], enable optimization -OPTLOCAL: [1], enable local optimization -JIT: [1], enable Jit -GRAPH: [1], Create a graph of all operations -GRAPHPATH: [/path/to], what path to generate the graph image -PRUNEGRAPH, [1], prune movementops and loadops from the graph -PRINT_PRG: [1], print program -FLOAT16: [1], use float16 instead of float32 -ENABLE_METHOD_CACHE: [1], enable method cache -EARLY_STOPPING: [1], stop early -DISALLOW_ASSIGN: [1], enable not assigning the realized lazydata to the lazy output buffer +Or if you just want to see some of the things tinygrad can do, check out the [showcase](/docs/showcase.md). -##### tinygrad/codegen/cstyle.py -NATIVE_EXPLOG: [1], enable using native explog +## API -##### accel/ane/2_compile/hwx_parse.py -PRINTALL: [1], print all ane registers +This is currently a big work in progress. -##### extra/onnx.py -ONNXLIMIT: [ ], set a limit for Onnx -DEBUGONNX: [1], enable Onnx debugging +## Resources -##### extra/thneed.py -DEBUGCL: [1-4], enable Debugging for OpenCL -PRINT_KERNEL: [1], Print OpenCL Kernels +### Environment Variables -##### extra/kernel_search.py -OP: [1-3], different operations -NOTEST: [1], enable not testing ast -DUMP: [1], enable dumping of intervention cache -REDUCE: [1], enable reduce operations -SIMPLE_REDUCE: [1], enable simpler reduce operations -BC: [1], enable big conv operations -CONVW: [1], enable convw operations -FASTCONV: [1], enable faster conv operations -GEMM: [1], enable general matrix multiply operations -BROKEN: [1], enable a kind of operation -BROKEN3: [1], enable a kind of operation +[env_vars.md](/docs/env_vars.md) -##### examples/vit.py -LARGE: [1], enable larger dimension model +### Adding New Accelerators -##### examples/llama.py -WEIGHTS: [1], enable using weights +[adding_new_accelerators.md](/docs/adding_new_accelerators.md) -##### examples/mlperf -MODEL: [resnet,retinanet,unet3d,rnnt,bert,maskrcnn], what models to use +### Community -##### examples/benchmark_train_efficientnet.py -CNT: [10], the amount of times to loop the benchmark -BACKWARD: [1], enable backward call -TRAINING: [1], set Tensor.training -CLCACHE: [1], enable Cache for OpenCL +[![tinygrad discord](https://discordapp.com/api/guilds/1068976834382925865/widget.png?style=banner2)](https://discord.gg/ZjZadyC7PK) -##### examples/hlb_cifar10.py -TORCHWEIGHTS: [1], use torch to initialize weights -DISABLE_BACKWARD: [1], dont use backward operations +## Contributing -##### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py -ADAM: [1], enable Adam optimization - -##### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py -STEPS: [0-10], number of steps -FAKEDATA: [1], enable to use random data - -##### examples/train_efficientnet.py -STEPS: [1024 dividable], number of steps -TINY: [1], use a tiny convolution network -IMAGENET: [1], use imagenet for training - -##### examples/train_efficientnet.py & examples/train_resnet.py -TRANSFER: [1], enable to use pretrained data - -##### examples & test/external/external_test_opt.py -NUM: [18, 2], what ResNet[18] / EfficientNet[2] to train - -##### test/test_ops.py -PRINT_TENSORS: [1], print tensors -FORWARD_ONLY: [1], use forward operations only - -##### test/test_speed_v_torch.py -TORCHCUDA: [1], enable the torch cuda backend - -##### test/external/external_test_gpu_ast.py -KOPT: [1], enable kernel optimization -KCACHE: [1], enable kernel cache - -##### test/external/external_test_opt.py -ENET_NUM: [-2,-1], what EfficientNet to use - -##### test/test_dtype.py & test/extra/test_utils.py & extra/training.py -CI: [1], enable to avoid some tests to run in CI - -##### examples & extra & test -BS: [8, 16, 32, 64, 128], bytesize +The documentation mainly follows the core contributing guidelines in the [README.md](/README.md#contributing). +Additionally, we always welcome documentation contributions, especially for features that are currently under documented. diff --git a/docs/adding_new_accelerators.md b/docs/adding_new_accelerators.md new file mode 100644 index 00000000..cd8a280f --- /dev/null +++ b/docs/adding_new_accelerators.md @@ -0,0 +1,32 @@ +# Adding a new accelerator to tinygrad + +It's pretty easy to add a new accelerator to tinygrad. All you need to do is implement a total of 20 (optionally 21) low level ops. Then tinygrad takes care of the rest, handling derivatives and syntactic sugar. + +## llops + +These are the ops that you must implement for your accelerator of choice. +``` +Buffer # class of memory on this device +unary_op (NOOP, EXP, LOG, CAST, SIN) # A -> A +reduce_op (SUM, MAX) # A -> B (smaller size, B has 1 in shape) +binary_op (ADD, SUB, MUL, DIV, POW, CMPEQ, MAX) # A + A -> A (all the same size) +movement_op (EXPAND, RESHAPE, PERMUTE, PAD, SHRINK, STRIDE) # A -> B (different size) +fused_op [[optional]] (MULACC) # A * A -> B +``` + +## mlops + +These are the mid level ops that handle the derivatives. +``` +Relu, Log, Exp, Sin # unary ops +Sum, Max # reduce ops (with axis argument) +Maximum, Add, Sub, Mul, Pow, Div, Equal # binary ops (no broadcasting, use expand) +Expand, Reshape, Permute, Pad, Shrink, Flip # movement ops +``` +These are implemented in [mlops.py](/tinygrad/mlops.py). + +## hlops + +These are the syntax sugar. They are built on top of the mlops and support most of the things that you could expect from a tensor library. + +These are implemented in [tensor.py](/tinygrad/tensor.py). diff --git a/docs/env_vars.md b/docs/env_vars.md new file mode 100644 index 00000000..e1cdbd27 --- /dev/null +++ b/docs/env_vars.md @@ -0,0 +1,186 @@ +# List of environment variables that control tinygrad behavior. + +This is a list of environment variable that control the runtime behavior of tinygrad and its examples. +Most of these are self-explanatory, and are usually used to set an option at runtime. + +Example: `GPU=1 DEBUG=4 python3 -m pytest` + +The columns are: Variable, Possible Value(s) and Description. + +- A `#` means that the variable can take any integer value. + +## Global Variables + +These control the behavior of core tinygrad even when used as a library. + +Variable | Possible Value(s) | Description +---|---|--- +DEBUG | [1-4] | enable debugging output, with 4 you get operations, timings, speed, generated code and more +GPU | [1] | enable the GPU backend +CPU | [1] | enable CPU backend +MPS | [1] | enable MPS device (for Mac M1 and after) +METAL | [1] | enable Metal backend (for Mac M1 and after) +METAL_XCODE | [1] | enable Metal using macOS Xcode SDK +TORCH | [1] | enable PyTorch backend +CLANG | [1] | enable Clang backend +LLVM | [1] | enable LLVM backend +LLVMOPT | [1] | enable slightly more expensive LLVM optimizations +LAZY | [1] | enable lazy operations (this is the default) +OPT | [1-4] | optimization level +OPTLOCAL | [1-2] | enable local optimization +GRAPH | [1] | create a graph of all operations (requires graphviz) +GRAPHPATH | [/path/to] | where to put the generated graph +PRUNEGRAPH | [1] | prune MovementOps and LoadOps from the graph +PRINT_PRG | [1] | print program code +IMAGE | [1] | enable 2d specific optimizations +FLOAT16 | [1] | use float16 for images instead of float32 +ENABLE_METHOD_CACHE | [1] | enable method cache (this is the default) +EARLY_STOPPING | [# > 0] | stop after this many kernels +DISALLOW_ASSIGN | [1] | disallow assignment of tensors +NATIVE_EXPLOG | [1] | enable using native exp and log + +## File Specific Variables + +These are variables that control the behavior of a specific file, these usually don't affect the library itself. +Most of the time these will never be used, but they are here for completeness. + +### accel/ane/2_compile/hwx_parse.py + +Variable | Possible Value(s) | Description +---|---|--- +PRINTALL | [1] | print all ANE registers + +### extra/onnx.py + +Variable | Possible Value(s) | Description +---|---|--- +ONNXLIMIT | [#] | set a limit for ONNX +DEBUGONNX | [1] | enable ONNX debugging + +### extra/thneed.py + +Variable | Possible Value(s) | Description +---|---|--- +DEBUGCL | [1-4] | enable Debugging for OpenCL +PRINT_KERNEL | [1] | Print OpenCL Kernels + +### extra/kernel_search.py + +Variable | Possible Value(s) | Description +---|---|--- +OP | [1-3] | different operations +NOTEST | [1] | enable not testing AST +DUMP | [1] | enable dumping of intervention cache +REDUCE | [1] | enable reduce operations +SIMPLE_REDUCE | [1] | enable simpler reduce operations +BC | [1] | enable big conv operations +CONVW | [1] | enable convw operations +FASTCONV | [1] | enable faster conv operations +GEMM | [1] | enable general matrix multiply operations +BROKEN | [1] | enable a kind of operation +BROKEN3 | [1] | enable a kind of operation + +### examples/vit.py + +Variable | Possible Value(s) | Description +---|---|--- +LARGE | [1] | enable larger dimension model + +### examples/llama.py + +Variable | Possible Value(s) | Description +---|---|--- +WEIGHTS | [1] | enable loading weights + +### examples/mlperf + +Variable | Possible Value(s) | Description +---|---|--- +MODEL | [resnet,retinanet,unet3d,rnnt,bert,maskrcnn] | what models to use + +### examples/benchmark_train_efficientnet.py + +Variable | Possible Value(s) | Description +---|---|--- +CNT | [10] | the amount of times to loop the benchmark +BACKWARD | [1] | enable backward pass +TRAINING | [1] | set Tensor.training +CLCACHE | [1] | enable cache for OpenCL + +### examples/hlb_cifar10.py + +Variable | Possible Value(s) | Description +---|---|--- +TORCHWEIGHTS | [1] | use torch to initialize weights +DISABLE_BACKWARD | [1] | don't do backward pass + +### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py + +Variable | Possible Value(s) | Description +---|---|--- +ADAM | [1] | use the Adam optimizer + +### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py + +Variable | Possible Value(s) | Description +---|---|--- +STEPS | [0-10] | number of steps +FAKEDATA | [1] | enable to use random data + +### examples/train_efficientnet.py + +Variable | Possible Value(s) | Description +---|---|--- +STEPS | [# % 1024] | number of steps +TINY | [1] | use a tiny convolution network +IMAGENET | [1] | use imagenet for training + +### examples/train_efficientnet.py & examples/train_resnet.py + +Variable | Possible Value(s) | Description +---|---|--- +TRANSFER | [1] | enable to use pretrained data + +### examples & test/external/external_test_opt.py + +Variable | Possible Value(s) | Description +---|---|--- +NUM | [18, 2] | what ResNet[18] / EfficientNet[2] to train + +### test/test_ops.py + +Variable | Possible Value(s) | Description +---|---|--- +PRINT_TENSORS | [1] | print tensors +FORWARD_ONLY | [1] | use forward operations only + +### test/test_speed_v_torch.py + +Variable | Possible Value(s) | Description +---|---|--- +TORCHCUDA | [1] | enable the torch cuda backend + +### test/external/external_test_gpu_ast.py + +Variable | Possible Value(s) | Description +---|---|--- +KOPT | [1] | enable kernel optimization +KCACHE | [1] | enable kernel cache + +### test/external/external_test_opt.py + +Variable | Possible Value(s) | Description +---|---|--- +ENET_NUM | [-2,-1] | what EfficientNet to use + +### test/test_dtype.py & test/extra/test_utils.py & extra/training.py + +Variable | Possible Value(s) | Description +---|---|--- +CI | [1] | disables some tests for CI + +### examples & extra & test + +Variable | Possible Value(s) | Description +---|---|--- +BS | [8, 16, 32, 64, 128] | batch size to use diff --git a/docs/quickstart.md b/docs/quickstart.md new file mode 100644 index 00000000..0b405f48 --- /dev/null +++ b/docs/quickstart.md @@ -0,0 +1,300 @@ +# tinygrad Quick Start Guide + +This guide assumes no prior knowledge of pytorch or any other deep learning framework, but does assume some basic knowledge of neural networks. +It is intended to be a very quick overview of the high level API that tinygrad provides. + +This guide is also structured as a tutorial which at the end of it you will have a working model that can classify handwritten digits. + +We need some imports to get started: +```py +import numpy as np +import time +``` + +## Tensors + +Tensors are the base data structure in tinygrad. They can be thought of as a multidimensional array of a specific data type. +All high level operations in tinygrad operate on these tensors. + +The tensor class can be imported like so: +```py +from tinygrad.tensor import Tensor +``` + +Tensors can be created from an existing data structure like a python list or numpy ndarray: +```py +t1 = Tensor([1, 2, 3, 4, 5]) +na = np.array([1, 2, 3, 4, 5]) +t2 = Tensor(na) +``` + +Tensors can also be created using one of the many factory methods: +```py +full = Tensor.full(shape=(2, 3), fill_value=5) # create a tensor of shape (2, 3) filled with 5 +zeros = Tensor.zeros(2, 3) # create a tensor of shape (2, 3) filled with 0 +ones = Tensor.ones(2, 3) # create a tensor of shape (2, 3) filled with 1 + +full_like = Tensor.full_like(full, fill_value=2) # create a tensor of the same shape as `full` filled with 2 +zeros_like = Tensor.zeros_like(full) # create a tensor of the same shape as `full` filled with 0 +ones_like = Tensor.ones_like(full) # create a tensor of the same shape as `full` filled with 1 + +eye = Tensor.eye(3) # create a 3x3 identity matrix +arange = Tensor.arange(start=0, stop=10, step=1) # create a tensor of shape (10,) filled with values from 0 to 9 + +rand = Tensor.rand(2, 3) # create a tensor of shape (2, 3) filled with random values from a uniform distribution +randn = Tensor.randn(2, 3) # create a tensor of shape (2, 3) filled with random values from a normal distribution +uniform = Tensor.uniform(2, 3, low=0, high=10) # create a tensor of shape (2, 3) filled with random values from a uniform distribution between 0 and 10 +``` +There are even more of these factory methods, you can find them in the [tensor.py](/tinygrad/tensor.py) file. + +All the tensors creation methods can take a `dtype` argument to specify the data type of the tensor. +```py +from tinygrad.helpers import dtypes + +t3 = Tensor([1, 2, 3, 4, 5], dtype=dtypes.int32) +``` + +Tensors allow you to perform operations on them like so: +```py +t4 = Tensor([1, 2, 3, 4, 5]) +t5 = (t4 + 1) * 2 +t6 = (t5 * t4).relu().log_softmax() +``` + +All of these operations are lazy and are only executed when you realize the tensor using `.realize()` or `.numpy()`. +```py +print(t6.numpy()) +# [-56. -48. -36. -20. 0.] +``` + +There are a lot more operations that can be performed on tensors, you can find them in the [tensor.py](/tinygrad/tensor.py) file. +Additionally reading through [abstractions.py](/docs/abstractions.py) will help you understand how operations on these tensors make their way down to your hardware. + +## Models + +Neural networks in tinygrad are really just represented by the operations performed on tensors. +These operations are commonly grouped into the `__call__` method of a class which allows modularization and reuse of these groups of operations. +These classes do not need to inherit from any base class, in fact if they don't need any trainable parameters they don't even need to be a class! + +An example of this would be the `nn.Linear` class which represents a linear layer in a neural network. +```py +# from tinygrad.nn import Linear +class Linear: + def __init__(self, in_features, out_features, bias=True, initialization: str='kaiming_uniform'): + self.weight = getattr(Tensor, initialization)(out_features, in_features) + self.bias = Tensor.zeros(out_features) if bias else None + + def __call__(self, x): + return x.linear(self.weight.transpose(), self.bias) +``` +There are more neural network modules already implemented in [nn](/tinygrad/nn/__init__.py), and you can also implement your own. + +We will be implementing a simple neural network that can classify handwritten digits from the MNIST dataset. +Our classifier will be a simple 2 layer neural network with a Leaky ReLU activation function. +It will use a hidden layer size of 128 and an output layer size of 10 (one for each digit) with no bias on either Linear layer. +```py +from tinygrad.nn import Linear + +class TinyNet: + def __init__(self): + self.l1 = Linear(784, 128, bias=False) + self.l2 = Linear(128, 10, bias=False) + + def __call__(self, x): + x = self.l1(x) + x = x.leakyrelu() + x = self.l2(x) + return x.log_softmax() + +net = TinyNet() +``` +We can see that the forward pass of our neural network is just the sequence of operations performed on the input tensor `x`. +We can also see that functional operations like `leakyrelu` and `log_softmax` are not defined as classes and instead are just methods we can just call. +Finally, we just initialize an instance of our neural network, and we are ready to start training it. + +## Training + +Now that we have our neural network defined we can start training it. +Training neural networks in tinygrad is super simple. +All we need to do is define our neural network, define our loss function, and then call `.backward()` on the loss function to compute the gradients. +They can then be used to update the parameters of our neural network using one of the many optimizers in [optim.py](/tinygrad/nn/optim.py). + +First we need to set the training flag in `Tensor`: +```py +Tensor.training = True +``` + +For our loss function we will be using cross entropy loss. +```py +# from extra.training import sparse_categorical_crossentropy +def cross_entropy(out, Y): + num_classes = out.shape[-1] + YY = Y.flatten().astype(np.int32) + y = np.zeros((YY.shape[0], num_classes), np.float32) + y[range(y.shape[0]),YY] = -1.0*num_classes + y = y.reshape(list(Y.shape)+[num_classes]) + y = Tensor(y) + return out.mul(y).mean() +``` +As we can see in this implementation of cross entropy loss, there are certain operations that tinygrad does not support. +Namely, operations that are load/store like indexing a tensor with another tensor or assigning a value to a tensor at a certain index. +Load/store ops are not supported in tinygrad because they add complexity when trying to port to different backends and 90% of the models out there don't use/need them. + +For our optimizer we will be using the traditional stochastic gradient descent optimizer with a learning rate of 3e-4. +```py +from tinygrad.nn.optim import SGD + +opt = SGD([net.l1.weight, net.l2.weight], lr=3e-4) +``` +We can see that we are passing in the parameters of our neural network to the optimizer. +This is due to the fact that the optimizer needs to know which parameters to update. +There is a simpler way to do this just by using `get_parameters(net)` from `tinygrad.nn.optim` which will return a list of all the parameters in the neural network. +The parameters are just listed out explicitly here for clarity. + +Now that we have our network, loss function, and optimizer defined all we are missing is the data to train on! +There are a couple of dataset loaders in tinygrad located in [/datasets](/datasets). +We will be using the MNIST dataset loader. +```py +from datasets import fetch_mnist +``` + +Now we have everything we need to start training our neural network. +We will be training for 1000 steps with a batch size of 64. +```py +X_train, Y_train, X_test, Y_test = fetch_mnist() + +for step in range(1000): + # random sample a batch + samp = np.random.randint(0, X_train.shape[0], size=(64)) + batch = Tensor(X_train[samp], requires_grad=False) + # get the corresponding labels + labels = Y_train[samp] + + # forward pass + out = net(batch) + + # compute loss + loss = cross_entropy(out, labels) + + # zero gradients + opt.zero_grad() + + # backward pass + loss.backward() + + # update parameters + opt.step() + + # calculate accuracy + pred = np.argmax(out.numpy(), axis=-1) + acc = (pred == labels).mean() + + if step % 100 == 0: + print(f"Step {step+1} | Loss: {loss.numpy()} | Accuracy: {acc}") +``` + +## Evaluation + +Now that we have trained our neural network we can evaluate it on the test set. +We will be using the same batch size of 64 and will be evaluating for 1000 of those batches. +```py +# set training flag to false +Tensor.training = False + +st = time.perf_counter() +avg_acc = 0 +for step in range(1000): + # random sample a batch + samp = np.random.randint(0, X_test.shape[0], size=(64)) + batch = Tensor(X_test[samp], requires_grad=False) + # get the corresponding labels + labels = Y_test[samp] + + # forward pass + out = net(batch) + + # calculate accuracy + pred = np.argmax(out.numpy(), axis=-1) + avg_acc += (pred == labels).mean() +print(f"Test Accuracy: {avg_acc / 1000}") +print(f"Time: {time.perf_counter() - st}") +``` + +## And that's it! + +Highly recommend you check out the [examples/](/examples) folder for more examples of using tinygrad. +Reading the source code of tinygrad is also a great way to learn how it works. +Specifically the tests in [tests/](/tests) are a great place to see how to use and the semantics of the different operations. +There are also a bunch of models implemented in [models/](/models) that you can use as a reference. + +Additionally, feel free to ask questions in the `#learn-tinygrad` channel on the [discord](https://discord.gg/beYbxwxVdx). Don't ask to ask, just ask! + +## Extras + +### JIT + +Additionally, it is possible to speed up the computation of certain neural networks by using the JIT. +Currently, this does not support models with varying input sizes and non tinygrad operations. + +To use the JIT we just need to add a function decorator to the forward pass of our neural network and ensure that the input and output are realized tensors. +Or in this case we will create a wrapper function and decorate the wrapper function to speed up the evaluation of our neural network. +```py +from tinygrad.jit import TinyJit + +@TinyJit +def jit(x): + return net(x).realize() + +st = time.perf_counter() +avg_acc = 0 +for step in range(1000): + # random sample a batch + samp = np.random.randint(0, X_test.shape[0], size=(64)) + batch = Tensor(X_test[samp], requires_grad=False) + # get the corresponding labels + labels = Y_test[samp] + + # forward pass with jit + out = jit(batch) + + # calculate accuracy + pred = np.argmax(out.numpy(), axis=-1) + avg_acc += (pred == labels).mean() +print(f"Test Accuracy: {avg_acc / 1000}") +print(f"Time: {time.perf_counter() - st}") +``` +You will find that the evaluation time is much faster than before and that your accelerator utilization is much higher. + +### Saving and Loading Models + +The standard weight format for tinygrad is [safetensors](https://github.com/huggingface/safetensors). This means that you can load the weights of any model also using safetensors into tinygrad. +There are functions in [state.py](/tinygrad/state.py) to save and load models to and from this format. +```py +from tinygrad.state import safe_save, safe_load, get_state_dict, load_state_dict + +# first we need the state dict of our model +state_dict = get_state_dict(net) + +# then we can just save it to a file +safe_save(state_dict, "model.safetensors") + +# and load it back in +state_dict = safe_load("model.safetensors") +load_state_dict(net, state_dict) +``` + +Many of the models in the [models/](/models) folder have a `load_from_pretrained` method that will download and load the weights for you. These usually are pytorch weights meaning that you would need pytorch installed to load them. + +### Environment Variables + +There exist a bunch of environment variables that control the runtime behavior of tinygrad. +Some of the commons ones are `DEBUG` and the different backend enablement variables. + +You can find a full list and their descriptions in [env_vars.md](/docs/env_vars.md). + +### Visualizing the Computation Graph + +It is possible to visualize the computation graph of a neural network using [graphviz](https://graphviz.org/). + +This is easily done by running a single pass (forward or backward!) of the neural network with the environment variable `GRAPH` set to `1`. +The graph will be saved to `/tmp/net.svg` by default. diff --git a/docs/showcase.md b/docs/showcase.md new file mode 100644 index 00000000..56d907bc --- /dev/null +++ b/docs/showcase.md @@ -0,0 +1,61 @@ +# tinygrad Showcase + +Despite being a tiny library, tinygrad is capable of doing a lot of things. From state-of-the-art [vision](https://arxiv.org/abs/1905.11946) to state-of-the-art [language](https://arxiv.org/abs/1706.03762) models. + +## Vision + +### EfficientNet + +You can either pass in the URL of a picture to discover what it is: +```sh +python3 examples/efficientnet.py https://media.istockphoto.com/photos/hen-picture-id831791190 +``` +Or, if you have a camera and OpenCV installed, you can detect what is in front of you: +```sh +python3 examples/efficientnet.py webcam +``` + +### YOLOv3 + +Take a look at [yolov3.py](/examples/yolov3.py). + +![yolo by tinygrad](/docs/showcase/yolo_by_tinygrad.jpg) + +## Audio + +### Whisper + +Take a look at [whisper.py](/examples/whisper.py). You need pyaudio and torchaudio installed. + +```sh +SMALL=1 python3 examples/whisper.py +``` + +## Generative + +### Generative Adversarial Networks + +Take a look at [mnist_gan.py](/examples/mnist_gan.py). + +![mnist gan by tinygrad](/docs/showcase/mnist_by_tinygrad.jpg) + +### Stable Diffusion + +You will need to download the [weights](https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt) of Stable Diffusion and put it into the [weights/](/weights) directory. + +```sh +python3 examples/stable_diffusion.py +``` + +![a horse sized cat eating a bagel](/docs/showcase/stable_diffusion_by_tinygrad.jpg) + +*"a horse sized cat eating a bagel"* + +### LLaMA + +You will need to download and put the weights into the [weights/LLaMA](/weightsLLaMA) directory, which may need to be created. + +Then you can have a chat with Stacy: +```sh +python3 examples/llama.py +``` diff --git a/docs/mnist_by_tinygrad.jpg b/docs/showcase/mnist_by_tinygrad.jpg similarity index 100% rename from docs/mnist_by_tinygrad.jpg rename to docs/showcase/mnist_by_tinygrad.jpg diff --git a/docs/stable_diffusion_by_tinygrad.jpg b/docs/showcase/stable_diffusion_by_tinygrad.jpg similarity index 100% rename from docs/stable_diffusion_by_tinygrad.jpg rename to docs/showcase/stable_diffusion_by_tinygrad.jpg diff --git a/docs/yolo_by_tinygrad.jpg b/docs/showcase/yolo_by_tinygrad.jpg similarity index 100% rename from docs/yolo_by_tinygrad.jpg rename to docs/showcase/yolo_by_tinygrad.jpg