tinygrad/docs/env_vars.md

# List of environment variables that control tinygrad behavior.

This is a list of environment variable that control the runtime behavior of tinygrad and its examples.
Most of these are self-explanatory, and are usually used to set an option at runtime.

Example: `GPU=1 DEBUG=4 python3 -m pytest`

The columns are: Variable, Possible Value(s) and Description.

- A `#` means that the variable can take any integer value.

## Global Variables

These control the behavior of core tinygrad even when used as a library.

Variable | Possible Value(s) | Description
---|---|---
DEBUG               | [1-4]      | enable debugging output, with 4 you get operations, timings, speed, generated code and more
GPU                 | [1]        | enable the GPU backend
CUDA                | [1]        | enable CUDA backend
CPU                 | [1]        | enable CPU backend
MPS                 | [1]        | enable MPS device (for Mac M1 and after)
METAL               | [1]        | enable Metal backend (for Mac M1 and after)
METAL_XCODE         | [1]        | enable Metal using macOS Xcode SDK
TORCH               | [1]        | enable PyTorch backend
CLANG               | [1]        | enable Clang backend
LLVM                | [1]        | enable LLVM backend
LLVMOPT             | [1]        | enable slightly more expensive LLVM optimizations
LAZY                | [1]        | enable lazy operations (this is the default)
OPT                 | [1-4]      | optimization level
GRAPH               | [1]        | create a graph of all operations (requires graphviz)
GRAPHPATH           | [/path/to] | where to put the generated graph
PRUNEGRAPH          | [1]        | prune MovementOps and LoadOps from the graph
PRINT_PRG           | [1]        | print program code
IMAGE               | [1]        | enable 2d specific optimizations
FLOAT16             | [1]        | use float16 for images instead of float32
ENABLE_METHOD_CACHE | [1]        | enable method cache (this is the default)
EARLY_STOPPING      | [# > 0]  | stop after this many kernels
DISALLOW_ASSIGN     | [1]        | disallow assignment of tensors
CL_EXCLUDE          | [name0,name1] | comma-separated list of device names to exclude when using OpenCL GPU backend (like `CL_EXCLUDE=gfx1036`)
CL_PLATFORM         | [# >= 0]   | index of the OpenCL [platform](https://documen.tician.de/pyopencl/runtime_platform.html#pyopencl.Platform) to run on. Defaults to 0.
RDNA                | [1]        | enable the specialized [RDNA 3](https://en.wikipedia.org/wiki/RDNA_3) assembler for AMD 7000-series GPUs. If not set, defaults to generic OpenCL codegen backend.
PTX                 | [1]        | enable the specialized [PTX](https://docs.nvidia.com/cuda/parallel-thread-execution/) assembler for Nvidia GPUs. If not set, defaults to generic CUDA codegen backend.

## File Specific Variables

These are variables that control the behavior of a specific file, these usually don't affect the library itself.
Most of the time these will never be used, but they are here for completeness.

### accel/ane/2_compile/hwx_parse.py

Variable | Possible Value(s) | Description
---|---|---
PRINTALL | [1] | print all ANE registers

### extra/onnx.py

Variable | Possible Value(s) | Description
---|---|---
ONNXLIMIT | [#] | set a limit for ONNX
DEBUGONNX | [1] | enable ONNX debugging

### extra/thneed.py

Variable | Possible Value(s) | Description
---|---|---
DEBUGCL      | [1-4] | enable Debugging for OpenCL
PRINT_KERNEL | [1]   | Print OpenCL Kernels

### extra/kernel_search.py

Variable | Possible Value(s) | Description
---|---|---
OP            | [1-3] | different operations
NOTEST        | [1]   | enable not testing AST
DUMP          | [1]   | enable dumping of intervention cache
REDUCE        | [1]   | enable reduce operations
SIMPLE_REDUCE | [1]   | enable simpler reduce operations
BC            | [1]   | enable big conv operations
CONVW         | [1]   | enable convw operations
FASTCONV      | [1]   | enable faster conv operations
GEMM          | [1]   | enable general matrix multiply operations
BROKEN        | [1]   | enable a kind of operation
BROKEN3       | [1]   | enable a kind of operation

### examples/vit.py

Variable | Possible Value(s) | Description
---|---|---
LARGE | [1] | enable larger dimension model

### examples/llama.py

Variable | Possible Value(s) | Description
---|---|---
WEIGHTS | [1] | enable loading weights

### examples/mlperf

Variable | Possible Value(s) | Description
---|---|---
MODEL | [resnet,retinanet,unet3d,rnnt,bert,maskrcnn] | what models to use

### examples/benchmark_train_efficientnet.py

Variable | Possible Value(s) | Description
---|---|---
CNT      | [10] | the amount of times to loop the benchmark
BACKWARD | [1]  | enable backward pass
TRAINING | [1]  | set Tensor.training
CLCACHE  | [1]  | enable cache for OpenCL

### examples/hlb_cifar10.py

Variable | Possible Value(s) | Description
---|---|---
TORCHWEIGHTS     | [1] | use torch to initialize weights
DISABLE_BACKWARD | [1] | don't do backward pass

### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py

Variable | Possible Value(s) | Description
---|---|---
ADAM | [1] | use the Adam optimizer

### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py

Variable | Possible Value(s) | Description
---|---|---
STEPS    | [0-10] | number of steps
FAKEDATA | [1]    | enable to use random data

### examples/train_efficientnet.py

Variable | Possible Value(s) | Description
---|---|---
STEPS    | [# % 1024] | number of steps
TINY     | [1]        | use a tiny convolution network
IMAGENET | [1]        | use imagenet for training

### examples/train_efficientnet.py & examples/train_resnet.py

Variable | Possible Value(s) | Description
---|---|---
TRANSFER | [1] | enable to use pretrained data

### examples & test/external/external_test_opt.py

Variable | Possible Value(s) | Description
---|---|---
NUM | [18, 2] | what ResNet[18] / EfficientNet[2] to train

### test/test_ops.py

Variable | Possible Value(s) | Description
---|---|---
PRINT_TENSORS | [1] | print tensors
FORWARD_ONLY  | [1] | use forward operations only

### test/test_speed_v_torch.py

Variable | Possible Value(s) | Description
---|---|---
TORCHCUDA | [1] | enable the torch cuda backend

### test/external/external_test_gpu_ast.py

Variable | Possible Value(s) | Description
---|---|---
KOPT   | [1] | enable kernel optimization
KCACHE | [1] | enable kernel cache

### test/external/external_test_opt.py

Variable | Possible Value(s) | Description
---|---|---
ENET_NUM | [-2,-1] | what EfficientNet to use

### test/test_dtype.py & test/extra/test_utils.py & extra/training.py

Variable | Possible Value(s) | Description
---|---|---
CI | [1] | disables some tests for CI

### examples & extra & test

Variable | Possible Value(s) | Description
---|---|---
BS | [8, 16, 32, 64, 128] | batch size to use

### datasets/imagenet_download.py

Variable | Possible Value(s) | Description
---|---|---
IMGNET_TRAIN | [1] | download also training data with imagenet