tinygrad/docs/env_vars.md

196 lines
6.9 KiB
Markdown

# List of environment variables that control tinygrad behavior.
This is a list of environment variable that control the runtime behavior of tinygrad and its examples.
Most of these are self-explanatory, and are usually used to set an option at runtime.
Example: `GPU=1 DEBUG=4 python3 -m pytest`
The columns are: Variable, Possible Value(s) and Description.
- A `#` means that the variable can take any integer value.
## Global Variables
These control the behavior of core tinygrad even when used as a library.
Variable | Possible Value(s) | Description
---|---|---
DEBUG | [1-4] | enable debugging output, with 4 you get operations, timings, speed, generated code and more
GPU | [1] | enable the GPU backend
CUDA | [1] | enable CUDA backend
CPU | [1] | enable CPU backend
MPS | [1] | enable MPS device (for Mac M1 and after)
METAL | [1] | enable Metal backend (for Mac M1 and after)
METAL_XCODE | [1] | enable Metal using macOS Xcode SDK
TORCH | [1] | enable PyTorch backend
CLANG | [1] | enable Clang backend
LLVM | [1] | enable LLVM backend
LLVMOPT | [1] | enable slightly more expensive LLVM optimizations
LAZY | [1] | enable lazy operations (this is the default)
OPT | [1-4] | optimization level
GRAPH | [1] | create a graph of all operations (requires graphviz)
GRAPHPATH | [/path/to] | where to put the generated graph
PRUNEGRAPH | [1] | prune MovementOps and LoadOps from the graph
PRINT_PRG | [1] | print program code
IMAGE | [1] | enable 2d specific optimizations
FLOAT16 | [1] | use float16 for images instead of float32
ENABLE_METHOD_CACHE | [1] | enable method cache (this is the default)
EARLY_STOPPING | [# > 0] | stop after this many kernels
DISALLOW_ASSIGN | [1] | disallow assignment of tensors
CL_EXCLUDE | [name0,name1] | comma-separated list of device names to exclude when using OpenCL GPU backend (like `CL_EXCLUDE=gfx1036`)
CL_PLATFORM | [# >= 0] | index of the OpenCL [platform](https://documen.tician.de/pyopencl/runtime_platform.html#pyopencl.Platform) to run on. Defaults to 0.
RDNA | [1] | enable the specialized [RDNA 3](https://en.wikipedia.org/wiki/RDNA_3) assembler for AMD 7000-series GPUs. If not set, defaults to generic OpenCL codegen backend.
PTX | [1] | enable the specialized [PTX](https://docs.nvidia.com/cuda/parallel-thread-execution/) assembler for Nvidia GPUs. If not set, defaults to generic CUDA codegen backend.
## File Specific Variables
These are variables that control the behavior of a specific file, these usually don't affect the library itself.
Most of the time these will never be used, but they are here for completeness.
### accel/ane/2_compile/hwx_parse.py
Variable | Possible Value(s) | Description
---|---|---
PRINTALL | [1] | print all ANE registers
### extra/onnx.py
Variable | Possible Value(s) | Description
---|---|---
ONNXLIMIT | [#] | set a limit for ONNX
DEBUGONNX | [1] | enable ONNX debugging
### extra/thneed.py
Variable | Possible Value(s) | Description
---|---|---
DEBUGCL | [1-4] | enable Debugging for OpenCL
PRINT_KERNEL | [1] | Print OpenCL Kernels
### extra/kernel_search.py
Variable | Possible Value(s) | Description
---|---|---
OP | [1-3] | different operations
NOTEST | [1] | enable not testing AST
DUMP | [1] | enable dumping of intervention cache
REDUCE | [1] | enable reduce operations
SIMPLE_REDUCE | [1] | enable simpler reduce operations
BC | [1] | enable big conv operations
CONVW | [1] | enable convw operations
FASTCONV | [1] | enable faster conv operations
GEMM | [1] | enable general matrix multiply operations
BROKEN | [1] | enable a kind of operation
BROKEN3 | [1] | enable a kind of operation
### examples/vit.py
Variable | Possible Value(s) | Description
---|---|---
LARGE | [1] | enable larger dimension model
### examples/llama.py
Variable | Possible Value(s) | Description
---|---|---
WEIGHTS | [1] | enable loading weights
### examples/mlperf
Variable | Possible Value(s) | Description
---|---|---
MODEL | [resnet,retinanet,unet3d,rnnt,bert,maskrcnn] | what models to use
### examples/benchmark_train_efficientnet.py
Variable | Possible Value(s) | Description
---|---|---
CNT | [10] | the amount of times to loop the benchmark
BACKWARD | [1] | enable backward pass
TRAINING | [1] | set Tensor.training
CLCACHE | [1] | enable cache for OpenCL
### examples/hlb_cifar10.py
Variable | Possible Value(s) | Description
---|---|---
TORCHWEIGHTS | [1] | use torch to initialize weights
DISABLE_BACKWARD | [1] | don't do backward pass
### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py
Variable | Possible Value(s) | Description
---|---|---
ADAM | [1] | use the Adam optimizer
### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py
Variable | Possible Value(s) | Description
---|---|---
STEPS | [0-10] | number of steps
FAKEDATA | [1] | enable to use random data
### examples/train_efficientnet.py
Variable | Possible Value(s) | Description
---|---|---
STEPS | [# % 1024] | number of steps
TINY | [1] | use a tiny convolution network
IMAGENET | [1] | use imagenet for training
### examples/train_efficientnet.py & examples/train_resnet.py
Variable | Possible Value(s) | Description
---|---|---
TRANSFER | [1] | enable to use pretrained data
### examples & test/external/external_test_opt.py
Variable | Possible Value(s) | Description
---|---|---
NUM | [18, 2] | what ResNet[18] / EfficientNet[2] to train
### test/test_ops.py
Variable | Possible Value(s) | Description
---|---|---
PRINT_TENSORS | [1] | print tensors
FORWARD_ONLY | [1] | use forward operations only
### test/test_speed_v_torch.py
Variable | Possible Value(s) | Description
---|---|---
TORCHCUDA | [1] | enable the torch cuda backend
### test/external/external_test_gpu_ast.py
Variable | Possible Value(s) | Description
---|---|---
KOPT | [1] | enable kernel optimization
KCACHE | [1] | enable kernel cache
### test/external/external_test_opt.py
Variable | Possible Value(s) | Description
---|---|---
ENET_NUM | [-2,-1] | what EfficientNet to use
### test/test_dtype.py & test/extra/test_utils.py & extra/training.py
Variable | Possible Value(s) | Description
---|---|---
CI | [1] | disables some tests for CI
### examples & extra & test
Variable | Possible Value(s) | Description
---|---|---
BS | [8, 16, 32, 64, 128] | batch size to use
### datasets/imagenet_download.py
Variable | Possible Value(s) | Description
---|---|---
IMGNET_TRAIN | [1] | download also training data with imagenet