mirror of https://github.com/commaai/tinygrad.git
docs cleanup and move (#4593)
* cleanup and move * docs-legacy is gone * don't update setup.py
This commit is contained in:
parent
fd02ab1e8b
commit
9425973bc7
|
@ -91,7 +91,7 @@ jobs:
|
|||
run: python -m mypy --strict-equality
|
||||
- name: Test Docs
|
||||
run: |
|
||||
python docs-legacy/abstractions2.py
|
||||
python docs/abstractions2.py
|
||||
- name: Test Quickstart
|
||||
run: awk '/```python/{flag=1;next}/```/{flag=0}flag' docs/quickstart.md > quickstart.py && PYTHONPATH=. python quickstart.py
|
||||
- name: Fuzz Test symbolic
|
||||
|
|
|
@ -21,7 +21,7 @@ repos:
|
|||
pass_filenames: false
|
||||
- id: docs2
|
||||
name: docs2
|
||||
entry: python3 docs-legacy/abstractions2.py
|
||||
entry: python3 docs/abstractions2.py
|
||||
language: system
|
||||
always_run: true
|
||||
pass_filenames: false
|
||||
|
|
|
@ -1,4 +0,0 @@
|
|||
*
|
||||
!*/
|
||||
|
||||
!tinygrad/**
|
|
@ -1,8 +1,8 @@
|
|||
<div align="center">
|
||||
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: light)" srcset="/docs-legacy/logo_tiny_light.svg">
|
||||
<img alt="tiny corp logo" src="/docs-legacy/logo_tiny_dark.svg" width="50%" height="50%">
|
||||
<source media="(prefers-color-scheme: light)" srcset="/docs/logo_tiny_light.svg">
|
||||
<img alt="tiny corp logo" src="/docs/logo_tiny_dark.svg" width="50%" height="50%">
|
||||
</picture>
|
||||
|
||||
tinygrad: For something between [PyTorch](https://github.com/pytorch/pytorch) and [karpathy/micrograd](https://github.com/karpathy/micrograd). Maintained by [tiny corp](https://tinygrad.org).
|
||||
|
@ -87,7 +87,7 @@ tinygrad already supports numerous accelerators, including:
|
|||
- [x] [HSA](tinygrad/runtime/ops_hsa.py)
|
||||
|
||||
And it is easy to add more! Your accelerator of choice only needs to support a total of ~25 low level ops.
|
||||
More information can be found in the [documentation for adding new accelerators](/docs-legacy/adding_new_accelerators.md).
|
||||
More information can be found in the [documentation for adding new accelerators](/docs/adding_new_accelerators.md).
|
||||
|
||||
## Installation
|
||||
|
||||
|
|
|
@ -1,17 +0,0 @@
|
|||
tinygrad is a bit bloated now, and there's several places where concerns should be seperated and they aren't.
|
||||
|
||||
tensor.py and mlops.py are great code. The interface going backward here is:
|
||||
|
||||
LazyBuffer.const (this creates a matching size buffer)
|
||||
LazyBuffer.contiguous (tbis is not exactly elementwise)
|
||||
LazyBuffer.e (elementwise)
|
||||
LazyBuffer.r (reduce)
|
||||
reshape/permute/expand/stride/shrink/pad (movement)
|
||||
|
||||
The lazy.py reordering engine has a lot of junk to deal with movementops that should be removed.
|
||||
|
||||
view.py is mostly great code, except it shouldn't have the rendering logic, and the int type should be parameterized to not import from symbolic.
|
||||
|
||||
LazyOp shouldn't have LazyBuffers as sources, just LazyOp LoadOps with a tuple of Views. Then the LazyOp uniquely determines the kernel and we don't have to do any replacement.
|
||||
|
||||
ShapeTracker probably shouldn't exist and just be a part of LazyBuffer. Most of the stuff in ShapeTracker should move to symbolic_view, which combines view and symbolic.
|
|
@ -1,25 +0,0 @@
|
|||
tinygrad has four pieces
|
||||
|
||||
* frontend (Tensor -> LazyBuffer)
|
||||
* See tensor.py, function.py, multi.py, and lazy.py
|
||||
* The user interacts with the Tensor class
|
||||
* This outputs LazyBuffers, which form the simple compute graph
|
||||
* scheduler (LazyBuffer -> ScheduleItem)
|
||||
* See engine/schedule.py
|
||||
* When a Tensor is realized, the scheduler is run to get its LazyBuffers to be computed
|
||||
* This takes in LazyBuffers and groups them as appropriate into kernels.
|
||||
* It returns a list of ScheduleItems + all the Variables used in the graph
|
||||
* lowering (TODO: lots of work to clean this up still)
|
||||
* See codegen/ (ScheduleItem.ast -> UOps)
|
||||
* ScheduleItems have an ast that's compiled into actual GPU code
|
||||
* Many optimization choices can be made here, this contains a beam search.
|
||||
* renderer/compiler (UOps -> machine code)
|
||||
* UOps are tinygrad's IR, similar to LLVM IR
|
||||
* Here we either convert them to a high level language or machine code directly
|
||||
* engine/realize.py (ScheduleItem -> ExecItem)
|
||||
* runtime
|
||||
* See runtime/
|
||||
* Runtime actually interacts with the GPUs
|
||||
* It manages Buffers, Programs, and Queues
|
||||
* Sadly, METAL and GPU (OpenCL) don't have a compiler that can be pulled out from the device itself
|
||||
|
|
@ -1,31 +0,0 @@
|
|||
# Welcome to the tinygrad documentation!
|
||||
|
||||
Here you will find documentation for tinygrad, as well as some examples and tutorials.
|
||||
|
||||
## Getting Started
|
||||
|
||||
Read the quick start guide [here](/docs/quickstart.md).
|
||||
|
||||
Or if you want to jump right in to how tinygrad works, you can read the [abstraction stack](/docs-legacy/abstractions2.py) documentation.
|
||||
|
||||
Or if you want to see some examples, you can look at the examples in the [examples](/examples) directory.
|
||||
|
||||
Or if you just want to see some of the things tinygrad can do, check out the [showcase](/docs/showcase.md).
|
||||
|
||||
## API
|
||||
|
||||
This is currently a big work in progress.
|
||||
|
||||
## Resources
|
||||
|
||||
### Environment Variables
|
||||
|
||||
[env_vars.md](/docs-legacy/env_vars.md)
|
||||
|
||||
### Adding New Accelerators
|
||||
|
||||
[adding_new_accelerators.md](/docs-legacy/adding_new_accelerators.md)
|
||||
|
||||
### Community
|
||||
|
||||
[![tinygrad discord](https://discordapp.com/api/guilds/1068976834382925865/widget.png?style=banner2)](https://discord.gg/ZjZadyC7PK)
|
|
@ -1,33 +0,0 @@
|
|||
# Adding a new accelerator to tinygrad
|
||||
|
||||
It's pretty easy to add a new accelerator to tinygrad. All you need to do is implement a total of 20 (optionally 21) low level ops. Then tinygrad takes care of the rest, handling derivatives and syntactic sugar.
|
||||
|
||||
## llops
|
||||
|
||||
These are the ops that you must implement for your accelerator of choice.
|
||||
```
|
||||
Buffer # class of memory on this device
|
||||
unary_op (NOOP, CAST, EXP2, LOG2, SIN, SQRT) # A -> A
|
||||
reduce_op (SUM, MAX) # A -> B (smaller size, B has 1 in shape)
|
||||
binary_op (ADD, SUB, MUL, DIV, CMPEQ, CMPLT, MAX) # A + A -> A (all the same size)
|
||||
load_op (EMPTY, CONST, FROM, CONTIGUOUS, CUSTOM) # -> A (initialize data on device)
|
||||
ternary_op (WHERE) # A, A, A -> A
|
||||
```
|
||||
|
||||
## mlops
|
||||
|
||||
These are the mid level ops that handle the derivatives.
|
||||
```
|
||||
Relu, Log, Exp, Sin # unary ops
|
||||
Sum, Max # reduce ops (with axis argument)
|
||||
Add, Sub, Mul, Div, Eq # binary ops (no broadcasting, use expand)
|
||||
Expand, Reshape, Permute, Pad, Shrink, Flip # movement ops
|
||||
Where # ternary ops
|
||||
```
|
||||
These are implemented in [function.py](/tinygrad/function.py).
|
||||
|
||||
## hlops
|
||||
|
||||
These are the syntax sugar. They are built on top of the mlops and support most of the things that you could expect from a tensor library.
|
||||
|
||||
These are implemented in [tensor.py](/tinygrad/tensor.py).
|
|
@ -1,27 +0,0 @@
|
|||
At base, the Linearizer a function that takes an AST + opts -> uops
|
||||
It should be rewritten like this. The AST can't be a LazyOp, because it should be able to have multiple outputs
|
||||
|
||||
We need a generic class to represent DAGs.
|
||||
This refactor is probably a prereq for the new linearizer, and can be used on existing uops also.
|
||||
Can this class also represent the large graph? The op graph is a subset of the large graph.
|
||||
|
||||
Currently the Linearizer is merging many concerns:
|
||||
|
||||
1. LocalBuffers are added. These should be added to the upper DAG, for both grouping and tensor cores. Some opts are used here. NOTE: currently reduce splitting is done in lazy.py and it shouldn't be
|
||||
2. The ShapeTrackers at the edges are collected and modified according to the other opts.
|
||||
3. The Ops are toposorted.
|
||||
4. The Ops are lowered to UOps. This requires expansion and loop assignment, potentially to global dimensions
|
||||
5. The indexes into the Tensor are computed from the shapetrackers
|
||||
|
||||
More generically, the whole network is a DAG. Ignore the forward/backward stuff, I'm fine with starting at the LazyBuffer level.
|
||||
|
||||
1. Is it possible to put an entire network in a single kernel? I think the answer has to be yes, but you may end up doing an absolutely crazy amount of recomputation. This should still be doable to check correctness.
|
||||
2. You can use intermediate buffers, be they local or global, to do less compute.
|
||||
|
||||
This is a rewrite of a lot of tinygrad. I don't think continuing to support Interpreted backends is worth it, have to deal with disk in a smart way.
|
||||
|
||||
We keep the features and nn stuff = 793 lines
|
||||
We keep the frontend (Tensor -> LazyBuffer): tensor.py + mlops.py + lazy.py + dtype.py = 1032 lines
|
||||
We keep the shapetracker/symbolic (part of the frontend): shapetracker.py + view.py + symbolic.py = 603 lines
|
||||
Codegen is all rewritten. realize.py is simpler with the new codegen
|
||||
We keep the backend (uops renderer/runtime): cstyle.py/llvmir.py + device.py + ops_*.py = 1216 lines (less when we remove interpreted)
|
|
@ -1,70 +0,0 @@
|
|||
## ["View.reshape without symbolic"](https://github.com/tinygrad/tinygrad/pull/2218)
|
||||
|
||||
This section contains the sketch proof of "Complete, Fast and Correct View.reshapes without using Symbolic". The goal is to reduce multi-views which cost runtime.
|
||||
|
||||
1. **old_shape = (s<sub>1</sub>,s<sub>2</sub>,...,s<sub>i</sub>,s<sub>(i+1)</sub>,...,s<sub>n</sub>)**
|
||||
2. **old_stride = (st<sub>1</sub>, st<sub>2</sub>, ... ,st<sub>i</sub>, st<sub>(i+1)</sub>, ..., st<sub>n</sub>)**
|
||||
3. **merge_old_shape = (p<sub>1</sub>, p<sub>2</sub>), where p<sub>1</sub> = s<sub>1</sub> * ... * s<sub>i</sub> & p<sub>2</sub> = s<sub>(i+1)</sub> * ... * s<sub>n</sub>**,
|
||||
4. **new_shape = (k<sub>1</sub>, ..., k<sub>p</sub>, k<sub>(p+1)</sub>, ..., k<sub>l</sub>)**
|
||||
5. **prod(new_shape) = p<sub>1</sub> * p<sub>2</sub>** (trivial)
|
||||
6. **mask** and **new_mask** represent valid indexes before & after reshape respectively.
|
||||
|
||||
|
||||
### Assumption
|
||||
|
||||
**p<sub>1</sub>** & **p<sub>2</sub>** individually are mergeable (we will discuss later on this) & we cannot merge **p<sub>1</sub>** & **p<sub>2</sub>**.
|
||||
|
||||
### Claim
|
||||
|
||||
If **prod([k<sub>1</sub> ... k<sub>p</sub>]) < p<sub>1</sub>** and **prod([k<sub>1</sub> ... k<sub>(p+1)</sub>]) > p<sub>1</sub>**, reshape is not possible.
|
||||
|
||||
**Proof**
|
||||
|
||||
**k<sub>(p+1)</sub>** will require some dimensions from **p<sub>1</sub>** & some from **p<sub>2</sub>**, which means **p<sub>1</sub>** & **p<sub>2</sub>** should be mergeable, but they are not.
|
||||
|
||||
**Conclusion**
|
||||
|
||||
Hence, reshape is only possible **if ∃ a p, where prod([k<sub>1</sub> .. k<sub>p</sub>]) = p<sub>1</sub>**.
|
||||
|
||||
|
||||
### Conditions for mergeability
|
||||
|
||||
**Case 1 - All non-zero strides**
|
||||
|
||||
They will merge **if st<sub>x</sub> = st<sub>(x+1)</sub> * s<sub>(x+1)</sub>, where x ∈ [1, ..., i-1, i+1, ..., n-1]**.
|
||||
|
||||
**Proof**
|
||||
|
||||
Lets consider merging of **(s<sub>1</sub> ... s<sub>i</sub>) -> p<sub>1</sub>**, here we have to get a single new stride corresponding to **p<sub>1</sub>**. For which it has to be contiguous.
|
||||
|
||||
**Case 2 - Some stride is zero**
|
||||
|
||||
Let **st<sub>j</sub> = 0 & st<sub>(j+1)</sub> != 0 & s<sub>(j+1)</sub> > 1, where 1 < j < i**.
|
||||
|
||||
If **s<sub>j</sub> = 1** , reshape is trivial.
|
||||
|
||||
If **s<sub>j</sub> > 1**,
|
||||
- If **mask<sub>j</sub>** has range > 1,
|
||||
reshape is not possible, because **s<sub>(j+1)</sub>** will need to be repeated at-least once and a single stride can't capture repetition.
|
||||
- If **mask<sub>j</sub>** has range = 1, reshape is possible, since it is virtually shape = 1, with some offset.
|
||||
|
||||
|
||||
|
||||
### Conditions for reshaping mask
|
||||
|
||||
**Case 1 - Splitting Dimension** - Mask shouldn't be cut for successful reshape.
|
||||
|
||||
- **Example** -
|
||||
[1,2,3,4,5,6,7,8] -> [[1,2,3,4], [5,6,7,8]] ; **mask** = ((2,6)) ; **new_mask[0]** = (0,2) (trivial split).
|
||||
|
||||
- **new_mask[1]** = not possible. It is only possible if **mask spans [1-8] or lies within a single dimension [1-4] or [5-8]**.
|
||||
|
||||
|
||||
**Case 2 - Combining Dimension** - Mask should unfold continuously.
|
||||
|
||||
- **Example** - **[[1,2],[3,4],[5,6]] -> [1,2,3,4,5,6]**; **mask** = ((0,2),(0,2)).
|
||||
|
||||
- **new_mask** = (0,4); only possible because **mask<sub>1</sub>** span the whole dimension.
|
||||
|
||||
- If **mask<sub>1</sub>** did not span the whole dimension, the only way combining would be possible is if **mask<sub>0</sub>** had range 1 as shown below.
|
||||
- **[[1,2,3],[4,5,6]] -> [1,2,3,4,5,6]**; **mask** = ((1,2),(0,2)); **new_mask** = ((3,5))
|
Before Width: | Height: | Size: 538 B After Width: | Height: | Size: 538 B |
Before Width: | Height: | Size: 526 B After Width: | Height: | Size: 526 B |
|
@ -76,7 +76,7 @@ print(t6.numpy())
|
|||
```
|
||||
|
||||
There are a lot more operations that can be performed on tensors, you can find them in the [Tensor](tensor.md) file.
|
||||
Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
|
||||
Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
|
||||
|
||||
## Models
|
||||
|
||||
|
@ -299,7 +299,7 @@ Many of the models in the [models/](https://github.com/tinygrad/tinygrad/tree/ma
|
|||
There exist a bunch of environment variables that control the runtime behavior of tinygrad.
|
||||
Some of the commons ones are `DEBUG` and the different backend enablement variables.
|
||||
|
||||
You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/env_vars.md).
|
||||
You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs/env_vars.md).
|
||||
|
||||
### Visualizing the Computation Graph
|
||||
|
||||
|
|
|
@ -11,6 +11,7 @@ nav:
|
|||
- Showcase: showcase.md
|
||||
- Developer: developer.md
|
||||
- Function: function.md
|
||||
- Environment: env_vars.md
|
||||
#- tinygrad: reference/
|
||||
|
||||
#extra_css:
|
||||
|
|
Loading…
Reference in New Issue