docs cleanup and move (#4593)

* cleanup and move

* docs-legacy is gone

* don't update setup.py
This commit is contained in:
George Hotz 2024-05-14 20:44:59 -07:00 committed by GitHub
parent fd02ab1e8b
commit 9425973bc7
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
18 changed files with 8 additions and 215 deletions

View File

@ -91,7 +91,7 @@ jobs:
run: python -m mypy --strict-equality
- name: Test Docs
run: |
python docs-legacy/abstractions2.py
python docs/abstractions2.py
- name: Test Quickstart
run: awk '/```python/{flag=1;next}/```/{flag=0}flag' docs/quickstart.md > quickstart.py && PYTHONPATH=. python quickstart.py
- name: Fuzz Test symbolic

View File

@ -21,7 +21,7 @@ repos:
pass_filenames: false
- id: docs2
name: docs2
entry: python3 docs-legacy/abstractions2.py
entry: python3 docs/abstractions2.py
language: system
always_run: true
pass_filenames: false

View File

@ -1,4 +0,0 @@
*
!*/
!tinygrad/**

View File

@ -1,8 +1,8 @@
<div align="center">
<picture>
<source media="(prefers-color-scheme: light)" srcset="/docs-legacy/logo_tiny_light.svg">
<img alt="tiny corp logo" src="/docs-legacy/logo_tiny_dark.svg" width="50%" height="50%">
<source media="(prefers-color-scheme: light)" srcset="/docs/logo_tiny_light.svg">
<img alt="tiny corp logo" src="/docs/logo_tiny_dark.svg" width="50%" height="50%">
</picture>
tinygrad: For something between [PyTorch](https://github.com/pytorch/pytorch) and [karpathy/micrograd](https://github.com/karpathy/micrograd). Maintained by [tiny corp](https://tinygrad.org).
@ -87,7 +87,7 @@ tinygrad already supports numerous accelerators, including:
- [x] [HSA](tinygrad/runtime/ops_hsa.py)
And it is easy to add more! Your accelerator of choice only needs to support a total of ~25 low level ops.
More information can be found in the [documentation for adding new accelerators](/docs-legacy/adding_new_accelerators.md).
More information can be found in the [documentation for adding new accelerators](/docs/adding_new_accelerators.md).
## Installation

View File

@ -1,17 +0,0 @@
tinygrad is a bit bloated now, and there's several places where concerns should be seperated and they aren't.
tensor.py and mlops.py are great code. The interface going backward here is:
LazyBuffer.const (this creates a matching size buffer)
LazyBuffer.contiguous (tbis is not exactly elementwise)
LazyBuffer.e (elementwise)
LazyBuffer.r (reduce)
reshape/permute/expand/stride/shrink/pad (movement)
The lazy.py reordering engine has a lot of junk to deal with movementops that should be removed.
view.py is mostly great code, except it shouldn't have the rendering logic, and the int type should be parameterized to not import from symbolic.
LazyOp shouldn't have LazyBuffers as sources, just LazyOp LoadOps with a tuple of Views. Then the LazyOp uniquely determines the kernel and we don't have to do any replacement.
ShapeTracker probably shouldn't exist and just be a part of LazyBuffer. Most of the stuff in ShapeTracker should move to symbolic_view, which combines view and symbolic.

View File

@ -1,25 +0,0 @@
tinygrad has four pieces
* frontend (Tensor -> LazyBuffer)
* See tensor.py, function.py, multi.py, and lazy.py
* The user interacts with the Tensor class
* This outputs LazyBuffers, which form the simple compute graph
* scheduler (LazyBuffer -> ScheduleItem)
* See engine/schedule.py
* When a Tensor is realized, the scheduler is run to get its LazyBuffers to be computed
* This takes in LazyBuffers and groups them as appropriate into kernels.
* It returns a list of ScheduleItems + all the Variables used in the graph
* lowering (TODO: lots of work to clean this up still)
* See codegen/ (ScheduleItem.ast -> UOps)
* ScheduleItems have an ast that's compiled into actual GPU code
* Many optimization choices can be made here, this contains a beam search.
* renderer/compiler (UOps -> machine code)
* UOps are tinygrad's IR, similar to LLVM IR
* Here we either convert them to a high level language or machine code directly
* engine/realize.py (ScheduleItem -> ExecItem)
* runtime
* See runtime/
* Runtime actually interacts with the GPUs
* It manages Buffers, Programs, and Queues
* Sadly, METAL and GPU (OpenCL) don't have a compiler that can be pulled out from the device itself

View File

@ -1,31 +0,0 @@
# Welcome to the tinygrad documentation!
Here you will find documentation for tinygrad, as well as some examples and tutorials.
## Getting Started
Read the quick start guide [here](/docs/quickstart.md).
Or if you want to jump right in to how tinygrad works, you can read the [abstraction stack](/docs-legacy/abstractions2.py) documentation.
Or if you want to see some examples, you can look at the examples in the [examples](/examples) directory.
Or if you just want to see some of the things tinygrad can do, check out the [showcase](/docs/showcase.md).
## API
This is currently a big work in progress.
## Resources
### Environment Variables
[env_vars.md](/docs-legacy/env_vars.md)
### Adding New Accelerators
[adding_new_accelerators.md](/docs-legacy/adding_new_accelerators.md)
### Community
[![tinygrad discord](https://discordapp.com/api/guilds/1068976834382925865/widget.png?style=banner2)](https://discord.gg/ZjZadyC7PK)

View File

@ -1,33 +0,0 @@
# Adding a new accelerator to tinygrad
It's pretty easy to add a new accelerator to tinygrad. All you need to do is implement a total of 20 (optionally 21) low level ops. Then tinygrad takes care of the rest, handling derivatives and syntactic sugar.
## llops
These are the ops that you must implement for your accelerator of choice.
```
Buffer # class of memory on this device
unary_op (NOOP, CAST, EXP2, LOG2, SIN, SQRT) # A -> A
reduce_op (SUM, MAX) # A -> B (smaller size, B has 1 in shape)
binary_op (ADD, SUB, MUL, DIV, CMPEQ, CMPLT, MAX) # A + A -> A (all the same size)
load_op (EMPTY, CONST, FROM, CONTIGUOUS, CUSTOM) # -> A (initialize data on device)
ternary_op (WHERE) # A, A, A -> A
```
## mlops
These are the mid level ops that handle the derivatives.
```
Relu, Log, Exp, Sin # unary ops
Sum, Max # reduce ops (with axis argument)
Add, Sub, Mul, Div, Eq # binary ops (no broadcasting, use expand)
Expand, Reshape, Permute, Pad, Shrink, Flip # movement ops
Where # ternary ops
```
These are implemented in [function.py](/tinygrad/function.py).
## hlops
These are the syntax sugar. They are built on top of the mlops and support most of the things that you could expect from a tensor library.
These are implemented in [tensor.py](/tinygrad/tensor.py).

View File

@ -1,27 +0,0 @@
At base, the Linearizer a function that takes an AST + opts -> uops
It should be rewritten like this. The AST can't be a LazyOp, because it should be able to have multiple outputs
We need a generic class to represent DAGs.
This refactor is probably a prereq for the new linearizer, and can be used on existing uops also.
Can this class also represent the large graph? The op graph is a subset of the large graph.
Currently the Linearizer is merging many concerns:
1. LocalBuffers are added. These should be added to the upper DAG, for both grouping and tensor cores. Some opts are used here. NOTE: currently reduce splitting is done in lazy.py and it shouldn't be
2. The ShapeTrackers at the edges are collected and modified according to the other opts.
3. The Ops are toposorted.
4. The Ops are lowered to UOps. This requires expansion and loop assignment, potentially to global dimensions
5. The indexes into the Tensor are computed from the shapetrackers
More generically, the whole network is a DAG. Ignore the forward/backward stuff, I'm fine with starting at the LazyBuffer level.
1. Is it possible to put an entire network in a single kernel? I think the answer has to be yes, but you may end up doing an absolutely crazy amount of recomputation. This should still be doable to check correctness.
2. You can use intermediate buffers, be they local or global, to do less compute.
This is a rewrite of a lot of tinygrad. I don't think continuing to support Interpreted backends is worth it, have to deal with disk in a smart way.
We keep the features and nn stuff = 793 lines
We keep the frontend (Tensor -> LazyBuffer): tensor.py + mlops.py + lazy.py + dtype.py = 1032 lines
We keep the shapetracker/symbolic (part of the frontend): shapetracker.py + view.py + symbolic.py = 603 lines
Codegen is all rewritten. realize.py is simpler with the new codegen
We keep the backend (uops renderer/runtime): cstyle.py/llvmir.py + device.py + ops_*.py = 1216 lines (less when we remove interpreted)

View File

@ -1,70 +0,0 @@
## ["View.reshape without symbolic"](https://github.com/tinygrad/tinygrad/pull/2218)
This section contains the sketch proof of "Complete, Fast and Correct View.reshapes without using Symbolic". The goal is to reduce multi-views which cost runtime.
1. **old_shape = (s<sub>1</sub>,s<sub>2</sub>,...,s<sub>i</sub>,s<sub>(i+1)</sub>,...,s<sub>n</sub>)**
2. **old_stride = (st<sub>1</sub>, st<sub>2</sub>, ... ,st<sub>i</sub>, st<sub>(i+1)</sub>, ..., st<sub>n</sub>)**
3. **merge_old_shape = (p<sub>1</sub>, p<sub>2</sub>), where p<sub>1</sub> = s<sub>1</sub> * ... * s<sub>i</sub> & p<sub>2</sub> = s<sub>(i+1)</sub> * ... * s<sub>n</sub>**,
4. **new_shape = (k<sub>1</sub>, ..., k<sub>p</sub>, k<sub>(p+1)</sub>, ..., k<sub>l</sub>)**
5. **prod(new_shape) = p<sub>1</sub> * p<sub>2</sub>** (trivial)
6. **mask** and **new_mask** represent valid indexes before & after reshape respectively.
### Assumption
**p<sub>1</sub>** & **p<sub>2</sub>** individually are mergeable (we will discuss later on this) & we cannot merge **p<sub>1</sub>** & **p<sub>2</sub>**.
### Claim
If **prod([k<sub>1</sub> ... k<sub>p</sub>]) < p<sub>1</sub>** and **prod([k<sub>1</sub> ... k<sub>(p+1)</sub>]) > p<sub>1</sub>**, reshape is not possible.
**Proof**
**k<sub>(p+1)</sub>** will require some dimensions from **p<sub>1</sub>** & some from **p<sub>2</sub>**, which means **p<sub>1</sub>** & **p<sub>2</sub>** should be mergeable, but they are not.
**Conclusion**
Hence, reshape is only possible **if ∃ a p, where prod([k<sub>1</sub> .. k<sub>p</sub>]) = p<sub>1</sub>**.
### Conditions for mergeability
**Case 1 - All non-zero strides**
They will merge **if st<sub>x</sub> = st<sub>(x+1)</sub> * s<sub>(x+1)</sub>, where x ∈ [1, ..., i-1, i+1, ..., n-1]**.
**Proof**
Lets consider merging of **(s<sub>1</sub> ... s<sub>i</sub>) -> p<sub>1</sub>**, here we have to get a single new stride corresponding to **p<sub>1</sub>**. For which it has to be contiguous.
**Case 2 - Some stride is zero**
Let **st<sub>j</sub> = 0 & st<sub>(j+1)</sub> != 0 & s<sub>(j+1)</sub> > 1, where 1 < j < i**.
If **s<sub>j</sub> = 1** , reshape is trivial.
If **s<sub>j</sub> > 1**,
- If **mask<sub>j</sub>** has range > 1,
reshape is not possible, because **s<sub>(j+1)</sub>** will need to be repeated at-least once and a single stride can't capture repetition.
- If **mask<sub>j</sub>** has range = 1, reshape is possible, since it is virtually shape = 1, with some offset.
### Conditions for reshaping mask
**Case 1 - Splitting Dimension** - Mask shouldn't be cut for successful reshape.
- **Example** -
[1,2,3,4,5,6,7,8] -> [[1,2,3,4], [5,6,7,8]] ; **mask** = ((2,6)) ; **new_mask[0]** = (0,2) (trivial split).
- **new_mask[1]** = not possible. It is only possible if **mask spans [1-8] or lies within a single dimension [1-4] or [5-8]**.
**Case 2 - Combining Dimension** - Mask should unfold continuously.
- **Example** - **[[1,2],[3,4],[5,6]] -> [1,2,3,4,5,6]**; **mask** = ((0,2),(0,2)).
- **new_mask** = (0,4); only possible because **mask<sub>1</sub>** span the whole dimension.
- If **mask<sub>1</sub>** did not span the whole dimension, the only way combining would be possible is if **mask<sub>0</sub>** had range 1 as shown below.
- **[[1,2,3],[4,5,6]] -> [1,2,3,4,5,6]**; **mask** = ((1,2),(0,2)); **new_mask** = ((3,5))

View File

Before

Width:  |  Height:  |  Size: 538 B

After

Width:  |  Height:  |  Size: 538 B

View File

Before

Width:  |  Height:  |  Size: 526 B

After

Width:  |  Height:  |  Size: 526 B

View File

@ -76,7 +76,7 @@ print(t6.numpy())
```
There are a lot more operations that can be performed on tensors, you can find them in the [Tensor](tensor.md) file.
Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
## Models
@ -299,7 +299,7 @@ Many of the models in the [models/](https://github.com/tinygrad/tinygrad/tree/ma
There exist a bunch of environment variables that control the runtime behavior of tinygrad.
Some of the commons ones are `DEBUG` and the different backend enablement variables.
You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/env_vars.md).
You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs/env_vars.md).
### Visualizing the Computation Graph

View File

@ -11,6 +11,7 @@ nav:
- Showcase: showcase.md
- Developer: developer.md
- Function: function.md
- Environment: env_vars.md
#- tinygrad: reference/
#extra_css:

View File

@ -27,7 +27,6 @@ line-length = 150
exclude = [
"docs/",
"docs-legacy/",
"examples/",
"extra/",
"tinygrad/runtime/autogen",