Commit Graph

85 Commits

Author SHA1 Message Date
George Hotz 095e2ced61
add name support to fetch (#2407)
* add name support

* use fetch in gpt2

* remove requests from main lib, networkx also optional

* umm, keep that assert

* updates to fetch

* i love the walrus so much

* stop bundling mnist with tinygrad

* err, https

* download cache names

* add DOWNLOAD_CACHE_VERSION

* need env.

* ugh, wrong path

* replace get_child
2023-11-23 14:16:17 -08:00
chenyu 5ef8d682e3
clean up attentions in stable diffusion (#2275) 2023-11-11 14:25:36 -05:00
Ahmed Harmouche 265304e7fd
Stable diffusion WebGPU port (#1370)
* WIP: Stable diffusion WebGPU port

* Load whole model: split safetensor to avoid Chrome allocation limit

* Gitignore .DS_Store, remove debug print

* Clip tokenizer in JS

* WIP: Compile model in parts (text model, diffusor, get_x_prev_and_pred_x0, decoder), and recreate forward logic in JS

* e2e stable diffusion flow

* Create initial random latent tensor in JS

* SD working e2e

* Log if some weights were not loaded properly

* Remove latent_tensor.npy used for debugging

* Cleanup, remove useless logs

* Improve UI

* Add progress bar

* Remove .npy files used for debugging

* Add clip tokenizer as external dependency

* Remove alphas_cumprod.js and load it from safetensors

* Refactor

* Simplify a lot

* Dedup base when limiting elementwise merge (webgpu)

* Add return type to safe_load_metadata

* Do not allow run when webgpu is not supported

* Add progress bar, refactor, fix special names

* Add option to chose from local vs huggingface weights

* lowercase tinygrad :)

* fp16 model dl, decompression client side

* Cache f16 model in browser, better progress

* Cache miss recovery

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-11-03 18:29:16 -07:00
George Hotz 6dc8eb5bfd
universal disk cache (#2130)
* caching infra for tinygrad

* nons tr key

* fix linter

* no shelve in beam search

* beam search caching

* check tensor cores with beam too

* pretty print

* LATEBEAM in stable diffusion
2023-10-22 10:56:57 -07:00
Ahmed Harmouche 0d3410d93f
Stable diffusion: Make guidance modifiable (#2077) 2023-10-15 14:36:43 -07:00
Ahmed Harmouche e27fedfc7b
Fix stable diffusion output error on WebGPU (#2032)
* Fix stable diffusion on WebGPU

* Remove hack, numpy cast only on webgpu

* No-copy numpy cast
2023-10-10 06:40:51 -07:00
George Hotz adab724caa
schedule2, keep the tests working with small changes (#1932)
* lazy cleanups

* ast functions take in LazyOps

* op instead of self.op

* _base for mops

* fix contiguous

* start schedule

* test_schedule

* fix openpilot

* more tests

* bugfix and test skip

* work

* make sure things get freed

* fix zerosized tensors

* fix failing test

* fix ceil and friends

* fix openpilot

* disable training

* disable test collectives
2023-09-28 09:14:43 -07:00
Dat D. Nguyen ae9529e678
chore: remove redundant noise in stable diffusion example (#1910) 2023-09-24 21:33:45 +08:00
segf00lt 9e8c1dbf34
patch to remove hack from stable_diffusion.py (#1814)
* patch to remove hack from stable_diffusion.py

* sorry linter

* realize after assign?

* float16 broken in llvmlite use float64 for now

* int32

* idiot forgot to change test array dtype
2023-09-08 09:26:50 -07:00
George Hotz 722823dee1 stable diffusion: force fp16 free 2023-09-06 15:11:05 -07:00
Francis Lam 0379b64ac4
add seed option to stable_diffusion (#1784)
useful for testing correctness of model runs
2023-09-05 19:45:15 -07:00
Karan Handa a8aa13dc91
[ready] Replacing os with pathlib (#1708)
* replace os.path with pathlib

* safe convert dirnames to pathlib

* replace all os.path.join

* fix cuda error

* change main chunk

* Reviewer fixes

* fix vgg

* Fixed everything

* Final fixes

* ensure consistency

* Change all parent.parent... to parents
2023-08-30 10:41:08 -07:00
Umut Zengin 1682e9a38a
Fix: Stable Diffusion index (#1713) 2023-08-30 00:21:10 -04:00
George Hotz aa7c98722b
sd timing (#1706) 2023-08-28 20:22:57 -07:00
nimlgen 1c0449e190
add cache collector (#1595)
* init cache collector

* add test_cache_collector.py

* switch GlobalCounters.cache to CacheCollector

* init jit models test

* jitted SD

* add debug msg to print loaded bufs count

* moved cache collctor to jit

* clearer SD

* no double device import
2023-08-28 19:59:55 -07:00
George Hotz 718ced296c
move state to nn/state (#1619) 2023-08-22 07:36:24 -07:00
George Hotz b9feb1b743 fp16 support in stable diffusion 2023-08-20 05:37:21 +00:00
George Hotz 47f18f4d60
[New] SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1516) (#1518)
* Refactor AttnBlock, CrossAttention, CLIPAttention to share code

* Reshape and transpose in loop

* Bugfix on attention mask

Co-authored-by: Jacky Lee <39754370+jla524@users.noreply.github.com>
2023-08-10 15:04:18 -07:00
George Hotz c82bd59b85
Revert "SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1513)" (#1515)
This reverts commit 85e02311a2.
2023-08-10 09:08:51 -07:00
Jacky Lee 85e02311a2
SD: Refactor AttnBlock, CrossAttention, CLIPAttention to share code (#1513)
* Refactor AttnBlock, CrossAttention, CLIPAttention to share code

* Reshape and transpose in loop
2023-08-10 08:52:33 -07:00
George Hotz d78fb8f4ed
add stable diffusion and llama (#1471)
* add stable diffusion and llama

* pretty in CI

* was CI not true

* that

* CI=true, wtf

* pythonpath

* debug=1

* oops, wrong place

* uops test broken for wgpu

* wgpu tests flaky
2023-08-06 21:31:51 -07:00
Felix 97a6029cf7
Corrected a few misspelled words (#1435) 2023-08-04 16:51:08 -07:00
George Hotz f27df835a6
delete dead stuff (#1382)
* delete bpe from repo

* remove yolo examples

* Revert "remove yolo examples"

This reverts commit cd1f49d4662a5565726ae1fa7bf3f6a3e3985965.

* no windows
2023-07-31 11:17:49 -07:00
George Hotz bfbb8d3d0f
fix ones, BS=2 stable diffusion, caching optimizer (#1312)
* fix ones, BS=2 stable diffusion

* caching optimizer

* print search time

* minor bug fix
2023-07-21 09:55:49 -07:00
George Hotz f45013f0a3 stable diffusion: remove realizes we don't need 2023-07-20 19:53:07 -07:00
George Hotz b58dd015e3 stable diffusion: remove import numpy as np 2023-07-20 19:35:44 -07:00
George Hotz 35bc46289c stable diffusion: use new tinygrad primitives 2023-07-20 19:25:49 -07:00
AN Long f75de602df
fix typo in stable diffusion example (#1219) 2023-07-11 15:26:40 -07:00
Diogo 2d4370b487
Adds tril & triu support (#936)
* triu & tril support

* lint and kernel count error

* switched shape indicies

* larger shape tests

* reverted numpy removal until #942 is resolved
2023-06-09 22:13:20 -07:00
Diogo 3bb38c3518
limit split to 1 due to windows path containing : (#944) 2023-06-06 10:27:54 -07:00
George Hotz ed1963b899
Fast DiskTensor to other Tensor (#916)
* make disktensors fast

* loading

* loader for sd and llama
2023-06-03 12:25:41 -07:00
George Hotz 46d419060b start on mlperf models 2023-05-10 16:30:49 -07:00
Kirill 0fe5014b1f
Use pathlib (#711)
* Use pathlib in llama

* Use pathlib in stablediffusion
2023-03-18 13:49:21 -07:00
Kirill af7745073f
Add comments to SD (#686)
* Add explanation for empty lambdas

* Fix my_unpickle if pytorch_lightning is installed

* oops
2023-03-12 10:56:49 -07:00
George Hotz b1206bcb18
third try at torch loading (#677)
* third try at torch loading

* numpy fixed

* fix enet compile

* load_single_weight supports empty weights

* oops, CPU wasn't the default

* so many bugs
2023-03-10 19:11:29 -08:00
George Hotz 8bf75a7fdd fix stable diffusion and CI 2023-03-10 17:48:12 -08:00
Pankaj Doharey 9d97d97b26
Opens image in default viewer after saving. (#612) 2023-03-03 17:28:49 -08:00
Jacky Lee c35fcc6964
Replace phrase for prompt (#555) 2023-02-12 09:04:44 -08:00
Kirill 27154db99a
Downloads weights in examples/stable_diffusion.py (#537)
* Downloads weights in examples/stable_diffusion.py

* use download_file_if_not_exists in fetch

* make consistent with previous NOCACHE behavior
2023-02-10 14:37:04 -06:00
Jacky Lee f08187526f
Fix examples (#540)
* Fix examples

* Remove training in parameters

* Simplify a bit

* Remove extra import

* Fix linter errors

* factor out Device

* NumPy-like semantics for Tensor.__getitem__ (#506)

* Rewrote Tensor.__getitem__ to fix negative indices and add support for np.newaxis/None

* Fixed pad2d

* mypy doesn't know about mlops methods

* normal python behavior for out-of-bounds slicing

* type: ignore

* inlined idxfix

* added comment for __getitem__

* Better comments, better tests, and fixed bug in np.newaxis

* update cpu and torch to hold buffers (#542)

* update cpu and torch to hold buffers

* save lines, and probably faster

* Mypy fun (#541)

* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup

* dyn add of math ops

* refactor ops_cpu and ops_torch to not share code

* nn/optim.py compiles now

* Reorder imports

* call mkdir only if directory doesn't exist

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: Mitchell Goff <mitchellgoffpc@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2023-02-10 12:09:37 -06:00
George Hotz 5e37f084db stable diffusion: clean up constant folding 2023-02-01 12:53:16 -08:00
Jacky Lee 486f023e81
Rename Normalize and move to nn (#513)
* Rename Normalize and move to nn

* Match PyTorch for dim>1
2023-02-01 11:55:03 -08:00
George Hotz 487685919b
Revert "Rename Normalize and move to nn (#415)" (#474)
This reverts commit d768acb6a9.
2023-01-25 07:50:04 -08:00
Jacky Lee d768acb6a9
Rename Normalize and move to nn (#415)
* Rename Normalize and move to nn

* Fix comparison to None error

* Add test for GroupNorm

* Rename test case

* Flip parameters to match PyTorch

* Increase error tolerance

* Fix elementwise_affine on channels

* Match arguments with PyTorch

* Initialize weight and bias only when affine is true

* Is this it?

* A bit cleaner

* Handle case where weight or bias is None
2023-01-25 07:47:59 -08:00
George Hotz 6d7658db12 delete opencl <celebration> 2023-01-24 14:18:35 -08:00
nogira 2e744ef2f2
confirmed (#449)
w/ a bunch of print statements in the official model here: ce05de2819/ldm/modules/diffusionmodules/openaimodel.py (L413)
2023-01-07 08:41:06 -08:00
Drew Hintz 165fb4d631
remove redundant list comprehension from inside all. (#397)
remove explicit inherit from object.
2022-10-13 09:58:35 -07:00
George Hotz 178ba50c03 some args for stable diffusion 2022-09-29 01:52:04 -04:00
George Hotz 60df954377
Fix weight init: this work? (#391)
* this work?

* glorot uniform

* requies_grad broke

* propagate the None correctly

* so this weight init works

* ahh, i think it's this

* can't beat this

* glorot is best for ae

* remove comments
2022-09-25 16:46:33 -04:00
George Hotz 894a7cee79 forgot a few 2022-09-12 09:21:46 -07:00