Commit Graph

693 Commits

Author SHA1 Message Date
George Hotz d6517a8a7c ins 2021-06-16 19:31:13 -07:00
George Hotz 29a08ba352 pytorch earlier 2021-06-16 12:24:21 -07:00
George Hotz 4a07b71731 update business model 2021-06-16 12:01:50 -07:00
George Hotz d29b16e5b4 more business notes 2021-06-16 11:47:57 -07:00
George Hotz b1000d866e readme, plus reduce ops 2021-06-16 11:21:06 -07:00
George Hotz ff3fdc58e5 risk -> cherry 2021-06-16 09:59:48 -07:00
George Hotz 2f91c012eb build note 2021-06-15 22:41:41 -07:00
George Hotz 0c02b66259 more 2021-06-15 15:02:32 -07:00
George Hotz 1e62e45d67 better todo 2021-06-15 10:30:16 -07:00
George Hotz 9ca4388695 debug 2021-06-15 10:24:21 -07:00
George Hotz 3d44aab52c more 2021-06-15 10:23:57 -07:00
George Hotz 4850d6eb43 update todo 2021-06-15 10:22:39 -07:00
George Hotz 4e1edb3692 have tinygrad log the loads 2021-06-14 18:35:14 -07:00
George Hotz 93f2e9769d little note 2021-06-14 15:49:41 -07:00
Jacky Lee 611d81dcb4
Add asserts for non-zero indices (#264) 2021-06-13 21:14:46 -07:00
George Hotz 508ced114c readme 2021-06-13 17:17:44 -07:00
Dinesh Kumar Gnanasekaran 2146860307
fixed OpenCL installation while running tests (#262)
Co-authored-by: dinesh <dinesh-GDK>
2021-06-12 11:14:21 -07:00
George Hotz a89d12d735 wow, way faster 2021-06-10 17:11:39 -07:00
George Hotz 10b1306525 binops 2021-06-10 16:52:37 -07:00
George Hotz 4535d39baa comments and pow 2021-06-10 09:03:40 -07:00
George Hotz 2075fdeb4f
FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
George Hotz 77ba198b57
Revert "Update README.md (#259)" (#260)
This reverts commit 5a69c5db6d.
2021-06-04 14:41:41 -07:00
Gabriel Rojas 5a69c5db6d
Update README.md (#259) 2021-06-04 14:41:07 -07:00
Josh Smith ad756f6112
minor optimizations & cleaning (#257)
* use isinstance, some optimizations & whitespace removal

* revert whitespace changes

* revert more whitespace

* some more cleanup

* revert fstring (not a fan of the {{}})

* fix typo

* fix typo
2021-06-02 09:57:15 -07:00
George Hotz 74e874cc0d comment 2021-05-26 18:06:55 -07:00
George Hotz 343c5f13c7 add output shape to DEBUG 2021-05-26 17:42:38 -07:00
George Hotz b80cacb416 fix GPU efficientnet example 2021-05-26 17:29:35 -07:00
George Hotz 1ae0e88627 nvidia notes 2021-05-26 14:27:00 -07:00
20kdc 2653d33292
vgg7 (image upscaling) implementation - not the best, but it works (#255)
* vgg7 implementation - not the best, but it works

* VGG7 implementation: Spread nansbane to deter NaNs, maybe improved training experience

* VGG7 implementation: Fix training, for real this time

Results actually attempt to approximate the input

* VGG7 implementation: Sample probability management
2021-05-12 23:48:51 -07:00
Skosh 81bf933a91
Improved __getitem__ (#254)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…

* Improved __getitem__

* Updated

* Updated __getitem__

* Linebreaks

* Maybe this works?

* Added MNIST locally, tests run now
2021-05-05 22:15:22 -07:00
Skosh 78aa147b39
[WIP] YOLO working on tinygrad! (#245)
* Some progress on yolov3

* Removed some debugging comments… Also, the forward pass eats all RAM for some reason

* forward pass almost runs

* forward pass runs almost

* forward pass runs, now we gotta load the weights

* loading weights works

* fetches config and weights

* everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done

* some changes

* fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly

* Something is wrong with the forward pass, Conv2d tests added

* forward pass almost outputs correct values, gotta fix one more thign

* yolo works

* some final changes

* reverting changes

* removed dataloader

* fixed some indentation

* comment out failing test, somehow it fails CI even though it passes on my computer…

* fixed wrong probabilities

* added webcam option to YOLO, now just need to add bounding boxes and speed it up

* some progress towards adding bounding boxes

* trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage

* Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image

* removed some debugging print statements

* updated result image

* something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…
2021-04-25 18:06:52 -07:00
ziofil 155ec1f18e
saving 50 LOC with automatic @staticmethod for forward and backward (#252)
* automatic @staticmethod for forward and backward

* triggering unit tests
2021-04-25 18:04:16 -07:00
freedom" Koan-Sin Tan f0cc2b66f8
add an aneccompile example in Objective-C (#240)
* add an aneccompile example in Objective-C

add a compile.m corresponding to compile.mm

build with
```clang compile.m -F /System/Library/PrivateFrameworks/ -framework ANECompiler -framework Foundation```

CoreFoundation framework is a C library.
Foundation is an Objective-C framework.

CF data structures in CoreFoundation usually have corresponding NS data structures in Foundation, e.g.,
NSDictionary is "toll-free bridged" with its Core Foundation counterpart, CFDictionary.
See [1].

[1] https://developer.apple.com/library/archive/documentation/General/Conceptual/CocoaEncyclopedia/Toll-FreeBridgin/Toll-FreeBridgin.html

* figure out how to use param_3 of ANECCompile

add a simple param_3 blocks callback, which dumps the status
dictionary when status != 0
2021-01-31 08:31:16 -08:00
Göktuğ Karakaşlı eabe0b9017
remove deepwalk args (#243) 2021-01-31 08:30:17 -08:00
George Hotz ce77dda805 yolov5 v4 2021-01-05 07:56:17 -08:00
George Hotz 62e3a8558c fix tolerance maybe 2021-01-05 07:45:47 -08:00
Asim 1c148f2fe4
fixed example broken after gpu refactor (#238) 2021-01-05 07:41:54 -08:00
George Hotz 8a38e0d207 only mish failed 2021-01-03 09:47:11 -08:00
George Hotz a337f7780e smarter way to write sign 2021-01-03 09:46:00 -08:00
George Hotz 1a4487965a remove negative from things w/o negative 2021-01-03 09:43:34 -08:00
George Hotz 0531b848eb second class sign 2021-01-03 09:33:12 -08:00
George Hotz 0702e0c763 nah, no sign, it's not what you want. use relu 2021-01-03 09:30:33 -08:00
George Hotz 29655609d5 fix GPU sign...these tests aren't very good 2021-01-03 09:00:49 -08:00
George Hotz ea9c9af5d7 faster sign 2021-01-03 08:54:21 -08:00
George Hotz c2eeb6950b add support for sign. technically relu can be second class now 2021-01-03 08:29:57 -08:00
George Hotz 6842ad9ec8 minor cleanups, yolo work 2021-01-03 08:14:16 -08:00
NeuralLink 0825cf7f79
Added softplus and mish non stable (#220)
*  Added softplus and mish CPU

* 🔨 refactor

* 🔨 second class softplus and mish

* 🔨 test fix

* no need of device in testing
2021-01-03 08:08:41 -08:00
George Hotz ac229ea750 remove print 2021-01-02 12:53:30 -08:00
George Hotz 895d142503 start trying to load yolo v5 2021-01-02 12:51:55 -08:00
NeuralLink ece07a3d12
🔨 refactor register ops (#233)
* 🔨 refactor register ops

* 🔨 reorder and register for ANE

* 🔨 refactor

* 🔨 conflicts

* 🔨 minor fix

* ane fix

* extra reshape weird
2021-01-02 07:47:16 -08:00