chenyu
|
e356807696
|
tinytqdm.set_description and tinytrange (#5101)
|
2024-06-22 14:45:06 -04:00 |
chenyu
|
dccefab23f
|
remove mixtral weight to clang first (#3792)
seems fine without it now
|
2024-03-17 23:33:17 -04:00 |
George Hotz
|
3527c5a9d2
|
add Tensor.replace (#3738)
* add Tensor.replace
* fix dtypes in that test
* should be replace
* and mixtral
|
2024-03-14 13:34:14 -07:00 |
George Hotz
|
3415b0ee54
|
hotfix: mixtral copies norms together for 2% speed
|
2024-03-11 01:28:03 +00:00 |
chenyu
|
bad6adaf8c
|
add mixtral and 6 gpus cifar to tinybox ci (#3676)
* add mixtral and 6 gpus cifar to tinybox ci
* print total ram used at the end of loading
|
2024-03-10 18:25:31 -04:00 |
chenyu
|
c3c35f9142
|
flag to profile mixtral - 1.7 tok/s now (#3104)
|
2024-01-12 18:54:27 -05:00 |
George Hotz
|
f432ec9c33
|
Bitcast hip fix + fix mixtral (#3022)
* fix bitcast in hip
* wrong dtype for precast, double COPY
|
2024-01-05 14:51:25 -08:00 |
chenyu
|
f88506e630
|
move gpt2/llama sampling inside the model call (#3013)
* move gpt2/llama sampling inside the model call
* argmax uses one more kernel
|
2024-01-04 17:01:50 -05:00 |
Ivan Vnučec
|
8d206f6bfd
|
fix help message (#2705)
llama -> mixtral
|
2023-12-10 22:04:35 -08:00 |
George Hotz
|
59ab3675a3
|
faster mixtral + green for new kernels (#2701)
* green for new kernels
* track ram
|
2023-12-10 19:04:58 -08:00 |
George Hotz
|
b01e3907a1
|
mixtral touch up: two lines
|
2023-12-10 17:21:49 -08:00 |
George Hotz
|
b3982187d1
|
Mixtral Example (#2691)
* mixtral
* simpler
* global counters
* simpler
* weights arg
|
2023-12-10 17:18:31 -08:00 |