Commit Graph

21 Commits

Author SHA1 Message Date
wozeparrot d269bc95fa
faster tinychat (#5993) 2024-08-08 19:16:26 -07:00
wozeparrot eebb1b9922
feat: temperature 0 llama3 benchmark (#5806) 2024-07-30 12:05:36 -07:00
wozeparrot 639af3f823
llama3 temperature flag (#5803) 2024-07-29 16:33:51 -07:00
wozeparrot fa873df9c1
bring tinychat more inline with tinyos' version (#5358) 2024-07-10 13:13:52 -07:00
nimlgen 21b225ac45
llama3 download works (#5160) 2024-06-26 22:45:13 +03:00
wozeparrot c91b3c4079
shard llama3 on 0 sometimes (#5157) 2024-06-26 11:50:57 -07:00
chenyu dade7677cf
validate llama3 output only with model "LLaMA-3/8B-SF-DPO" (#5138) 2024-06-24 20:58:25 -04:00
chenyu 8080298739
s/tinytqdm/tqdm (#5103)
except in unit test where tqdm is imported
2024-06-22 14:18:26 -04:00
chenyu e468601226
update llama attention casting (#5096)
* update llama attention casting

updated scaled_dot_product_attention middle cast and removed hard-coded half in llama attention.

* fix that
2024-06-22 10:57:17 -04:00
wozeparrot acb715c64c
fix: llama3 special tokens (#5045) 2024-06-18 17:08:44 -07:00
chenyu a3ed4176c8
use tinytqdm in active tests and examples (#5038)
* use tinytqdm in active tests and examples

stress test this before 0.9.1

* no set_description
2024-06-18 16:01:19 -04:00
wozeparrot ce1ed374c9
more tinychat fixes (#4971) 2024-06-15 16:29:39 -07:00
wozeparrot 8209cd3c55
easier llama3 + fetch subdir (#4938) 2024-06-14 13:47:27 -07:00
wozeparrot 3d13c23bfa
llama3 `--download_model` (#4922) 2024-06-11 22:59:59 -07:00
wozeparrot 6c24eda522
feat: tinychat (#4869) 2024-06-08 12:05:45 -07:00
wozeparrot ed0a740fe4
greater chat api endpoint compat (#4792) 2024-05-30 22:47:31 -07:00
chenyu 7624ad3ddd
add --timing and --profile to llama3 example (#4767) 2024-05-28 16:24:44 -04:00
chenyu 31358cbea5
change Tensor.stack to method (#4719) 2024-05-24 17:04:19 -04:00
chenyu 5e3fbbb33e
llama3 example add manual seed and log seed (#4667) 2024-05-20 19:09:57 -04:00
chenyu ae861325ce
update llama sample for mac 32 input buffer limit (#4662)
set default sampling params to function call to 0, and top k in llama3 to 25.
2024-05-20 17:23:39 -04:00
wozeparrot b144d4b460
new llama3 example (#4576) 2024-05-19 22:42:23 -07:00