Commit graph

31 commits

Author SHA1 Message Date
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance (#1205) 2024-02-21 16:25:38 -05:00
kddubey
5a8944672f
Fix logits_to_logprobs for 2-D and 3-D logits (#1002)
* Fix logits_to_logprobs for 2-D and 3-D logits

* Set dtype to single

* Test size
2023-12-16 18:59:26 -05:00
Andrei Betlen
9515467439 tests: add mock_kv_cache placeholder functions 2023-11-22 06:02:21 -05:00
Andrei Betlen
0ea244499e tests: avoid constantly reallocating logits 2023-11-22 04:31:05 -05:00
Andrei Betlen
0a7e05bc10 tests: don't mock sampling functions 2023-11-22 04:12:32 -05:00
Andrei Betlen
d7388f1ffb Use mock_llama for all tests 2023-11-21 18:13:19 -05:00
Andrei Betlen
3dc21b2557 tests: Improve llama.cpp mock 2023-11-20 23:23:18 -05:00
Andrei Betlen
2c2afa320f Update llama.cpp 2023-11-20 14:11:33 -05:00
Andrei Betlen
e32ecb0516 Fix tests 2023-11-10 05:39:42 -05:00
Andrei Betlen
e214a58422 Refactor Llama class internals 2023-11-06 09:16:36 -05:00
Andrei
ab028cb878
Migrate inference to llama_batch and llama_decode api (#795)
* Add low-level batching notebook

* fix: tokenization of special characters: (#850)

It should behave like llama.cpp, where most out of the box usages
treat special characters accordingly

* Update CHANGELOG

* Cleanup

* Fix runner label

* Update notebook

* Use llama_decode and batch api

* Support logits_all parameter

---------

Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>
2023-11-02 20:13:57 -04:00
Antoine Lizee
4d4e0f11e2 fix: tokenization of special characters: (#850)
It should behave like llama.cpp, where most out of the box usages
treat special characters accordingly
2023-11-02 14:28:14 -04:00
Andrei Betlen
ef03d77b59 Enable finish reason tests 2023-10-19 02:56:45 -04:00
Andrei Betlen
cbeef36510 Re-enable tests completion function 2023-10-19 02:55:29 -04:00
Andrei Betlen
1a1c3dc418 Update llama.cpp 2023-09-28 22:42:03 -04:00
janvdp
f49b6d7c67 add test to see if llama_cpp.__version__ exists 2023-09-05 21:10:05 +02:00
Andrei Betlen
4887973c22 Update llama.cpp 2023-08-27 12:59:20 -04:00
Andrei Betlen
3a29d65f45 Update llama.cpp 2023-08-26 23:36:24 -04:00
Andrei Betlen
8ac59465b9 Strip leading space when de-tokenizing. 2023-08-25 04:56:48 -04:00
Andrei Betlen
3674e5ed4e Update model path 2023-08-24 01:01:20 -04:00
Andrei Betlen
01a010be52 Fix llama_cpp and Llama type signatures. Closes #221 2023-05-19 11:59:33 -04:00
Andrei Betlen
46e3c4b84a Fix 2023-05-01 22:41:54 -04:00
Andrei Betlen
9eafc4c49a Refactor server to use factory 2023-05-01 22:38:46 -04:00
Andrei Betlen
c088a2b3a7 Un-skip tests 2023-05-01 15:46:03 -04:00
Andrei Betlen
2f8a3adaa4 Temporarily skip sampling tests. 2023-05-01 15:01:49 -04:00
Lucas Doyle
efe8e6f879 llama_cpp server: slight refactor to init_llama function
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py

This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle
6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Mug
18a0c10032 Remove excessive errors="ignore" and add utf8 test 2023-04-29 12:19:22 +02:00
Mug
5f81400fcb Also ignore errors on input prompts 2023-04-26 14:45:51 +02:00
Andrei Betlen
e96a5c5722 Make Llama instance pickleable. Closes #27 2023-04-05 06:52:17 -04:00
Andrei Betlen
c3972b61ae Add basic tests. Closes #24 2023-04-05 03:23:15 -04:00