Andrei
|
f37456133a
|
Merge pull request #108 from eiery/main
Update n_batch default to 512 to match upstream llama.cpp
|
2023-04-24 13:48:09 -04:00 |
|
Andrei Betlen
|
02cf881317
|
Update llama.cpp
|
2023-04-24 09:30:10 -04:00 |
|
eiery
|
aa12d8a81f
|
Update llama.py
update n_batch default to 512 to match upstream llama.cpp
|
2023-04-23 20:56:40 -04:00 |
|
Andrei Betlen
|
7230599593
|
Disable mmap when applying lora weights. Closes #107
|
2023-04-23 14:53:17 -04:00 |
|
Andrei Betlen
|
e99caedbbd
|
Update llama.cpp
|
2023-04-22 19:50:28 -04:00 |
|
Andrei Betlen
|
1eb130a6b2
|
Update llama.cpp
|
2023-04-21 17:40:27 -04:00 |
|
Andrei Betlen
|
e4647c75ec
|
Add use_mmap flag to server
|
2023-04-19 15:57:46 -04:00 |
|
Andrei Betlen
|
0df4d69c20
|
If lora base is not set avoid re-loading the model by passing NULL
|
2023-04-18 23:45:25 -04:00 |
|
Andrei Betlen
|
95c0dc134e
|
Update type signature to allow for null pointer to be passed.
|
2023-04-18 23:44:46 -04:00 |
|
Andrei Betlen
|
453e517fd5
|
Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights.
|
2023-04-18 10:20:46 -04:00 |
|
Andrei Betlen
|
eb7f278cc6
|
Add lora_path parameter to Llama model
|
2023-04-18 01:43:44 -04:00 |
|
Andrei Betlen
|
35abf89552
|
Add bindings for LoRA adapters. Closes #88
|
2023-04-18 01:30:04 -04:00 |
|
Andrei Betlen
|
89856ef00d
|
Bugfix: only eval new tokens
|
2023-04-15 17:32:53 -04:00 |
|
Andrei Betlen
|
92c077136d
|
Add experimental cache
|
2023-04-15 12:03:09 -04:00 |
|
Andrei Betlen
|
a6372a7ae5
|
Update stop sequences for chat
|
2023-04-15 12:02:48 -04:00 |
|
Andrei Betlen
|
83b2be6dc4
|
Update chat parameters
|
2023-04-15 11:58:43 -04:00 |
|
Andrei Betlen
|
62087514c6
|
Update chat prompt
|
2023-04-15 11:58:19 -04:00 |
|
Andrei Betlen
|
02f9fb82fb
|
Bugfix
|
2023-04-15 11:39:52 -04:00 |
|
Andrei Betlen
|
3cd67c7bd7
|
Add type annotations
|
2023-04-15 11:39:21 -04:00 |
|
Andrei Betlen
|
d7de0e8014
|
Bugfix
|
2023-04-15 00:08:04 -04:00 |
|
Andrei Betlen
|
e90e122f2a
|
Use clear
|
2023-04-14 23:33:18 -04:00 |
|
Andrei Betlen
|
ac7068a469
|
Track generated tokens internally
|
2023-04-14 23:33:00 -04:00 |
|
Andrei Betlen
|
6e298d8fca
|
Set kv cache size to f16 by default
|
2023-04-14 22:21:19 -04:00 |
|
Andrei Betlen
|
6c7cec0c65
|
Fix completion request
|
2023-04-14 10:01:15 -04:00 |
|
Andrei Betlen
|
6153baab2d
|
Clean up logprobs implementation
|
2023-04-14 09:59:33 -04:00 |
|
Andrei Betlen
|
26cc4ee029
|
Fix signature for stop parameter
|
2023-04-14 09:59:08 -04:00 |
|
Andrei Betlen
|
6595ad84bf
|
Add field to disable reseting between generations
|
2023-04-13 00:28:00 -04:00 |
|
Andrei Betlen
|
22fa5a621f
|
Revert "Deprecate generate method"
This reverts commit 6cf5876538 .
|
2023-04-13 00:19:55 -04:00 |
|
Andrei Betlen
|
4f5f99ef2a
|
Formatting
|
2023-04-12 22:40:12 -04:00 |
|
Andrei Betlen
|
0daf16defc
|
Enable logprobs on completion endpoint
|
2023-04-12 19:08:11 -04:00 |
|
Andrei Betlen
|
19598ac4e8
|
Fix threading bug. Closes #62
|
2023-04-12 19:07:53 -04:00 |
|
Andrei Betlen
|
005c78d26c
|
Update llama.cpp
|
2023-04-12 14:29:00 -04:00 |
|
Andrei Betlen
|
c854c2564b
|
Don't serialize stateful parameters
|
2023-04-12 14:07:14 -04:00 |
|
Andrei Betlen
|
2f9b649005
|
Style fix
|
2023-04-12 14:06:22 -04:00 |
|
Andrei Betlen
|
6cf5876538
|
Deprecate generate method
|
2023-04-12 14:06:04 -04:00 |
|
Andrei Betlen
|
b3805bb9cc
|
Implement logprobs parameter for text completion. Closes #2
|
2023-04-12 14:05:11 -04:00 |
|
Andrei Betlen
|
9f1e565594
|
Update llama.cpp
|
2023-04-11 11:59:03 -04:00 |
|
Andrei Betlen
|
213cc5c340
|
Remove async from function signature to avoid blocking the server
|
2023-04-11 11:54:31 -04:00 |
|
Mug
|
2559e5af9b
|
Changed the environment variable name into "LLAMA_CPP_LIB"
|
2023-04-10 17:27:17 +02:00 |
|
Mug
|
ee71ce8ab7
|
Make windows users happy (hopefully)
|
2023-04-10 17:12:25 +02:00 |
|
Mug
|
cf339c9b3c
|
Better custom library debugging
|
2023-04-10 17:06:58 +02:00 |
|
Mug
|
4132293d2d
|
Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into local-lib
|
2023-04-10 17:00:42 +02:00 |
|
Mug
|
76131d5bb8
|
Use environment variable for library override
|
2023-04-10 17:00:35 +02:00 |
|
Andrei Betlen
|
1f67ad2a0b
|
Add use_mmap option
|
2023-04-10 02:11:35 -04:00 |
|
Andrei Betlen
|
c3c2623e8b
|
Update llama.cpp
|
2023-04-09 22:01:33 -04:00 |
|
Andrei Betlen
|
314ce7d1cc
|
Fix cpu count default
|
2023-04-08 19:54:04 -04:00 |
|
Andrei Betlen
|
3fbc06361f
|
Formatting
|
2023-04-08 16:01:45 -04:00 |
|
Andrei Betlen
|
0067c1a588
|
Formatting
|
2023-04-08 16:01:18 -04:00 |
|
Andrei Betlen
|
38f442deb0
|
Bugfix: Wrong size of embeddings. Closes #47
|
2023-04-08 15:05:33 -04:00 |
|
Andrei Betlen
|
ae3e9c3d6f
|
Update shared library extension for macos
|
2023-04-08 02:45:21 -04:00 |
|