Sigbjørn Skjæret
027f7bc678
fix: Avoid duplicate special tokens in chat formats ( #1439 )
...
* Templates sometimes have BOS in them, remove duplicate
* tokenize chat format prompts before completion
This is to ensure that we don't duplicate any special tokens.
Hopefully I amended the existing formats correctly?
* updated comment
* corrected a few
* add some missing internals
* proper bos/eos detection
* just let tokenizer do the job
* typo--
* align test with new response
* changed to a warning
* move to another PR
* Use python warnings module
---------
Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-06-04 10:15:41 -04:00
Andrei Betlen
af3ed503e9
fix: Use numpy recarray for candidates data, fixes bug with temp < 0
2024-06-01 18:09:24 -04:00
Noam Gat
e0d7674e62
fix: detokenization case where first token does not start with a leading space ( #1375 )
...
* Fix tokenization edge case where llama output does not start with a space
See this notebook:
https://colab.research.google.com/drive/1Ooz11nFPk19zyJdMDx42CeesU8aWZMdI#scrollTo=oKpHw5PZ30uC
* Update _internals.py
Fixing to compare to b' ' instead of (str)' '
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-05-04 10:14:59 -04:00
Andrei Betlen
f116175a5a
fix: Suppress all logs when verbose=False, use hardcoded fileno's to work in colab notebooks. Closes #796 Closes #729
2024-04-30 15:45:34 -04:00
Douglas Hanley
f6ed21f9a2
feat: Allow for possibly non-pooled embeddings ( #1380 )
...
* allow for possibly non-pooled embeddings
* add more to embeddings section in README.md
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-25 21:32:44 -04:00
Andrei Betlen
159cc4e5d9
feat: Update llama.cpp
2024-04-21 20:46:40 -04:00
Yuri Mikhailov
62aad610e1
fix: last tokens passing to sample_repetition_penalties function ( #1295 )
...
Co-authored-by: ymikhaylov <ymikhaylov@x5.ru>
Co-authored-by: Andrei <abetlen@gmail.com>
2024-04-01 15:25:43 -04:00
Andrei Betlen
8c71725d53
fix: Remove deprecated cfg sampling functions
2024-02-28 14:37:07 -05:00
Andrei Betlen
cbbcd888af
feat: Update llama.cpp
2024-02-25 20:52:14 -05:00
Andrei Betlen
b9aca612af
misc: use typesafe byref for internal classes
2024-02-23 03:40:07 -05:00
Andrei Betlen
dd22010e85
fix: Raise exceptions when llama model or context fails to load
2024-02-22 00:09:45 -05:00
Andrei
7f51b6071f
feat(low-level-api): Improve API static type-safety and performance ( #1205 )
2024-02-21 16:25:38 -05:00
Douglas Hanley
d7a67917ba
feat: Support batch embeddings ( #1186 )
...
* handle batched embeddings
* fix normalization issue
* fix type hints, ensure no breaking changes to embed
* Clear kv cache / reset internal state after embedding complete
---------
Co-authored-by: Andrei <abetlen@gmail.com>
2024-02-14 04:26:09 -05:00
Andrei Betlen
6943bab6d8
fix: destructor exception where internal classes are missing some uninitialized attributes
2024-02-14 03:38:41 -05:00
Andrei Betlen
59760c85ed
fix: Use llama_log_callback to avoid suppress_stdout_stderr
2024-02-05 21:52:12 -05:00
Andrei
da003d8768
Automatically set chat format from gguf ( #1110 )
...
* Use jinja formatter to load chat format from gguf
* Fix off-by-one error in metadata loader
* Implement chat format auto-detection
2024-01-29 14:22:23 -05:00
Andrei Betlen
5a34c57e54
feat: Expose gguf model metadata in metadata property
2024-01-19 10:46:03 -05:00
Andrei Betlen
cc4630e66f
Move helper classes to _internals submodule
2024-01-17 09:14:00 -05:00