llama.cpp

History

Sigbjørn Skjæret 027f7bc678 fix: Avoid duplicate special tokens in chat formats (#1439 ) * Templates sometimes have BOS in them, remove duplicate * tokenize chat format prompts before completion This is to ensure that we don't duplicate any special tokens. Hopefully I amended the existing formats correctly? * updated comment * corrected a few * add some missing internals * proper bos/eos detection * just let tokenizer do the job * typo-- * align test with new response * changed to a warning * move to another PR * Use python warnings module --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>		2024-06-04 10:15:41 -04:00
..
test_llama.py	feat: Update llama.cpp	2024-04-29 23:34:55 -04:00
test_llama_chat_format.py	fix: Avoid duplicate special tokens in chat formats (#1439 )	2024-06-04 10:15:41 -04:00
test_llama_grammar.py	misc: rename grammar test	2024-02-08 01:07:44 -05:00
test_llama_speculative.py	Add speculative decoding (#1120 )	2024-01-31 14:08:14 -05:00