baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Joe	4ff8def4d0	#717 : Add support for Huggingface Autotokenizer (#790 ) Co-authored-by: Andrei <abetlen@gmail.com>	2023-11-05 18:06:36 -05:00
earonesty	3580e2c5df	Update llama_chat_format.py (#869 ) * Update llama_chat_format.py properly formal llama2 with first-message prompt embedded * Update llama_chat_format.py	2023-11-05 17:00:13 -05:00
Andrei Betlen	f0b30ef7dc	Update llama.cpp	2023-11-05 16:57:10 -05:00
Andrei Betlen	dccbac82eb	Update llama.cpp	2023-11-03 18:12:22 -04:00
Andrei Betlen	2ec043af76	Clean up stdout / stderr suppression	2023-11-03 13:02:15 -04:00
Andrei Betlen	4ea7027c41	Rename internal only module utils to _utils	2023-11-03 12:55:55 -04:00
Andrei Betlen	df9362eeea	Update llama.cpp	2023-11-03 11:34:50 -04:00
Andrei	3af7b21ff1	Add functionary support (#784 ) * Add common grammars and json-schema-to-grammar utility function from llama.cpp * Pass functions to format function * Add basic functionary formatting * Add LlamaChatHandler for more complex chat use cases * Add function calling example notebook * Add support for regular chat completions alongside function calling	2023-11-03 02:12:14 -04:00
Andrei Betlen	df31303a12	Update CHANGELOG	2023-11-02 20:16:32 -04:00
Andrei	ab028cb878	Migrate inference to llama_batch and llama_decode api (#795 ) * Add low-level batching notebook * fix: tokenization of special characters: (#850) It should behave like llama.cpp, where most out of the box usages treat special characters accordingly * Update CHANGELOG * Cleanup * Fix runner label * Update notebook * Use llama_decode and batch api * Support logits_all parameter --------- Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>	2023-11-02 20:13:57 -04:00
Andrei Betlen	f436e0c872	Update llama.cpp	2023-11-02 17:34:01 -04:00
Andrei Betlen	8350de9a18	Bump version	2023-11-02 15:53:01 -04:00
Andrei Betlen	9ffe62d665	Update llama.cpp	2023-11-02 15:45:27 -04:00
Andrei Betlen	011b95d7f3	Fix name 'open' is not defined exception. Closes #860	2023-11-02 15:30:55 -04:00
Andrei Betlen	fa83cc5f9c	Update llama.cpp Fix build examples Exclude examples directory Revert cmake changes Try actions/checkout@v4 Try to update submodules Revert Update llama.cpp Fix build examples Exclude examples directory Revert cmake changes Try actions/checkout@v4 Try to update submodules Revert	2023-11-02 14:28:15 -04:00
Andrei Betlen	ddbd10c442	Fix clblast test	2023-11-02 14:28:15 -04:00
Andrei Betlen	735522272b	Fix runner label	2023-11-02 14:28:15 -04:00
Andrei Betlen	0feffb9c20	Cleanup	2023-11-02 14:28:15 -04:00
Andrei Betlen	7fe0bd3a31	Update CHANGELOG	2023-11-02 14:28:15 -04:00
Antoine Lizee	4d4e0f11e2	fix: tokenization of special characters: (#850 ) It should behave like llama.cpp, where most out of the box usages treat special characters accordingly	2023-11-02 14:28:14 -04:00
Andrei Betlen	952e4cc3ce	Fix: use linux image for opencl test	2023-11-01 21:31:02 -04:00
Andrei Betlen	8bf7fa6e5f	Add opencl test	2023-11-01 21:18:36 -04:00
Andrei Betlen	446d5f5649	Add metal ci test	2023-11-01 21:15:01 -04:00
Andrei Betlen	c89eadafbf	Update CHANGELOG	2023-11-01 19:40:04 -04:00
Andrei Betlen	6b3aa7fc8f	Bump version	2023-11-01 19:25:03 -04:00
NickAlgra	3fbcded7cd	Add missing n_seq_id to llama_batch (#842 )	2023-11-01 18:56:29 -04:00
Sujeendran Menon	7b136bb5b1	Fix for shared library not found and compile issues in Windows (#848 ) * fix windows library dll name issue * Updated README.md Windows instructions * Update llama_cpp.py to handle different windows dll file versions	2023-11-01 18:55:57 -04:00
cebtenzzre	eefd76fe81	llama: fix exception in Llama.__del__ (#846 )	2023-11-01 18:53:57 -04:00
David Ponce	3fc9147218	Iterate over tokens that should be biased rather than the entire vocabulary. (#851 )	2023-11-01 18:53:47 -04:00
Marko Tasic	9c8f4dca5f	fixed Llama._create_completion suffix check, it can be either None or str instance (#854 )	2023-11-01 18:52:50 -04:00
Daniel Thuerck	5f8f369d1b	Pass-Through grammar parameter in web server. (#855 ) Closes #778	2023-11-01 18:51:12 -04:00
Adam Katora	25cb710281	Update llama_types.py (#849 ) Minor typo fix, funcion -> function	2023-11-01 18:50:11 -04:00
Andrei Betlen	bdf5254658	Update llama.cpp	2023-11-01 14:15:56 -04:00
Andrei Betlen	d808fd436c	Update llama.cpp	2023-10-31 21:29:35 -04:00
Andrei Betlen	53861c9e53	Update llama.cpp	2023-10-24 03:13:32 -04:00
Andrei Betlen	acf50f179a	Update llama.cpp	2023-10-20 11:17:31 -04:00
Andrei Betlen	5a045fcbbc	Update llama.cpp	2023-10-19 17:37:07 -04:00
Andrei Betlen	ef03d77b59	Enable finish reason tests	2023-10-19 02:56:45 -04:00
gmcgoldr	09a8406c83	Fix streaming doesn't return finish reason (#798 ) When streaming the yield that contains the finish can be skipped. This change ensures that yield isn't skipped.	2023-10-19 02:55:56 -04:00
Andrei Betlen	28c2b884e2	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-10-19 02:55:31 -04:00
Andrei Betlen	cbeef36510	Re-enable tests completion function	2023-10-19 02:55:29 -04:00
Andrei Betlen	ff580031d2	Update llama.cpp	2023-10-19 02:55:08 -04:00
Xiaoyu Kevin Hu	a315128d66	update value check for n_gpu_layers field (#826 )	2023-10-18 18:25:25 -04:00
Andrei Betlen	d989ac86e6	Update llama.cpp	2023-10-15 15:12:57 -04:00
Pierre Alexandre SCHEMBRI	10304d75fc	Make use of suppress_stdout_stderr when freeing model (#803 )	2023-10-15 13:52:43 -04:00
Ma, Guokai	a1ac199980	Fix repeat greeting (#808 ) * fix repeated greeting * remove seperator between role and message	2023-10-15 13:52:21 -04:00
Eric Liu	b50166500e	Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES (#820 ) * Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES * reword	2023-10-15 13:51:51 -04:00
Andrei Betlen	f30aa20126	Update llama.cpp	2023-10-12 02:24:50 -04:00
Andrei Betlen	622bff19b2	Update llama.cpp	2023-10-10 19:23:35 -04:00
Andrei Betlen	d6a130a052	Print traceback on server error	2023-10-10 15:56:04 -04:00

1 2 3 4 5 ...

1200 commits