baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	e02d52df29	Try to clean before calling cibuildwheel	2023-11-10 06:01:58 -05:00
Andrei Betlen	ed5a9260f6	Force LD_LIBRARY_PATH	2023-11-10 05:54:23 -05:00
Andrei Betlen	2f070afd61	Don't install in editable mode for release	2023-11-10 05:45:44 -05:00
Andrei Betlen	e32ecb0516	Fix tests	2023-11-10 05:39:42 -05:00
Andrei Betlen	6f0b0b1b84	Fix sampling bug when logits_all=False	2023-11-10 05:15:41 -05:00
Andrei Betlen	d9b38e3e3a	Potential bugfix for eval	2023-11-10 04:41:19 -05:00
Andrei Betlen	52350cc9d7	Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main	2023-11-10 04:24:51 -05:00
Andrei Betlen	b84d76a844	Fix: add default stop sequence to chatml chat format	2023-11-10 04:24:48 -05:00
Andrei Betlen	841f6167cc	Add Code Completion section to docs	2023-11-10 04:06:14 -05:00
Andrei Betlen	1b376c62b7	Update functionary for new OpenAI API	2023-11-10 02:51:58 -05:00
Andrei Betlen	17da8fb446	Add missing tool_calls finish_reason	2023-11-10 02:51:06 -05:00
Andrei Betlen	770df34436	Add $ref and $defs support to json schema converter	2023-11-10 02:50:46 -05:00
Andrei Betlen	faeae181b1	Fix: json_schema_to_gbnf should take string dump of json schema as input	2023-11-10 02:50:17 -05:00
Andrei Betlen	e7962d2c73	Fix: default max_tokens matches openai api (16 for completion, max length for chat completion)	2023-11-10 02:49:27 -05:00
Andrei Betlen	82072802ea	Add link to bakllava gguf model	2023-11-09 03:05:18 -05:00
Andrei Betlen	baeb7b34b3	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-11-09 00:55:25 -05:00
Andrei Betlen	b62c449839	Bugfix: missing response_format for functionary and llava chat handlers	2023-11-09 00:55:23 -05:00
Kevin Jung	fb1f956a27	Fix server doc arguments (#892 )	2023-11-08 23:53:00 -05:00
Andrei Betlen	80f4162bf4	Update llama.cpp	2023-11-08 11:18:15 -05:00
Andrei Betlen	fd41ed3a90	Add set_seed to Llama class	2023-11-08 11:09:41 -05:00
Andrei Betlen	ca4cb88351	Fix destructor NoneType is not callable error	2023-11-08 11:05:45 -05:00
Andrei Betlen	01cb3a0381	Bump version	2023-11-08 00:54:54 -05:00
Andrei Betlen	9ae9c86be0	Update server docs	2023-11-08 00:52:13 -05:00
Andrei Betlen	598780fde8	Update Multimodal notebook	2023-11-08 00:48:25 -05:00
Andrei Betlen	b30b9c338b	Add JSON mode support. Closes #881	2023-11-08 00:07:16 -05:00
Andrei Betlen	4852a6a39c	Fix built in GBNF grammar rules	2023-11-08 00:06:22 -05:00
Andrei Betlen	64f5153c35	Add seed parameter to chat handlers	2023-11-07 23:41:29 -05:00
Andrei Betlen	86aeb9f3a1	Add seed parameter support for completion and chat_completion requests. Closes #884	2023-11-07 23:37:28 -05:00
Andrei Betlen	da1b80285a	Update changelog	2023-11-07 23:15:26 -05:00
Andrei Betlen	9a8e64d29d	Update llama.cpp	2023-11-07 23:14:19 -05:00
Andrei Betlen	3660230faa	Fix docs multi-modal docs	2023-11-07 22:52:08 -05:00
Damian Stewart	aab74f0b2b	Multimodal Support (Llava 1.5) (#821 ) * llava v1.5 integration * Point llama.cpp to fork * Add llava shared library target * Fix type * Update llama.cpp * Add llava api * Revert changes to llama and llama_cpp * Update llava example * Add types for new gpt-4-vision-preview api * Fix typo * Update llama.cpp * Update llama_types to match OpenAI v1 API * Update ChatCompletionFunction type * Reorder request parameters * More API type fixes * Even More Type Updates * Add parameter for custom chat_handler to Llama class * Fix circular import * Convert to absolute imports * Fix * Fix pydantic Jsontype bug * Accept list of prompt tokens in create_completion * Add llava1.5 chat handler * Add Multimodal notebook * Clean up examples * Add server docs --------- Co-authored-by: Andrei Betlen <abetlen@gmail.com>	2023-11-07 22:48:51 -05:00
Andrei Betlen	56171cf7bf	Bump version	2023-11-06 09:37:55 -05:00
Andrei Betlen	52320c348c	Add python 3.12 classifier	2023-11-06 09:34:07 -05:00
Andrei Betlen	4286830f16	Add python3.12 tests	2023-11-06 09:32:20 -05:00
Andrei Betlen	be0add1b2d	Fix type bug	2023-11-06 09:30:38 -05:00
Andrei Betlen	e214a58422	Refactor Llama class internals	2023-11-06 09:16:36 -05:00
Andrei Betlen	bbffdaebaa	Refactor autotokenizer format to reusable function	2023-11-06 09:07:27 -05:00
Andrei Betlen	b0e597e46e	Pin python version in release	2023-11-06 08:56:41 -05:00
Joe	4ff8def4d0	#717 : Add support for Huggingface Autotokenizer (#790 ) Co-authored-by: Andrei <abetlen@gmail.com>	2023-11-05 18:06:36 -05:00
earonesty	3580e2c5df	Update llama_chat_format.py (#869 ) * Update llama_chat_format.py properly formal llama2 with first-message prompt embedded * Update llama_chat_format.py	2023-11-05 17:00:13 -05:00
Andrei Betlen	f0b30ef7dc	Update llama.cpp	2023-11-05 16:57:10 -05:00
Andrei Betlen	dccbac82eb	Update llama.cpp	2023-11-03 18:12:22 -04:00
Andrei Betlen	2ec043af76	Clean up stdout / stderr suppression	2023-11-03 13:02:15 -04:00
Andrei Betlen	4ea7027c41	Rename internal only module utils to _utils	2023-11-03 12:55:55 -04:00
Andrei Betlen	df9362eeea	Update llama.cpp	2023-11-03 11:34:50 -04:00
Andrei	3af7b21ff1	Add functionary support (#784 ) * Add common grammars and json-schema-to-grammar utility function from llama.cpp * Pass functions to format function * Add basic functionary formatting * Add LlamaChatHandler for more complex chat use cases * Add function calling example notebook * Add support for regular chat completions alongside function calling	2023-11-03 02:12:14 -04:00
Andrei Betlen	df31303a12	Update CHANGELOG	2023-11-02 20:16:32 -04:00
Andrei	ab028cb878	Migrate inference to llama_batch and llama_decode api (#795 ) * Add low-level batching notebook * fix: tokenization of special characters: (#850) It should behave like llama.cpp, where most out of the box usages treat special characters accordingly * Update CHANGELOG * Cleanup * Fix runner label * Update notebook * Use llama_decode and batch api * Support logits_all parameter --------- Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>	2023-11-02 20:13:57 -04:00
Andrei Betlen	f436e0c872	Update llama.cpp	2023-11-02 17:34:01 -04:00

1 2 3 4 5 ...

1189 commits