baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Phil H	fe5d6ea648	fix: GGUF metadata KV overrides, re #1011 (#1116 ) * kv overrides another attempt * add sentinel element, simplify array population * ensure sentinel element is zeroed	2024-01-23 22:00:38 -05:00
Andrei Betlen	7e63928bc9	Update llama.cpp	2024-01-23 18:42:39 -05:00
Andrei Betlen	fcdf337d84	Update llama.cpp	2024-01-22 11:25:11 -05:00
Andrei Betlen	5b982d0f8c	fix: use both eos and bos tokens as stop sequences for hf-tokenizer-config chat format.	2024-01-22 08:32:48 -05:00
baalajimaestro	8806f19ef9	Merge https://github.com/abetlen/llama-cpp-python	2024-01-22 18:40:46 +05:30
Andrei Betlen	2ce0b8aa2c	Bump version	2024-01-21 20:30:24 -05:00
Andrei Betlen	d3f5528ca8	fix: from_json_schema oneof/anyof bug. Closes #1097	2024-01-21 19:06:53 -05:00
Andrei Betlen	8eefdbca03	Update llama.cpp	2024-01-21 19:01:27 -05:00
Andrei Betlen	88fbccaaa3	docs: Add macosx wrong arch fix to README	2024-01-21 18:38:44 -05:00
Andrei Betlen	24f39454e9	fix: pass chat handler not chat formatter for huggingface autotokenizer and tokenizer_config formats.	2024-01-21 18:38:04 -05:00
Andrei Betlen	7f3209b1eb	feat: Add add_generation_prompt option for jinja2chatformatter.	2024-01-21 18:37:24 -05:00
Andrei Betlen	ac2e96d4b4	Update llama.cpp	2024-01-19 15:33:43 -05:00
Andrei Betlen	be09318c26	feat: Add Jinja2ChatFormatter	2024-01-19 15:04:42 -05:00
Andrei Betlen	5a34c57e54	feat: Expose gguf model metadata in metadata property	2024-01-19 10:46:03 -05:00
Andrei Betlen	833a7f1a86	Bump version	2024-01-19 09:03:35 -05:00
Andrei Betlen	e21c3c7a91	Update makefile	2024-01-19 08:47:56 -05:00
Andrei Betlen	0f54948482	Update llama.cpp	2024-01-19 08:41:52 -05:00
Andrei Betlen	3babe3512c	Fix mirostat sampling	2024-01-19 08:31:59 -05:00
Andrei Betlen	141293a75b	Fix python3.8 support	2024-01-19 08:17:49 -05:00
Andrei Betlen	656f3d8968	Bump version	2024-01-18 21:30:36 -05:00
Andrei Betlen	03ed547bfd	Remove templates doc	2024-01-18 21:23:26 -05:00
Andrei Betlen	3ca86ab390	Update llama.cpp	2024-01-18 21:22:45 -05:00
Andrei Betlen	be23404ed4	Cleanup pyproject	2024-01-18 21:22:19 -05:00
Andrei Betlen	89cce50f8c	Update llama.cpp	2024-01-18 21:21:49 -05:00
Andrei Betlen	b8fc1c7d83	feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files.	2024-01-18 21:21:37 -05:00
Andrei Betlen	48c3b77e6f	Offload KQV by default	2024-01-18 11:08:57 -05:00
Austin	6bfe98bd80	Integration of Jinja2 Templating (#875 ) * feat: Add support for jinja templating Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com> * fix: Refactor chat formatter and update interface for jinja templates - Simplify the `llama2_template` in `llama_jinja_format.py` by removing unnecessary line breaks for readability without affecting functionality. - Update `ChatFormatterInterface` constructor to accept a more generic `Optional[object]` type for the template parameter, enhancing flexibility. - Introduce a `template` property to `ChatFormatterInterface` for standardized access to the template string. - Replace `MetaSingleton` metaclass with `Singleton` for the `ChatFormatterFactory` to streamline the singleton implementation. These changes enhance code readability, maintain usability, and ensure consistency in the chat formatter's design pattern usage. * Add outline for Jinja2 templating integration documentation Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com> * Add jinja2 as a dependency with version range for Hugging Face transformers compatibility Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com> * Update jinja2 version constraint for mkdocs-material compatibility Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com> * Fix attribute name in AutoChatFormatter - Changed attribute name from `self._renderer` to `self._environment` --------- Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>	2024-01-17 09:47:52 -05:00
Andrei Betlen	52adc23115	Update llama.cpp	2024-01-17 09:27:40 -05:00
Andrei Betlen	7b46bb5a78	Re-order classes in llama.py	2024-01-17 09:16:13 -05:00
Andrei Betlen	cc4630e66f	Move helper classes to _internals submodule	2024-01-17 09:14:00 -05:00
Andrei Betlen	3b92419132	Move cache classes to llama_cache submodule.	2024-01-17 09:09:12 -05:00
baalajimaestro	833126bbd3	Merge https://github.com/abetlen/llama-cpp-python	2024-01-17 12:10:00 +05:30
Andrei Betlen	6981597835	Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main	2024-01-16 19:35:59 -05:00
Andrei Betlen	d5dbb3f8de	Update llama.cpp	2024-01-16 19:35:57 -05:00
Jerry Liu	84380fe9a6	Add llamaindex integration to readme (#1092 )	2024-01-16 19:10:50 -05:00
Kyle Mistele	9c36688b33	fix(cli): allow passing n_ctx=0 to openAI API server args to use model n_ctx_train field per #1015 (#1093 )	2024-01-16 18:54:06 -05:00
anil	cfb7da98ed	Support Accept text/event-stream in chat and completion endpoints, resolves #1083 (#1088 ) Co-authored-by: Anil Pathak <anil@heyday.com> Co-authored-by: Andrei Betlen <abetlen@gmail.com>	2024-01-16 12:52:52 -05:00
Andrei Betlen	e39778f8eb	Update llama.cpp	2024-01-16 11:56:44 -05:00
Andrei Betlen	4b11fa83c0	Bump version	2024-01-15 12:54:51 -05:00
Andrei Betlen	84615adbc6	Add split_mode option. Closes #1085	2024-01-15 12:49:20 -05:00
Phil H	76aafa6149	Implement GGUF metadata KV overrides (#1011 ) * Implement GGUF metadata overrides * whitespace fix * Fix kv overrides. * Fix pointer and pickle * Match llama.cpp kv_overrides cli argument --------- Co-authored-by: Andrei <abetlen@gmail.com>	2024-01-15 12:29:29 -05:00
yieldthought	7eff42c239	Avoid "LookupError: unknown encoding: ascii" when open() called in a destructor (#1012 ) The existing code often causes "LookupError: unknown encoding: ascii" when open() called in a destructor. Saving open in self.open is not enough to avoid this. Instead, we can avoid reopening /dev/null every time by doing it once when the module is loaded.	2024-01-15 10:52:10 -05:00
anil	1eaace8ea3	Fix low_level_api_chat_cpp example to match current API (#1086 ) * Fix low_level_api_chat_cpp to match current API * Fix low_level_api_chat_cpp to match current API * Using None instead of empty string to so that default prompt template can be used if no prompt provided --------- Co-authored-by: Anil Pathak <anil@heyday.com>	2024-01-15 10:46:35 -05:00
Mark Neumann	c689ccc728	Fix Pydantic model parsing (#1087 )	2024-01-15 10:45:57 -05:00
Andrei Betlen	5502ac8876	Update llama.cpp	2024-01-15 10:12:10 -05:00
Andrei Betlen	359ae73643	Update llama.cpp	2024-01-14 08:17:22 -05:00
baalajimaestro	966f8cb64f	Merge https://github.com/abetlen/llama-cpp-python	2024-01-14 14:56:35 +05:30
Andrei Betlen	7c898d5684	Update llama.cpp	2024-01-13 22:37:49 -05:00
Andrei Betlen	bb610b9428	Update llama.cpp	2024-01-11 22:51:12 -05:00
Andrei Betlen	f0159663d9	Bump version	2024-01-10 02:51:17 -05:00

1 2 3 4 5 ...

1449 commits