baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	3921e10770	feat: support minItems/maxItems in JSON grammar converter (by @nopperl)	2024-02-22 00:17:06 -05:00
Andrei Betlen	e6d6260a91	fix: Update from_pretrained defaults to match hf_hub_download	2024-02-22 00:10:23 -05:00
Andrei Betlen	dd22010e85	fix: Raise exceptions when llama model or context fails to load	2024-02-22 00:09:45 -05:00
Andrei Betlen	3632241e98	chore: Bump version	2024-02-21 23:09:13 -05:00
Andrei Betlen	0653e15c20	feat: Update llama.cpp	2024-02-21 23:04:52 -05:00
Andrei Betlen	7981e9ce1e	chore: Bump version	2024-02-21 16:30:59 -05:00
Andrei Betlen	7f3962e11c	feat: Update llama.cpp	2024-02-21 16:27:56 -05:00
Andrei Betlen	14191e9036	docs: Add create_chat_completion_openai_v1 to api reference	2024-02-21 16:26:49 -05:00
Andrei Betlen	fe5626cd40	misc: add .local pattern to gitignore	2024-02-21 16:26:30 -05:00
Andrei	7f51b6071f	feat(low-level-api): Improve API static type-safety and performance (#1205 )	2024-02-21 16:25:38 -05:00
Andrei	0f8aa4ab5c	feat: Pull models directly from huggingface (#1206 ) * Add from_pretrained method to Llama class * Update docs * Merge filename and pattern	2024-02-21 16:25:10 -05:00
Andrei Betlen	e42f62c247	chore: Bump version	2024-02-21 11:09:40 -05:00
Andrei Betlen	4edde21b3d	feat: Update llama.cpp	2024-02-21 11:05:58 -05:00
Andrei Betlen	f57b01ac9b	ci: add debug build to dev makefile	2024-02-21 11:04:30 -05:00
Andrei Betlen	04fe33b999	feat: Update llama.cpp	2024-02-20 02:59:02 -05:00
Andrei Betlen	d122bd7858	feat: Update llama.cpp	2024-02-19 22:10:16 -05:00
Andrei Betlen	6225f027e5	feat: Update llama.cpp	2024-02-19 04:11:34 -05:00
Andrei Betlen	748c0ce057	feat: Update llama.cpp	2024-02-18 21:30:36 -05:00
Andrei Betlen	53f6f5f415	fix: self.numa missing	2024-02-17 01:02:33 -05:00
Andrei Betlen	fdce078cb9	feat: Update llama.cpp	2024-02-17 00:37:51 -05:00
Andrei Betlen	c2a234a086	docs: Add embeddings section	2024-02-15 23:15:50 -05:00
Andrei Betlen	f736827b9b	chore: Bump version	2024-02-15 23:10:50 -05:00
Andrei Betlen	0ce66bc080	fix: create_embedding broken response for input type str	2024-02-15 16:09:48 -05:00
khimaros	ea1f88dd29	fix: Use '\n' seperator for EventSourceResponse (#1188 ) this fixes compatibility with some OpenAI clients, including BetterChatGPT (https://github.com/ztjhz/BetterChatGPT/issues/537). Co-authored-by: Andrei <abetlen@gmail.com>	2024-02-15 15:20:13 -05:00
Andrei Betlen	a5cfeb7763	feat: Update llama.cpp	2024-02-15 15:17:30 -05:00
Douglas Hanley	7bb91f025f	fix: Incorporate embedding pooling layer fixes (#1194 ) * remove division by token count * truncate to n_batch, not n_ctx	2024-02-15 15:16:30 -05:00
Andrei Betlen	ae71ad1a14	Bump version	2024-02-14 04:31:42 -05:00
Andrei Betlen	f300d4310a	Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main	2024-02-14 04:27:33 -05:00
Andrei Betlen	c336f78269	Update llama.cpp	2024-02-14 04:27:30 -05:00
Douglas Hanley	d7a67917ba	feat: Support batch embeddings (#1186 ) * handle batched embeddings * fix normalization issue * fix type hints, ensure no breaking changes to embed * Clear kv cache / reset internal state after embedding complete --------- Co-authored-by: Andrei <abetlen@gmail.com>	2024-02-14 04:26:09 -05:00
Andrei Betlen	36b843228f	misc: fix makefile build commands	2024-02-14 03:47:40 -05:00
Andrei Betlen	7b9960d1cb	Update llama.cpp	2024-02-14 03:47:21 -05:00
Andrei Betlen	6943bab6d8	fix: destructor exception where internal classes are missing some uninitialized attributes	2024-02-14 03:38:41 -05:00
Andrei Betlen	07a783779a	fix: Update openbuddy prompt format. Closes #1155	2024-02-13 23:57:10 -05:00
Andrei Betlen	7a79e5ac49	Update llama.cpp	2024-02-13 23:54:05 -05:00
Andrei Betlen	7dbbfdecad	fix: submodule kompute is not included in sdist. Closes #1165	2024-02-13 23:53:56 -05:00
Andrei Betlen	345215a76c	fix: more chatml-function-calling fixes	2024-02-13 23:02:50 -05:00
Andrei Betlen	b1637c2319	Bump version	2024-02-13 12:35:04 -05:00
Andrew Lapp	d6be5333e1	fix: sample idx off-by-one error for logit_processors (#1179 ) * fix sample_idx off-by-one error * self._scores is indexed differently, only modify the index within self._input_ids --------- Co-authored-by: Andrew Lapp <andrew@rew.la> Co-authored-by: Andrei <abetlen@gmail.com>	2024-02-13 12:26:07 -05:00
Andrei Betlen	f7cdf78788	Update llama.cpp	2024-02-13 12:24:00 -05:00
Andrei Betlen	68fb71b6a2	fix: missing generation_prompt in chatml-function-calling	2024-02-13 03:24:41 -05:00
Andrei Betlen	4b0e3320bd	fix: minor formatting bugs for chatml-function-calling	2024-02-13 03:11:35 -05:00
Andrei Betlen	6fe8b427e1	Bump version	2024-02-13 02:46:52 -05:00
Andrei Betlen	d1822fed6b	fix: Don't change order of json schema object properties unless prop_order is passed, Closes #1180	2024-02-13 02:44:00 -05:00
Andrei Betlen	5efc45bdfd	Update llama.cpp	2024-02-13 02:43:07 -05:00
Andrei Betlen	4348a6cdf0	docs: Fix typo	2024-02-13 02:04:54 -05:00
Andrei Betlen	d605875772	Bump version	2024-02-12 16:28:30 -05:00
Andrei Betlen	b82b0e1014	docs: Temporarily revert function calling docs	2024-02-12 16:27:43 -05:00
Andrei Betlen	cb791716b4	fix: Always set logits_all = True when using speculative decoding	2024-02-12 16:19:05 -05:00
Andrei	153a0049d9	feat: Generic chatml Function Calling (#957 ) * Add demo notebook * Add initial chat handler * Update OpenAI types * Add generic chatml function calling (wip) * Update chatml generic function calling. * Progress on auto-tool calls * fix streaming functions * Remove print statements * fix: Suppress output from llama.cpp init and grammar creation * Add OpenAI v1 python api compatible chat completion function * Support non-streaming multi-tool calls * Format * Include function_call in response.	2024-02-12 15:56:07 -05:00

1 2 3 4 5 ...

1505 commits