baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	5a045fcbbc	Update llama.cpp	2023-10-19 17:37:07 -04:00
Andrei Betlen	ef03d77b59	Enable finish reason tests	2023-10-19 02:56:45 -04:00
gmcgoldr	09a8406c83	Fix streaming doesn't return finish reason (#798 ) When streaming the yield that contains the finish can be skipped. This change ensures that yield isn't skipped.	2023-10-19 02:55:56 -04:00
Andrei Betlen	28c2b884e2	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-10-19 02:55:31 -04:00
Andrei Betlen	cbeef36510	Re-enable tests completion function	2023-10-19 02:55:29 -04:00
Andrei Betlen	ff580031d2	Update llama.cpp	2023-10-19 02:55:08 -04:00
Xiaoyu Kevin Hu	a315128d66	update value check for n_gpu_layers field (#826 )	2023-10-18 18:25:25 -04:00
Andrei Betlen	d989ac86e6	Update llama.cpp	2023-10-15 15:12:57 -04:00
Pierre Alexandre SCHEMBRI	10304d75fc	Make use of suppress_stdout_stderr when freeing model (#803 )	2023-10-15 13:52:43 -04:00
Ma, Guokai	a1ac199980	Fix repeat greeting (#808 ) * fix repeated greeting * remove seperator between role and message	2023-10-15 13:52:21 -04:00
Eric Liu	b50166500e	Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES (#820 ) * Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES * reword	2023-10-15 13:51:51 -04:00
Andrei Betlen	f30aa20126	Update llama.cpp	2023-10-12 02:24:50 -04:00
Andrei Betlen	622bff19b2	Update llama.cpp	2023-10-10 19:23:35 -04:00
Andrei Betlen	d6a130a052	Print traceback on server error	2023-10-10 15:56:04 -04:00
Andrei Betlen	43dfe1e2ab	Update llama.cpp	2023-10-05 16:07:49 -04:00
Andrei Betlen	2c0456acf0	Update llama.cpp	2023-10-04 20:19:31 -04:00
Andrei Betlen	c305be6db6	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-10-03 15:23:37 -04:00
Andrei Betlen	a7d17b8ac9	Update llama.cpp	2023-10-03 15:23:35 -04:00
ccshen	b76724cddc	Update instruction to download GGUF model (#783 ) Co-authored-by: john.shen <john.shen@bioclinica.com>	2023-10-02 11:46:47 -04:00
Andrei Betlen	305482bd41	Add chatml chat format	2023-09-30 21:01:34 -04:00
Andrei Betlen	5ef5280ef9	Log server exceptions to stdout	2023-09-30 19:13:36 -04:00
Andrei Betlen	f0af1c7201	Update llama.cpp	2023-09-30 19:09:50 -04:00
Andrei Betlen	fab4bccc35	Bump version	2023-09-30 16:04:46 -04:00
Andrei Betlen	d696251fbe	Fix logits_all bug	2023-09-30 16:02:35 -04:00
Andrei Betlen	6ee413d79e	Bump version	2023-09-30 13:23:09 -04:00
Andrei Betlen	42bb721d64	Fix bug in embedding	2023-09-30 13:20:22 -04:00
Andrei Betlen	bca965325d	Update CHANGELOG	2023-09-30 00:08:45 -04:00
Andrei Betlen	5d62d55a82	Bump version	2023-09-30 00:07:06 -04:00
Andrei Betlen	ac853e01e1	Include git directories	2023-09-30 00:01:14 -04:00
Andrei Betlen	9e76613629	Remove git repo exclude	2023-09-29 23:28:59 -04:00
Andrei Betlen	b4939c2d99	Revert BUILD_NUMBER fix	2023-09-29 23:28:45 -04:00
Andrei Betlen	541aaff45e	Quote fix attempt #2	2023-09-29 23:05:26 -04:00
Andrei Betlen	39e5feb138	Fix quote issue	2023-09-29 23:01:38 -04:00
Andrei Betlen	3c6e98f945	Use dev versioning for test pypi	2023-09-29 22:57:49 -04:00
Andrei Betlen	1cca20304b	Revert update to publish test pypi	2023-09-29 22:48:17 -04:00
Andrei Betlen	85e4d08a2e	Update publish to test pypi workflow	2023-09-29 22:32:31 -04:00
Andrei Betlen	43f8fc371a	Potential fix for pip install bug	2023-09-29 22:24:22 -04:00
Andrei Betlen	386c88b68e	Bump version	2023-09-29 20:07:31 -04:00
Andrei Betlen	d9bce17794	Update server params	2023-09-29 19:59:12 -04:00
Andrei Betlen	3720c739d4	Update llama.cpp	2023-09-29 19:58:21 -04:00
Andrei	3bca7708fb	Configurable Chat Formats (#711 ) * Add configurable default chat completion format. * Remove chat_template file to avoid circular import * Update llama_types * Add chat format	2023-09-29 19:52:04 -04:00
Josh XT	a945404b4a	Fix rope scaling defaults (#767 ) * Fix rope scale with backwards compatibility * Fix defaults * Fix op * Remove backwards compatibility * Check single val	2023-09-29 16:03:57 -04:00
Andrei Betlen	a72efc77de	Update llama.cpp	2023-09-28 23:25:14 -04:00
Andrei Betlen	1a1c3dc418	Update llama.cpp	2023-09-28 22:42:03 -04:00
Andrei Betlen	4177ae6d34	Bump version	2023-09-25 14:38:38 -04:00
Andrei Betlen	1ed0f3ebe1	Bump scikit-build-core version to one that includes fix for windows cmake.	2023-09-25 14:20:09 -04:00
Andrei Betlen	f7b785a00f	Update CHANGELOG	2023-09-25 13:58:23 -04:00
Andrei Betlen	cf8ae5a69c	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-09-25 13:57:00 -04:00
Andrei Betlen	5da57734bc	Update llama.cpp	2023-09-25 13:56:52 -04:00
Viacheslav/Slava Tradunsky	3d5e5b1c04	Adds openai-processing-ms response header (#748 )	2023-09-25 13:55:58 -04:00

... 6 7 8 9 10 ...

1464 commits