baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	d808fd436c	Update llama.cpp	2023-10-31 21:29:35 -04:00
Andrei Betlen	53861c9e53	Update llama.cpp	2023-10-24 03:13:32 -04:00
gmcgoldr	09a8406c83	Fix streaming doesn't return finish reason (#798 ) When streaming the yield that contains the finish can be skipped. This change ensures that yield isn't skipped.	2023-10-19 02:55:56 -04:00
Andrei Betlen	28c2b884e2	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-10-19 02:55:31 -04:00
Andrei Betlen	ff580031d2	Update llama.cpp	2023-10-19 02:55:08 -04:00
Xiaoyu Kevin Hu	a315128d66	update value check for n_gpu_layers field (#826 )	2023-10-18 18:25:25 -04:00
Pierre Alexandre SCHEMBRI	10304d75fc	Make use of suppress_stdout_stderr when freeing model (#803 )	2023-10-15 13:52:43 -04:00
Ma, Guokai	a1ac199980	Fix repeat greeting (#808 ) * fix repeated greeting * remove seperator between role and message	2023-10-15 13:52:21 -04:00
Eric Liu	b50166500e	Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES (#820 ) * Add validation for tensor_split size exceeding LLAMA_MAX_DEVICES * reword	2023-10-15 13:51:51 -04:00
Andrei Betlen	d6a130a052	Print traceback on server error	2023-10-10 15:56:04 -04:00
Andrei Betlen	43dfe1e2ab	Update llama.cpp	2023-10-05 16:07:49 -04:00
Andrei Betlen	a7d17b8ac9	Update llama.cpp	2023-10-03 15:23:35 -04:00
Andrei Betlen	305482bd41	Add chatml chat format	2023-09-30 21:01:34 -04:00
Andrei Betlen	5ef5280ef9	Log server exceptions to stdout	2023-09-30 19:13:36 -04:00
Andrei Betlen	fab4bccc35	Bump version	2023-09-30 16:04:46 -04:00
Andrei Betlen	d696251fbe	Fix logits_all bug	2023-09-30 16:02:35 -04:00
Andrei Betlen	6ee413d79e	Bump version	2023-09-30 13:23:09 -04:00
Andrei Betlen	42bb721d64	Fix bug in embedding	2023-09-30 13:20:22 -04:00
Andrei Betlen	5d62d55a82	Bump version	2023-09-30 00:07:06 -04:00
Andrei Betlen	386c88b68e	Bump version	2023-09-29 20:07:31 -04:00
Andrei Betlen	d9bce17794	Update server params	2023-09-29 19:59:12 -04:00
Andrei Betlen	3720c739d4	Update llama.cpp	2023-09-29 19:58:21 -04:00
Andrei	3bca7708fb	Configurable Chat Formats (#711 ) * Add configurable default chat completion format. * Remove chat_template file to avoid circular import * Update llama_types * Add chat format	2023-09-29 19:52:04 -04:00
Josh XT	a945404b4a	Fix rope scaling defaults (#767 ) * Fix rope scale with backwards compatibility * Fix defaults * Fix op * Remove backwards compatibility * Check single val	2023-09-29 16:03:57 -04:00
Andrei Betlen	1a1c3dc418	Update llama.cpp	2023-09-28 22:42:03 -04:00
Andrei Betlen	4177ae6d34	Bump version	2023-09-25 14:38:38 -04:00
Viacheslav/Slava Tradunsky	3d5e5b1c04	Adds openai-processing-ms response header (#748 )	2023-09-25 13:55:58 -04:00
Andrei Betlen	dbca136fea	Update llama_types and names to match openai api	2023-09-20 15:38:26 -04:00
Andrei Betlen	38e34c97f0	Update llama.cpp	2023-09-18 16:11:27 -04:00
Andrei Betlen	8d75016549	Install required runtime dlls to package directory on windows	2023-09-16 14:57:49 -04:00
Andrei Betlen	acf18fcdf0	Bump version	2023-09-15 14:22:21 -04:00
Andrei Betlen	b047b3034e	Remove confusing helpstring from server cli args. Closes #719	2023-09-15 14:09:43 -04:00
Andrei Betlen	24fec0b242	Bump version	2023-09-14 18:33:08 -04:00
Andrei Betlen	8474665625	Update base_path to fix issue resolving dll in windows isolation container.	2023-09-14 14:51:43 -04:00
Andrei Betlen	507bcc7171	Bump version	2023-09-13 23:15:23 -04:00
Andrei Betlen	0449d29b9f	Fix boolean env vars and cli arguments	2023-09-13 23:09:57 -04:00
earonesty	58a6e42cc0	Update app.py (#705 )	2023-09-13 23:01:34 -04:00
Andrei Betlen	f4090a0bb2	Add numa support, low level api users must now explicitly call llama_backend_init at the start of their programs.	2023-09-13 23:00:43 -04:00
Andrei Betlen	c999325e8e	Fix boolean cli flags	2023-09-13 22:56:10 -04:00
Andrei Betlen	4daf77e546	Format	2023-09-13 21:23:23 -04:00
Andrei Betlen	2920c4bf7e	Update server params. Added lora_base, lora_path, low_vram, and main_gpu. Removed rms_norm_eps and n_gqa (deprecated in llama.cpp)	2023-09-13 21:23:13 -04:00
Andrei Betlen	6a20293fc2	Reorder init params to match llama.cpp order	2023-09-13 21:20:26 -04:00
Andrei Betlen	c8f9b8a734	Explicitly make all init params other than model_path into keyword only params	2023-09-13 21:19:47 -04:00
Andrei Betlen	a68f9e2791	Add kwargs to init to catch extra params	2023-09-13 21:19:02 -04:00
Andrei Betlen	9e345a47a2	remove print	2023-09-13 21:12:27 -04:00
Andrei Betlen	517f9ed80b	Convert missed llama.cpp constants into standard python types	2023-09-13 21:11:52 -04:00
Andrei Betlen	c4c440ba2d	Fix tensor_split cli option	2023-09-13 20:00:42 -04:00
Andrei Betlen	203ede4ba2	Bump version	2023-09-13 18:07:08 -04:00
Andrei Betlen	759405c84b	Fix issue with Literal and Optional cli arguments not working. Closes #702	2023-09-13 18:06:12 -04:00
Devrim	da9df78db0	Add X-Request-ID request header for mirroring custom IDs. (#703 )	2023-09-13 16:18:31 -04:00

1 2 3 4 5 ...

441 commits