baalajimaestro/llama.cpp

Author	SHA1	Message	Date
earonesty	58a6e42cc0	Update app.py (#705 )	2023-09-13 23:01:34 -04:00
Andrei Betlen	f4090a0bb2	Add numa support, low level api users must now explicitly call llama_backend_init at the start of their programs.	2023-09-13 23:00:43 -04:00
Andrei Betlen	4daf77e546	Format	2023-09-13 21:23:23 -04:00
Andrei Betlen	2920c4bf7e	Update server params. Added lora_base, lora_path, low_vram, and main_gpu. Removed rms_norm_eps and n_gqa (deprecated in llama.cpp)	2023-09-13 21:23:13 -04:00
Devrim	da9df78db0	Add X-Request-ID request header for mirroring custom IDs. (#703 )	2023-09-13 16:18:31 -04:00
Andrei Betlen	1910793f56	Merge branch 'main' into v0.2-wip	2023-09-12 16:43:32 -04:00
Andrei Betlen	5de8009706	Add copilot-codex completions endpoint for drop-in copilot usage	2023-08-25 17:49:14 -04:00
Andrei Betlen	cf405f6764	Merge branch 'main' into v0.2-wip	2023-08-24 00:30:51 -04:00
Andrei Betlen	d015bdb4f8	Add mul_mat_q option	2023-08-08 14:35:06 -04:00
Andrei Betlen	343480364f	Merge branch 'main' into v0.2-wip	2023-07-24 15:26:08 -04:00
Andrei Betlen	11dd2bf382	Add temporary rms_norm_eps parameter	2023-07-24 14:09:24 -04:00
Andrei Betlen	0538ba1dab	Merge branch 'main' into v0.2-wip	2023-07-20 19:06:26 -04:00
Andrei Betlen	28a111704b	Fix compatibility with older python versions	2023-07-20 18:52:10 -04:00
Andrei	365d9a4367	Merge pull request #481 from c0sogi/main Added `RouteErrorHandler` for server	2023-07-20 17:41:42 -04:00
Andrei Betlen	0b121a7456	Format	2023-07-19 03:48:27 -04:00
Andrei Betlen	b43917c144	Add functions parameters	2023-07-19 03:48:20 -04:00
Andrei Betlen	19ba9d3845	Use numpy arrays for logits_processors and stopping_criteria. Closes #491	2023-07-18 19:27:41 -04:00
shutup	5ed8bf132f	expose RoPE param to server start	2023-07-18 16:34:36 +08:00
c0sogi	1551ba10bd	Added `RouteErrorHandler` for server	2023-07-16 14:57:39 +09:00
Andrei Betlen	118b7f6d5c	fix: tensor_split should be optional list	2023-07-14 16:52:48 -04:00
Shouyi Wang	579f526246	Resolve merge conflicts	2023-07-14 14:37:01 +10:00
Andrei Betlen	de4cc5a233	bugfix: pydantic v2 fields	2023-07-13 23:25:12 -04:00
Shouyi Wang	9f21f548a5	Add tensor split	2023-07-09 23:00:59 +10:00
Andrei Betlen	52753b77f5	Upgrade fastapi to 0.100.0 and pydantic v2	2023-07-07 21:38:46 -04:00
Andrei Betlen	57d8ec3899	Add setting to control request interruption	2023-07-07 03:37:23 -04:00
Andrei Betlen	4c7cdcca00	Add interruptible streaming requests for llama-cpp-python server. Closes #183	2023-07-07 03:04:17 -04:00
Alexey	282698b6d3	server: pass seed param from command line to llama	2023-06-23 00:19:24 +04:00
Andrei Betlen	1e20be6d0c	Add low_vram to server settings	2023-06-14 22:13:42 -04:00
Andrei Betlen	f7c5cfaf50	Format server options	2023-06-14 22:08:28 -04:00
Andrei Betlen	9c41a3e990	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-06-14 21:50:43 -04:00
Andrei	f568baeef1	Merge pull request #351 from player1537-forks/th/add-logits-bias-parameter Add support for `logit_bias` and `logit_bias_type` parameters	2023-06-14 21:49:56 -04:00
Andrei Betlen	f27393ab7e	Add additional verbose logs for cache	2023-06-14 21:46:48 -04:00
Gabor	3ea31930e5	fixes abetlen/llama-cpp-python #358	2023-06-11 00:58:08 +01:00
Tanner Hobson	eb7645b3ba	Add support for logit_bias and logit_bias_type parameters	2023-06-09 13:13:08 -04:00
Andrei Betlen	0c42168508	Fix cache implementation breaking changes	2023-06-08 13:19:23 -04:00
Eric B	9b1c9e902c	Added mirostat support for completions, chat completions API	2023-06-05 22:37:11 -04:00
Andrei Betlen	80066f0b80	Use async routes	2023-05-27 09:12:58 -04:00
Andrei Betlen	c2b59a5f59	Import unnused import	2023-05-26 22:59:29 -04:00
Simon Chabot	e783f1c191	feat: make embedding support list of string as input makes the /v1/embedding route similar to OpenAI api.	2023-05-20 01:23:32 +02:00
Andrei Betlen	a8cd169251	Bugfix: Stop sequences can be strings	2023-05-19 03:15:08 -04:00
Andrei Betlen	dc39cc0fa4	Use server sent events function for streaming completion	2023-05-19 02:04:30 -04:00
Andrei Betlen	a3352923c7	Add model_alias option to override model_path in completions. Closes #39	2023-05-16 17:22:00 -04:00
Andrei Betlen	cdf59768f5	Update llama.cpp	2023-05-14 00:04:22 -04:00
Andrei Betlen	8740ddc58e	Only support generating one prompt at a time.	2023-05-12 07:21:46 -04:00
Andrei Betlen	8895b9002a	Revert "llama_cpp server: prompt is a string". Closes #187 This reverts commit `b9098b0ef7`.	2023-05-12 07:16:57 -04:00
Lucas Doyle	02e8a018ae	llama_cpp server: document presence_penalty and frequency_penalty, mark as supported	2023-05-09 16:25:00 -07:00
Andrei Betlen	82d138fe54	Fix: default repeat_penalty	2023-05-08 18:49:11 -04:00
Andrei Betlen	0d751a69a7	Set repeat_penalty to 0 by default	2023-05-08 01:50:43 -04:00
Andrei Betlen	65d9cc050c	Add openai frequency and presence penalty parameters. Closes #169	2023-05-08 01:30:18 -04:00
Andrei Betlen	a0b61ea2a7	Bugfix for models endpoint	2023-05-07 20:17:52 -04:00

1 2

72 commits