baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Lucas Doyle	02e8a018ae	llama_cpp server: document presence_penalty and frequency_penalty, mark as supported	2023-05-09 16:25:00 -07:00
Andrei Betlen	d957422bf4	Implement sampling as in llama.cpp main example	2023-05-08 21:21:25 -04:00
Andrei Betlen	93a9019bb1	Merge branch 'main' of github.com:abetlen/llama_cpp_python into Maximilian-Winter/main	2023-05-08 19:57:09 -04:00
Andrei Betlen	82d138fe54	Fix: default repeat_penalty	2023-05-08 18:49:11 -04:00
Andrei Betlen	29f094bbcf	Bugfix: not falling back to environment variables when default is value is set.	2023-05-08 14:46:25 -04:00
Andrei Betlen	0d6c60097a	Show default value when --help is called	2023-05-08 14:21:15 -04:00
Andrei Betlen	022e9ebcb8	Use environment variable if parsed cli arg is None	2023-05-08 14:20:53 -04:00
Andrei Betlen	0d751a69a7	Set repeat_penalty to 0 by default	2023-05-08 01:50:43 -04:00
Andrei Betlen	65d9cc050c	Add openai frequency and presence penalty parameters. Closes #169	2023-05-08 01:30:18 -04:00
Andrei Betlen	a0b61ea2a7	Bugfix for models endpoint	2023-05-07 20:17:52 -04:00
Andrei Betlen	e72f58614b	Change pointer to lower overhead byref	2023-05-07 20:01:34 -04:00
Andrei Betlen	14da46f16e	Added cache size to settins object.	2023-05-07 19:33:17 -04:00
Andrei Betlen	0e94a70de1	Add in-memory longest prefix cache. Closes #158	2023-05-07 19:31:26 -04:00
Andrei Betlen	8dfde63255	Fix return type	2023-05-07 19:30:14 -04:00
Andrei Betlen	2753b85321	Format	2023-05-07 13:19:56 -04:00
Andrei Betlen	627811ea83	Add verbose flag to server	2023-05-07 05:09:10 -04:00
Andrei Betlen	3fbda71790	Fix mlock_supported and mmap_supported return type	2023-05-07 03:04:22 -04:00
Andrei Betlen	5a3413eee3	Update cpu_count	2023-05-07 03:03:57 -04:00
Andrei Betlen	1a00e452ea	Update settings fields and defaults	2023-05-07 02:52:20 -04:00
Andrei Betlen	86753976c4	Revert "llama_cpp server: delete some ignored / unused parameters" This reverts commit `b47b9549d5`.	2023-05-07 02:02:34 -04:00
Andrei Betlen	c382d8f86a	Revert "llama_cpp server: mark model as required" This reverts commit `e40fcb0575`.	2023-05-07 02:00:22 -04:00
Andrei Betlen	d8fddcce73	Merge branch 'main' of github.com:abetlen/llama_cpp_python into better-server-params-and-fields	2023-05-07 01:54:00 -04:00
Andrei Betlen	7c3743fe5f	Update llama.cpp	2023-05-07 00:12:47 -04:00
Andrei Betlen	bc853e3742	Fix type for eval_logits in LlamaState object	2023-05-06 21:32:50 -04:00
Maximilian Winter	515d9bde7e	Fixed somethings and activated cublas	2023-05-06 23:40:19 +02:00
Maximilian Winter	aa203a0d65	Added mirostat sampling to the high level API.	2023-05-06 22:47:47 +02:00
Andrei Betlen	98bbd1c6a8	Fix eval logits type	2023-05-05 14:23:14 -04:00
Andrei Betlen	b5f3e74627	Add return type annotations for embeddings and logits	2023-05-05 14:22:55 -04:00
Andrei Betlen	3e28e0e50c	Fix: runtime type errors	2023-05-05 14:12:26 -04:00
Andrei Betlen	e24c3d7447	Prefer explicit imports	2023-05-05 14:05:31 -04:00
Andrei Betlen	40501435c1	Fix: types	2023-05-05 14:04:12 -04:00
Andrei Betlen	66e28eb548	Fix temperature bug	2023-05-05 14:00:41 -04:00
Andrei Betlen	6702d2abfd	Fix candidates type	2023-05-05 14:00:30 -04:00
Andrei Betlen	5e7ddfc3d6	Fix llama_cpp types	2023-05-05 13:54:22 -04:00
Andrei Betlen	b6a9a0b6ba	Add types for all low-level api functions	2023-05-05 12:22:27 -04:00
Andrei Betlen	5be0efa5f8	Cache should raise KeyError when key is missing	2023-05-05 12:21:49 -04:00
Andrei Betlen	24fc38754b	Add cli options to server. Closes #37	2023-05-05 12:08:28 -04:00
Andrei Betlen	853dc711cc	Format	2023-05-04 21:58:36 -04:00
Andrei Betlen	97c6372350	Rewind model to longest prefix.	2023-05-04 21:58:27 -04:00
Andrei Betlen	329297fafb	Bugfix: Missing logits_to_logprobs	2023-05-04 12:18:40 -04:00
Lucas Doyle	3008a954c1	Merge branch 'main' of github.com:abetlen/llama-cpp-python into better-server-params-and-fields	2023-05-03 13:10:03 -07:00
Andrei Betlen	9e5b6d675a	Improve logging messages	2023-05-03 10:28:10 -04:00
Andrei Betlen	43f2907e3a	Support smaller state sizes	2023-05-03 09:33:50 -04:00
Andrei Betlen	1d47cce222	Update llama.cpp	2023-05-03 09:33:30 -04:00
Lucas Doyle	b9098b0ef7	llama_cpp server: prompt is a string Not sure why this union type was here but taking a look at llama.py, prompt is only ever processed as a string for completion This was breaking types when generating an openapi client	2023-05-02 14:47:07 -07:00
Matt Hoffner	f97ff3c5bb	Update llama_cpp.py	2023-05-01 20:40:06 -07:00
Andrei	7ab08b8d10	Merge branch 'main' into better-server-params-and-fields	2023-05-01 22:45:57 -04:00
Andrei Betlen	9eafc4c49a	Refactor server to use factory	2023-05-01 22:38:46 -04:00
Andrei Betlen	dd9ad1c759	Formatting	2023-05-01 21:51:16 -04:00
Lucas Doyle	dbbfc4ba2f	llama_cpp server: fix to ChatCompletionRequestMessage When I generate a client, it breaks because it fails to process the schema of ChatCompletionRequestMessage These fix that: - I think `Union[Literal["user"], Literal["channel"], ...]` is the same as Literal["user", "channel", ...] - Turns out default value `Literal["user"]` isn't JSON serializable, so replace with "user"	2023-05-01 15:38:19 -07:00

1 2 3 4

191 commits