baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	dc39cc0fa4	Use server sent events function for streaming completion	2023-05-19 02:04:30 -04:00
Andrei	69f9d50090	Merge pull request #235 from Pipboyguy/main Decrement CUDA version and bump Ubuntu	2023-05-18 13:42:04 -04:00
Andrei Betlen	f0ec6e615e	Stream tokens instead of text chunks	2023-05-18 11:35:59 -04:00
Andrei Betlen	21d8f5fa9f	Remove unnused union	2023-05-18 11:35:15 -04:00
Marcel Coetzee	6ece8a225a	Set CUDA_VERSION as build ARG Signed-off-by: Marcel Coetzee <marcel@mooncoon.com>	2023-05-18 16:59:42 +02:00
Marcel Coetzee	6c57d38552	Decrement CUDA version and bump Ubuntu Signed-off-by: Marcel Coetzee <marcel@mooncoon.com>	2023-05-18 16:02:42 +02:00
Andrei Betlen	50e136252a	Update llama.cpp	2023-05-17 16:14:12 -04:00
Andrei Betlen	db10e0078b	Update docs	2023-05-17 16:14:01 -04:00
Andrei Betlen	61d58e7b35	Check for CUDA_PATH before adding	2023-05-17 15:26:38 -04:00
Andrei Betlen	7c95895626	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-05-17 15:19:32 -04:00
Andrei	47921a312c	Merge pull request #225 from aneeshjoy/main Fixed CUBLAS DLL load issues on Windows	2023-05-17 15:17:37 -04:00
Aneesh Joy	e9794f91f2	Fixd CUBLAS dll load issue in Windows	2023-05-17 18:04:58 +01:00
Andrei Betlen	70695c430b	Move docs link up	2023-05-17 11:40:12 -04:00
Andrei Betlen	4f342795e5	Update token checks	2023-05-17 03:35:13 -04:00
Andrei Betlen	626003c884	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-05-17 02:00:48 -04:00
Andrei Betlen	f5c2f998ab	Format	2023-05-17 02:00:39 -04:00
Andrei Betlen	d28b753ed2	Implement penalize_nl	2023-05-17 01:53:26 -04:00
Andrei Betlen	f11e2a781c	Fix last_n_tokens_size	2023-05-17 01:42:51 -04:00
Andrei Betlen	7e55244540	Fix top_k value. Closes #220	2023-05-17 01:41:42 -04:00
Andrei Betlen	e37a808bc0	Update llama.cpp	2023-05-16 23:33:53 -04:00
Andrei Betlen	a7c9e38287	Update variable name	2023-05-16 18:07:25 -04:00
Andrei Betlen	a3352923c7	Add model_alias option to override model_path in completions. Closes #39	2023-05-16 17:22:00 -04:00
Andrei Betlen	214589e462	Update llama.cpp	2023-05-16 17:20:45 -04:00
Andrei Betlen	a65125c0bd	Add sampling defaults for generate	2023-05-16 09:35:50 -04:00
Andrei Betlen	341c50b5b0	Fix CMakeLists.txt	2023-05-16 09:07:14 -04:00
Andrei	1a13d76c48	Merge pull request #215 from zxybazh/main Update README.md	2023-05-15 17:57:58 -04:00
Xiyou Zhou	408dd14e5b	Update README.md Fix typo.	2023-05-15 14:52:25 -07:00
Andrei	e0cca841bf	Merge pull request #214 from abetlen/dependabot/pip/mkdocs-material-9.1.12 Bump mkdocs-material from 9.1.11 to 9.1.12	2023-05-15 17:24:06 -04:00
dependabot[bot]	7526b3f6f9	Bump mkdocs-material from 9.1.11 to 9.1.12 Bumps [mkdocs-material](https://github.com/squidfunk/mkdocs-material) from 9.1.11 to 9.1.12. - [Release notes](https://github.com/squidfunk/mkdocs-material/releases) - [Changelog](https://github.com/squidfunk/mkdocs-material/blob/master/CHANGELOG) - [Commits](https://github.com/squidfunk/mkdocs-material/compare/9.1.11...9.1.12) --- updated-dependencies: - dependency-name: mkdocs-material dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-05-15 21:05:54 +00:00
Andrei	cda9cecd5f	Merge pull request #212 from mzbac/patch-1 chore: add note for Mac m1 installation	2023-05-15 16:19:00 -04:00
Andrei Betlen	cbac19bf24	Add winmode arg only on windows if python version supports it	2023-05-15 09:15:01 -04:00
Anchen	3718799b37	chore: add note for Mac m1 installation	2023-05-15 20:46:59 +10:00
Andrei Betlen	c804efe3f0	Fix obscure Wndows DLL issue. Closes #208	2023-05-14 22:08:11 -04:00
Andrei Betlen	ceec21f1e9	Update llama.cpp	2023-05-14 22:07:35 -04:00
Andrei Betlen	d90c9df326	Bump version	2023-05-14 00:04:49 -04:00
Andrei Betlen	cdf59768f5	Update llama.cpp	2023-05-14 00:04:22 -04:00
Andrei Betlen	7a536e86c2	Allow model to tokenize strings longer than context length and set add_bos. Closes #92	2023-05-12 14:28:22 -04:00
Andrei Betlen	8740ddc58e	Only support generating one prompt at a time.	2023-05-12 07:21:46 -04:00
Andrei Betlen	8895b9002a	Revert "llama_cpp server: prompt is a string". Closes #187 This reverts commit `b9098b0ef7`.	2023-05-12 07:16:57 -04:00
Andrei Betlen	684d7c8c17	Fix docker command	2023-05-11 22:12:35 -04:00
Andrei Betlen	fa1fc4ec42	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-05-11 21:56:54 -04:00
Andrei Betlen	e3d3c31da2	Bump version	2023-05-11 21:56:43 -04:00
Andrei Betlen	7be584fe82	Add missing tfs_z paramter	2023-05-11 21:56:19 -04:00
Andrei Betlen	28ee2adec2	Update llama.cpp	2023-05-11 21:15:12 -04:00
Andrei Betlen	35229f5eab	Update llama.cpp	2023-05-11 10:05:34 -04:00
Andrei Betlen	cdeaded251	Bugfix: Ensure logs are printed when streaming	2023-05-10 16:12:17 -04:00
Andrei	c3ed1330d7	Merge pull request #177 from joelkurian/main Updated installation instructions for BLAS backends	2023-05-10 05:27:12 -04:00
Andrei	3c96b43cf4	Merge pull request #178 from Stonelinks/document-presence-frequency-penalty Document presence frequency penalty	2023-05-09 23:55:52 -04:00
Lucas Doyle	02e8a018ae	llama_cpp server: document presence_penalty and frequency_penalty, mark as supported	2023-05-09 16:25:00 -07:00
Lucas Doyle	bebe7712f7	README: better setup instructions for developers for pip and poetry Give folks options + explicit instructions for installing with poetry or pip.	2023-05-09 16:04:15 -07:00

... 12 13 14 15 16 ...

1145 commits