baalajimaestro/llama.cpp

Author	SHA1	Message	Date
Andrei	bcc4e631cb	Merge pull request #163 from abetlen/dependabot/pip/black-23.3.0 Bump black from 23.1.0 to 23.3.0	2023-05-06 17:10:30 -04:00
Maximilian Winter	aa203a0d65	Added mirostat sampling to the high level API.	2023-05-06 22:47:47 +02:00
Mug	fd80ddf703	Fix a bug with wrong type	2023-05-06 22:22:28 +02:00
Mug	996f63e9e1	Add utf8 to chat example	2023-05-06 15:16:58 +02:00
Mug	3ceb47b597	Fix mirastat requiring c_float	2023-05-06 13:35:50 +02:00
Mug	9797394c81	Wrong logit_bias parsed type	2023-05-06 13:27:52 +02:00
Mug	1895c11033	Rename postfix to suffix to match upstream	2023-05-06 13:18:25 +02:00
dependabot[bot]	c9bb602b26	Bump black from 23.1.0 to 23.3.0 Bumps [black](https://github.com/psf/black) from 23.1.0 to 23.3.0. - [Release notes](https://github.com/psf/black/releases) - [Changelog](https://github.com/psf/black/blob/main/CHANGES.md) - [Commits](https://github.com/psf/black/compare/23.1.0...23.3.0) --- updated-dependencies: - dependency-name: black dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-05-05 23:25:53 +00:00
Andrei	2f2ea00a3d	Merge pull request #160 from th-neu/main Create dependabot.yml	2023-05-05 19:24:53 -04:00
Thomas Neu	79d50a29f4	Create dependabot.yml	2023-05-06 01:02:59 +02:00
Andrei Betlen	980903df93	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-05-05 15:07:26 -04:00
Andrei Betlen	98bbd1c6a8	Fix eval logits type	2023-05-05 14:23:14 -04:00
Andrei Betlen	b5f3e74627	Add return type annotations for embeddings and logits	2023-05-05 14:22:55 -04:00
Andrei Betlen	3e28e0e50c	Fix: runtime type errors	2023-05-05 14:12:26 -04:00
Andrei Betlen	e24c3d7447	Prefer explicit imports	2023-05-05 14:05:31 -04:00
Andrei Betlen	40501435c1	Fix: types	2023-05-05 14:04:12 -04:00
Andrei Betlen	66e28eb548	Fix temperature bug	2023-05-05 14:00:41 -04:00
Andrei Betlen	6702d2abfd	Fix candidates type	2023-05-05 14:00:30 -04:00
Andrei Betlen	5e7ddfc3d6	Fix llama_cpp types	2023-05-05 13:54:22 -04:00
Andrei	f712a04f4e	Merge pull request #157 from th-neu/th-neu-readme-windows readme windows	2023-05-05 12:40:45 -04:00
Thomas Neu	22c3056b2a	Update README.md added MacOS	2023-05-05 18:40:00 +02:00
Andrei Betlen	b6a9a0b6ba	Add types for all low-level api functions	2023-05-05 12:22:27 -04:00
Andrei Betlen	5be0efa5f8	Cache should raise KeyError when key is missing	2023-05-05 12:21:49 -04:00
Andrei Betlen	24fc38754b	Add cli options to server. Closes #37	2023-05-05 12:08:28 -04:00
Thomas Neu	eb54e30f34	Update README.md	2023-05-05 14:22:41 +02:00
Thomas Neu	952ba9ecaf	Update README.md add windows server commad	2023-05-05 14:21:57 +02:00
Andrei Betlen	5f583b0179	Merge branch 'main' of github.com:abetlen/llama_cpp_python into main	2023-05-04 21:59:40 -04:00
Andrei Betlen	5c165a85da	Bump version	2023-05-04 21:59:37 -04:00
Andrei Betlen	853dc711cc	Format	2023-05-04 21:58:36 -04:00
Andrei Betlen	97c6372350	Rewind model to longest prefix.	2023-05-04 21:58:27 -04:00
Andrei	38b8eeea58	Merge pull request #154 from th-neu/th-neu-dockerfile-slim Slim-Bullseye based docker image	2023-05-04 19:59:23 -04:00
Thomas Neu	5672ed7fea	Merge branch 'abetlen:main' into th-neu-dockerfile-slim	2023-05-04 21:41:13 +02:00
Thomas Neu	501321875f	Slim-Bullseye based docker image ends up at ~669MB	2023-05-04 21:03:19 +02:00
Mug	0e9f227afd	Update low level examples	2023-05-04 18:33:08 +02:00
Andrei Betlen	cabd8b8ed1	Bump version	2023-05-04 12:21:20 -04:00
Andrei Betlen	d78cec67df	Update llama.cpp	2023-05-04 12:20:25 -04:00
Andrei Betlen	329297fafb	Bugfix: Missing logits_to_logprobs	2023-05-04 12:18:40 -04:00
Andrei Betlen	d594892fd4	Remove Docker CUDA build job	2023-05-04 00:02:46 -04:00
Andrei Betlen	0607f6578e	Use network installer for cuda	2023-05-03 23:22:16 -04:00
Andrei Betlen	6d3c20e39d	Add CUDA docker image build to github actions	2023-05-03 22:20:53 -04:00
Lucas Doyle	3008a954c1	Merge branch 'main' of github.com:abetlen/llama-cpp-python into better-server-params-and-fields	2023-05-03 13:10:03 -07:00
Andrei Betlen	a02aa121da	Remove cuda build job	2023-05-03 10:50:48 -04:00
Andrei Betlen	07a56dd9c2	Update job name	2023-05-03 10:39:39 -04:00
Andrei Betlen	7839eb14d3	Add docker cuda image. Closes #143	2023-05-03 10:29:05 -04:00
Andrei Betlen	9e5b6d675a	Improve logging messages	2023-05-03 10:28:10 -04:00
Andrei Betlen	43f2907e3a	Support smaller state sizes	2023-05-03 09:33:50 -04:00
Andrei Betlen	1d47cce222	Update llama.cpp	2023-05-03 09:33:30 -04:00
Lucas Doyle	b9098b0ef7	llama_cpp server: prompt is a string Not sure why this union type was here but taking a look at llama.py, prompt is only ever processed as a string for completion This was breaking types when generating an openapi client	2023-05-02 14:47:07 -07:00
Lucas Doyle	0fcc25cdac	examples fastapi_server: deprecate This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more. Rationale: Currently there exist two server implementations in this repo: - `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server` - `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff. The one in the module seems better: - supports logits_all - supports use_mmap - has experimental cache support (with some mutex thing going on) - some stuff with streaming support was moved around more recently than fastapi_server.py	2023-05-01 22:34:23 -07:00
Andrei Betlen	c2e31eecee	Update permissions	2023-05-02 01:23:17 -04:00

1 2 3 4 5 ...

440 commits