Commit graph

1041 commits

Author SHA1 Message Date
Andrei
bcc4e631cb
Merge pull request #163 from abetlen/dependabot/pip/black-23.3.0
Bump black from 23.1.0 to 23.3.0
2023-05-06 17:10:30 -04:00
Maximilian Winter
aa203a0d65 Added mirostat sampling to the high level API. 2023-05-06 22:47:47 +02:00
Mug
fd80ddf703 Fix a bug with wrong type 2023-05-06 22:22:28 +02:00
Mug
996f63e9e1 Add utf8 to chat example 2023-05-06 15:16:58 +02:00
Mug
3ceb47b597 Fix mirastat requiring c_float 2023-05-06 13:35:50 +02:00
Mug
9797394c81 Wrong logit_bias parsed type 2023-05-06 13:27:52 +02:00
Mug
1895c11033 Rename postfix to suffix to match upstream 2023-05-06 13:18:25 +02:00
dependabot[bot]
c9bb602b26
Bump black from 23.1.0 to 23.3.0
Bumps [black](https://github.com/psf/black) from 23.1.0 to 23.3.0.
- [Release notes](https://github.com/psf/black/releases)
- [Changelog](https://github.com/psf/black/blob/main/CHANGES.md)
- [Commits](https://github.com/psf/black/compare/23.1.0...23.3.0)

---
updated-dependencies:
- dependency-name: black
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-05 23:25:53 +00:00
Andrei
2f2ea00a3d
Merge pull request #160 from th-neu/main
Create dependabot.yml
2023-05-05 19:24:53 -04:00
Thomas Neu
79d50a29f4
Create dependabot.yml 2023-05-06 01:02:59 +02:00
Andrei Betlen
980903df93 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-05-05 15:07:26 -04:00
Andrei Betlen
98bbd1c6a8 Fix eval logits type 2023-05-05 14:23:14 -04:00
Andrei Betlen
b5f3e74627 Add return type annotations for embeddings and logits 2023-05-05 14:22:55 -04:00
Andrei Betlen
3e28e0e50c Fix: runtime type errors 2023-05-05 14:12:26 -04:00
Andrei Betlen
e24c3d7447 Prefer explicit imports 2023-05-05 14:05:31 -04:00
Andrei Betlen
40501435c1 Fix: types 2023-05-05 14:04:12 -04:00
Andrei Betlen
66e28eb548 Fix temperature bug 2023-05-05 14:00:41 -04:00
Andrei Betlen
6702d2abfd Fix candidates type 2023-05-05 14:00:30 -04:00
Andrei Betlen
5e7ddfc3d6 Fix llama_cpp types 2023-05-05 13:54:22 -04:00
Andrei
f712a04f4e
Merge pull request #157 from th-neu/th-neu-readme-windows
readme windows
2023-05-05 12:40:45 -04:00
Thomas Neu
22c3056b2a
Update README.md
added MacOS
2023-05-05 18:40:00 +02:00
Andrei Betlen
b6a9a0b6ba Add types for all low-level api functions 2023-05-05 12:22:27 -04:00
Andrei Betlen
5be0efa5f8 Cache should raise KeyError when key is missing 2023-05-05 12:21:49 -04:00
Andrei Betlen
24fc38754b Add cli options to server. Closes #37 2023-05-05 12:08:28 -04:00
Thomas Neu
eb54e30f34
Update README.md 2023-05-05 14:22:41 +02:00
Thomas Neu
952ba9ecaf
Update README.md
add windows server commad
2023-05-05 14:21:57 +02:00
Andrei Betlen
5f583b0179 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-05-04 21:59:40 -04:00
Andrei Betlen
5c165a85da Bump version 2023-05-04 21:59:37 -04:00
Andrei Betlen
853dc711cc Format 2023-05-04 21:58:36 -04:00
Andrei Betlen
97c6372350 Rewind model to longest prefix. 2023-05-04 21:58:27 -04:00
Andrei
38b8eeea58
Merge pull request #154 from th-neu/th-neu-dockerfile-slim
Slim-Bullseye based docker image
2023-05-04 19:59:23 -04:00
Thomas Neu
5672ed7fea
Merge branch 'abetlen:main' into th-neu-dockerfile-slim 2023-05-04 21:41:13 +02:00
Thomas Neu
501321875f
Slim-Bullseye based docker image
ends up at ~669MB
2023-05-04 21:03:19 +02:00
Mug
0e9f227afd Update low level examples 2023-05-04 18:33:08 +02:00
Andrei Betlen
cabd8b8ed1 Bump version 2023-05-04 12:21:20 -04:00
Andrei Betlen
d78cec67df Update llama.cpp 2023-05-04 12:20:25 -04:00
Andrei Betlen
329297fafb Bugfix: Missing logits_to_logprobs 2023-05-04 12:18:40 -04:00
Andrei Betlen
d594892fd4 Remove Docker CUDA build job 2023-05-04 00:02:46 -04:00
Andrei Betlen
0607f6578e Use network installer for cuda 2023-05-03 23:22:16 -04:00
Andrei Betlen
6d3c20e39d Add CUDA docker image build to github actions 2023-05-03 22:20:53 -04:00
Lucas Doyle
3008a954c1 Merge branch 'main' of github.com:abetlen/llama-cpp-python into better-server-params-and-fields 2023-05-03 13:10:03 -07:00
Andrei Betlen
a02aa121da Remove cuda build job 2023-05-03 10:50:48 -04:00
Andrei Betlen
07a56dd9c2 Update job name 2023-05-03 10:39:39 -04:00
Andrei Betlen
7839eb14d3 Add docker cuda image. Closes #143 2023-05-03 10:29:05 -04:00
Andrei Betlen
9e5b6d675a Improve logging messages 2023-05-03 10:28:10 -04:00
Andrei Betlen
43f2907e3a Support smaller state sizes 2023-05-03 09:33:50 -04:00
Andrei Betlen
1d47cce222 Update llama.cpp 2023-05-03 09:33:30 -04:00
Lucas Doyle
b9098b0ef7 llama_cpp server: prompt is a string
Not sure why this union type was here but taking a look at llama.py, prompt is only ever processed as a string for completion

This was breaking types when generating an openapi client
2023-05-02 14:47:07 -07:00
Lucas Doyle
0fcc25cdac examples fastapi_server: deprecate
This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more.

Rationale:

Currently there exist two server implementations in this repo:

- `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server`
- `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around

IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff.

The one in the module seems better:
- supports logits_all
- supports use_mmap
- has experimental cache support (with some mutex thing going on)
- some stuff with streaming support was moved around more recently than fastapi_server.py
2023-05-01 22:34:23 -07:00
Andrei Betlen
c2e31eecee Update permissions 2023-05-02 01:23:17 -04:00