llama.cpp

History

Lucas Doyle 0fcc25cdac examples fastapi_server: deprecate This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more. Rationale: Currently there exist two server implementations in this repo: - `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server` - `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff. The one in the module seems better: - supports logits_all - supports use_mmap - has experimental cache support (with some mutex thing going on) - some stuff with streaming support was moved around more recently than fastapi_server.py		2023-05-01 22:34:23 -07:00
..
high_level_api	examples fastapi_server: deprecate	2023-05-01 22:34:23 -07:00
low_level_api	Detect multi-byte responses and wait	2023-04-28 12:50:30 +02:00
notebooks	Add clients example. Closes #46	2023-04-08 09:35:32 -04:00