llama.cpp

History

Lucas Doyle 0fcc25cdac examples fastapi_server: deprecate This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more. Rationale: Currently there exist two server implementations in this repo: - `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server` - `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff. The one in the module seems better: - supports logits_all - supports use_mmap - has experimental cache support (with some mutex thing going on) - some stuff with streaming support was moved around more recently than fastapi_server.py		2023-05-01 22:34:23 -07:00
..
fastapi_server.py	examples fastapi_server: deprecate	2023-05-01 22:34:23 -07:00
high_level_api_embedding.py	Update model paths to be more clear they should point to file	2023-04-09 22:45:55 -04:00
high_level_api_inference.py	Update model paths to be more clear they should point to file	2023-04-09 22:45:55 -04:00
high_level_api_streaming.py	Update model paths to be more clear they should point to file	2023-04-09 22:45:55 -04:00
langchain_custom_llm.py	Update model paths to be more clear they should point to file	2023-04-09 22:45:55 -04:00