llama.cpp/examples/high_level_api/fastapi_server.py

38 lines
653 B
Python
Raw Normal View History

2023-03-24 23:10:31 +00:00
"""Example FastAPI server for llama.cpp.
2023-04-04 00:12:44 +00:00
To run this example:
```bash
pip install fastapi uvicorn sse-starlette
export MODEL=../models/7B/...
2023-04-04 00:12:44 +00:00
```
Then run:
```
2024-01-31 15:37:19 +00:00
uvicorn --factory llama_cpp.server.app:create_app --reload
```
2023-04-05 18:44:26 +00:00
or
2023-04-05 18:44:26 +00:00
```
python3 -m llama_cpp.server
```
2023-04-05 18:44:26 +00:00
Then visit http://localhost:8000/docs to see the interactive API docs.
2023-04-05 18:44:26 +00:00
To actually see the implementation of the server, see llama_cpp/server/app.py
2023-04-05 18:44:26 +00:00
"""
import os
import uvicorn
2023-04-05 18:44:26 +00:00
from llama_cpp.server.app import create_app
2023-04-05 18:44:26 +00:00
if __name__ == "__main__":
app = create_app()
2023-04-05 18:44:26 +00:00
uvicorn.run(
app, host=os.getenv("HOST", "localhost"), port=int(os.getenv("PORT", 8000))
)