diff --git a/README.md b/README.md index 6dbd79d..ccd66ea 100644 --- a/README.md +++ b/README.md @@ -505,14 +505,14 @@ This allows you to use llama.cpp compatible models with any OpenAI compatible cl To install the server package and get started: ```bash -pip install llama-cpp-python[server] +pip install 'llama-cpp-python[server]' python3 -m llama_cpp.server --model models/7B/llama-model.gguf ``` Similar to Hardware Acceleration section above, you can also install with GPU (cuBLAS) support like this: ```bash -CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python[server] +CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install 'llama-cpp-python[server]' python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35 ```