docs: Add embeddings section
This commit is contained in:
parent
f736827b9b
commit
c2a234a086
1 changed files with 16 additions and 0 deletions
16
README.md
16
README.md
|
@ -398,6 +398,22 @@ llama = Llama(
|
|||
)
|
||||
```
|
||||
|
||||
### Embeddings
|
||||
|
||||
`llama-cpp-python` supports generating embeddings from the text.
|
||||
|
||||
```python
|
||||
import llama_cpp
|
||||
|
||||
llm = llama_cpp.Llama(model_path="path/to/model.gguf", embeddings=True)
|
||||
|
||||
embeddings = llm.create_embedding("Hello, world!")
|
||||
|
||||
# or batched
|
||||
|
||||
embeddings = llm.create_embedding(["Hello, world!", "Goodbye, world!"])
|
||||
```
|
||||
|
||||
### Adjusting the Context Window
|
||||
|
||||
The context window of the Llama models determines the maximum number of tokens that can be processed at once. By default, this is set to 512 tokens, but can be adjusted based on your requirements.
|
||||
|
|
Loading…
Reference in a new issue