ollama/docs/development.md
Bruce MacDonald 42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00

29 lines
318 B
Markdown

# Development
- Install cmake or (optionally, required tools for GPUs)
- run `go generate ./...`
- run `go build .`
Install required tools:
```
brew install go cmake gcc
```
Get the required libraries:
```
go generate ./...
```
Then build ollama:
```
go build .
```
Now you can run `ollama`:
```
./ollama
```