Daniel Hiltgen
6459377ae0
Add ROCm support to linux install script ( #2966 )
2024-03-14 18:00:16 -07:00
Jeffrey Morgan
5ce997a7b9
Update README.md
2024-03-13 21:12:17 -07:00
Patrick Devine
ba7cf7fb66
add more docs on for the modelfile message command ( #3087 )
2024-03-12 16:41:41 -07:00
Daniel Hiltgen
b53229a2ed
Add docs explaining GPU selection env vars
2024-03-12 11:33:06 -07:00
Jeffrey Morgan
6d3adfbea2
Update troubleshooting.md
2024-03-11 13:22:28 -07:00
Daniel Hiltgen
0fdebb34a9
Doc how to set up ROCm builds on windows
2024-03-09 11:29:45 -08:00
Daniel Hiltgen
4a5c9b8035
Finish unwinding idempotent payload logic
...
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent. This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
Jeffrey Morgan
6c0af2599e
Update docs README.md
and table of contents
2024-03-08 22:45:11 -08:00
Daniel Hiltgen
280da44522
Merge pull request #2988 from dhiltgen/rocm_docs
...
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
Jeffrey Morgan
b886bec3f9
Update api.md
2024-03-07 23:27:51 -08:00
Daniel Hiltgen
69f0227813
Refined ROCm troubleshooting docs
2024-03-07 11:22:37 -08:00
Daniel Hiltgen
6c5ccb11f9
Revamp ROCm support
...
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.
We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.
For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Jeffrey Morgan
d481fb3cc8
update go to 1.22 in other places ( #2975 )
2024-03-07 07:39:49 -08:00
John
23ebe8fe11
fix some typos ( #2973 )
...
Signed-off-by: hishope <csqiye@126.com>
2024-03-06 22:50:11 -08:00
Jeffrey Morgan
ce9f7c4674
Update api.md
2024-03-05 13:13:23 -08:00
Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
elthommy
1f087c4d26
Update langchain python tutorial ( #2737 )
...
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
Jeffrey Morgan
bdc0ea1ba5
Update import.md
2024-02-22 02:08:03 -05:00
Jeffrey Morgan
7fab7918cc
Update import.md
2024-02-22 02:06:24 -05:00
Jeffrey Morgan
f0425d3de9
Update faq.md
2024-02-20 20:44:45 -05:00
Jeffrey Morgan
8125ce4cb6
Update import.md
...
Add instructions to get public key on windows
2024-02-19 22:48:24 -05:00
Jeffrey Morgan
df56f1ee5e
Update faq.md
2024-02-19 22:16:42 -05:00
Jeffrey Morgan
41aca5c2d0
Update faq.md
2024-02-19 21:11:01 -05:00
Jeffrey Morgan
753724d867
Update api.md to include examples for reproducible outputs
2024-02-19 20:36:16 -05:00
Patrick Devine
9a7a4b9533
add faqs for memory pre-loading and the keep_alive setting ( #2601 )
2024-02-19 14:45:25 -08:00
Daniel Hiltgen
b338c0635f
Document setting server vars for windows
2024-02-19 13:30:46 -08:00
Tristan Rhodes
9774663013
Update faq.md with the location of models on Windows ( #2545 )
2024-02-16 11:04:19 -08:00
Daniel Hiltgen
1ba734de67
typo
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
29e90cc13b
Implement new Go based Desktop app
...
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode ( #2460 )
2024-02-12 15:06:57 -08:00
Jeffrey Morgan
1c8435ffa9
Update domain name references in docs and install script ( #2435 )
2024-02-09 15:19:30 -08:00
Jeffrey Morgan
42b797ed9c
Update openai.md
2024-02-08 15:03:23 -05:00
Jeffrey Morgan
336aa43f3c
Update openai.md
2024-02-08 12:48:28 -05:00
Jeffrey Morgan
ab0d37fde4
Update openai.md
2024-02-07 17:25:33 -05:00
Jeffrey Morgan
14e71350c8
Update openai.md
2024-02-07 17:25:24 -05:00
Jeffrey Morgan
453f572f83
Initial OpenAI /v1/chat/completions
API compatibility ( #2376 )
2024-02-07 17:24:29 -05:00
Bruce MacDonald
128fce5495
docs: keep_alive ( #2258 )
2024-02-06 11:00:05 -05:00
Jeffrey Morgan
b9f91a0b36
Update import instructions to use convert and quantize tooling from llama.cpp submodule ( #2247 )
2024-02-05 00:50:44 -05:00
Jeffrey Morgan
f0e9496c85
Update api.md
2024-02-02 12:17:24 -08:00
Daniel Hiltgen
e7dbb00331
Add container hints for troubleshooting
...
Some users are new to containers and unsure where the server logs go
2024-01-29 08:53:41 -08:00
Daniel Hiltgen
e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
...
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Jeffrey Morgan
5be9bdd444
Update modelfile.md
2024-01-25 16:29:48 -08:00
Jeffrey Morgan
b706794905
Update modelfile.md to include MESSAGE
2024-01-25 16:29:32 -08:00
Michael Yang
93a756266c
faq: update to use launchctl setenv
2024-01-22 13:10:13 -08:00
Daniel Hiltgen
df54c723ae
Make CPU builds parallel and customizable AMD GPUs
...
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen
a447a083f2
Add compute capability 5.0, 7.5, and 8.0
2024-01-20 14:24:05 -08:00
Daniel Hiltgen
abec7f06e5
Merge pull request #2056 from dhiltgen/slog
...
Mechanical switch from log to slog
2024-01-18 14:27:24 -08:00
Daniel Hiltgen
ecbfc0182f
Go bump to v1.21 to pick up slog
2024-01-18 14:12:57 -08:00
Daniel Hiltgen
fedd705aea
Mechanical switch from log to slog
...
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Daniel Hiltgen
9cd20b0ec8
Refine the linux cuda/rocm developer docs
2024-01-18 09:44:44 -08:00