ollama/llm
Daniel Hiltgen b05c9e83d9
Introduce GPU Overhead env var (#5922)
Provide a mechanism for users to set aside an amount of VRAM on each GPU
to make room for other applications they want to start after Ollama, or workaround
memory prediction bugs
2024-09-05 13:46:35 -07:00
..
ext_server llm: use json.hpp from common (#6642) 2024-09-04 19:34:42 -04:00
generate llm: update llama.cpp commit to 8962422 (#6618) 2024-09-03 21:12:39 -04:00
llama.cpp@8962422b1c llm: update llama.cpp commit to 8962422 (#6618) 2024-09-03 21:12:39 -04:00
patches llm: update llama.cpp commit to 8962422 (#6618) 2024-09-03 21:12:39 -04:00
filetype.go Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 2024-05-23 13:21:49 -07:00
ggla.go update convert test to check result data 2024-07-31 10:59:38 -07:00
ggml.go Merge pull request #6260 from ollama/mxyng/mem 2024-09-05 13:22:08 -07:00
ggml_test.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
gguf.go add conversion for microsoft phi 3 mini/medium 4k, 128 2024-08-12 15:13:29 -07:00
llm.go lint 2024-08-01 17:06:06 -07:00
llm_darwin_amd64.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_darwin_arm64.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_linux.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_windows.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
memory.go Introduce GPU Overhead env var (#5922) 2024-09-05 13:46:35 -07:00
memory_test.go llama3.1 2024-08-21 11:49:31 -07:00
payload.go Add Jetson cuda variants for arm 2024-08-19 09:38:53 -07:00
server.go Log system memory at info (#6617) 2024-09-03 14:55:20 -07:00
status.go Catch one more error log 2024-08-05 09:28:07 -07:00