ollama/llm
Daniel Hiltgen f457d63400 Implement linux NUMA detection
If the system has multiple numa nodes, enable numa support in llama.cpp
If we detect numactl in the path, use that, else use the basic "distribute" mode.
2024-08-05 12:56:20 -07:00
..
ext_server line feed 2024-08-04 17:25:41 -07:00
generate Adjust windows ROCm discovery 2024-07-20 15:17:50 -07:00
llama.cpp@6eeaeba126 update llama.cpp submodule to 6eeaeba1 (#6039) 2024-07-29 13:20:26 -07:00
patches patches: phi3 default sliding window attention 2024-07-31 14:58:34 -07:00
filetype.go Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 2024-05-23 13:21:49 -07:00
ggla.go update convert test to check result data 2024-07-31 10:59:38 -07:00
ggml.go update convert test to check result data 2024-07-31 10:59:38 -07:00
ggml_test.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
gguf.go comments 2024-07-31 15:58:55 -07:00
llm.go lint 2024-08-01 17:06:06 -07:00
llm_darwin_amd64.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_darwin_arm64.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_linux.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
llm_windows.go Enable windows error dialog for subprocess startup 2024-07-22 14:07:27 -07:00
memory.go handle asymmetric embedding KVs 2024-06-20 09:57:27 -07:00
memory_test.go lint 2024-08-01 17:06:06 -07:00
payload.go Fix corner cases on tmp cleaner on mac 2024-07-03 13:10:14 -07:00
server.go Implement linux NUMA detection 2024-08-05 12:56:20 -07:00
status.go fix error detection by limiting model loading error parsing (#5472) 2024-07-03 20:04:30 -04:00