ollama

History

Daniel Hiltgen 283948c83b Adjust windows ROCm discovery The v5 hip library returns unsupported GPUs which wont enumerate at inference time in the runner so this makes sure we align discovery. The gfx906 cards are no longer supported so we shouldn't compile with that GPU type as it wont enumerate at runtime.		2024-07-20 15:17:50 -07:00
..
ext_server	Introduce `/api/embed` endpoint supporting batch embedding (#5127 )	2024-07-15 12:14:24 -07:00
generate	Adjust windows ROCm discovery	2024-07-20 15:17:50 -07:00
llama.cpp@a8db2a9ce6	Update llama.cpp submodule to `a8db2a9c` (#5530 )	2024-07-07 13:03:09 -04:00
patches	add patch for tekken (#5807 )	2024-07-20 13:41:21 -04:00
filetype.go	Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322 )	2024-05-23 13:21:49 -07:00
ggla.go	llm: speed up gguf decoding by a lot (#5246 )	2024-06-24 21:47:52 -07:00
ggml.go	chatglm graph	2024-07-10 13:43:47 -07:00
ggml_test.go	llm: speed up gguf decoding by a lot (#5246 )	2024-06-24 21:47:52 -07:00
gguf.go	add chat and generate tests with mock runner	2024-07-16 09:39:31 -07:00
llm.go	fix: quant err message (#5616 )	2024-07-11 17:24:29 -07:00
llm_darwin_amd64.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_darwin_arm64.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_linux.go	Switch back to subprocessing for llama.cpp	2024-04-01 16:48:18 -07:00
llm_windows.go	Move nested payloads to installer and zip file on windows	2024-04-23 16:14:47 -07:00
memory.go	handle asymmetric embedding KVs	2024-06-20 09:57:27 -07:00
memory_test.go	llm: speed up gguf decoding by a lot (#5246 )	2024-06-24 21:47:52 -07:00
payload.go	Fix corner cases on tmp cleaner on mac	2024-07-03 13:10:14 -07:00
server.go	Adjust windows ROCm discovery	2024-07-20 15:17:50 -07:00
status.go	fix error detection by limiting model loading error parsing (#5472 )	2024-07-03 20:04:30 -04:00