ollama/llm
Daniel Hiltgen 1f50356e8e Bump ROCm on windows to 6.1.2
This also adjusts our algorithm to favor our bundled ROCm.
I've confirmed VRAM reporting still doesn't work properly so we
can't yet enable concurrency by default.
2024-07-10 11:01:22 -07:00
..
ext_server llm: allow gemma 2 to context shift (#5534) 2024-07-07 13:41:51 -04:00
generate Bump ROCm on windows to 6.1.2 2024-07-10 11:01:22 -07:00
llama.cpp@a8db2a9ce6 Update llama.cpp submodule to a8db2a9c (#5530) 2024-07-07 13:03:09 -04:00
patches Update llama.cpp submodule to a8db2a9c (#5530) 2024-07-07 13:03:09 -04:00
filetype.go Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322) 2024-05-23 13:21:49 -07:00
ggla.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
ggml.go gemma2 graph 2024-06-27 13:34:52 -07:00
ggml_test.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
gguf.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
llm.go Statically link c++ and thread lib 2024-07-09 11:34:30 -07:00
llm_darwin_amd64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_darwin_arm64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_linux.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_windows.go Move nested payloads to installer and zip file on windows 2024-04-23 16:14:47 -07:00
memory.go handle asymmetric embedding KVs 2024-06-20 09:57:27 -07:00
memory_test.go llm: speed up gguf decoding by a lot (#5246) 2024-06-24 21:47:52 -07:00
payload.go Fix corner cases on tmp cleaner on mac 2024-07-03 13:10:14 -07:00
server.go Merge pull request #5126 from ollama/mxyng/messages 2024-07-09 09:20:44 -07:00
status.go fix error detection by limiting model loading error parsing (#5472) 2024-07-03 20:04:30 -04:00