ollama/llm
Daniel Hiltgen 380378cc80 Use our libraries first
Trying to live off the land for cuda libraries was not the right strategy.  We need to use the version we compiled against to ensure things work properly
2024-05-06 14:23:29 -07:00
..
ext_server omit prompt and generate settings from final response 2024-05-03 17:00:02 -07:00
generate Do not build AVX runners on ARM64 2024-04-26 23:55:32 -06:00
llama.cpp@952d03dbea update llama.cpp commit to 952d03d 2024-04-30 17:31:20 -04:00
patches Fix llava models not working after first request (#4164) 2024-05-05 20:50:31 -07:00
ggla.go refactor tensor query 2024-04-10 11:37:20 -07:00
ggml.go fix: mixtral graph 2024-04-22 17:19:44 -07:00
gguf.go fixes for gguf (#3863) 2024-04-23 20:57:20 -07:00
llm.go Add import declaration for windows,arm64 to llm.go 2024-04-26 23:23:53 -06:00
llm_darwin_amd64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_darwin_arm64.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_linux.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00
llm_windows.go Move nested payloads to installer and zip file on windows 2024-04-23 16:14:47 -07:00
memory.go Centralize server config handling 2024-05-05 16:49:50 -07:00
payload.go Move nested payloads to installer and zip file on windows 2024-04-23 16:14:47 -07:00
server.go Use our libraries first 2024-05-06 14:23:29 -07:00
status.go Switch back to subprocessing for llama.cpp 2024-04-01 16:48:18 -07:00