ollama/llm
Daniel Hiltgen db2a9ad1fe Explicitly disable AVX2 on GPU builds
Even though we weren't setting it to on, somewhere in the cmake config
it was getting toggled on.  By explicitly setting it to off, we get `/arch:AVX`
as intended.
2024-02-15 14:50:11 -08:00
..
ext_server set shutting_down to false once shutdown is complete (#2484) 2024-02-13 17:48:41 -08:00
generate Explicitly disable AVX2 on GPU builds 2024-02-15 14:50:11 -08:00
llama.cpp@6c00a06692 Revert "Revert "bump submodule to 6c00a06 (#2479)"" (#2485) 2024-02-13 18:18:41 -08:00
patches patch: always add token to cache_tokens (#2459) 2024-02-12 08:10:16 -08:00
dyn_ext_server.c Switch to local dlopen symbols 2024-01-19 11:37:02 -08:00
dyn_ext_server.go Shutdown faster 2024-02-08 22:22:50 -08:00
dyn_ext_server.h Always dynamically load the llm server library 2024-01-11 08:42:47 -08:00
ggml.go add max context length check 2024-01-12 14:54:07 -08:00
gguf.go refactor tensor read 2024-01-24 10:48:31 -08:00
llama.go use llm.ImageData 2024-01-31 19:13:48 -08:00
llm.go Ensure the libraries are present 2024-02-07 17:27:49 -08:00
payload_common.go Detect AMD GPU info via sysfs and block old cards 2024-02-12 08:19:41 -08:00
payload_darwin_amd64.go Add multiple CPU variants for Intel Mac 2024-01-17 15:08:54 -08:00
payload_darwin_arm64.go Add multiple CPU variants for Intel Mac 2024-01-17 15:08:54 -08:00
payload_linux.go Add multiple CPU variants for Intel Mac 2024-01-17 15:08:54 -08:00
payload_test.go Fix up the CPU fallback selection 2024-01-11 15:27:06 -08:00
payload_windows.go Add multiple CPU variants for Intel Mac 2024-01-17 15:08:54 -08:00
utils.go partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00