Daniel Hiltgen
e1f50377f4
Harden generate patching model
...
Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean
2024-02-01 19:34:36 -08:00
Daniel Hiltgen
e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
...
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Jeffrey Morgan
a64570dcae
Fix clearing kv cache between requests with the same prompt ( #2186 )
...
* Fix clearing kv cache between requests with the same prompt
* fix powershell script
2024-01-25 13:46:20 -08:00
Daniel Hiltgen
a447a083f2
Add compute capability 5.0, 7.5, and 8.0
2024-01-20 14:24:05 -08:00
Jeffrey Morgan
4c54f0ddeb
sign dylibs on macOS ( #2101 )
2024-01-19 19:24:11 -05:00
Jeffrey Morgan
dc88cc3981
use gzip
for runner embedding ( #2067 )
2024-01-19 13:23:03 -05:00
Daniel Hiltgen
1b249748ab
Add multiple CPU variants for Intel Mac
...
This also refines the build process for the ext_server build.
2024-01-17 15:08:54 -08:00
Jeffrey Morgan
288ef8ff95
add gcc -lstdc++
flag for linux cpu ( #1974 )
2024-01-13 03:53:00 -05:00
Jeffrey Morgan
4cf17990f7
use g++ to build libext_server.so
on linux ( #1972 )
2024-01-13 03:12:42 -05:00
Daniel Hiltgen
d88c527be3
Build multiple CPU variants and pick the best
...
This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker. Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available
2024-01-11 08:42:47 -08:00
Bruce MacDonald
3367b5f3df
remove unused generate patches ( #1810 )
2024-01-05 11:25:45 -05:00
Daniel Hiltgen
9983fa5f4e
Cleaup stale submodule
...
If the tree has a stale submodule, make sure we clean it up first
2024-01-04 13:40:16 -08:00
Daniel Hiltgen
77d96da94b
Code shuffle to clean up the llm dir
2024-01-04 12:12:05 -08:00