Daniel Hiltgen
841adda157
Fix windows lint CI flakiness
2024-04-02 12:22:16 -07:00
Daniel Hiltgen
0035e31af8
Bump to b2581
2024-04-02 11:53:07 -07:00
Daniel Hiltgen
c863c6a96d
Merge pull request #3218 from dhiltgen/subprocess
...
Switch back to subprocessing for llama.cpp
2024-04-02 10:49:44 -07:00
Daniel Hiltgen
1f11b52511
Refined min memory from testing
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
526d4eb204
Release gpu discovery library after use
...
Leaving the cudart library loaded kept ~30m of memory
pinned in the GPU in the main process. This change ensures
we don't hold GPU resources when idle.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
0a74cb31d5
Safeguard for noexec
...
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
10ed1b6292
Detect too-old cuda driver
...
"cudart init failure: 35" isn't particularly helpful in the logs.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
4fec5816d6
Integration test improvements
...
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
0a0e9f3e0f
Apply 01-cache.diff
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
58d95cc9bd
Switch back to subprocessing for llama.cpp
...
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Patrick Devine
3b6a9154dd
Simplify model conversion ( #3422 )
2024-04-01 16:14:53 -07:00
Michael Yang
d6dd2ff839
Merge pull request #3241 from ollama/mxyng/mem
...
update memory estimations for gpu offloading
2024-04-01 13:59:14 -07:00
Michael Yang
e57a6ba89f
Merge pull request #2926 from ollama/mxyng/decode-ggml-v2
...
refactor model parsing
2024-04-01 13:58:13 -07:00
Michael Yang
12ec2346ef
Merge pull request #3442 from ollama/mxyng/generate-output
...
fix generate output
2024-04-01 13:56:09 -07:00
Michael Yang
1ec0df1069
fix generate output
2024-04-01 13:47:34 -07:00
Michael Yang
91b3e4d282
update memory calcualtions
...
count each layer independently when deciding gpu offloading
2024-04-01 13:16:32 -07:00
Michael Yang
d338d70492
refactor model parsing
2024-04-01 13:16:15 -07:00
Philipp Gillé
011bb67351
Add chromem-go to community integrations ( #3437 )
2024-04-01 11:17:37 -04:00
Saifeddine ALOUI
d124627202
Update README.md ( #3436 )
2024-04-01 11:16:31 -04:00
Jesse Zhang
b0a8246a69
Community Integration: CRAG Ollama Chat ( #3423 )
...
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit 🤗
Support:
- Ollama
- OpenAI APIs
2024-04-01 11:16:14 -04:00
Yaroslav
e6fb39c182
Update README.md ( #3378 )
...
Plugins list updated
2024-03-31 13:10:05 -04:00
sugarforever
e1f1c374ea
Community Integration: ChatOllama ( #3400 )
...
* Community Integration: ChatOllama
* fixed typo
2024-03-30 22:46:50 -04:00
Jeffrey Morgan
06a1508bfe
Update 90_bug_report.yml
2024-03-29 10:11:17 -04:00
Patrick Devine
5a5efee46b
Add gemma safetensors conversion ( #3250 )
...
Co-authored-by: Michael Yang <mxyng@pm.me>
2024-03-28 18:54:01 -07:00
Daniel Hiltgen
97ae517fbf
Merge pull request #3398 from dhiltgen/release_latest
...
CI automation for tagging latest images
2024-03-28 16:25:54 -07:00
Daniel Hiltgen
44b813e459
Merge pull request #3377 from dhiltgen/rocm_v6_bump
...
Bump ROCm to 6.0.2 patch release
2024-03-28 16:07:54 -07:00
Daniel Hiltgen
539043f5e0
CI automation for tagging latest images
2024-03-28 16:07:37 -07:00
Daniel Hiltgen
dbcace6847
Merge pull request #3392 from dhiltgen/ci_build_win_cuda
...
CI windows gpu builds
2024-03-28 16:03:52 -07:00
Daniel Hiltgen
c91a4ebcff
Bump ROCm to 6.0.2 patch release
2024-03-28 15:58:57 -07:00
Daniel Hiltgen
b79c7e4528
CI windows gpu builds
...
If we're doing generate, test windows cuda and rocm as well
2024-03-28 14:39:10 -07:00
Michael Yang
035b274b70
Merge pull request #3379 from ollama/mxyng/origins
...
fix: trim quotes on OLLAMA_ORIGINS
2024-03-28 14:14:18 -07:00
Michael Yang
9c6a254945
Merge pull request #3391 from ollama/mxyng-patch-1
2024-03-28 13:15:56 -07:00
Michael Yang
f31f2bedf4
Update troubleshooting link
2024-03-28 12:05:26 -07:00
Michael Yang
756c257553
Merge pull request #3380 from ollama/mxyng/conditional-generate
...
fix: workflows
2024-03-28 00:35:27 +01:00
Michael Yang
5255d0af8a
fix: workflows
2024-03-27 16:30:01 -07:00
Michael Yang
af8a8a6b59
fix: trim quotes on OLLAMA_ORIGINS
2024-03-27 15:24:29 -07:00
Michael Yang
461ad25015
Merge pull request #3376 from ollama/mxyng/conditional-generate
...
only generate on changes to llm subdirectory
2024-03-27 22:12:53 +01:00
Michael Yang
8838ae787d
stub stub
2024-03-27 13:59:12 -07:00
Michael Yang
db75402ade
mangle arch
2024-03-27 13:44:50 -07:00
Michael Yang
1e85a140a3
only generate on changes to llm subdirectory
2024-03-27 12:45:35 -07:00
Michael Yang
c363282fdc
Merge pull request #3375 from ollama/mxyng/conditional-generate
...
only generate cuda/rocm when changes to llm detected
2024-03-27 20:40:55 +01:00
Michael Yang
5b0c48d29e
only generate cuda/rocm when changes to llm detected
2024-03-27 12:23:09 -07:00
Jeffrey Morgan
913306f4fd
Detect arrow keys on windows ( #3363 )
...
* detect arrow keys on windows
* add some helpful comments
2024-03-26 18:21:56 -04:00
Jeffrey Morgan
f5ca7f8c8e
add license in file header for vendored llama.cpp code ( #3351 )
2024-03-26 16:23:23 -04:00
Jeffrey Morgan
856b8ec131
remove need for $VSINSTALLDIR
since build will fail if ninja
cannot be found ( #3350 )
2024-03-26 16:23:16 -04:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
Christophe Dervieux
29715dbca7
malformed markdown link ( #3358 )
2024-03-26 10:46:36 -04:00
Daniel Hiltgen
54a028d07f
Merge pull request #3356 from dhiltgen/fix_arm_linux
...
Switch runner for final release job
2024-03-25 20:54:46 -07:00
Daniel Hiltgen
f83e4db365
Switch runner for final release job
...
The manifest and tagging step use a lot of disk space
2024-03-25 20:51:40 -07:00
Daniel Hiltgen
3b5866a233
Merge pull request #3353 from dhiltgen/fix_arm_linux
...
Use Rocky Linux Vault to get GCC 10.2 installed
2024-03-25 19:38:56 -07:00