Daniel Hiltgen
50ee8b5f56
Merge pull request #6186 from dhiltgen/numa
...
Implement linux NUMA detection
2024-08-05 15:20:06 -07:00
Michael Yang
03bdac0595
Merge pull request #6146 from ollama/mxyng/testing
...
use testing tempdirs
2024-08-05 13:00:05 -07:00
Daniel Hiltgen
f457d63400
Implement linux NUMA detection
...
If the system has multiple numa nodes, enable numa support in llama.cpp
If we detect numactl in the path, use that, else use the basic "distribute" mode.
2024-08-05 12:56:20 -07:00
Michael Yang
39f2bc6bfc
Merge pull request #6167 from ollama/mxyng/line-feed
...
line feed
2024-08-05 00:06:28 -07:00
frob
b73b0940ef
Disable paging for journalctl ( #6154 )
...
Users using `journalctl` to get logs for issue logging sometimes don't realize that paging is causing information to be missed.
2024-08-05 00:10:53 -04:00
Michael Yang
6a07344786
line feed
2024-08-04 17:25:41 -07:00
sryu1
8b920f35a4
Add Gemma 2 2b ( #6151 )
2024-08-04 10:58:39 -04:00
Ivan Charapanau
4221e39867
Reference ollama integration with Harbor ( #6147 )
2024-08-02 17:03:46 -07:00
Michael Yang
a091fadfda
use testing tempdirs
2024-08-02 16:04:06 -07:00
Michael Yang
77ccbf04dc
Merge pull request #6128 from ollama/mxyng/lint
...
enable gofmt/gofumpt/goimports/tenv
2024-08-02 14:58:40 -07:00
royjhan
4addf6b587
Update OpenAI Compatibility Docs with /v1/completions ( #5311 )
...
* Update docs
* token bug corrected
* Update docs/openai.md
* Update docs/openai.md
* add suffix
* merge conflicts
* merge conflicts
2024-08-02 13:16:23 -07:00
royjhan
85c7f11170
Update docs ( #5310 )
2024-08-02 13:05:57 -07:00
Michael Yang
b732beba6a
lint
2024-08-01 17:06:06 -07:00
Kim Hallberg
ce1fb4447e
Fix models/{model} URL ( #6132 )
2024-08-01 16:31:47 -07:00
royjhan
558a54b098
Update OpenAI Compatibility Docs with /v1/embeddings ( #5470 )
...
* docs without usage
* no usage
* rm metric note
2024-08-01 16:00:29 -07:00
royjhan
ed52833bb1
Add to docs ( #5309 )
2024-08-01 15:58:13 -07:00
royjhan
6f133a0bdd
OpenAI: Add Usage to v1/embeddings
( #5886 )
...
* add prompt tokens to embed response
* rm slog
* metrics
* types
* prompt n
* clean up
* reset submodule
* add tokens to v1/embeddings
* separate usage
2024-08-01 15:49:37 -07:00
royjhan
f561eecfb8
Update OpenAI Compatibility Docs with /v1/models ( #5151 )
...
* OpenAI Docs
* Update docs/openai.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Remove newline
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-08-01 15:48:44 -07:00
Michael Yang
ff7c9060ec
Merge pull request #6115 from slouffka/fix-context
...
Fix context in /api/generate grows too much (#5980 ).
2024-08-01 15:13:59 -07:00
Michael Yang
0ff42e84b0
Merge pull request #4756 from ollama/mxyng/convert2
...
refactor convert
2024-08-01 14:16:30 -07:00
Vyacheslav Moskalev
8a9f946ca7
Refactor and format code.
2024-08-02 03:50:05 +07:00
Vyacheslav Moskalev
3b5210548e
Refactor code. Remove extra variable.
2024-08-01 19:56:15 +07:00
Vyacheslav Moskalev
b0c216584c
Better types and naming closer to style.
2024-08-01 19:43:44 +07:00
Vyacheslav Moskalev
49a5483139
Change the order of context and prompt.
2024-08-01 19:25:56 +07:00
Vyacheslav Moskalev
6bc5c13758
Fix extra context concatenation in generate handler ( #5980 ).
2024-08-01 15:45:58 +07:00
Michael Yang
3e614260af
Merge pull request #6109 from ollama/mxyng/fix-modelfile
...
fix modelfile message quotes
2024-07-31 17:05:43 -07:00
Michael Yang
d87b4a488e
fix modelfile message quotes
2024-07-31 16:52:09 -07:00
Michael Yang
4c14855ad7
Merge pull request #6106 from ollama/mxyng/default-sliding-window-attention
...
patches: phi3 optional sliding window attention
2024-07-31 16:12:06 -07:00
Blake Mizerany
dc77bbcfa4
server: fix json marshalling of downloadBlobPart ( #6108 )
2024-07-31 16:01:24 -07:00
Michael Yang
d8e2664c33
convert: fix parse functions
2024-07-31 15:58:55 -07:00
Michael Yang
eafc607abb
convert: only extract large files
2024-07-31 15:58:55 -07:00
Michael Yang
781fc2d576
Update convert/reader_safetensors.go
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-31 15:58:55 -07:00
Michael Yang
df993fa37b
comments
2024-07-31 15:58:55 -07:00
Michael Yang
5e9db9fb0b
refactor convert
2024-07-31 15:58:33 -07:00
Michael Yang
0f3271db88
patches: phi3 default sliding window attention
2024-07-31 14:58:34 -07:00
Michael Yang
6b252918fb
update convert test to check result data
2024-07-31 10:59:38 -07:00
Michael Yang
c4c84b7a0d
Merge pull request #5196 from ollama/mxyng/messages-2
...
include modelfile messages
2024-07-31 10:18:17 -07:00
Michael Yang
5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
...
fix: environ lookup
2024-07-31 10:18:05 -07:00
Daniel Nguyen
71399aa682
Added BoltAI as a desktop UI for Ollama ( #6096 )
2024-07-31 08:44:58 -07:00
Jeffrey Morgan
463a8aa273
Create SECURITY.md
2024-07-30 21:01:12 -07:00
Michael
3579b4966a
Update README to include Firebase Genkit ( #6083 )
...
Firebase Genkit
2024-07-30 18:40:09 -07:00
Jeffrey Morgan
5d66578356
Update README.md
...
Better example for multi-modal input
2024-07-30 18:08:34 -07:00
jmorganca
afa8d6e9d5
patch gemma support
2024-07-30 18:07:29 -07:00
royjhan
1b44d873e7
Add Metrics to api\embed
response ( #5709 )
...
* add prompt tokens to embed response
* rm slog
* metrics
* types
* prompt n
* clean up
* reset submodule
* update tests
* test name
* list metrics
2024-07-30 13:12:21 -07:00
Daniel Hiltgen
cef2c6054d
Merge pull request #5859 from dhiltgen/homogeneous_gpus
...
Prevent partial loading on mixed GPU brands
2024-07-30 11:06:42 -07:00
Daniel Hiltgen
345420998e
Prevent partial loading on mixed GPU brands
...
In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands. This makes sure we find the set of GPU(s) that
best fit for the partial load.
2024-07-30 11:00:55 -07:00
Kim Hallberg
0be8baad2b
Update and Fix example models ( #6065 )
...
* Update example models
* Remove unused README.md
2024-07-29 23:56:37 -07:00
Daniel Hiltgen
1a83581a8e
Merge pull request #5895 from dhiltgen/sched_faq
...
Better explain multi-gpu behavior
2024-07-29 14:25:41 -07:00
Daniel Hiltgen
37926eb991
Merge pull request #5927 from dhiltgen/high_cpu_count
...
Ensure amd gpu nodes are numerically sorted
2024-07-29 14:24:57 -07:00
Daniel Hiltgen
3d4634fdff
Merge pull request #5934 from dhiltgen/missing_cuda_repo
...
Report better error on cuda unsupported os/arch
2024-07-29 14:24:20 -07:00