Commit graph

689 commits

Author SHA1 Message Date
Michael Yang
b732beba6a lint 2024-08-01 17:06:06 -07:00
Michael Yang
ff7c9060ec
Merge pull request #6115 from slouffka/fix-context
Fix context in /api/generate grows too much (#5980).
2024-08-01 15:13:59 -07:00
Michael Yang
0ff42e84b0
Merge pull request #4756 from ollama/mxyng/convert2
refactor convert
2024-08-01 14:16:30 -07:00
Vyacheslav Moskalev
8a9f946ca7 Refactor and format code. 2024-08-02 03:50:05 +07:00
Vyacheslav Moskalev
3b5210548e Refactor code. Remove extra variable. 2024-08-01 19:56:15 +07:00
Vyacheslav Moskalev
b0c216584c Better types and naming closer to style. 2024-08-01 19:43:44 +07:00
Vyacheslav Moskalev
49a5483139 Change the order of context and prompt. 2024-08-01 19:25:56 +07:00
Vyacheslav Moskalev
6bc5c13758 Fix extra context concatenation in generate handler (#5980). 2024-08-01 15:45:58 +07:00
Michael Yang
d87b4a488e fix modelfile message quotes 2024-07-31 16:52:09 -07:00
Blake Mizerany
dc77bbcfa4
server: fix json marshalling of downloadBlobPart (#6108) 2024-07-31 16:01:24 -07:00
Michael Yang
eafc607abb convert: only extract large files 2024-07-31 15:58:55 -07:00
Michael Yang
df993fa37b comments 2024-07-31 15:58:55 -07:00
Michael Yang
5e9db9fb0b refactor convert 2024-07-31 15:58:33 -07:00
Michael Yang
c4c84b7a0d
Merge pull request #5196 from ollama/mxyng/messages-2
include modelfile messages
2024-07-31 10:18:17 -07:00
Michael Yang
5c1912769e
Merge pull request #5473 from ollama/mxyng/environ
fix: environ lookup
2024-07-31 10:18:05 -07:00
royjhan
1b44d873e7
Add Metrics to api\embed response (#5709)
* add prompt tokens to embed response

* rm slog

* metrics

* types

* prompt n

* clean up

* reset submodule

* update tests

* test name

* list metrics
2024-07-30 13:12:21 -07:00
Daniel Hiltgen
345420998e Prevent partial loading on mixed GPU brands
In mult-brand GPU setups, if we couldn't fully load the model we
would fall through the scheduler and mistakenly try to load across
a mix of brands.  This makes sure we find the set of GPU(s) that
best fit for the partial load.
2024-07-30 11:00:55 -07:00
Michael Yang
079b2c3b03
Merge pull request #5999 from ollama/mxyng/fix-push
fix nil deref in auth.go
2024-07-26 14:28:34 -07:00
Blake Mizerany
750c1c55f7
server: fix race conditions during download (#5994)
This fixes various data races scattered throughout the download/pull
client where the client was accessing the download state concurrently.

This commit is mostly a hot-fix and will be replaced by a new client one
day soon.

Also, remove the unnecessary opts argument from downloadChunk.
2024-07-26 14:24:24 -07:00
Michael Yang
a622c47bd3 fix nil deref in auth.go 2024-07-26 14:14:48 -07:00
Michael Yang
ec4c35fe99
Merge pull request #5512 from ollama/mxyng/detect-stop
autodetect stop parameters from template
2024-07-26 13:48:23 -07:00
Michael Yang
15af558423 include modelfile messages 2024-07-26 11:40:11 -07:00
Blake Mizerany
c8af3c2d96
server: reuse original download URL for images (#5962)
This changes the registry client to reuse the original download URL
it gets on the first redirect response for all subsequent requests,
preventing thundering herd issues when hot new LLMs are released.
2024-07-25 15:58:30 -07:00
Josh
db0968f30c
fix dupe err message (#5857) 2024-07-22 15:48:15 -07:00
Michael Yang
85d9d73a72 comments 2024-07-22 11:49:03 -07:00
Michael Yang
1954ec5917 uint64 2024-07-22 11:49:02 -07:00
Michael Yang
0f1910129f int 2024-07-22 11:30:07 -07:00
Michael Yang
8570c1c0ef keepalive 2024-07-22 11:27:22 -07:00
Michael Yang
55cd3ddcca bool 2024-07-22 11:27:21 -07:00
Michael Yang
66fe77f084 models 2024-07-22 11:26:12 -07:00
Michael Yang
d1a5227cad origins 2024-07-22 11:25:30 -07:00
Michael Yang
35b89b2eab rfc: dynamic environ lookup 2024-07-22 11:25:30 -07:00
Jeffrey Morgan
b3e5491e41
server: collect nested tool call objects when parsing (#5824) 2024-07-22 12:38:03 -04:00
Jeffrey Morgan
80ee9b5e47
Remove out of space test temporarily (#5825) 2024-07-21 00:22:11 -04:00
Daniel Hiltgen
06e5d74e34
Merge pull request #5506 from dhiltgen/sched_tests
Refine scheduler unit tests for reliability
2024-07-20 15:48:39 -07:00
Jeffrey Morgan
69a2d4ccff
Fix generate test flakyness (#5804) 2024-07-19 19:11:25 -07:00
Josh
e8b954c646
server: validate template (#5734)
add template validation to modelfile
2024-07-19 15:24:29 -07:00
Michael Yang
43606d6d6a fix parsing tool calls 2024-07-18 12:08:11 -07:00
Jeffrey Morgan
70b1010fa5
server: check for empty tools array too (#5779) 2024-07-18 11:44:57 -07:00
Jeffrey Morgan
319fb1ce03
server: only parse tool calls if tools are provided (#5771)
* server: only parse tool calls if tools are provided

* still set `resp.Message.Content`
2024-07-18 08:50:23 -07:00
Michael Yang
b255445557
marshal json automatically for some template values (#5758) 2024-07-17 15:35:11 -07:00
Michael Yang
5fd6988126 parse tool call as individual objects 2024-07-17 11:19:04 -07:00
Michael Yang
c279f96371 remove ToolCall from GenerateResponse 2024-07-16 15:22:49 -07:00
Michael Yang
499e87c9ba
Merge pull request #5730 from ollama/mxyng/cleanup
remove unneeded tool calls
2024-07-16 14:42:13 -07:00
Michael Yang
d290e87513 add suffix support to generate endpoint
this change is triggered by the presence of "suffix", particularly
useful for code completion tasks
2024-07-16 14:31:35 -07:00
Michael Yang
5a83f79afd remove unneeded tool calls 2024-07-16 13:48:45 -07:00
royjhan
987dbab0b0
OpenAI: /v1/embeddings compatibility (#5285)
* OpenAI v1 models

* Empty List Testing

* Add back envconfig

* v1/models docs

* Remove Docs

* OpenAI batch embed compatibility

* merge conflicts

* integrate with api/embed

* ep

* merge conflicts

* request tests

* rm resp test

* merge conflict

* merge conflict

* test fixes

* test fn renaming

* input validation for empty string

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2024-07-16 13:36:08 -07:00
Michael Yang
a8388beb94
Merge pull request #5726 from ollama/mxyng/tools-templates
fix unmarshal type errors
2024-07-16 12:12:10 -07:00
Michael Yang
5afbb60fc4 fix unmarshal type errors 2024-07-16 11:39:34 -07:00
Jeffrey Morgan
4cb5d7decc
server: omit model system prompt if empty (#5717) 2024-07-16 11:09:00 -07:00