Commit graph

3356 commits

Author SHA1 Message Date
Michael Yang
cd0853f2d5
Merge pull request #5207 from ollama/mxyng/suffix
add insert support to generate endpoint
2024-07-16 14:37:32 -07:00
Michael Yang
d290e87513 add suffix support to generate endpoint
this change is triggered by the presence of "suffix", particularly
useful for code completion tasks
2024-07-16 14:31:35 -07:00
Thorsten Sommer
97c20ede33
README: Added AI Studio to the list of UIs (#5721)
* Added AI Studio to the list of UIs
2024-07-16 14:24:27 -07:00
Michael Yang
5a83f79afd remove unneeded tool calls 2024-07-16 13:48:45 -07:00
royjhan
987dbab0b0
OpenAI: /v1/embeddings compatibility (#5285)
* OpenAI v1 models

* Empty List Testing

* Add back envconfig

* v1/models docs

* Remove Docs

* OpenAI batch embed compatibility

* merge conflicts

* integrate with api/embed

* ep

* merge conflicts

* request tests

* rm resp test

* merge conflict

* merge conflict

* test fixes

* test fn renaming

* input validation for empty string

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2024-07-16 13:36:08 -07:00
Michael Yang
a8388beb94
Merge pull request #5726 from ollama/mxyng/tools-templates
fix unmarshal type errors
2024-07-16 12:12:10 -07:00
Michael Yang
5afbb60fc4 fix unmarshal type errors 2024-07-16 11:39:34 -07:00
Jeffrey Morgan
4cb5d7decc
server: omit model system prompt if empty (#5717) 2024-07-16 11:09:00 -07:00
87345eda1b
Ditch the runner container entirely and use build environment as the runner environment
Running the binary outside the build environment crashes with signal 127 and i am unable to debug why

Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-16 22:42:01 +05:30
Michael Yang
8eac50dd4f
Merge pull request #5684 from ollama/mxyng/tests
add chat and generate tests with mock runner
2024-07-16 09:44:45 -07:00
Michael Yang
4a565cbf94 add chat and generate tests with mock runner 2024-07-16 09:39:31 -07:00
696e20eeae
Merge https://github.com/ollama/ollama
Signed-off-by: baalajimaestro <me@baalajimaestro.me>
2024-07-16 21:50:57 +05:30
Michael Yang
64039df6d7
Merge pull request #5284 from ollama/mxyng/tools
tools
2024-07-15 18:03:37 -07:00
Jeffrey Morgan
7ac6d462ec
server: return empty slice on empty /api/embed request (#5713)
* server: return empty slice on empty `/api/embed` request

* fix tests
2024-07-15 17:39:44 -07:00
Michael Yang
ef5136a745 tools test 2024-07-15 17:18:21 -07:00
Daniel Hiltgen
8288ec8824
Merge pull request #5710 from dhiltgen/rocm_bump
Bump linux ROCm to 6.1.2
2024-07-15 15:32:18 -07:00
Michael Yang
d02bbebb11 tools 2024-07-15 15:26:16 -07:00
Daniel Hiltgen
224337b32f Bump linux ROCm to 6.1.2 2024-07-15 15:10:22 -07:00
Jeffrey Morgan
9e35d9bbee
server: lowercase roles for compatibility with clients (#5695) 2024-07-15 13:55:57 -07:00
royjhan
b9f5e16c80
Introduce /api/embed endpoint supporting batch embedding (#5127)
* Initial Batch Embedding

* Revert "Initial Batch Embedding"

This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29.

* Initial Draft

* mock up notes

* api/embed draft

* add server function

* check normalization

* clean up

* normalization

* playing around with truncate stuff

* Truncation

* Truncation

* move normalization to go

* Integration Test Template

* Truncation Integration Tests

* Clean up

* use float32

* move normalize

* move normalize test

* refactoring

* integration float32

* input handling and handler testing

* Refactoring of legacy and new

* clear comments

* merge conflicts

* touches

* embedding type 64

* merge conflicts

* fix hanging on single string

* refactoring

* test values

* set context length

* clean up

* testing clean up

* testing clean up

* remove function closure

* Revert "remove function closure"

This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787.

* remove function closure

* remove redundant error check

* clean up

* more clean up

* clean up
2024-07-15 12:14:24 -07:00
8c6402d194
Merge https://github.com/ollama/ollama 2024-07-14 16:51:20 +05:30
royjhan
e9f7f36029
Support image input for OpenAI chat compatibility (#5208)
* OpenAI v1 models

* Refactor Writers

* Add Test

Co-Authored-By: Attila Kerekes

* Credit Co-Author

Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>

* Empty List Testing

* Use Namespace for Ownedby

* Update Test

* Add back envconfig

* v1/models docs

* Use ModelName Parser

* Test Names

* Remove Docs

* Clean Up

* Test name

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Add Middleware for Chat and List

* Testing Cleanup

* Test with Fatal

* Add functionality to chat test

* Support image input for OpenAI chat

* Decoding

* Fix message processing logic

* openai vision test

* type errors

* clean up

* redundant check

* merge conflicts

* merge conflicts

* merge conflicts

* flattening and smaller image

* add test

* support python and js SDKs and mandate prefixing

* clean up

---------

Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-07-13 22:07:45 -07:00
Patrick Devine
057d31861e
remove template (#5655) 2024-07-13 20:56:24 -07:00
jmorganca
f7ee012300 server: prepend system message in chat handler 2024-07-13 15:08:00 -07:00
Jeffrey Morgan
1ed0aa8fea
server: fix context, load_duration and total_duration fields (#5676)
* server: fix `contet`, `load_duration` and `total_duration` fields

* Update server/routes.go
2024-07-13 09:25:31 -07:00
Jeffrey Morgan
ef98803d63
llm: looser checks for minimum memory (#5677) 2024-07-13 09:20:05 -07:00
Jarek
02fea420e5
Add Kerlig AI, an app for macOS (#5675) 2024-07-13 08:33:46 -07:00
Michael Yang
22c5451fc2
fix system prompt (#5662)
* fix system prompt

* execute template when hitting previous roles

* fix tests

---------

Co-authored-by: jmorganca <jmorganca@gmail.com>
2024-07-12 21:04:44 -07:00
Michael Yang
ebc529cbb3 autodetect stop parameters from template 2024-07-12 16:01:23 -07:00
Patrick Devine
23ebbaa46e Revert "remove template from tests"
This reverts commit 9ac0a7a50b.
2024-07-12 15:47:17 -07:00
Patrick Devine
9ac0a7a50b remove template from tests 2024-07-12 15:41:31 -07:00
Michael Yang
e5c65a85df
Merge pull request #5653 from ollama/mxyng/collect-system
template: preprocess message and collect system
2024-07-12 12:32:34 -07:00
Jeffrey Morgan
33627331a3
app: also clean up tempdir runners on install (#5646) 2024-07-12 12:29:23 -07:00
Michael Yang
36c87c433b template: preprocess message and collect system 2024-07-12 12:26:43 -07:00
Jeffrey Morgan
179737feb7
Clean up old files when installing on Windows (#5645)
* app: always clean up install dir; force close applications

* remove wildcard

* revert `CloseApplications`

* whitespace

* update `LOCALAPPDATA` var
2024-07-11 22:53:46 -07:00
Michael Yang
47353f5ee4
Merge pull request #5639 from ollama/mxyng/unaggregated-system 2024-07-11 17:48:50 -07:00
Josh
10e768826c
fix: quant err message (#5616) 2024-07-11 17:24:29 -07:00
Michael Yang
5056bb9c01 rename aggregate to contents 2024-07-11 17:00:26 -07:00
Jeffrey Morgan
c4cf8ad559
llm: avoid loading model if system memory is too small (#5637)
* llm: avoid loading model if system memory is too small

* update log

* Instrument swap free space

On linux and windows, expose how much swap space is available
so we can take that into consideration when scheduling models

* use `systemSwapFreeMemory` in check

---------

Co-authored-by: Daniel Hiltgen <daniel@ollama.com>
2024-07-11 16:42:57 -07:00
Michael Yang
57ec6901eb revert embedded templates to use prompt/response
This reverts commit 19753c18c0.

for compat. messages will be added at a later date
2024-07-11 14:49:35 -07:00
Michael Yang
e64f9ebb44 do no automatically aggregate system messages 2024-07-11 14:49:35 -07:00
Jeffrey Morgan
791650ddef
sched: only error when over-allocating system memory (#5626) 2024-07-11 00:53:12 -07:00
Jeffrey Morgan
efbf41ed81
llm: dont link cuda with compat libs (#5621) 2024-07-10 20:01:52 -07:00
Michael Yang
cf15589851
Merge pull request #5620 from ollama/mxyng/templates
update embedded templates
2024-07-10 17:16:24 -07:00
Michael Yang
19753c18c0 update embedded templates 2024-07-10 17:03:08 -07:00
Michael Yang
41be28096a add system prompt to first legacy template 2024-07-10 17:03:08 -07:00
Michael Yang
37a570f962
Merge pull request #5612 from ollama/mxyng/mem
chatglm graph
2024-07-10 14:18:33 -07:00
Michael Yang
5a739ff4cb chatglm graph 2024-07-10 13:43:47 -07:00
Jeffrey Morgan
4e262eb2a8
remove GGML_CUDA_FORCE_MMQ=on from build (#5588) 2024-07-10 13:17:13 -07:00
Daniel Hiltgen
4cfcbc328f
Merge pull request #5124 from dhiltgen/amd_windows
Wire up windows AMD driver reporting
2024-07-10 12:50:23 -07:00