Commit graph

2876 commits

Author SHA1 Message Date
Jeffrey Morgan
943172cbf4
Update api.md 2024-06-08 23:04:32 -07:00
Nischal Jain
85169e8d6f
Added headless-ollama (#4612) 2024-06-08 18:51:16 -07:00
Jeffrey Morgan
34f142797a
llm: always add bos token to prompt (#4941)
* fix embedding by adding fixes from llama.cpp upstream

* remove assert

---------

Co-authored-by: Jesper Ek <deadbeef84@gmail.com>
2024-06-08 18:47:10 -07:00
Erhan
46a7f1e74a
Update README.md with LangChainRust (#4854) 2024-06-08 17:29:36 -07:00
Daniel Hiltgen
cddc63381c
Merge pull request #4909 from dhiltgen/oneapi_disable
Add ability to skip oneapi generate
2024-06-07 14:07:15 -07:00
Michael Yang
385a32ecb5
Merge pull request #4910 from ollama/mxyng/detect-chat-template
fix create model when template detection errors
2024-06-07 11:07:39 -07:00
Michael Yang
030e765e76 fix create model when template detection errors 2024-06-07 10:51:35 -07:00
Daniel Hiltgen
ab8c929e20 Add ability to skip oneapi generate
This follows the same pattern for cuda and rocm to allow
disabling the build even when we detect the dependent libraries
2024-06-07 08:32:49 -07:00
Jeffrey Morgan
ce0dc33cb8
llm: patch to fix qwen 2 temporarily on nvidia (#4897) 2024-06-06 23:14:33 -07:00
Michael Yang
78f81fc0e5
Merge pull request #4800 from ollama/mxyng/detect-chat-template
detect chat template from KV
2024-06-06 16:17:18 -07:00
Michael Yang
9b6c2e6eb6 detect chat template from KV 2024-06-06 16:03:47 -07:00
royjhan
1a29e9a879
API app/browser access (#4879)
* API app/browser access

* Add tauri (resolves #2291, #4791, #3799, #4388)
2024-06-06 15:19:03 -07:00
royjhan
4bf1da4944
Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842)
* Remove false time fields

* Struct Separation for List and Process

* Remove Marshaler
2024-06-06 10:11:45 -07:00
Blake Mizerany
de5beb06b3 server: skip blob verification for already verified blobs 2024-06-05 16:39:11 -07:00
Sam
98e65929dc
docs(tools): add gollama (#4829) 2024-06-05 14:13:39 -07:00
Michael Yang
22fcf8f7de
Merge pull request #3737 from ollama/mxyng/modelname-4
update create handler to use model.Name
2024-06-05 12:05:05 -07:00
royjhan
28c7813ac4
API PS Documentation (#4822)
* API PS Documentation
2024-06-05 11:06:53 -07:00
Kartikeya Mishra
1d8616d30f
docs: update to add LLocal.in to web & desktop integrations (#4719) 2024-06-04 14:43:59 -07:00
Michael Yang
d61ef8b954 update create handler to use model.Name 2024-06-04 13:28:25 -07:00
Michael Yang
89d9900152
Merge pull request #4570 from ollama/mxyng/slices
lint some of the things
2024-06-04 13:27:05 -07:00
Michael
4a048715b6
local wording was confusing people
local wording was confusing people -- Ollama runs on cloud providers
2024-06-04 13:25:25 -07:00
Michael Yang
6297f85606 gofmt, goimports 2024-06-04 13:20:24 -07:00
Michael Yang
ed56428dd7 warn on intrange, usestdlibvars 2024-06-04 11:52:48 -07:00
Michael Yang
ad40b92b6a disable intrange 2024-06-04 11:35:30 -07:00
Michael Yang
8ce4032e72 more lint 2024-06-04 11:13:30 -07:00
Michael Yang
42660466f8 no usestdlibvars 2024-06-04 11:13:30 -07:00
Michael Yang
e919f6811f lint windows 2024-06-04 11:13:30 -07:00
Michael Yang
bf7edb0d5d lint linux 2024-06-04 11:13:30 -07:00
Michael Yang
f38353d6b9 stdin.fd 2024-06-04 11:13:30 -07:00
Michael Yang
201d853fdf nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang
e40145a39d lint 2024-06-04 11:13:30 -07:00
Michael Yang
c895a7d13f some gocritic 2024-06-04 11:13:30 -07:00
Michael Yang
dad7a987ae nosprintfhostport 2024-06-04 11:13:30 -07:00
Michael Yang
8ffb51749f nolintlint 2024-06-04 11:13:30 -07:00
Michael Yang
55f6eba049 gofmt 2024-06-04 11:13:30 -07:00
Michael Yang
04f3c12bb7 replace x/exp/slices with slices 2024-06-04 11:13:30 -07:00
Shubham
60323e0805
add embed model command and fix question invoke (#4766)
* add embed model command and fix question invoke

* Update docs/tutorials/langchainpy.md

Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>

* Update docs/tutorials/langchainpy.md

---------

Co-authored-by: Kim Hallberg <hallberg.kim@gmail.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-06-03 22:20:48 -07:00
Jeffrey Morgan
d4a86102fd
update welcome prompt in windows to llama3 (#4779) 2024-06-01 21:05:51 -07:00
Jeffrey Morgan
476fb8e892
Limit GPU lib search for now (#4777)
* fix oneapi errors on windows 10
2024-06-01 19:24:33 -07:00
Michael Yang
829ff87bd1
revert tokenize ffi (#4761)
* Revert "use `int32_t` for call to tokenize (#4738)"

This reverts commit 763bb65dbb.

* Revert "vocab only"

This reverts commit bf54c845e9.

* Revert "use ffi for tokenizing/detokenizing"

This reverts commit 26a00a0410.
2024-05-31 18:54:21 -07:00
Josh
f6b622c4b3
Merge pull request #4733 from ollama/jyan/isvalidname
added IsValidNamespace function
2024-05-31 14:08:45 -07:00
Josh Yan
2e4da8eec2 added tests for IsValidNamespace 2024-05-31 11:48:07 -07:00
Jeffrey Morgan
763bb65dbb
use int32_t for call to tokenize (#4738)
* use `int32_t` for call to tokenize

* variable naming

* cleanup

* fix crash
2024-05-30 21:43:30 -07:00
Jeffrey Morgan
7ca9605f54
speed up tests by only building static lib (#4740) 2024-05-30 21:43:15 -07:00
Michael Yang
eb2c443a79
Merge pull request #4736 from ollama/mxyng/vocab-only
vocab only for tokenize
2024-05-30 17:21:00 -07:00
Michael Yang
278e25ea44
Merge pull request #4737 from ollama/mxyng/less-generate
only generate on relevant changes
2024-05-30 17:17:50 -07:00
Jeffrey Morgan
a50a87a7b8
partial offloading: allow flash attention and disable mmap (#4734)
* partial offloading: allow flash attention and disable mmap

* allow mmap with num_gpu=0
2024-05-30 16:58:01 -07:00
Michael Yang
98085015d5 only generate on relevant changes 2024-05-30 16:54:11 -07:00
Michael Yang
bf54c845e9 vocab only 2024-05-30 16:49:28 -07:00
Josh Yan
c365f195a8 directly use isvalidpart 2024-05-30 16:40:04 -07:00