ollama

baalajimaestro/ollama

Fork 0

Commit graph

17df6520c8 Remove mmap related output calc logic Daniel Hiltgen 2024-06-13 09:59:36 -0700
6f351bf586 review comments and coverage Daniel Hiltgen 2024-06-05 12:07:20 -0700
ff4f0cbd1d Prevent multiple concurrent loads on the same gpus Daniel Hiltgen 2024-06-04 14:08:36 -0700
fc37c192ae Refine CPU load behavior with system memory visibility Daniel Hiltgen 2024-06-03 19:09:23 -0700
434dfe30c5 Reintroduce nvidia nvml library for windows Daniel Hiltgen 2024-06-03 15:07:50 -0700
4e2b7e181d Refactor intel gpu discovery Daniel Hiltgen 2024-05-29 16:37:34 -0700
48702dd149 Harden unload for empty runners Daniel Hiltgen 2024-05-30 16:43:40 -0700
68dfc6236a refined test timing Daniel Hiltgen 2024-05-31 14:28:02 -0700
5e8ff556cb Support forced spreading for multi GPU Daniel Hiltgen 2024-05-08 14:32:42 -0700
6fd04ca922 Improve multi-gpu handling at the limit Daniel Hiltgen 2024-05-18 12:34:31 -0700
206797bda4 Fix concurrency integration test to work locally Daniel Hiltgen 2024-05-23 13:12:14 -0700
43ed358f9a Refine GPU discovery to bootstrap once Daniel Hiltgen 2024-05-15 15:13:16 -0700
b32ebb4f29 Use DRM driver for VRAM info for amd Daniel Hiltgen 2024-05-14 16:18:42 -0700
fb9cdfa723 Fix server.cpp for the new cuda build macros Daniel Hiltgen 2024-05-18 16:02:13 -0700
efac488675 Revert "Limit GPU lib search for now (#4777)" Daniel Hiltgen 2024-06-03 08:31:48 -0700
6b800aa7b7

openai: do not set temperature to 0 when setting seed (#5045) Jeffrey Morgan 2024-06-14 13:43:56 -0700
dd7c9ebeaf

server: longer timeout in TestRequests (#5046) Jeffrey Morgan 2024-06-14 09:48:25 -0700
4dc7fb9525

update 40xx gpu compat matrix (#5036) Patrick Devine 2024-06-13 20:10:33 -0400
c39761c552

Merge pull request #5032 from dhiltgen/actually_skip Daniel Hiltgen 2024-06-13 13:26:09 -0700
aac367636d Actually skip PhysX on windows Daniel Hiltgen 2024-06-13 13:17:19 -0700
15a687ae4b

Merge pull request #5031 from ollama/mxyng/fix-multibyte-utf16 Michael Yang 2024-06-13 13:14:55 -0700
d528e1af75 fix utf16 for multibyte runes Michael Yang 2024-06-13 11:39:01 -0700
cd234ce22c parser: add test for multibyte runes Michael Yang 2024-06-13 11:09:22 -0700
94618b2365

add OLLAMA_MODELS to envconfig (#5029) Patrick Devine 2024-06-13 15:52:03 -0400
1fd236d177

server: remove jwt decoding error (#5027) Jeffrey Morgan 2024-06-13 11:21:15 -0700
e87fc7200d

Merge pull request #5025 from ollama/mxyng/revert-parser-scan Michael Yang 2024-06-13 10:31:25 -0700
20b9f8e6f4 Revert "proper utf16 support" Michael Yang 2024-06-13 10:22:16 -0700
c69bc19e46

move OLLAMA_HOST to envconfig (#5009) Patrick Devine 2024-06-12 18:48:16 -0400
bba5d177aa

Merge pull request #5004 from ollama/mxyng/fix-templates Michael Yang 2024-06-12 14:39:29 -0700
c16f8af911 fix: multiple templates when creating from model Michael Yang 2024-06-12 13:30:08 -0700
217f60c3d9

Merge pull request #4987 from ollama/mxyng/revert-byte-order Michael Yang 2024-06-11 16:04:20 -0700
7bdcd1da94 Revert "Merge pull request #4938 from ollama/mxyng/fix-byte-order" Michael Yang 2024-06-11 15:55:44 -0700
ead259d877

llm: fix seed value not being applied to requests (#4986) Jeffrey Morgan 2024-06-11 14:24:41 -0700
2ff45d571d

Add Ollama-hpp to Community Libraries in README. (#4983) James Montgomery 2024-06-11 14:15:05 -0400
157f09acdf

fix: "Skip searching for network devices" jayson-cloude 2024-06-11 16:11:35 +0800
0f3cf1d42e

Merge pull request #4715 from ollama/mxyng/utf16-parser Michael Yang 2024-06-10 11:41:29 -0700
5bc029c529

Merge pull request #4921 from ollama/mxyng/import-md Michael Yang 2024-06-10 11:41:09 -0700
e9a9c6a8e8

Merge pull request #4965 from ollama/mxyng/skip-layer-remove Michael Yang 2024-06-10 11:40:03 -0700
515f497e6d fix: skip removing layers that no longer exist Michael Yang 2024-06-10 11:15:03 -0700
b27268aaef add test Michael Yang 2024-06-10 11:31:34 -0700
f5f245cc15

Merge pull request #4938 from ollama/mxyng/fix-byte-order Michael Yang 2024-06-10 09:38:12 -0700
94d37fdcae

fix: examples/langchain-python-rag-privategpt/requirements.txt (#3382) Jim Scardelis 2024-06-09 10:58:09 -0700
b84aea1685

Critical fix from llama.cpp JSON grammar to forbid un-escaped escape characters inside strings, which breaks parsing. (#3782) Craig Hughes 2024-06-09 13:57:09 -0400
896495de7b

Add instructions to easily install specific versions on faq.md (#4084) Napuh 2024-06-09 19:49:03 +0200
5528dd9d11

Error handling load_single_document() in ingest.py (#4852) dcasota 2024-06-09 19:41:07 +0200
943172cbf4

Update api.md Jeffrey Morgan 2024-06-08 23:04:32 -0700
85169e8d6f

Added headless-ollama (#4612) Nischal Jain 2024-06-09 07:21:16 +0530
34f142797a

llm: always add bos token to prompt (#4941) Jeffrey Morgan 2024-06-08 18:47:10 -0700
46a7f1e74a

Update README.md with LangChainRust (#4854) Erhan 2024-06-09 03:29:36 +0300
620d5c569e fix parsing big endian gguf Michael Yang 2024-06-08 12:32:02 -0700
b9ce7bf75e update import.md Michael Yang 2024-06-07 16:45:15 -0700
cddc63381c

Merge pull request #4909 from dhiltgen/oneapi_disable Daniel Hiltgen 2024-06-07 14:07:15 -0700
385a32ecb5

Merge pull request #4910 from ollama/mxyng/detect-chat-template Michael Yang 2024-06-07 11:07:39 -0700
030e765e76 fix create model when template detection errors Michael Yang 2024-06-07 08:55:46 -0700
ab8c929e20 Add ability to skip oneapi generate Daniel Hiltgen 2024-06-07 08:32:49 -0700
ce0dc33cb8

llm: patch to fix qwen 2 temporarily on nvidia (#4897) Jeffrey Morgan 2024-06-06 23:14:33 -0700
78f81fc0e5

Merge pull request #4800 from ollama/mxyng/detect-chat-template Michael Yang 2024-06-06 16:17:18 -0700
9b6c2e6eb6 detect chat template from KV Michael Yang 2024-06-03 11:06:29 -0700
1a29e9a879

API app/browser access (#4879) royjhan 2024-06-06 15:19:03 -0700
4bf1da4944

Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842) royjhan 2024-06-06 10:11:45 -0700
de5beb06b3 server: skip blob verification for already verified blobs Blake Mizerany 2024-05-24 08:40:40 -0700
98e65929dc

docs(tools): add gollama (#4829) Sam 2024-06-06 09:13:39 +1200
66ab48772f proper utf16 support Michael Yang 2024-05-29 21:37:07 -0700
22fcf8f7de

Merge pull request #3737 from ollama/mxyng/modelname-4 Michael Yang 2024-06-05 12:05:05 -0700
28c7813ac4

API PS Documentation (#4822) royjhan 2024-06-05 11:06:53 -0700
1d8616d30f

docs: update to add LLocal.in to web & desktop integrations (#4719) Kartikeya Mishra 2024-06-05 03:13:59 +0530
d61ef8b954 update create handler to use model.Name Michael Yang 2024-05-08 14:36:08 -0700
89d9900152

Merge pull request #4570 from ollama/mxyng/slices Michael Yang 2024-06-04 13:27:05 -0700
4a048715b6

local wording was confusing people Michael 2024-06-04 13:25:25 -0700
6297f85606 gofmt, goimports Michael Yang 2024-06-04 11:53:23 -0700
ed56428dd7 warn on intrange, usestdlibvars Michael Yang 2024-06-04 11:51:39 -0700
ad40b92b6a disable intrange Michael Yang 2024-06-04 11:35:30 -0700
8ce4032e72 more lint Michael Yang 2024-05-29 18:22:03 -0700
42660466f8 no usestdlibvars Michael Yang 2024-05-23 11:04:46 -0700
e919f6811f lint windows Michael Yang 2024-05-22 09:26:45 -0700
bf7edb0d5d lint linux Michael Yang 2024-05-22 09:08:01 -0700
f38353d6b9 stdin.fd Michael Yang 2024-05-22 09:00:38 -0700
201d853fdf nolintlint Michael Yang 2024-05-22 08:52:00 -0700
e40145a39d lint Michael Yang 2024-05-21 22:21:04 -0700
c895a7d13f some gocritic Michael Yang 2024-05-21 22:07:57 -0700
dad7a987ae nosprintfhostport Michael Yang 2024-05-21 21:53:44 -0700
8ffb51749f nolintlint Michael Yang 2024-05-21 21:52:20 -0700
55f6eba049 gofmt Michael Yang 2024-05-21 21:32:43 -0700
04f3c12bb7 replace x/exp/slices with slices Michael Yang 2024-05-21 21:30:52 -0700
60323e0805

add embed model command and fix question invoke (#4766) Shubham 2024-06-04 10:50:48 +0530
d4a86102fd

update welcome prompt in windows to llama3 (#4779) Jeffrey Morgan 2024-06-01 21:05:51 -0700
476fb8e892

Limit GPU lib search for now (#4777) Jeffrey Morgan 2024-06-01 19:24:33 -0700
829ff87bd1

revert tokenize ffi (#4761) Michael Yang 2024-05-31 18:54:21 -0700
f6b622c4b3

Merge pull request #4733 from ollama/jyan/isvalidname Josh 2024-05-31 14:08:45 -0700
2e4da8eec2 added tests for IsValidNamespace Josh Yan 2024-05-31 11:48:07 -0700
763bb65dbb

use int32_t for call to tokenize (#4738) Jeffrey Morgan 2024-05-30 21:43:30 -0700
7ca9605f54

speed up tests by only building static lib (#4740) Jeffrey Morgan 2024-05-30 21:43:15 -0700
eb2c443a79

Merge pull request #4736 from ollama/mxyng/vocab-only Michael Yang 2024-05-30 17:21:00 -0700
278e25ea44

Merge pull request #4737 from ollama/mxyng/less-generate Michael Yang 2024-05-30 17:17:50 -0700
a50a87a7b8

partial offloading: allow flash attention and disable mmap (#4734) Jeffrey Morgan 2024-05-30 16:58:01 -0700
98085015d5 only generate on relevant changes Michael Yang 2024-05-22 09:58:26 -0700
bf54c845e9 vocab only Michael Yang 2024-05-30 16:49:28 -0700
c365f195a8 directly use isvalidpart Josh Yan 2024-05-30 16:40:04 -0700
e91d0ef737

Merge pull request #4728 from ollama/jyan/japanese Josh 2024-05-30 16:25:12 -0700
22f5c12ced

Update llama.cpp submodule to 5921b8f0 (#4731) Jeffrey Morgan 2024-05-30 16:20:22 -0700