Commit graph

  • 17df6520c8 Remove mmap related output calc logic Daniel Hiltgen 2024-06-13 09:59:36 -0700
  • 6f351bf586 review comments and coverage Daniel Hiltgen 2024-06-05 12:07:20 -0700
  • ff4f0cbd1d Prevent multiple concurrent loads on the same gpus Daniel Hiltgen 2024-06-04 14:08:36 -0700
  • fc37c192ae Refine CPU load behavior with system memory visibility Daniel Hiltgen 2024-06-03 19:09:23 -0700
  • 434dfe30c5 Reintroduce nvidia nvml library for windows Daniel Hiltgen 2024-06-03 15:07:50 -0700
  • 4e2b7e181d Refactor intel gpu discovery Daniel Hiltgen 2024-05-29 16:37:34 -0700
  • 48702dd149 Harden unload for empty runners Daniel Hiltgen 2024-05-30 16:43:40 -0700
  • 68dfc6236a refined test timing Daniel Hiltgen 2024-05-31 14:28:02 -0700
  • 5e8ff556cb Support forced spreading for multi GPU Daniel Hiltgen 2024-05-08 14:32:42 -0700
  • 6fd04ca922 Improve multi-gpu handling at the limit Daniel Hiltgen 2024-05-18 12:34:31 -0700
  • 206797bda4 Fix concurrency integration test to work locally Daniel Hiltgen 2024-05-23 13:12:14 -0700
  • 43ed358f9a Refine GPU discovery to bootstrap once Daniel Hiltgen 2024-05-15 15:13:16 -0700
  • b32ebb4f29 Use DRM driver for VRAM info for amd Daniel Hiltgen 2024-05-14 16:18:42 -0700
  • fb9cdfa723 Fix server.cpp for the new cuda build macros Daniel Hiltgen 2024-05-18 16:02:13 -0700
  • efac488675 Revert "Limit GPU lib search for now (#4777)" Daniel Hiltgen 2024-06-03 08:31:48 -0700
  • 6b800aa7b7
    openai: do not set temperature to 0 when setting seed (#5045) Jeffrey Morgan 2024-06-14 13:43:56 -0700
  • dd7c9ebeaf
    server: longer timeout in TestRequests (#5046) Jeffrey Morgan 2024-06-14 09:48:25 -0700
  • 4dc7fb9525
    update 40xx gpu compat matrix (#5036) Patrick Devine 2024-06-13 20:10:33 -0400
  • c39761c552
    Merge pull request #5032 from dhiltgen/actually_skip Daniel Hiltgen 2024-06-13 13:26:09 -0700
  • aac367636d Actually skip PhysX on windows Daniel Hiltgen 2024-06-13 13:17:19 -0700
  • 15a687ae4b
    Merge pull request #5031 from ollama/mxyng/fix-multibyte-utf16 Michael Yang 2024-06-13 13:14:55 -0700
  • d528e1af75 fix utf16 for multibyte runes Michael Yang 2024-06-13 11:39:01 -0700
  • cd234ce22c parser: add test for multibyte runes Michael Yang 2024-06-13 11:09:22 -0700
  • 94618b2365
    add OLLAMA_MODELS to envconfig (#5029) Patrick Devine 2024-06-13 15:52:03 -0400
  • 1fd236d177
    server: remove jwt decoding error (#5027) Jeffrey Morgan 2024-06-13 11:21:15 -0700
  • e87fc7200d
    Merge pull request #5025 from ollama/mxyng/revert-parser-scan Michael Yang 2024-06-13 10:31:25 -0700
  • 20b9f8e6f4 Revert "proper utf16 support" Michael Yang 2024-06-13 10:22:16 -0700
  • c69bc19e46
    move OLLAMA_HOST to envconfig (#5009) Patrick Devine 2024-06-12 18:48:16 -0400
  • bba5d177aa
    Merge pull request #5004 from ollama/mxyng/fix-templates Michael Yang 2024-06-12 14:39:29 -0700
  • c16f8af911 fix: multiple templates when creating from model Michael Yang 2024-06-12 13:30:08 -0700
  • 217f60c3d9
    Merge pull request #4987 from ollama/mxyng/revert-byte-order Michael Yang 2024-06-11 16:04:20 -0700
  • 7bdcd1da94 Revert "Merge pull request #4938 from ollama/mxyng/fix-byte-order" Michael Yang 2024-06-11 15:55:44 -0700
  • ead259d877
    llm: fix seed value not being applied to requests (#4986) Jeffrey Morgan 2024-06-11 14:24:41 -0700
  • 2ff45d571d
    Add Ollama-hpp to Community Libraries in README. (#4983) James Montgomery 2024-06-11 14:15:05 -0400
  • 157f09acdf
    fix: "Skip searching for network devices" jayson-cloude 2024-06-11 16:11:35 +0800
  • 0f3cf1d42e
    Merge pull request #4715 from ollama/mxyng/utf16-parser Michael Yang 2024-06-10 11:41:29 -0700
  • 5bc029c529
    Merge pull request #4921 from ollama/mxyng/import-md Michael Yang 2024-06-10 11:41:09 -0700
  • e9a9c6a8e8
    Merge pull request #4965 from ollama/mxyng/skip-layer-remove Michael Yang 2024-06-10 11:40:03 -0700
  • 515f497e6d fix: skip removing layers that no longer exist Michael Yang 2024-06-10 11:15:03 -0700
  • b27268aaef add test Michael Yang 2024-06-10 11:31:34 -0700
  • f5f245cc15
    Merge pull request #4938 from ollama/mxyng/fix-byte-order Michael Yang 2024-06-10 09:38:12 -0700
  • 94d37fdcae
    fix: examples/langchain-python-rag-privategpt/requirements.txt (#3382) Jim Scardelis 2024-06-09 10:58:09 -0700
  • b84aea1685
    Critical fix from llama.cpp JSON grammar to forbid un-escaped escape characters inside strings, which breaks parsing. (#3782) Craig Hughes 2024-06-09 13:57:09 -0400
  • 896495de7b
    Add instructions to easily install specific versions on faq.md (#4084) Napuh 2024-06-09 19:49:03 +0200
  • 5528dd9d11
    Error handling load_single_document() in ingest.py (#4852) dcasota 2024-06-09 19:41:07 +0200
  • 943172cbf4
    Update api.md Jeffrey Morgan 2024-06-08 23:04:32 -0700
  • 85169e8d6f
    Added headless-ollama (#4612) Nischal Jain 2024-06-09 07:21:16 +0530
  • 34f142797a
    llm: always add bos token to prompt (#4941) Jeffrey Morgan 2024-06-08 18:47:10 -0700
  • 46a7f1e74a
    Update README.md with LangChainRust (#4854) Erhan 2024-06-09 03:29:36 +0300
  • 620d5c569e fix parsing big endian gguf Michael Yang 2024-06-08 12:32:02 -0700
  • b9ce7bf75e update import.md Michael Yang 2024-06-07 16:45:15 -0700
  • cddc63381c
    Merge pull request #4909 from dhiltgen/oneapi_disable Daniel Hiltgen 2024-06-07 14:07:15 -0700
  • 385a32ecb5
    Merge pull request #4910 from ollama/mxyng/detect-chat-template Michael Yang 2024-06-07 11:07:39 -0700
  • 030e765e76 fix create model when template detection errors Michael Yang 2024-06-07 08:55:46 -0700
  • ab8c929e20 Add ability to skip oneapi generate Daniel Hiltgen 2024-06-07 08:32:49 -0700
  • ce0dc33cb8
    llm: patch to fix qwen 2 temporarily on nvidia (#4897) Jeffrey Morgan 2024-06-06 23:14:33 -0700
  • 78f81fc0e5
    Merge pull request #4800 from ollama/mxyng/detect-chat-template Michael Yang 2024-06-06 16:17:18 -0700
  • 9b6c2e6eb6 detect chat template from KV Michael Yang 2024-06-03 11:06:29 -0700
  • 1a29e9a879
    API app/browser access (#4879) royjhan 2024-06-06 15:19:03 -0700
  • 4bf1da4944
    Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842) royjhan 2024-06-06 10:11:45 -0700
  • de5beb06b3 server: skip blob verification for already verified blobs Blake Mizerany 2024-05-24 08:40:40 -0700
  • 98e65929dc
    docs(tools): add gollama (#4829) Sam 2024-06-06 09:13:39 +1200
  • 66ab48772f proper utf16 support Michael Yang 2024-05-29 21:37:07 -0700
  • 22fcf8f7de
    Merge pull request #3737 from ollama/mxyng/modelname-4 Michael Yang 2024-06-05 12:05:05 -0700
  • 28c7813ac4
    API PS Documentation (#4822) royjhan 2024-06-05 11:06:53 -0700
  • 1d8616d30f
    docs: update to add LLocal.in to web & desktop integrations (#4719) Kartikeya Mishra 2024-06-05 03:13:59 +0530
  • d61ef8b954 update create handler to use model.Name Michael Yang 2024-05-08 14:36:08 -0700
  • 89d9900152
    Merge pull request #4570 from ollama/mxyng/slices Michael Yang 2024-06-04 13:27:05 -0700
  • 4a048715b6
    local wording was confusing people Michael 2024-06-04 13:25:25 -0700
  • 6297f85606 gofmt, goimports Michael Yang 2024-06-04 11:53:23 -0700
  • ed56428dd7 warn on intrange, usestdlibvars Michael Yang 2024-06-04 11:51:39 -0700
  • ad40b92b6a disable intrange Michael Yang 2024-06-04 11:35:30 -0700
  • 8ce4032e72 more lint Michael Yang 2024-05-29 18:22:03 -0700
  • 42660466f8 no usestdlibvars Michael Yang 2024-05-23 11:04:46 -0700
  • e919f6811f lint windows Michael Yang 2024-05-22 09:26:45 -0700
  • bf7edb0d5d lint linux Michael Yang 2024-05-22 09:08:01 -0700
  • f38353d6b9 stdin.fd Michael Yang 2024-05-22 09:00:38 -0700
  • 201d853fdf nolintlint Michael Yang 2024-05-22 08:52:00 -0700
  • e40145a39d lint Michael Yang 2024-05-21 22:21:04 -0700
  • c895a7d13f some gocritic Michael Yang 2024-05-21 22:07:57 -0700
  • dad7a987ae nosprintfhostport Michael Yang 2024-05-21 21:53:44 -0700
  • 8ffb51749f nolintlint Michael Yang 2024-05-21 21:52:20 -0700
  • 55f6eba049 gofmt Michael Yang 2024-05-21 21:32:43 -0700
  • 04f3c12bb7 replace x/exp/slices with slices Michael Yang 2024-05-21 21:30:52 -0700
  • 60323e0805
    add embed model command and fix question invoke (#4766) Shubham 2024-06-04 10:50:48 +0530
  • d4a86102fd
    update welcome prompt in windows to llama3 (#4779) Jeffrey Morgan 2024-06-01 21:05:51 -0700
  • 476fb8e892
    Limit GPU lib search for now (#4777) Jeffrey Morgan 2024-06-01 19:24:33 -0700
  • 829ff87bd1
    revert tokenize ffi (#4761) Michael Yang 2024-05-31 18:54:21 -0700
  • f6b622c4b3
    Merge pull request #4733 from ollama/jyan/isvalidname Josh 2024-05-31 14:08:45 -0700
  • 2e4da8eec2 added tests for IsValidNamespace Josh Yan 2024-05-31 11:48:07 -0700
  • 763bb65dbb
    use int32_t for call to tokenize (#4738) Jeffrey Morgan 2024-05-30 21:43:30 -0700
  • 7ca9605f54
    speed up tests by only building static lib (#4740) Jeffrey Morgan 2024-05-30 21:43:15 -0700
  • eb2c443a79
    Merge pull request #4736 from ollama/mxyng/vocab-only Michael Yang 2024-05-30 17:21:00 -0700
  • 278e25ea44
    Merge pull request #4737 from ollama/mxyng/less-generate Michael Yang 2024-05-30 17:17:50 -0700
  • a50a87a7b8
    partial offloading: allow flash attention and disable mmap (#4734) Jeffrey Morgan 2024-05-30 16:58:01 -0700
  • 98085015d5 only generate on relevant changes Michael Yang 2024-05-22 09:58:26 -0700
  • bf54c845e9 vocab only Michael Yang 2024-05-30 16:49:28 -0700
  • c365f195a8 directly use isvalidpart Josh Yan 2024-05-30 16:40:04 -0700
  • e91d0ef737
    Merge pull request #4728 from ollama/jyan/japanese Josh 2024-05-30 16:25:12 -0700
  • 22f5c12ced
    Update llama.cpp submodule to 5921b8f0 (#4731) Jeffrey Morgan 2024-05-30 16:20:22 -0700