Commit graph

  • 33627331a3
    app: also clean up tempdir runners on install (#5646) Jeffrey Morgan 2024-07-12 12:29:23 -0700
  • 36c87c433b template: preprocess message and collect system Michael Yang 2024-07-12 11:48:06 -0700
  • 179737feb7
    Clean up old files when installing on Windows (#5645) Jeffrey Morgan 2024-07-11 22:53:46 -0700
  • 47353f5ee4
    Merge pull request #5639 from ollama/mxyng/unaggregated-system Michael Yang 2024-07-11 17:48:50 -0700
  • 10e768826c
    fix: quant err message (#5616) Josh 2024-07-11 17:24:29 -0700
  • 5056bb9c01 rename aggregate to contents Michael Yang 2024-07-11 16:06:57 -0700
  • c4cf8ad559
    llm: avoid loading model if system memory is too small (#5637) Jeffrey Morgan 2024-07-11 16:42:57 -0700
  • 57ec6901eb revert embedded templates to use prompt/response Michael Yang 2024-07-11 13:11:40 -0700
  • e64f9ebb44 do no automatically aggregate system messages Michael Yang 2024-07-11 13:10:13 -0700
  • 791650ddef
    sched: only error when over-allocating system memory (#5626) Jeffrey Morgan 2024-07-11 00:53:12 -0700
  • efbf41ed81
    llm: dont link cuda with compat libs (#5621) Jeffrey Morgan 2024-07-10 20:01:52 -0700
  • cf15589851
    Merge pull request #5620 from ollama/mxyng/templates Michael Yang 2024-07-10 17:16:24 -0700
  • 19753c18c0 update embedded templates Michael Yang 2024-07-10 11:00:29 -0700
  • 41be28096a add system prompt to first legacy template Michael Yang 2024-07-10 11:00:07 -0700
  • 37a570f962
    Merge pull request #5612 from ollama/mxyng/mem Michael Yang 2024-07-10 14:18:33 -0700
  • 5a739ff4cb chatglm graph Michael Yang 2024-07-10 13:18:04 -0700
  • 4e262eb2a8
    remove GGML_CUDA_FORCE_MMQ=on from build (#5588) Jeffrey Morgan 2024-07-10 13:17:13 -0700
  • 4cfcbc328f
    Merge pull request #5124 from dhiltgen/amd_windows Daniel Hiltgen 2024-07-10 12:50:23 -0700
  • 79292ff3e0
    Merge pull request #5555 from dhiltgen/msvc_deps Daniel Hiltgen 2024-07-10 12:50:02 -0700
  • 8ea500441d
    Merge pull request #5580 from dhiltgen/cuda_overhead Daniel Hiltgen 2024-07-10 12:47:31 -0700
  • b50c818623
    Merge pull request #5607 from dhiltgen/win_rocm_v6 Daniel Hiltgen 2024-07-10 12:47:10 -0700
  • b99e750b62
    Merge pull request #5605 from dhiltgen/merge_glitch Daniel Hiltgen 2024-07-10 11:47:08 -0700
  • 1f50356e8e Bump ROCm on windows to 6.1.2 Daniel Hiltgen 2024-07-10 11:01:22 -0700
  • 22c81f62ec Remove duplicate merge glitch Daniel Hiltgen 2024-07-10 09:01:33 -0700
  • 73e2c8f68f Fix context exhaustion integration test for small gpus Daniel Hiltgen 2024-07-09 15:28:25 -0700
  • f4408219e9 Refine scheduler unit tests for reliability Daniel Hiltgen 2024-07-05 15:30:06 -0700
  • 2d1e3c3229
    Merge pull request #5503 from dhiltgen/dual_rocm Daniel Hiltgen 2024-07-09 15:44:16 -0700
  • 4918fae535
    OpenAI v1/completions: allow stop token list (#5551) royjhan 2024-07-09 14:01:26 -0700
  • 0aff67877e
    separate request tests (#5578) royjhan 2024-07-09 13:48:31 -0700
  • f6f759fc5f Detect CUDA OS Overhead Daniel Hiltgen 2024-07-09 10:27:53 -0700
  • 9544a57ee4
    Merge pull request #5579 from dhiltgen/win_static_deps Daniel Hiltgen 2024-07-09 12:21:13 -0700
  • b51e3b63ac Statically link c++ and thread lib Daniel Hiltgen 2024-07-09 11:17:44 -0700
  • 6bbbc50f10
    Merge pull request #5440 from ollama/mxyng/messages-templates Michael Yang 2024-07-09 09:36:32 -0700
  • 9bbddc37a7
    Merge pull request #5126 from ollama/mxyng/messages Michael Yang 2024-07-09 09:20:44 -0700
  • e4ff73297d
    server: fix model reloads when setting OLLAMA_NUM_PARALLEL (#5560) Jeffrey Morgan 2024-07-08 22:32:15 -0700
  • b44320db13 Bundle missing CRT libraries Daniel Hiltgen 2024-07-08 18:24:21 -0700
  • 0bacb30007 Workaround broken ROCm p2p copy Daniel Hiltgen 2024-07-05 12:46:28 -0700
  • 53da2c6965
    llm: remove ambiguous comment when putting upper limit on predictions to avoid infinite generation (#5535) Jeffrey Morgan 2024-07-07 14:32:05 -0400
  • 3bb134eaa0
    Use alpine and remove blas baalajimaestro 2024-07-07 21:49:35 +0530
  • d8def1ff94
    llm: allow gemma 2 to context shift (#5534) Jeffrey Morgan 2024-07-07 13:41:51 -0400
  • 571dc61955
    Update llama.cpp submodule to a8db2a9c (#5530) Jeffrey Morgan 2024-07-07 13:03:09 -0400
  • 0e09c380fc
    llm: print caching notices in debug only (#5533) Jeffrey Morgan 2024-07-07 12:38:04 -0400
  • 0ee87615c7
    sched: don't error if paging to disk on Windows and macOS (#5523) Jeffrey Morgan 2024-07-06 22:01:52 -0400
  • f8241bfba3
    gpu: report system free memory instead of 0 (#5521) Jeffrey Morgan 2024-07-06 19:35:04 -0400
  • 4607c70641
    llm: add -DBUILD_SHARED_LIBS=off to common cpu cmake flags (#5520) Jeffrey Morgan 2024-07-06 18:58:16 -0400
  • c12f1c5b99 release: move mingw library cleanup to correct job jmorganca 2024-07-06 16:12:29 -0400
  • a08f20d910 release: remove unwanted mingw dll.a files jmorganca 2024-07-06 15:21:15 -0400
  • 6cea036027 Revert "llm: only statically link libstdc++" jmorganca 2024-07-06 15:10:48 -0400
  • 415d9f0f15
    Merge https://github.com/ollama/ollama baalajimaestro 2024-07-06 23:37:14 +0530
  • 5796bfc401 llm: only statically link libstdc++ jmorganca 2024-07-06 14:06:20 -0400
  • 110deb68cf
    Add more params for llama baalajimaestro 2024-07-06 23:35:58 +0530
  • f1a379aa56 llm: statically link pthread and stdc++ dependencies in windows build jmorganca 2024-07-06 12:54:02 -0400
  • 9ae146993e llm: add GGML_STATIC flag to windows static lib jmorganca 2024-07-06 03:27:05 -0400
  • e0348d3fe8
    llm: add COMMON_DARWIN_DEFS to arm static build (#5513) Jeffrey Morgan 2024-07-05 22:42:42 -0400
  • 2cc854f8cb
    llm: fix missing dylibs by restoring old build behavior on Linux and macOS (#5511) Jeffrey Morgan 2024-07-05 21:48:31 -0400
  • 5304b765b2
    llm: put back old include dir (#5507) Jeffrey Morgan 2024-07-05 19:34:21 -0400
  • fb6cbc02fb update named templates Michael Yang 2024-06-27 14:15:17 -0700
  • 4fd5f3526a
    fix cmake build (#5505) Jeffrey Morgan 2024-07-05 19:07:01 -0400
  • 842f85f758
    Merge pull request #5502 from dhiltgen/ci_fixes Daniel Hiltgen 2024-07-05 15:39:11 -0700
  • 9d30f9f8b3 Always go build in CI generate steps Daniel Hiltgen 2024-07-05 12:25:53 -0700
  • 631cfd9e62
    types/model: remove knowledge of digest (#5500) Blake Mizerany 2024-07-05 13:42:30 -0700
  • 326363b3a7 no funcs Michael Yang 2024-07-03 13:49:14 -0700
  • ac7a842e55 fix model reloading Michael Yang 2024-07-03 09:00:07 -0700
  • 2c3fe1fd97 comments Michael Yang 2024-06-20 11:00:08 -0700
  • 269ed6e6a2 update message processing Michael Yang 2024-06-17 10:38:55 -0700
  • 78fb33dd07
    fix typo in cgo directives in llm.go (#5501) Jeffrey Morgan 2024-07-05 15:18:36 -0400
  • 8f8e736b13
    update llama.cpp submodule to d7fd29f (#5475) Jeffrey Morgan 2024-07-05 13:25:58 -0400
  • d89454de80
    Use slot with cached prompt instead of least recently used (#5492) Jeffrey Morgan 2024-07-05 12:32:47 -0400
  • af28b94533
    Merge pull request #5469 from dhiltgen/prevent_system_oom Daniel Hiltgen 2024-07-05 08:22:20 -0700
  • e9188e971a
    Fix assert on small embedding inputs (#5491) Jeffrey Morgan 2024-07-05 11:20:57 -0400
  • 78eddfc068
    Merge pull request #4412 from dhiltgen/win_docs Daniel Hiltgen 2024-07-05 08:18:22 -0700
  • 02c24d3d01
    Merge pull request #5466 from dhiltgen/fix_clip_unicode Daniel Hiltgen 2024-07-05 08:16:58 -0700
  • 52abc8acb7 Document older win10 terminal problems Daniel Hiltgen 2024-05-13 15:08:29 -0700
  • 4d71c559b2
    fix error detection by limiting model loading error parsing (#5472) Jeffrey Morgan 2024-07-03 20:04:30 -0400
  • 0d16eb310e
    fix: use envconfig.ModelsDir directly (#4821) Anatoli Babenia 2024-07-04 01:36:11 +0300
  • 8072e205ff
    Merge pull request #5447 from dhiltgen/fix_keepalive Daniel Hiltgen 2024-07-03 15:34:38 -0700
  • 955f2a4e03 Only set default keep_alive on initial model load Daniel Hiltgen 2024-07-02 15:12:43 -0700
  • 3c75113e37 Prevent loading models larger than total memory Daniel Hiltgen 2024-07-03 14:47:42 -0700
  • ccd7785859
    Merge pull request #5243 from dhiltgen/modelfile_use_mmap Daniel Hiltgen 2024-07-03 13:59:42 -0700
  • 3b5a4a77f3
    Return Correct Prompt Eval Count Regardless of Cache Prompt (#5371) royjhan 2024-07-03 13:46:23 -0700
  • daed0634a9
    Merge pull request #5467 from dhiltgen/bogus_cpu_mac_error Daniel Hiltgen 2024-07-03 13:39:36 -0700
  • 0d4dd707bc
    Merge pull request #5465 from dhiltgen/better_cuda_logging Daniel Hiltgen 2024-07-03 13:12:22 -0700
  • 0e982bc1f4 Fix corner cases on tmp cleaner on mac Daniel Hiltgen 2024-07-03 13:10:14 -0700
  • 6298f49816 Fix clip model loading with unicode paths Daniel Hiltgen 2024-07-03 12:37:40 -0700
  • ef757da2c9 Better nvidia GPU discovery logging Daniel Hiltgen 2024-07-03 10:30:07 -0700
  • e5352297d9
    Merge pull request #5448 from ollama/mxyng/fix-generate Michael Yang 2024-07-02 16:48:06 -0700
  • 65a5040e09 fix generate template Michael Yang 2024-07-02 16:42:17 -0700
  • d626b99b54
    OpenAI: v1/completions compatibility (#5209) royjhan 2024-07-02 16:01:45 -0700
  • dddb58a38b
    Merge pull request #5051 from ollama/mxyng/capabilities Michael Yang 2024-07-02 14:26:07 -0700
  • 400056e154
    Merge pull request #5420 from ollama/mxyng/insecure-path Michael Yang 2024-07-02 14:03:23 -0700
  • d2f19024d0
    Merge pull request #5442 from dhiltgen/concurrency_docs Daniel Hiltgen 2024-07-02 12:47:47 -0700
  • 69c04eecc4 Add windows radeon concurreny note Daniel Hiltgen 2024-07-02 12:46:14 -0700
  • 996bb1b85e
    OpenAI: /v1/models and /v1/models/{model} compatibility (#5007) royjhan 2024-07-02 11:50:56 -0700
  • 422dcc3856
    Merge pull request #5439 from dhiltgen/fix_centos_7_build Daniel Hiltgen 2024-07-02 11:01:15 -0700
  • 020bd60ab2 Switch amd container image base to rocky 8 Daniel Hiltgen 2024-07-02 10:23:05 -0700
  • 8e277b72bb
    Merge pull request #5438 from dhiltgen/fix_centos_7_build Daniel Hiltgen 2024-07-02 09:28:00 -0700
  • 4f67b39d26 Centos 7 EOL broke mirrors Daniel Hiltgen 2024-07-02 09:22:17 -0700
  • 2425281317
    Merge pull request #5336 from ollama/jyan/from-errors Josh 2024-07-01 16:32:46 -0700
  • 0403e9860e
    Merge pull request #5421 from ollama/jyan/ver Josh 2024-07-01 16:32:14 -0700
  • 33a65e3ba3 error Josh Yan 2024-07-01 16:04:13 -0700