Commit graph

1667 commits

Author SHA1 Message Date
Jeffrey Morgan
7fa6e51686
generate binary dependencies based on GOARCH on macos (#459) 2023-09-05 12:53:57 -04:00
Michael Yang
8dc68417e7
Merge pull request #463 from jmorganca/mxyng/fix-last-token
fix not forwarding last token
2023-09-05 09:01:32 -07:00
Michael Yang
681f3c4c42 fix num_keep 2023-09-03 17:47:49 -04:00
Michael Yang
59a705525c fix not forwarding last token 2023-09-03 17:46:50 -04:00
Michael Yang
5d3f314b0b remove marshalPrompt which is no longer needed 2023-09-03 17:01:05 -04:00
Michael Yang
adaa13088b
Merge pull request #457 from sqs/dont-html-escape-prompt
do not HTML-escape prompt
2023-09-01 17:41:53 -07:00
Quinn Slack
62d29b2157 do not HTML-escape prompt
The `html/template` package automatically HTML-escapes interpolated strings in templates. This behavior is undesirable because it causes prompts like `<h1>hello` to be escaped to `&lt;h1&gt;hello` before being passed to the LLM.

The included test case passes, but before the code change, it failed:

```
--- FAIL: TestModelPrompt
    images_test.go:21: got "a&lt;h1&gt;b", want "a<h1>b"
```
2023-09-01 17:16:38 -05:00
Michael Yang
ed19d10aa5
update readme (#451)
* update readme

* readme: more run examples
2023-09-01 16:44:14 -04:00
Michael Yang
36c2f45c40
Merge pull request #450 from jmorganca/mxyng/update-readme
update readme
2023-09-01 08:21:49 -07:00
Michael Yang
742226625f update readme 2023-09-01 10:54:31 -04:00
Matt Williams
6bb8a16ccb
Merge pull request #273 from jmorganca/matt/moreexamples
Create a sentiments example
2023-08-31 16:31:59 -07:00
Jeffrey Morgan
a5dbcf2e73 app: dont package ggml-metal.metal 2023-08-31 17:41:09 -04:00
Michael Yang
9304f0e7a8
Merge pull request #443 from jmorganca/mxyng/fix-list-models
windows: fix filepath bugs
2023-08-31 14:19:10 -07:00
Michael Yang
6578b2f8a1
Merge pull request #448 from callmephilip/patch-1
fix spelling errors in example prompts
2023-08-31 08:57:07 -07:00
Michael Yang
1c8fd627ad windows: fix create modelfile 2023-08-31 09:47:10 -04:00
Michael Yang
ae950b00f1 windows: fix delete 2023-08-31 09:47:10 -04:00
Michael Yang
eeb40a672c fix list models for windows 2023-08-31 09:47:10 -04:00
Michael Yang
0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
Philip Nuzhnyi
1363f537ce
fix spelling errors in prompt 2023-08-31 10:02:46 +01:00
Jeffrey Morgan
bc3e21fdc6 update README.md 2023-08-30 17:56:14 -04:00
Jeffrey Morgan
a82eb275ff update docs for subprocess 2023-08-30 17:54:02 -04:00
Bruce MacDonald
f964aea9a2 remove test not applicate to subprocess 2023-08-30 16:36:11 -04:00
Bruce MacDonald
42998d797d
subprocess llama.cpp server (#401)
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Quinn Slack
f4432e1dba
treat stop as stop sequences, not exact tokens (#442)
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.

Fixes https://github.com/jmorganca/ollama/issues/295.
2023-08-30 11:53:42 -04:00
Michael Yang
982c535428
Merge pull request #428 from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
Michael Yang
7df342a6ea
Merge pull request #421 from jmorganca/mxyng/f16-metal
allow F16 to use metal
2023-08-29 06:32:59 -07:00
Patrick Devine
8bbff2df98
add model IDs (#439) 2023-08-28 20:50:24 -07:00
Michael Yang
16b06699fd remove unused parameter 2023-08-28 18:35:18 -04:00
Michael Yang
246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
Michael Yang
865fceb73c chunked pipe 2023-08-28 18:34:53 -04:00
Michael Yang
72266c7684 bump chunk size to 95MB 2023-08-28 18:34:53 -04:00
Jeffrey Morgan
d3b838ce60 update orca to orca-mini 2023-08-27 13:26:30 -04:00
Michael Yang
e639a12fa1
Merge pull request #412 from jmorganca/mxyng/update-readme
update README.md
2023-08-26 21:26:34 -07:00
Michael Yang
e82fcf30c6
Merge pull request #420 from jmorganca/mxyng/34b-mem-check
add 34b to mem check
2023-08-26 14:15:52 -07:00
Michael Yang
495e8b0a6a
Merge pull request #426 from jmorganca/default-template
set default template
2023-08-26 14:15:38 -07:00
Michael Yang
59734ca24d set default template 2023-08-26 12:20:48 -07:00
Jeffrey Morgan
22ab7f5f88 default host to 127.0.0.1, fixes #424 2023-08-26 11:59:28 -07:00
Michael Yang
b25dd1795d allow F16 to use metal
warning F16 uses significantly more memory than quantized model so the
standard requires don't apply.
2023-08-26 08:38:48 -07:00
Michael Yang
304f2b6c96 add 34b to mem check 2023-08-26 08:29:21 -07:00
Quinn Slack
2ecc3a33c3
delete all models (not just 1st) in ollama rm (#415)
Previously, `ollama rm model1 model2 modelN` would only delete `model1`. The other model command-line arguments would be silently ignored. Now, all models mentioned are deleted.
2023-08-26 00:47:56 -07:00
Jeffrey Morgan
ee6e1df118 add codellama to model list in readme 2023-08-25 20:44:26 -07:00
Jeffrey Morgan
177b69a211 add missing entries for 34B 2023-08-25 18:35:35 -07:00
Michael Yang
dad63f0821
Merge pull request #411 from jmorganca/mxyng/34b
patch llama.cpp for 34B
2023-08-25 11:59:05 -07:00
Michael Yang
041f9ad1a1 update README.md 2023-08-25 11:44:25 -07:00
Michael Yang
7a378f8b66 patch llama.cpp for 34B 2023-08-25 10:06:55 -07:00
Michael Yang
de0bdd7f29
Merge pull request #405 from jmorganca/mxyng/34b
add 34b model type
2023-08-24 10:37:22 -07:00
Michael Yang
b1cececb8e add 34b model type 2023-08-24 10:35:44 -07:00
Michael Yang
e0d39fa3bf
Merge pull request #398 from jmorganca/mxyng/cleanup
Mxyng/cleanup
2023-08-22 15:51:41 -07:00
Michael Yang
968ced2e71
Merge pull request #393 from jmorganca/mxyng/net-url
use url.URL
2023-08-22 15:51:33 -07:00
Michael Yang
32d1a00017 remove unused requestContextKey 2023-08-22 10:49:54 -07:00