Michael Yang
ed19d10aa5
update readme ( #451 )
...
* update readme
* readme: more run examples
2023-09-01 16:44:14 -04:00
Michael Yang
36c2f45c40
Merge pull request #450 from jmorganca/mxyng/update-readme
...
update readme
2023-09-01 08:21:49 -07:00
Michael Yang
742226625f
update readme
2023-09-01 10:54:31 -04:00
Matt Williams
6bb8a16ccb
Merge pull request #273 from jmorganca/matt/moreexamples
...
Create a sentiments example
2023-08-31 16:31:59 -07:00
Jeffrey Morgan
a5dbcf2e73
app: dont package ggml-metal.metal
2023-08-31 17:41:09 -04:00
Michael Yang
9304f0e7a8
Merge pull request #443 from jmorganca/mxyng/fix-list-models
...
windows: fix filepath bugs
2023-08-31 14:19:10 -07:00
Michael Yang
6578b2f8a1
Merge pull request #448 from callmephilip/patch-1
...
fix spelling errors in example prompts
2023-08-31 08:57:07 -07:00
Michael Yang
1c8fd627ad
windows: fix create modelfile
2023-08-31 09:47:10 -04:00
Michael Yang
ae950b00f1
windows: fix delete
2023-08-31 09:47:10 -04:00
Michael Yang
eeb40a672c
fix list models for windows
2023-08-31 09:47:10 -04:00
Michael Yang
0f541a0367
s/ListResponseModel/ModelResponse/
2023-08-31 09:47:10 -04:00
Philip Nuzhnyi
1363f537ce
fix spelling errors in prompt
2023-08-31 10:02:46 +01:00
Jeffrey Morgan
bc3e21fdc6
update README.md
2023-08-30 17:56:14 -04:00
Jeffrey Morgan
a82eb275ff
update docs for subprocess
2023-08-30 17:54:02 -04:00
Bruce MacDonald
f964aea9a2
remove test not applicate to subprocess
2023-08-30 16:36:11 -04:00
Bruce MacDonald
42998d797d
subprocess llama.cpp server ( #401 )
...
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Quinn Slack
f4432e1dba
treat stop as stop sequences, not exact tokens ( #442 )
...
The `stop` option to the generate API is a list of sequences that should cause generation to stop. Although these are commonly called "stop tokens", they do not necessarily correspond to LLM tokens (per the LLM's tokenizer). For example, if the caller sends a generate request with `"stop":["\n"]`, then generation should stop on any token containing `\n` (and trim `\n` from the output), not just if the token exactly matches `\n`. If `stop` were interpreted strictly as LLM tokens, then it would require callers of the generate API to know the LLM's tokenizer and enumerate many tokens in the `stop` list.
Fixes https://github.com/jmorganca/ollama/issues/295 .
2023-08-30 11:53:42 -04:00
Michael Yang
982c535428
Merge pull request #428 from jmorganca/mxyng/upload-chunks
...
update upload chunks
2023-08-30 07:47:17 -07:00
Michael Yang
7df342a6ea
Merge pull request #421 from jmorganca/mxyng/f16-metal
...
allow F16 to use metal
2023-08-29 06:32:59 -07:00
Patrick Devine
8bbff2df98
add model IDs ( #439 )
2023-08-28 20:50:24 -07:00
Michael Yang
16b06699fd
remove unused parameter
2023-08-28 18:35:18 -04:00
Michael Yang
246dc65417
loosen http status code checks
2023-08-28 18:34:53 -04:00
Michael Yang
865fceb73c
chunked pipe
2023-08-28 18:34:53 -04:00
Michael Yang
72266c7684
bump chunk size to 95MB
2023-08-28 18:34:53 -04:00
Jeffrey Morgan
d3b838ce60
update orca
to orca-mini
2023-08-27 13:26:30 -04:00
Michael Yang
e639a12fa1
Merge pull request #412 from jmorganca/mxyng/update-readme
...
update README.md
2023-08-26 21:26:34 -07:00
Michael Yang
e82fcf30c6
Merge pull request #420 from jmorganca/mxyng/34b-mem-check
...
add 34b to mem check
2023-08-26 14:15:52 -07:00
Michael Yang
495e8b0a6a
Merge pull request #426 from jmorganca/default-template
...
set default template
2023-08-26 14:15:38 -07:00
Michael Yang
59734ca24d
set default template
2023-08-26 12:20:48 -07:00
Jeffrey Morgan
22ab7f5f88
default host to 127.0.0.1
, fixes #424
2023-08-26 11:59:28 -07:00
Michael Yang
b25dd1795d
allow F16 to use metal
...
warning F16 uses significantly more memory than quantized model so the
standard requires don't apply.
2023-08-26 08:38:48 -07:00
Michael Yang
304f2b6c96
add 34b to mem check
2023-08-26 08:29:21 -07:00
Quinn Slack
2ecc3a33c3
delete all models (not just 1st) in ollama rm
( #415 )
...
Previously, `ollama rm model1 model2 modelN` would only delete `model1`. The other model command-line arguments would be silently ignored. Now, all models mentioned are deleted.
2023-08-26 00:47:56 -07:00
Jeffrey Morgan
ee6e1df118
add codellama
to model list in readme
2023-08-25 20:44:26 -07:00
Jeffrey Morgan
177b69a211
add missing entries for 34B
2023-08-25 18:35:35 -07:00
Michael Yang
dad63f0821
Merge pull request #411 from jmorganca/mxyng/34b
...
patch llama.cpp for 34B
2023-08-25 11:59:05 -07:00
Michael Yang
041f9ad1a1
update README.md
2023-08-25 11:44:25 -07:00
Michael Yang
7a378f8b66
patch llama.cpp for 34B
2023-08-25 10:06:55 -07:00
Michael Yang
de0bdd7f29
Merge pull request #405 from jmorganca/mxyng/34b
...
add 34b model type
2023-08-24 10:37:22 -07:00
Michael Yang
b1cececb8e
add 34b model type
2023-08-24 10:35:44 -07:00
Michael Yang
e0d39fa3bf
Merge pull request #398 from jmorganca/mxyng/cleanup
...
Mxyng/cleanup
2023-08-22 15:51:41 -07:00
Michael Yang
968ced2e71
Merge pull request #393 from jmorganca/mxyng/net-url
...
use url.URL
2023-08-22 15:51:33 -07:00
Michael Yang
32d1a00017
remove unused requestContextKey
2023-08-22 10:49:54 -07:00
Michael Yang
04e2128273
move upload funcs to upload.go
2023-08-22 10:49:53 -07:00
Michael Yang
2cc634689b
use url.URL
2023-08-22 10:49:07 -07:00
Michael Yang
8f827641b0
Merge pull request #397 from jmorganca/mxyng/release-mode
...
build release mode
2023-08-22 10:48:44 -07:00
Michael Yang
95187d7e1e
build release mode
2023-08-22 09:52:43 -07:00
Michael Yang
9ec7e37534
Merge pull request #392 from jmorganca/mxyng/version
...
add version
2023-08-22 09:50:25 -07:00
Michael Yang
2c7f956b38
add version
2023-08-22 09:40:58 -07:00
Jeffrey Morgan
a9f6c56652
fix FROM
instruction erroring when referring to a file
2023-08-22 09:39:42 -07:00