Commit graph

82 commits

Author SHA1 Message Date
Michael Yang
f40b3de758 use int64 consistently 2023-09-28 11:07:24 -07:00
Patrick Devine
8efbc5df55
DRAFT: add a simple python client to access ollama () 2023-09-14 16:37:38 -07:00
Bruce MacDonald
f221637053
first pass at linux gpu support ()
* linux gpu support
* handle multiple gpus
* add cuda docker image ()
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
2023-09-12 11:04:35 -04:00
Patrick Devine
790d24eb7b
add show command () 2023-09-06 11:04:17 -07:00
Michael Yang
0f541a0367 s/ListResponseModel/ModelResponse/ 2023-08-31 09:47:10 -04:00
Bruce MacDonald
42998d797d
subprocess llama.cpp server ()
* remove c code
* pack llama.cpp
* use request context for llama_cpp
* let llama_cpp decide the number of threads to use
* stop llama runner when app stops
* remove sample count and duration metrics
* use go generate to get libraries
* tmp dir for running llm
2023-08-30 16:35:03 -04:00
Michael Yang
982c535428
Merge pull request from jmorganca/mxyng/upload-chunks
update upload chunks
2023-08-30 07:47:17 -07:00
Patrick Devine
8bbff2df98
add model IDs () 2023-08-28 20:50:24 -07:00
Michael Yang
246dc65417 loosen http status code checks 2023-08-28 18:34:53 -04:00
Jeffrey Morgan
22ab7f5f88 default host to 127.0.0.1, fixes 2023-08-26 11:59:28 -07:00
Michael Yang
2c7f956b38 add version 2023-08-22 09:40:58 -07:00
Michael Yang
f723bf0879 ignore nil map values 2023-08-17 15:50:46 -07:00
Jeffrey Morgan
54bb49a502 parse protocol for OLLAMA_HOST 2023-08-17 18:20:44 -04:00
Jeffrey Morgan
5ee6116420 set default OLLAMA_HOST to http://localhost:11434 2023-08-16 12:22:59 -04:00
Blake Mizerany
67e593e355
cmd: support OLLAMA_CLIENT_HOST environment variable ()
* cmd: support OLLAMA_HOST environment variable

This commit adds support for the OLLAMA_HOST environment
variable. This variable can be used to specify the host to which
the client should connect. This is useful when the client is
running somewhere other than the host where the server is running.

The new api.FromEnv function is used to read configure clients from the
environment. Clients wishing to use the environment variable being
consistent with the Ollama CLI can use this new function.

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

* Update api/client.go

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>

---------

Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-16 11:03:48 -04:00
Michael Yang
f27bc261cf s/parmeter/parameter/ 2023-08-10 16:26:06 -07:00
Michael Yang
81d8d7b73f fix could not convert int 2023-08-10 16:24:17 -07:00
Patrick Devine
be989d89d1
Token auth () 2023-08-10 11:34:25 -07:00
Bruce MacDonald
4b3507f036 embeddings endpoint
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com>
2023-08-10 11:45:57 -04:00
Bruce MacDonald
7a5f3616fd
embed text document in modelfile 2023-08-09 10:26:19 -04:00
Bruce MacDonald
21ddcaa1f1 pr comments
- default to embeddings enabled
- move embedding logic for loaded model to request
- allow embedding full directory
- close llm on reload
2023-08-08 13:49:37 -04:00
Michael Yang
f2074ed4c0
Merge pull request from jmorganca/default-keep-system
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Jeffrey Morgan
8713ac23a8 allow overriding template and system in /api/generate
Fixes 
Fixes 
2023-08-08 00:55:34 -04:00
Michael Yang
4dc5b117dd automatically set num_keep if num_keep < 0
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Michael Yang
b9f4d67554 configurable rope frequency parameters 2023-08-03 22:11:58 -07:00
Bruce MacDonald
8b1e791820
allow specifying zero values in modelfile 2023-08-02 17:07:53 -04:00
Bruce MacDonald
8f8b6288ac
check server is running before running command 2023-08-02 10:51:23 -04:00
Bruce MacDonald
765994362c use head to check heartbeat 2023-08-01 14:50:38 -04:00
Bruce MacDonald
1c5a8770ee read runner parameter options from map
- read runner options from map to see what was specified explicitly and overwrite zero values
2023-08-01 13:38:19 -04:00
Jeffrey Morgan
528bafa585 cache loaded model 2023-08-01 11:24:18 -04:00
Bruce MacDonald
e72fe7945f check server is running before running command 2023-07-31 16:25:57 -04:00
Bruce MacDonald
184ad8f057 allow specifying stop conditions in modelfile 2023-07-28 11:02:04 -04:00
Jeffrey Morgan
822a0e36eb lower batch size to 512 2023-07-28 10:56:21 -04:00
Michael Yang
fadf75f99d add stop conditions 2023-07-27 17:00:47 -07:00
Michael Yang
ad3a7d0e2c add NumGQA 2023-07-27 14:05:11 -07:00
Jeffrey Morgan
688661ab9b increase default batch size to 1024 2023-07-27 16:51:01 -04:00
Michael Yang
cca61181cb sample metrics 2023-07-27 09:31:44 -07:00
Michael Yang
c490416189 lock on llm.lock(); decrease batch size 2023-07-27 09:31:44 -07:00
Michael Yang
f62a882760 add session expiration 2023-07-27 09:31:44 -07:00
Michael Yang
3003fc03fc update predict code 2023-07-27 09:31:44 -07:00
Michael Yang
32aec66e6a add load duration 2023-07-27 09:31:44 -07:00
Michael Yang
35af37a2cb session id 2023-07-27 09:31:44 -07:00
Bruce MacDonald
4c1caa3733 download models when creating from modelfile 2023-07-25 14:25:13 -04:00
Bruce MacDonald
536028c35a better error message when model not found on pull 2023-07-24 17:48:17 -04:00
Patrick Devine
4cb42ca55e
add copy command () 2023-07-24 11:27:28 -04:00
Patrick Devine
6d6b0d3321
change error handler behavior and fix error when a model isn't found () 2023-07-21 23:02:12 -07:00
Patrick Devine
9f6e97865c
allow pushing/pulling to insecure registries () 2023-07-21 15:42:19 -07:00
Bruce MacDonald
7ba1308595
Merge pull request from jmorganca/brucemacd/cli-err-display
Improve CLI error display
2023-07-21 16:10:19 +02:00
Patrick Devine
e7a393de54
add rm command for models () 2023-07-20 16:09:23 -07:00
Michael Yang
1f27d7f1b8 fix stream errors 2023-07-20 12:12:08 -07:00