Daniel Hiltgen
bee2f4a3b0
Record GPU usage information
...
This records more GPU usage information for eventual UX inclusion.
2024-05-08 14:45:39 -07:00
Michael Yang
88cf154483
Merge pull request #4244 from ollama/mxyng/skip-if-same
...
skip if same quantization
2024-05-07 19:03:37 -07:00
Bruce MacDonald
8cbd3e7510
skip hidden files in list models handler ( #4247 )
2024-05-07 19:01:45 -07:00
Michael Yang
eeb695261f
skip if same quantization
2024-05-07 17:44:19 -07:00
Bruce MacDonald
dc9b1111e0
fix invalid destination error message
2024-05-07 17:35:52 -07:00
Tobias Gårdhus
06ac829e70
Fix help string for stop parameter ( #2307 )
2024-05-07 16:48:35 -07:00
boessu
5d3f7fff26
Update langchainpy.md ( #4236 )
...
fixing pip code.
2024-05-07 16:36:34 -07:00
Eli Bendersky
d77c1c5f9d
api: fill up API documentation ( #3596 )
...
* api: fill up API documentation
Followup for #2878
Now that the documentation is more complete, mention it in the README.
Updates #2840
* fix typo/lint
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-07 16:27:46 -07:00
Giuseppe Lumia
2a5302a1cf
Fix paste of text with line feed characters ( #3043 )
...
Some terminals may send line feed characters when pasting text with
newlines.
2024-05-07 15:26:07 -07:00
Michael Yang
ffbd3d173f
Merge pull request #3715 from ollama/mxyng/modelname-2
...
update list handler to use model.Name
2024-05-07 15:21:39 -07:00
Michael Yang
1e0a669f75
Merge pull request #3682 from ollama/mxyng/quantize-all-the-things
...
quantize any fp16/fp32 model
2024-05-07 15:20:49 -07:00
Bruce MacDonald
527e9be058
fix: store accurate model parameter size ( #4058 )
...
- add test for number formatting
- fix bug where 1B and 1M were not stored correctly
- display 2 decimal points for million param sizes
- display 1 decimal point for billion param sizes
2024-05-07 14:41:53 -07:00
Renat
34bea2e272
Add macai to list of Web & Desktop integrations ( #3881 )
2024-05-07 13:31:34 -07:00
Fernando Maclen
fe44ae3371
Update README.md ( #3884 )
2024-05-07 13:17:35 -07:00
Michael Yang
adeb40eaf2
Merge pull request #4231 from ollama/mxyng/parser
...
types/model: fix parser for empty values
2024-05-07 10:48:32 -07:00
Michael Yang
d7d33e5255
Merge pull request #951 from ollama/mxyng/example-fly
...
fly example
2024-05-07 10:46:24 -07:00
Michael Yang
63bc884e25
types/model: fix parser for empty values
2024-05-07 10:44:43 -07:00
Michael Yang
ef4e095d24
Merge pull request #4232 from ollama/revert-4190-fix/golang-ci
...
Revert "fix golangci workflow not enable gofmt and goimports"
2024-05-07 10:39:37 -07:00
Michael Yang
4d4f75a8a8
Revert "fix golangci workflow missing gofmt and goimports ( #4190 )"
...
This reverts commit 04f971c84b
.
2024-05-07 10:35:44 -07:00
Mélony QIN
3f71ba406a
Correct the kubernetes terminology ( #3843 )
...
* add details on kubernetes deployment and separate the testing process
* Update examples/kubernetes/README.md
thanks for suggesting this change, I agree with you and let's make this project better together !
Co-authored-by: JonZeolla <Zeolla@gmail.com>
---------
Co-authored-by: QIN Mélony <MQN1@dsone.3ds.com>
Co-authored-by: JonZeolla <Zeolla@gmail.com>
2024-05-07 09:53:08 -07:00
Hause Lin
88a67127d8
Update README.md to include ollama-r library ( #4012 )
...
* Update README.md
Add Ollama for R - ollama-r library
* Update README.md
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-07 09:52:30 -07:00
Jeffrey Morgan
f7dc7dcc64
Update .gitattributes
2024-05-07 09:50:19 -07:00
alwqx
04f971c84b
fix golangci workflow missing gofmt and goimports ( #4190 )
2024-05-07 09:49:40 -07:00
Michael Yang
548a7df014
update list handler to use model.Name
2024-05-07 09:38:45 -07:00
Michael Yang
70edb9bc4d
Merge pull request #4215 from ollama/mxyng/mem
...
llm: add minimum based on layer size
2024-05-07 09:26:33 -07:00
Michael Yang
3f0ed03856
Update examples/flyio/README.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-07 09:25:01 -07:00
Michael Yang
4736391bfb
llm: add minimum based on layer size
2024-05-06 17:04:19 -07:00
CrispStrobe
7c5330413b
note on naming restrictions ( #2625 )
...
* note on naming restrictions
else push would fail with cryptic
retrieving manifest
Error: file does not exist
==> maybe change that in code too
* Update docs/import.md
---------
Co-authored-by: C-4-5-3 <154636388+C-4-5-3@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 16:03:21 -07:00
Jeffrey Morgan
39d9d22ca3
close server on receiving signal ( #4213 )
2024-05-06 16:01:37 -07:00
Jackie Li
af47413dba
Add MarshalJSON to Duration ( #3284 )
...
---------
Co-authored-by: Patrick Devine <patrick@infrahq.com>
2024-05-06 15:59:18 -07:00
Michael Yang
b2f00aa977
close zip files
2024-05-06 15:27:19 -07:00
Michael Yang
6694be5e50
convert/llama: use WriteSeeker
2024-05-06 15:24:01 -07:00
Michael Yang
f5e8b207fb
s/DisplayLongest/String/
2024-05-06 15:24:01 -07:00
Michael Yang
d245460362
only quantize language models
2024-05-06 15:24:01 -07:00
Michael Yang
4d0d0fa383
no iterator
2024-05-06 15:24:01 -07:00
Michael Yang
7ffe45734d
rebase
2024-05-06 15:24:01 -07:00
Michael Yang
01811c176a
comments
2024-05-06 15:24:01 -07:00
Michael Yang
a7248f6ea8
update tests
2024-05-06 15:24:01 -07:00
Michael Yang
9685c34509
quantize any fp16/fp32 model
...
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
2024-05-06 15:24:01 -07:00
Jeffrey Chen
d091fe3c21
Windows automatically recognizes username ( #3214 )
2024-05-06 15:03:14 -07:00
Mohamed A. Fouad
ee02f548c8
Update linux.md ( #3847 )
...
Add -e to viewing logs in order to show end of ollama logs
2024-05-06 15:02:25 -07:00
Daniel Hiltgen
b08870aff3
Merge pull request #4188 from dhiltgen/use_our_lib
...
User our bundled libraries (cuda) instead of the host library
2024-05-06 14:41:05 -07:00
Darinka
3ecae420ac
Update api.md ( #3945 )
...
* Update api.md
Changed the calculation of tps (token/s) in the documentation
* Update docs/api.md
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-05-06 14:39:58 -07:00
Daniel Hiltgen
4cbbf0e13b
Merge pull request #4090 from dhiltgen/rocm_paths
...
Support Fedoras standard ROCm location
2024-05-06 14:33:41 -07:00
Daniel Hiltgen
380378cc80
Use our libraries first
...
Trying to live off the land for cuda libraries was not the right strategy. We need to use the version we compiled against to ensure things work properly
2024-05-06 14:23:29 -07:00
Daniel Hiltgen
0963c65027
Merge pull request #4208 from dhiltgen/fix_sched_test
...
Fix stale test logic
2024-05-06 14:23:12 -07:00
Jeffrey Morgan
ed740a2504
Fix no slots available
error with concurrent requests ( #4160 )
2024-05-06 14:22:53 -07:00
Jeffrey Morgan
c9f98622b1
Skip scheduling cancelled requests, always reload unloaded runners ( #4189 )
2024-05-06 14:22:24 -07:00
Daniel Hiltgen
0a954e5066
Fix stale test logic
...
The model processing was recently changed to be deferred but
this test scenario hadn't been adjusted for that change in behavior.
2024-05-06 14:15:37 -07:00
Adrien Brault
aa93423fbf
docs: pbcopy on mac ( #3129 )
2024-05-06 13:47:00 -07:00