Michael Yang
72e7a49aa9
seek instead of copyn
2023-12-04 16:59:23 -08:00
Michael Yang
998f1785b6
add modelfamilies
2023-12-04 16:59:23 -08:00
Michael Yang
2cb0fa7d40
split from into one or more models
2023-12-04 16:59:23 -08:00
Michael Yang
7232f1fa41
go mod tidy
2023-12-04 16:59:23 -08:00
Michael Yang
a3737cbd33
use NewLayer for CreateBlobHandler
2023-12-04 16:59:23 -08:00
Michael Yang
70a93057cd
refactor layer creation
...
previous layer creation was not ideal because:
1. it required reading the input file multiple times, once to calculate
the sha256 checksum, another to write it to disk, and potentially one
more to decode the underlying gguf
2. used io.ReadSeeker which is prone to user error. if the file isn't
reset correctly or in the right place, it could end up reading an
empty file
there are also some brittleness when reading existing layers else
writing the inherited layers will error reading an already closed file
this commit aims to fix these issues by restructuring layer creation.
1. it will now write the layer to a temporary file as well as the hash
function and move it to the final location on Commit
2. layers are read once once when copied to the destination. exception
is raw model files which still requires a second read to decode the
model metadata
2023-12-04 16:59:23 -08:00
Michael Yang
b2816bca67
unnecessary ReadSeeker for DecodeGGML
2023-12-04 16:59:23 -08:00
Patrick Devine
bf704423c5
revert cli to use /api/generate ( #1383 )
2023-12-04 16:35:29 -08:00
Bruce MacDonald
7a0899d62d
chat api ( #991 )
...
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Michael Yang
0cca1486dd
Merge pull request #1376 from jmorganca/mxyng/rocky-install
...
install: fix rocky kernel packages
2023-12-04 14:23:43 -08:00
Patrick Devine
2113c9d31a
make linewrap still work when the terminal width has changed ( #1350 )
2023-12-04 14:14:56 -08:00
Michael Yang
95cb38ae47
install: fix rocky kernel packages
2023-12-04 11:10:42 -08:00
ruecat
1f126afb2d
Ollama Telegram Bot ( #1364 )
...
* Add "ollama-telegram" to Extensions & Plugins
* Update README.md
2023-12-03 11:19:55 -08:00
Jeffrey Morgan
f6201a7a6c
remove duplicate community integration in README.md
2023-12-02 21:18:13 -08:00
Michael Yang
b3f6c6598f
Merge pull request #1349 from jmorganca/mxyng/ctrl-z
...
handle ctrl+z
2023-12-01 16:21:49 -08:00
Michael Yang
88620e983a
handle ctrl+z
2023-12-01 16:15:20 -08:00
Michael Yang
cedae0d17a
Merge pull request #1347 from jshph/adapter-hash
...
Fix adapter loading from SHA hash
2023-12-01 11:08:25 -08:00
Joshua Pham
bb80a597db
Fix adapter loading from SHA hash
2023-12-01 13:50:55 -05:00
Patrick Devine
6681d37861
allow setting the system and template for prompts in the repl ( #1335 )
2023-12-01 09:28:35 -08:00
Michael Yang
0409c1fa59
docker: set PATH, LD_LIBRARY_PATH, and capabilities ( #1336 )
...
* docker: set PATH, LD_LIBRARY_PATH, and capabilities
* example: update k8s gpu manifest
2023-11-30 21:16:56 -08:00
Michael Yang
b56e92470a
Merge pull request #1229 from jmorganca/mxyng/calculate-as-you-go
...
revert checksum calculation to calculate-as-you-go
2023-11-30 10:54:38 -08:00
Jeffrey Morgan
5687f1a0cf
fix unexpected end of response
errors when cancelling in ollama run
2023-11-30 00:30:21 -05:00
James Radtke
7eda3d0c55
Corrected transposed 129 to 192 for OLLAMA_ORIGINS example ( #1325 )
2023-11-29 22:44:17 -05:00
Bruce MacDonald
7194a07d4d
Add chatd to example projects
2023-11-29 21:18:21 -05:00
Michael Yang
13efd5f218
upload: fix PUT retry
2023-11-29 16:38:35 -08:00
Michael Yang
c4bdfffd96
upload: separate progress tracking
2023-11-29 16:38:33 -08:00
Michael Yang
26c63418e0
new hasher
2023-11-29 14:52:41 -08:00
Michael Yang
2799784ac8
revert checksum calculation to calculate-as-you-go
2023-11-29 13:47:58 -08:00
Alec Hammond
91897a606f
Add OllamaEmbeddings to python LangChain example ( #994 )
...
* Add OllamaEmbeddings to python LangChain example
* typo
---------
Co-authored-by: Alec Hammond <alechammond@fb.com>
2023-11-29 16:25:39 -05:00
Bruce MacDonald
96122b7271
validate model tags on copy ( #1323 )
2023-11-29 15:54:29 -05:00
jeremiahbuckley
39be7fdb98
fix rhel cuda install ( #1321 )
...
Co-authored-by: Cloud User <azureuser@testgpu2.hqzwom21okjenksna4y3c4ymjd.phxx.internal.cloudapp.net>
2023-11-29 14:55:15 -05:00
Timothy Jaeryang Baek
c2e3b89176
fix: disable ':' in tag names ( #1280 )
...
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Patrick Devine
cde31cb220
Allow setting parameters in the REPL ( #1294 )
2023-11-29 09:56:42 -08:00
ToasterUwU
63097607b2
Correct MacOS Host port example ( #1301 )
2023-11-29 11:44:03 -05:00
Michael
2ae80e1e27
Update README.md
...
add new recent models as examples
2023-11-28 22:16:37 -05:00
Michael Yang
b173cfc558
Merge pull request #1195 from jmorganca/mxyng/fix-bar-rate
...
progress: fix bar rate
2023-11-28 11:55:23 -08:00
Michael Yang
424d53ac70
progress: fix bar rate
2023-11-28 11:44:56 -08:00
ftorto
e1a69d44c9
Update faq.md ( #1299 )
...
Fix a typo in the CA update command
2023-11-28 09:54:42 -05:00
Jason Jacobs
3d620f9462
ignore jetbrain ides ( #1287 )
2023-11-27 15:57:45 -05:00
Bruce MacDonald
928950fcc6
update python client create example ( #1227 )
...
* add remote create to python example client
2023-11-27 15:36:19 -05:00
Kasumi
39c6d949fc
Add Amica to community integrations ( #1281 )
2023-11-27 10:44:37 -05:00
Jeffrey Morgan
16a9006306
add back f16c
instructions on intel mac
2023-11-26 15:59:49 -05:00
Jeffrey Morgan
e9216ea459
fix readline history on linux
2023-11-26 15:59:04 -05:00
Jeffrey Morgan
9e4a316405
update submodule commit
2023-11-26 14:52:00 -05:00
Jeffrey Morgan
9fb5e8399c
Fix issues with inputting and formatting multi line strings in ollama run
...
Co-authored-by: Wen Sun <iwendellsun@gmail.com>
2023-11-26 12:54:29 -05:00
Jing Zhang
82b9b329ff
windows CUDA support ( #1262 )
...
* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows
2023-11-24 17:16:36 -05:00
Jongwook Choi
12e8c12d2b
Disable CUDA peer access as a workaround for multi-gpu inference bug ( #1261 )
...
When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.
See #961 .
2023-11-24 14:05:57 -05:00
Jeffrey Morgan
d77dde126b
consistent cpu instructions on macos and linux
2023-11-22 16:26:46 -05:00
Michael Yang
c7e70cd3bb
Merge pull request #1245 from jmorganca/mxyng/gguf-int
...
fix: gguf int type
2023-11-22 11:42:56 -08:00
Michael Yang
199941cd15
fix: gguf int type
2023-11-22 11:40:30 -08:00