Michael Yang
5e9db9fb0b
refactor convert
2024-07-31 15:58:33 -07:00
Michael Yang
ec4c35fe99
Merge pull request #5512 from ollama/mxyng/detect-stop
...
autodetect stop parameters from template
2024-07-26 13:48:23 -07:00
Jeffrey Morgan
b3e5491e41
server: collect nested tool call objects when parsing ( #5824 )
2024-07-22 12:38:03 -04:00
Michael Yang
43606d6d6a
fix parsing tool calls
2024-07-18 12:08:11 -07:00
Michael Yang
b255445557
marshal json automatically for some template values ( #5758 )
2024-07-17 15:35:11 -07:00
Michael Yang
5fd6988126
parse tool call as individual objects
2024-07-17 11:19:04 -07:00
Michael Yang
5a83f79afd
remove unneeded tool calls
2024-07-16 13:48:45 -07:00
Michael Yang
5afbb60fc4
fix unmarshal type errors
2024-07-16 11:39:34 -07:00
Michael Yang
d02bbebb11
tools
2024-07-15 15:26:16 -07:00
Michael Yang
ebc529cbb3
autodetect stop parameters from template
2024-07-12 16:01:23 -07:00
Michael Yang
dddb58a38b
Merge pull request #5051 from ollama/mxyng/capabilities
...
add model capabilities
2024-07-02 14:26:07 -07:00
Michael Yang
88bcd79bb9
err on insecure path
2024-07-01 15:55:59 -07:00
Michael Yang
58e3fff311
rename templates to template
2024-07-01 10:40:54 -07:00
Michael Yang
123a722a6f
zip: prevent extracting files into parent dirs ( #5314 )
2024-06-26 21:38:21 -07:00
Blake Mizerany
cb42e607c5
llm: speed up gguf decoding by a lot ( #5246 )
...
Previously, some costly things were causing the loading of GGUF files
and their metadata and tensor information to be VERY slow:
* Too many allocations when decoding strings
* Hitting disk for each read of each key and value, resulting in a
not-okay amount of syscalls/disk I/O.
The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro
m3.
This commit also prevents collecting large arrays of values when
decoding GGUFs (if desired). When such keys are encountered, their
values are null, and are encoded as such in JSON.
Also, this fixes a broken test that was not encoding valid GGUF.
2024-06-24 21:47:52 -07:00
Michael Yang
c16f8af911
fix: multiple templates when creating from model
...
multiple templates may appear in a model if a model is created from
another model that 1) has an autodetected template and 2) defines a
custom template
2024-06-12 13:35:49 -07:00
Michael Yang
d61ef8b954
update create handler to use model.Name
2024-06-04 13:28:25 -07:00
Michael Yang
e40145a39d
lint
2024-06-04 11:13:30 -07:00
Michael Yang
f36f1d6be9
tidy intermediate blobs
2024-05-20 15:15:06 -07:00
Michael Yang
3520c0e4d5
cache and reuse intermediate blobs
...
particularly useful for zipfiles and f16s
2024-05-20 13:25:10 -07:00
Michael Yang
b2f00aa977
close zip files
2024-05-06 15:27:19 -07:00
Michael Yang
f5e8b207fb
s/DisplayLongest/String/
2024-05-06 15:24:01 -07:00
Michael Yang
4d0d0fa383
no iterator
2024-05-06 15:24:01 -07:00
Michael Yang
01811c176a
comments
2024-05-06 15:24:01 -07:00
Michael Yang
9685c34509
quantize any fp16/fp32 model
...
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
2024-05-06 15:24:01 -07:00