ollama

Author	SHA1	Message	Date
Josh	980dd15f81	cmd: speed up gguf creates (#6324 )	2024-08-12 11:46:09 -07:00
Josh	1dc3ef3aa9	Revert "server: speed up single gguf creates (#5898 )" (#6323 ) This reverts commit `8aac22438e`.	2024-08-12 09:57:51 -07:00
Josh	8aac22438e	server: speed up single gguf creates (#5898 )	2024-08-12 09:28:55 -07:00
Jesse Gross	7edaf6e7e8	manifest: Store layers inside manifests consistently as values. Commit `1829fb61` ("manifest: Fix crash on startup when trying to clean up unused files (#5840)") changed the config layer stored in manifests from a pointer to a value. This was done in order to avoid potential nil pointer dereferences after it is deserialized from JSON in the event that the field is missing. This changes the Layers slice to also be stored by value. This enables consistency in handling across the two objects.	2024-08-07 17:03:06 -07:00
Michael Yang	eafc607abb	convert: only extract large files	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	ec4c35fe99	Merge pull request #5512 from ollama/mxyng/detect-stop autodetect stop parameters from template	2024-07-26 13:48:23 -07:00
Jeffrey Morgan	b3e5491e41	server: collect nested tool call objects when parsing (#5824 )	2024-07-22 12:38:03 -04:00
Michael Yang	43606d6d6a	fix parsing tool calls	2024-07-18 12:08:11 -07:00
Michael Yang	b255445557	marshal json automatically for some template values (#5758 )	2024-07-17 15:35:11 -07:00
Michael Yang	5fd6988126	parse tool call as individual objects	2024-07-17 11:19:04 -07:00
Michael Yang	5a83f79afd	remove unneeded tool calls	2024-07-16 13:48:45 -07:00
Michael Yang	5afbb60fc4	fix unmarshal type errors	2024-07-16 11:39:34 -07:00
Michael Yang	d02bbebb11	tools	2024-07-15 15:26:16 -07:00
Michael Yang	ebc529cbb3	autodetect stop parameters from template	2024-07-12 16:01:23 -07:00
Michael Yang	dddb58a38b	Merge pull request #5051 from ollama/mxyng/capabilities add model capabilities	2024-07-02 14:26:07 -07:00
Michael Yang	88bcd79bb9	err on insecure path	2024-07-01 15:55:59 -07:00
Michael Yang	58e3fff311	rename templates to template	2024-07-01 10:40:54 -07:00
Michael Yang	123a722a6f	zip: prevent extracting files into parent dirs (#5314 )	2024-06-26 21:38:21 -07:00
Blake Mizerany	cb42e607c5	llm: speed up gguf decoding by a lot (#5246 ) Previously, some costly things were causing the loading of GGUF files and their metadata and tensor information to be VERY slow: * Too many allocations when decoding strings * Hitting disk for each read of each key and value, resulting in a not-okay amount of syscalls/disk I/O. The show API is now down to 33ms from 800ms+ for llama3 on a macbook pro m3. This commit also prevents collecting large arrays of values when decoding GGUFs (if desired). When such keys are encountered, their values are null, and are encoded as such in JSON. Also, this fixes a broken test that was not encoding valid GGUF.	2024-06-24 21:47:52 -07:00
Michael Yang	c16f8af911	fix: multiple templates when creating from model multiple templates may appear in a model if a model is created from another model that 1) has an autodetected template and 2) defines a custom template	2024-06-12 13:35:49 -07:00
Michael Yang	d61ef8b954	update create handler to use model.Name	2024-06-04 13:28:25 -07:00
Michael Yang	e40145a39d	lint	2024-06-04 11:13:30 -07:00
Michael Yang	f36f1d6be9	tidy intermediate blobs	2024-05-20 15:15:06 -07:00
Michael Yang	3520c0e4d5	cache and reuse intermediate blobs particularly useful for zipfiles and f16s	2024-05-20 13:25:10 -07:00
Michael Yang	b2f00aa977	close zip files	2024-05-06 15:27:19 -07:00
Michael Yang	f5e8b207fb	s/DisplayLongest/String/	2024-05-06 15:24:01 -07:00
Michael Yang	4d0d0fa383	no iterator	2024-05-06 15:24:01 -07:00
Michael Yang	01811c176a	comments	2024-05-06 15:24:01 -07:00
Michael Yang	9685c34509	quantize any fp16/fp32 model - FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32}	2024-05-06 15:24:01 -07:00

30 commits