ollama

Author	SHA1	Message	Date
royjhan	8b00a415ab	Load Embedding Model on Empty Input (#6325 ) * load on empty input * no load on invalid input	2024-08-13 10:19:56 -07:00
Josh	980dd15f81	cmd: speed up gguf creates (#6324 )	2024-08-12 11:46:09 -07:00
Josh	1dc3ef3aa9	Revert "server: speed up single gguf creates (#5898 )" (#6323 ) This reverts commit `8aac22438e`.	2024-08-12 09:57:51 -07:00
Josh	8aac22438e	server: speed up single gguf creates (#5898 )	2024-08-12 09:28:55 -07:00
Jeffrey Morgan	15c2d8fe14	server: parallelize embeddings in API web handler instead of in subprocess runner (#6220 ) For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.	2024-08-11 11:57:10 -07:00
Jesse Gross	9b53e39d8e	Merge pull request #6258 from coolljt0725/fix_typo server/download.go: Fix a typo in log	2024-08-09 17:19:48 -07:00
Daniel Hiltgen	2fa1db4345	Don't hard fail on sparse setup error It seems this can fail in some casees, but proceed with the download anyway.	2024-08-09 12:16:19 -07:00
Jitang Lei	7b61eba471	server/download.go: Fix a typo in log Signed-off-by: Jitang Lei <leijitang@outlook.com>	2024-08-08 20:28:01 +08:00
Jesse Gross	7edaf6e7e8	manifest: Store layers inside manifests consistently as values. Commit `1829fb61` ("manifest: Fix crash on startup when trying to clean up unused files (#5840)") changed the config layer stored in manifests from a pointer to a value. This was done in order to avoid potential nil pointer dereferences after it is deserialized from JSON in the event that the field is missing. This changes the Layers slice to also be stored by value. This enables consistency in handling across the two objects.	2024-08-07 17:03:06 -07:00
Jesse Gross	97ec8cfd4e	image: Clarify argument to WriteManifest is config When creating a model the config layer is appended to the list of layers and then the last layer is used as the config when writing the manifest. This change directly uses the config layer to write the manifest. There is no behavior change but it is less error prone.	2024-08-07 16:58:42 -07:00
Jesse Gross	1829fb61bd	manifest: Fix crash on startup when trying to clean up unused files (#5840 ) Currently if the config field is missing in the manifest file (or corrupted), Ollama will crash when it tries to read it. This can happen at startup or when pulling new models. This data is mostly just used for showing model information so we can be tolerant of it not being present - it is not required to run the models. Besides avoiding crashing, this also gives us the ability to restructure the config in the future by pulling it into the main manifest file.	2024-08-07 10:30:44 -07:00
Jesse Gross	685a53534b	manifest: Don't prune layers if we can't open a manifest file If there is an error when opening a manifest file (corrupted, permission denied, etc.) then the referenced layers will not be included in the list of active layers. This causes them to be deleted when pruning happens at startup or a model is pulled. In such a situation, we should prefer to preserve data in the hopes that it can be recovered rather than being agressive about deletion.	2024-08-06 23:11:19 -07:00
Daniel Hiltgen	fc85f50a2b	Ensure sparse files on windows during download The file.Truncate call on windows will write the whole file unless you set the sparse flag, leading to heavy I/O at the beginning of download. This should improve our I/O behavior on windows and put less stress on the users disk.	2024-08-06 10:58:08 -07:00
Michael Yang	a091fadfda	use testing tempdirs	2024-08-02 16:04:06 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Michael Yang	ff7c9060ec	Merge pull request #6115 from slouffka/fix-context Fix context in /api/generate grows too much (#5980).	2024-08-01 15:13:59 -07:00
Michael Yang	0ff42e84b0	Merge pull request #4756 from ollama/mxyng/convert2 refactor convert	2024-08-01 14:16:30 -07:00
Vyacheslav Moskalev	8a9f946ca7	Refactor and format code.	2024-08-02 03:50:05 +07:00
Vyacheslav Moskalev	3b5210548e	Refactor code. Remove extra variable.	2024-08-01 19:56:15 +07:00
Vyacheslav Moskalev	b0c216584c	Better types and naming closer to style.	2024-08-01 19:43:44 +07:00
Vyacheslav Moskalev	49a5483139	Change the order of context and prompt.	2024-08-01 19:25:56 +07:00
Vyacheslav Moskalev	6bc5c13758	Fix extra context concatenation in generate handler (#5980 ).	2024-08-01 15:45:58 +07:00
Michael Yang	d87b4a488e	fix modelfile message quotes	2024-07-31 16:52:09 -07:00
Blake Mizerany	dc77bbcfa4	server: fix json marshalling of downloadBlobPart (#6108 )	2024-07-31 16:01:24 -07:00
Michael Yang	eafc607abb	convert: only extract large files	2024-07-31 15:58:55 -07:00
Michael Yang	df993fa37b	comments	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	c4c84b7a0d	Merge pull request #5196 from ollama/mxyng/messages-2 include modelfile messages	2024-07-31 10:18:17 -07:00
Michael Yang	5c1912769e	Merge pull request #5473 from ollama/mxyng/environ fix: environ lookup	2024-07-31 10:18:05 -07:00
royjhan	1b44d873e7	Add Metrics to `api\embed` response (#5709 ) * add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * update tests * test name * list metrics	2024-07-30 13:12:21 -07:00
Daniel Hiltgen	345420998e	Prevent partial loading on mixed GPU brands In mult-brand GPU setups, if we couldn't fully load the model we would fall through the scheduler and mistakenly try to load across a mix of brands. This makes sure we find the set of GPU(s) that best fit for the partial load.	2024-07-30 11:00:55 -07:00
Michael Yang	079b2c3b03	Merge pull request #5999 from ollama/mxyng/fix-push fix nil deref in auth.go	2024-07-26 14:28:34 -07:00
Blake Mizerany	750c1c55f7	server: fix race conditions during download (#5994 ) This fixes various data races scattered throughout the download/pull client where the client was accessing the download state concurrently. This commit is mostly a hot-fix and will be replaced by a new client one day soon. Also, remove the unnecessary opts argument from downloadChunk.	2024-07-26 14:24:24 -07:00
Michael Yang	a622c47bd3	fix nil deref in auth.go	2024-07-26 14:14:48 -07:00
Michael Yang	ec4c35fe99	Merge pull request #5512 from ollama/mxyng/detect-stop autodetect stop parameters from template	2024-07-26 13:48:23 -07:00
Michael Yang	15af558423	include modelfile messages	2024-07-26 11:40:11 -07:00
Blake Mizerany	c8af3c2d96	server: reuse original download URL for images (#5962 ) This changes the registry client to reuse the original download URL it gets on the first redirect response for all subsequent requests, preventing thundering herd issues when hot new LLMs are released.	2024-07-25 15:58:30 -07:00
Josh	db0968f30c	fix dupe err message (#5857 )	2024-07-22 15:48:15 -07:00
Michael Yang	85d9d73a72	comments	2024-07-22 11:49:03 -07:00
Michael Yang	1954ec5917	uint64	2024-07-22 11:49:02 -07:00
Michael Yang	0f1910129f	int	2024-07-22 11:30:07 -07:00
Michael Yang	8570c1c0ef	keepalive	2024-07-22 11:27:22 -07:00
Michael Yang	55cd3ddcca	bool	2024-07-22 11:27:21 -07:00
Michael Yang	66fe77f084	models	2024-07-22 11:26:12 -07:00
Michael Yang	d1a5227cad	origins	2024-07-22 11:25:30 -07:00
Michael Yang	35b89b2eab	rfc: dynamic environ lookup	2024-07-22 11:25:30 -07:00
Jeffrey Morgan	b3e5491e41	server: collect nested tool call objects when parsing (#5824 )	2024-07-22 12:38:03 -04:00
Jeffrey Morgan	80ee9b5e47	Remove out of space test temporarily (#5825 )	2024-07-21 00:22:11 -04:00
Daniel Hiltgen	06e5d74e34	Merge pull request #5506 from dhiltgen/sched_tests Refine scheduler unit tests for reliability	2024-07-20 15:48:39 -07:00
Jeffrey Morgan	69a2d4ccff	Fix generate test flakyness (#5804 )	2024-07-19 19:11:25 -07:00

1 2 3 4 5 ...

703 commits