ollama

Author	SHA1	Message	Date
Jesse Gross	97ec8cfd4e	image: Clarify argument to WriteManifest is config When creating a model the config layer is appended to the list of layers and then the last layer is used as the config when writing the manifest. This change directly uses the config layer to write the manifest. There is no behavior change but it is less error prone.	2024-08-07 16:58:42 -07:00
royjhan	5b3a21b578	add metrics to docs (#6079 )	2024-08-07 14:43:44 -07:00
Kyle Kelley	ad0c19dde4	Use llama3.1 in tools example (#5985 ) * Use llama3.1 in tools example * Update api.md	2024-08-07 17:20:50 -04:00
Jesse Gross	69eb06c40e	Merge pull request #6145 from ollama/jessegross/bug5840 Fix crash on startup when trying to clean up unused files (#5840)	2024-08-07 11:24:15 -07:00
Jesse Gross	1829fb61bd	manifest: Fix crash on startup when trying to clean up unused files (#5840 ) Currently if the config field is missing in the manifest file (or corrupted), Ollama will crash when it tries to read it. This can happen at startup or when pulling new models. This data is mostly just used for showing model information so we can be tolerant of it not being present - it is not required to run the models. Besides avoiding crashing, this also gives us the ability to restructure the config in the future by pulling it into the main manifest file.	2024-08-07 10:30:44 -07:00
Jesse Gross	685a53534b	manifest: Don't prune layers if we can't open a manifest file If there is an error when opening a manifest file (corrupted, permission denied, etc.) then the referenced layers will not be included in the list of active layers. This causes them to be deleted when pruning happens at startup or a model is pulled. In such a situation, we should prefer to preserve data in the hopes that it can be recovered rather than being agressive about deletion.	2024-08-06 23:11:19 -07:00
Jeffrey Morgan	de4fc29773	llm: reserve required number of slots for embeddings (#6219 )	2024-08-06 23:20:49 -04:00
Jeffrey Morgan	e04c7012c2	update llama.cpp submodule to `1e6f6554` (#6208 )	2024-08-06 15:11:45 -04:00
Chua Chee Seng	d4a7216c82	Fixed invalid option provided not displaying the invalid option name problem. (#6202 )	2024-08-06 14:37:16 -04:00
Daniel Hiltgen	a4fdd03c3b	Merge pull request #6207 from dhiltgen/sparse_win Ensure sparse files on windows during download	2024-08-06 11:06:06 -07:00
Daniel Hiltgen	fc85f50a2b	Ensure sparse files on windows during download The file.Truncate call on windows will write the whole file unless you set the sparse flag, leading to heavy I/O at the beginning of download. This should improve our I/O behavior on windows and put less stress on the users disk.	2024-08-06 10:58:08 -07:00
royjhan	86b907f82a	sort batch results (#6189 )	2024-08-05 16:55:34 -07:00
Michael Yang	10d49bce70	Merge pull request #6190 from ollama/mxyng/fix-integration fix concurrency test	2024-08-05 16:45:49 -07:00
Michael Yang	7ed367419e	fix concurrency test	2024-08-05 16:36:16 -07:00
Daniel Hiltgen	50ee8b5f56	Merge pull request #6186 from dhiltgen/numa Implement linux NUMA detection	2024-08-05 15:20:06 -07:00
Michael Yang	03bdac0595	Merge pull request #6146 from ollama/mxyng/testing use testing tempdirs	2024-08-05 13:00:05 -07:00
Daniel Hiltgen	f457d63400	Implement linux NUMA detection If the system has multiple numa nodes, enable numa support in llama.cpp If we detect numactl in the path, use that, else use the basic "distribute" mode.	2024-08-05 12:56:20 -07:00
Michael Yang	39f2bc6bfc	Merge pull request #6167 from ollama/mxyng/line-feed line feed	2024-08-05 00:06:28 -07:00
frob	b73b0940ef	Disable paging for journalctl (#6154 ) Users using `journalctl` to get logs for issue logging sometimes don't realize that paging is causing information to be missed.	2024-08-05 00:10:53 -04:00
Michael Yang	6a07344786	line feed	2024-08-04 17:25:41 -07:00
sryu1	8b920f35a4	Add Gemma 2 2b (#6151 )	2024-08-04 10:58:39 -04:00
Ivan Charapanau	4221e39867	Reference ollama integration with Harbor (#6147 )	2024-08-02 17:03:46 -07:00
Michael Yang	a091fadfda	use testing tempdirs	2024-08-02 16:04:06 -07:00
Michael Yang	77ccbf04dc	Merge pull request #6128 from ollama/mxyng/lint enable gofmt/gofumpt/goimports/tenv	2024-08-02 14:58:40 -07:00
royjhan	4addf6b587	Update OpenAI Compatibility Docs with /v1/completions (#5311 ) * Update docs * token bug corrected * Update docs/openai.md * Update docs/openai.md * add suffix * merge conflicts * merge conflicts	2024-08-02 13:16:23 -07:00
royjhan	85c7f11170	Update docs (#5310 )	2024-08-02 13:05:57 -07:00
Michael Yang	b732beba6a	lint	2024-08-01 17:06:06 -07:00
Kim Hallberg	ce1fb4447e	Fix models/{model} URL (#6132 )	2024-08-01 16:31:47 -07:00
royjhan	558a54b098	Update OpenAI Compatibility Docs with /v1/embeddings (#5470 ) * docs without usage * no usage * rm metric note	2024-08-01 16:00:29 -07:00
royjhan	ed52833bb1	Add to docs (#5309 )	2024-08-01 15:58:13 -07:00
royjhan	6f133a0bdd	OpenAI: Add Usage to `v1/embeddings` (#5886 ) * add prompt tokens to embed response * rm slog * metrics * types * prompt n * clean up * reset submodule * add tokens to v1/embeddings * separate usage	2024-08-01 15:49:37 -07:00
royjhan	f561eecfb8	Update OpenAI Compatibility Docs with /v1/models (#5151 ) * OpenAI Docs * Update docs/openai.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Remove newline --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-08-01 15:48:44 -07:00
Michael Yang	ff7c9060ec	Merge pull request #6115 from slouffka/fix-context Fix context in /api/generate grows too much (#5980).	2024-08-01 15:13:59 -07:00
Michael Yang	0ff42e84b0	Merge pull request #4756 from ollama/mxyng/convert2 refactor convert	2024-08-01 14:16:30 -07:00
Vyacheslav Moskalev	8a9f946ca7	Refactor and format code.	2024-08-02 03:50:05 +07:00
Vyacheslav Moskalev	3b5210548e	Refactor code. Remove extra variable.	2024-08-01 19:56:15 +07:00
Vyacheslav Moskalev	b0c216584c	Better types and naming closer to style.	2024-08-01 19:43:44 +07:00
Vyacheslav Moskalev	49a5483139	Change the order of context and prompt.	2024-08-01 19:25:56 +07:00
Vyacheslav Moskalev	6bc5c13758	Fix extra context concatenation in generate handler (#5980 ).	2024-08-01 15:45:58 +07:00
Michael Yang	3e614260af	Merge pull request #6109 from ollama/mxyng/fix-modelfile fix modelfile message quotes	2024-07-31 17:05:43 -07:00
Michael Yang	d87b4a488e	fix modelfile message quotes	2024-07-31 16:52:09 -07:00
Michael Yang	4c14855ad7	Merge pull request #6106 from ollama/mxyng/default-sliding-window-attention patches: phi3 optional sliding window attention	2024-07-31 16:12:06 -07:00
Blake Mizerany	dc77bbcfa4	server: fix json marshalling of downloadBlobPart (#6108 )	2024-07-31 16:01:24 -07:00
Michael Yang	d8e2664c33	convert: fix parse functions	2024-07-31 15:58:55 -07:00
Michael Yang	eafc607abb	convert: only extract large files	2024-07-31 15:58:55 -07:00
Michael Yang	781fc2d576	Update convert/reader_safetensors.go Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-07-31 15:58:55 -07:00
Michael Yang	df993fa37b	comments	2024-07-31 15:58:55 -07:00
Michael Yang	5e9db9fb0b	refactor convert	2024-07-31 15:58:33 -07:00
Michael Yang	0f3271db88	patches: phi3 default sliding window attention	2024-07-31 14:58:34 -07:00
Michael Yang	6b252918fb	update convert test to check result data	2024-07-31 10:59:38 -07:00

1 2 3 4 5 ...

3299 commits