ollama

Author	SHA1	Message	Date
Michael Yang	547132e820	bpe pretokenizer	2024-05-20 16:13:57 -07:00
Patrick Devine	2d315ba9a9	add missing file	2024-05-20 16:13:57 -07:00
Patrick Devine	d355d2020f	add fixes for llama	2024-05-20 16:13:57 -07:00
Patrick Devine	c8cf0d94ed	llama3 conversion	2024-05-20 16:13:57 -07:00
Patrick Devine	4730762e5c	add safetensors version	2024-05-20 16:13:57 -07:00
Patrick Devine	d88582dffd	some changes for llama3	2024-05-20 16:13:57 -07:00
Michael Yang	2f81b3dce2	Merge pull request #4502 from ollama/mxyng/fix-quantize fix quantize file types	2024-05-20 16:09:27 -07:00
jmorganca	5cab13739e	set llama.cpp submodule commit to `614d3b9`	2024-05-20 15:28:17 -07:00
Josh Yan	8aadad9c72	updated updateURL	2024-05-20 15:24:32 -07:00
Michael Yang	807d092761	fix quantize file types	2024-05-20 15:22:11 -07:00
Michael Yang	f36f1d6be9	tidy intermediate blobs	2024-05-20 15:15:06 -07:00
alwqx	8800c8a59b	chore: fix typo in docs (#4536 )	2024-05-20 14:19:03 -07:00
Michael Yang	b4dce13309	Merge pull request #4330 from ollama/mxyng/cache-intermediate-layers cache and reuse intermediate blobs	2024-05-20 13:54:41 -07:00
Sam	e15307fdf4	feat: add support for flash_attn (#4120 ) * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: add flash_attn support	2024-05-20 13:36:03 -07:00
Michael Yang	3520c0e4d5	cache and reuse intermediate blobs particularly useful for zipfiles and f16s	2024-05-20 13:25:10 -07:00
Patrick Devine	ccdf0b2a44	Move the parser back + handle utf16 files (#4533 )	2024-05-20 11:26:45 -07:00
jmorganca	63a453554d	`go mod tidy`	2024-05-19 23:03:57 -07:00
Patrick Devine	105186aa17	add OLLAMA_NOHISTORY to turn off history in interactive mode (#4508 )	2024-05-18 11:51:57 -07:00
Daniel Hiltgen	ba04afc9a4	Merge pull request #4483 from dhiltgen/clean_exit Don't return error on signal exit	2024-05-17 11:41:57 -07:00
Daniel Hiltgen	7e1e0086e7	Merge pull request #4482 from dhiltgen/integration_improvements Skip max queue test on remote	2024-05-16 16:43:48 -07:00
Daniel Hiltgen	02b31c9dc8	Don't return error on signal exit	2024-05-16 16:25:38 -07:00
Daniel Hiltgen	7f2fbad736	Skip max queue test on remote This test needs to be able to adjust the queue size down from our default setting for a reliable test, so it needs to skip on remote test execution mode.	2024-05-16 16:24:18 -07:00
Josh	5bece94509	Merge pull request #4463 from ollama/jyan/line-display changed line display to be calculated with runewidth	2024-05-16 14:15:08 -07:00
Josh Yan	3d90156e99	removed comment	2024-05-16 14:12:03 -07:00
Rose Heart	5e46c5c435	Updating software for read me (#4467 ) * Update README.md Added chat/moderation bot to list of software. * Update README.md Fixed link error.	2024-05-16 13:55:14 -07:00
Jeffrey Morgan	583c1f472c	update llama.cpp submodule to `614d3b9` (#4414 )	2024-05-16 13:53:09 -07:00
Josh Yan	26bfc1c443	go fmt'd cmd.go	2024-05-15 17:26:39 -07:00
Josh Yan	799aa9883c	go fmt'd cmd.go	2024-05-15 17:24:17 -07:00
Michael Yang	84ed77cbd8	Merge pull request #4436 from ollama/mxyng/done-part return on part done	2024-05-15 17:16:24 -07:00
Josh Yan	c9e584fb90	updated double-width display	2024-05-15 16:45:24 -07:00
Josh Yan	17b1e81ca1	fixed width and word count for double spacing	2024-05-15 16:29:33 -07:00
Daniel Hiltgen	7e9a2da097	Merge pull request #4462 from dhiltgen/opt_out_build Port cuda/rocm skip build vars to linux	2024-05-15 16:27:47 -07:00
Daniel Hiltgen	c48c1d7c46	Port cuda/rocm skip build vars to linux Windows already implements these, carry over to linux.	2024-05-15 15:56:43 -07:00
Patrick Devine	d1692fd3e0	fix the cpu estimatedTotal memory + get the expiry time for loading models (#4461 )	2024-05-15 15:43:16 -07:00
Daniel Hiltgen	5fa36a0833	Merge pull request #4459 from dhiltgen/sanitize_env_log Sanitize the env var debug log	2024-05-15 14:58:55 -07:00
Daniel Hiltgen	853ae490e1	Sanitize the env var debug log Only dump env vars we care about in the logs	2024-05-15 14:42:57 -07:00
Patrick Devine	f2cf97d6f1	fix typo in modelfile generation (#4439 )	2024-05-14 15:34:29 -07:00
Patrick Devine	c344da4c5a	fix keepalive for non-interactive mode (#4438 )	2024-05-14 15:17:04 -07:00
Michael Yang	0e331c7168	Merge pull request #4328 from ollama/mxyng/mem count memory up to NumGPU if set by user	2024-05-14 13:47:44 -07:00
Michael Yang	ac145f75ca	return on part done	2024-05-14 13:04:30 -07:00
Patrick Devine	a4b8d1f89a	re-add system context (#4435 )	2024-05-14 11:38:20 -07:00
Ryo Machida	798b107f19	Fixed the API endpoint /api/tags when the model list is empty. (#4424 ) * Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty. * Update server/routes.go --------- Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2024-05-14 11:18:10 -07:00
Daniel Hiltgen	6a1b471365	Merge pull request #4430 from dhiltgen/gpu_info Remove VRAM convergence check for windows	2024-05-14 10:59:06 -07:00
Daniel Hiltgen	ec231a7923	Remove VRAM convergence check for windows The APIs we query are optimistic on free space, and windows pages VRAM, so we don't have to wait to see reported usage recover on unload	2024-05-14 09:53:46 -07:00
Patrick Devine	7ca71a6b0f	don't abort when an invalid model name is used in /save (#4416 )	2024-05-13 18:48:28 -07:00
Josh	7607e6e902	Merge pull request #4379 from WolfTheDeveloper/main Update `LlamaScript` to point to new link from Legacy link.	2024-05-13 18:08:32 -07:00
Patrick Devine	f1548ef62d	update the FAQ to be more clear about windows env variables (#4415 )	2024-05-13 18:01:13 -07:00
Patrick Devine	6845988807	Ollama `ps` command for showing currently loaded models (#4327 )	2024-05-13 17:17:36 -07:00
Josh	9eed4a90ce	Merge pull request #4411 from joshyan1/main removed inconsistent punctuation	2024-05-13 15:30:45 -07:00
Josh Yan	f8464785a6	removed inconsistencies	2024-05-13 14:50:52 -07:00

1 2 3 4 5 ...

2762 commits