ollama

Author	SHA1	Message	Date
Daniel Hiltgen	f77713bf1f	Add isolated gpu test to troubleshooting	2024-05-23 09:33:25 -07:00
Jeffrey Morgan	38255d2af1	Use flash attention flag for now (#4580 ) * put flash attention behind flag for now * add test * remove print * up timeout for sheduler tests	2024-05-22 21:52:09 -07:00
Michael	73630a7e85	add phi 3 medium (#4578 )	2024-05-22 12:53:45 -04:00
Ikko Eltociear Ashimine	955c317cab	chore: update tokenizer.go (#4571 ) PreTokenziers -> PreTokenizers	2024-05-22 00:25:23 -07:00
Josh	9f18b88a06	Merge pull request #4566 from ollama/jyan/shortcuts add Ctrl + W shortcut	2024-05-21 22:49:36 -07:00
Josh Yan	353f83a9c7	add Ctrl + W shortcut	2024-05-21 16:55:09 -07:00
Patrick Devine	3bade04e10	doc updates for the faq/troubleshooting (#4565 )	2024-05-21 15:30:09 -07:00
Michael Yang	a6d0f443eb	Merge pull request #4543 from ollama/mxyng/simple-safetensors simplify safetensors reading	2024-05-21 14:43:55 -07:00
Michael Yang	96236b7968	Merge pull request #4268 from ollama/pdevine/llama3 Convert directly from llama3	2024-05-21 14:43:37 -07:00
Sang Park	4434d7f447	Correct typo in error message (#4535 ) The spelling of the term "request" has been corrected, which was previously mistakenly written as "requeset" in the error log message.	2024-05-21 13:39:01 -07:00
Michael Yang	171eb040fc	simplify safetensors reading	2024-05-21 11:28:22 -07:00
Michael Yang	3591bbe56f	add test	2024-05-21 11:28:22 -07:00
Michael Yang	34d5ef29b3	fix conversion for f16 or f32 inputs	2024-05-21 11:28:22 -07:00
Michael Yang	bbbd9f20f3	cleanup	2024-05-20 16:13:57 -07:00
Michael Yang	547132e820	bpe pretokenizer	2024-05-20 16:13:57 -07:00
Patrick Devine	2d315ba9a9	add missing file	2024-05-20 16:13:57 -07:00
Patrick Devine	d355d2020f	add fixes for llama	2024-05-20 16:13:57 -07:00
Patrick Devine	c8cf0d94ed	llama3 conversion	2024-05-20 16:13:57 -07:00
Patrick Devine	4730762e5c	add safetensors version	2024-05-20 16:13:57 -07:00
Patrick Devine	d88582dffd	some changes for llama3	2024-05-20 16:13:57 -07:00
Michael Yang	2f81b3dce2	Merge pull request #4502 from ollama/mxyng/fix-quantize fix quantize file types	2024-05-20 16:09:27 -07:00
jmorganca	5cab13739e	set llama.cpp submodule commit to `614d3b9`	2024-05-20 15:28:17 -07:00
Josh Yan	8aadad9c72	updated updateURL	2024-05-20 15:24:32 -07:00
Michael Yang	807d092761	fix quantize file types	2024-05-20 15:22:11 -07:00
Michael Yang	f36f1d6be9	tidy intermediate blobs	2024-05-20 15:15:06 -07:00
alwqx	8800c8a59b	chore: fix typo in docs (#4536 )	2024-05-20 14:19:03 -07:00
Michael Yang	b4dce13309	Merge pull request #4330 from ollama/mxyng/cache-intermediate-layers cache and reuse intermediate blobs	2024-05-20 13:54:41 -07:00
Sam	e15307fdf4	feat: add support for flash_attn (#4120 ) * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: enable flash attention if supported * feat: add flash_attn support	2024-05-20 13:36:03 -07:00
Michael Yang	3520c0e4d5	cache and reuse intermediate blobs particularly useful for zipfiles and f16s	2024-05-20 13:25:10 -07:00
Patrick Devine	ccdf0b2a44	Move the parser back + handle utf16 files (#4533 )	2024-05-20 11:26:45 -07:00
jmorganca	63a453554d	`go mod tidy`	2024-05-19 23:03:57 -07:00
Patrick Devine	105186aa17	add OLLAMA_NOHISTORY to turn off history in interactive mode (#4508 )	2024-05-18 11:51:57 -07:00
Daniel Hiltgen	ba04afc9a4	Merge pull request #4483 from dhiltgen/clean_exit Don't return error on signal exit	2024-05-17 11:41:57 -07:00
Daniel Hiltgen	7e1e0086e7	Merge pull request #4482 from dhiltgen/integration_improvements Skip max queue test on remote	2024-05-16 16:43:48 -07:00
Daniel Hiltgen	02b31c9dc8	Don't return error on signal exit	2024-05-16 16:25:38 -07:00
Daniel Hiltgen	7f2fbad736	Skip max queue test on remote This test needs to be able to adjust the queue size down from our default setting for a reliable test, so it needs to skip on remote test execution mode.	2024-05-16 16:24:18 -07:00
Josh	5bece94509	Merge pull request #4463 from ollama/jyan/line-display changed line display to be calculated with runewidth	2024-05-16 14:15:08 -07:00
Josh Yan	3d90156e99	removed comment	2024-05-16 14:12:03 -07:00
Rose Heart	5e46c5c435	Updating software for read me (#4467 ) * Update README.md Added chat/moderation bot to list of software. * Update README.md Fixed link error.	2024-05-16 13:55:14 -07:00
Jeffrey Morgan	583c1f472c	update llama.cpp submodule to `614d3b9` (#4414 )	2024-05-16 13:53:09 -07:00
Josh Yan	26bfc1c443	go fmt'd cmd.go	2024-05-15 17:26:39 -07:00
Josh Yan	799aa9883c	go fmt'd cmd.go	2024-05-15 17:24:17 -07:00
Michael Yang	84ed77cbd8	Merge pull request #4436 from ollama/mxyng/done-part return on part done	2024-05-15 17:16:24 -07:00
Josh Yan	c9e584fb90	updated double-width display	2024-05-15 16:45:24 -07:00
Josh Yan	17b1e81ca1	fixed width and word count for double spacing	2024-05-15 16:29:33 -07:00
Daniel Hiltgen	7e9a2da097	Merge pull request #4462 from dhiltgen/opt_out_build Port cuda/rocm skip build vars to linux	2024-05-15 16:27:47 -07:00
Daniel Hiltgen	c48c1d7c46	Port cuda/rocm skip build vars to linux Windows already implements these, carry over to linux.	2024-05-15 15:56:43 -07:00
Patrick Devine	d1692fd3e0	fix the cpu estimatedTotal memory + get the expiry time for loading models (#4461 )	2024-05-15 15:43:16 -07:00
Daniel Hiltgen	5fa36a0833	Merge pull request #4459 from dhiltgen/sanitize_env_log Sanitize the env var debug log	2024-05-15 14:58:55 -07:00
Daniel Hiltgen	853ae490e1	Sanitize the env var debug log Only dump env vars we care about in the logs	2024-05-15 14:42:57 -07:00

... 7 8 9 10 11 ...

3183 commits