ollama

Author	SHA1	Message	Date
Michael Yang	b25dd1795d	allow F16 to use metal warning F16 uses significantly more memory than quantized model so the standard requires don't apply.	2023-08-26 08:38:48 -07:00
Michael Yang	304f2b6c96	add 34b to mem check	2023-08-26 08:29:21 -07:00
Jeffrey Morgan	177b69a211	add missing entries for 34B	2023-08-25 18:35:35 -07:00
Michael Yang	7a378f8b66	patch llama.cpp for 34B	2023-08-25 10:06:55 -07:00
Michael Yang	b1cececb8e	add 34b model type	2023-08-24 10:35:44 -07:00
Michael Yang	5ca05c2e88	fix ModelType()	2023-08-18 11:23:38 -07:00
Michael Yang	a894cc792d	model and file type as strings	2023-08-17 12:08:04 -07:00
Michael Yang	4dcf5c3e0b	Merge pull request #349 from jmorganca/close-files close open files	2023-08-14 16:15:58 -07:00
Michael Yang	e26085b921	close open files	2023-08-14 16:08:06 -07:00
Michael Yang	f7b613332c	update llama.cpp	2023-08-14 15:47:00 -07:00
Bruce MacDonald	4b2d366c37	Update llama.go	2023-08-14 12:55:50 -03:00
Bruce MacDonald	56fd4e4ef2	log embedding eval timing	2023-08-14 12:51:31 -03:00
Jeffrey Morgan	22885aeaee	update `llama.cpp` to `f64d44a`	2023-08-12 22:47:15 -04:00
Michael Yang	6ed991c8e2	ggml: fix off by one error remove used Unknown FileType	2023-08-11 10:45:22 -07:00
Michael Yang	6de5d032e1	implement loading ggml lora adapters through the modelfile	2023-08-10 09:23:39 -07:00
Michael Yang	d791df75dd	check memory requirements before loading	2023-08-10 09:23:11 -07:00
Michael Yang	020a3b3530	disable gpu for q5_0, q5_1, q8_0 quants	2023-08-10 09:23:11 -07:00
Michael Yang	fccf8d179f	partial decode ggml bin for more info	2023-08-10 09:23:10 -07:00

... 5 6 7 8 9

418 commits