ollama

History

Jesse Gross 34a75102f7 prompt: Use a single token when estimating mllama context size Currently we assume that images take 768 tokens of context size for the purposes of clipping old messages that exceed the context window. However, our mllama implementation stores the full image embedding in a single token. As a result, there is significant waste of context space. Ideally, we would handle this more generically and have the implementation report the number of tokens. However, at the moment this would just result in a similar set of 'if' conditions in the runner plus APIs to report it back. So for now, we just keep this simple.		2024-11-05 10:11:50 -08:00
..
imageproc	add more tests for getting the optimal tiled canvas (#7411 )	2024-10-29 16:28:02 -07:00
testdata/tools	server: add tool parsing support for nemotron-mini (#6849 )	2024-09-17 18:06:16 -07:00
auth.go	fix nil deref in auth.go	2024-07-26 14:14:48 -07:00
download.go	server: fix blob download when receiving a 200 response (#6656 )	2024-09-05 10:48:26 -07:00
fixblobs.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
fixblobs_test.go	server: replace blob prefix separator from ':' to '-' (#3146 )	2024-03-14 20:18:06 -07:00
images.go	Re-introduce the `llama` package (#5034 )	2024-10-08 08:53:54 -07:00
layer.go	fix: chmod new layer to 0o644 when creating it	2024-08-16 11:43:19 +08:00
manifest.go	only skip invalid json manifests	2024-08-15 10:29:14 -07:00
manifest_test.go	lint	2024-08-01 17:06:06 -07:00
model.go	image processing for llama3.2 (#6963 )	2024-10-18 16:12:35 -07:00
model_test.go	server: add tool parsing support for nemotron-mini (#6849 )	2024-09-17 18:06:16 -07:00
modelpath.go	validate model path	2024-08-28 09:32:57 -07:00
modelpath_test.go	validate model path	2024-08-28 09:32:57 -07:00
prompt.go	prompt: Use a single token when estimating mllama context size	2024-11-05 10:11:50 -08:00
prompt_test.go	runner.go: Better abstract vision model integration	2024-10-30 14:53:43 -07:00
routes.go	Quiet down debug log of image payload (#7454 )	2024-11-04 13:05:16 -08:00
routes_create_test.go	Merge pull request #6534 from ollama/mxyng/messages	2024-08-30 09:39:59 -07:00
routes_delete_test.go	server: clean up route names for consistency (#6524 )	2024-08-26 19:36:11 -07:00
routes_generate_test.go	image processing for llama3.2 (#6963 )	2024-10-18 16:12:35 -07:00
routes_list_test.go	server: clean up route names for consistency (#6524 )	2024-08-26 19:36:11 -07:00
routes_test.go	image processing for llama3.2 (#6963 )	2024-10-18 16:12:35 -07:00
sched.go	Rename gpu package discover (#7143 )	2024-10-16 17:45:00 -07:00
sched_test.go	Rename gpu package discover (#7143 )	2024-10-16 17:45:00 -07:00
sparse_common.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
sparse_windows.go	Don't hard fail on sparse setup error	2024-08-09 12:16:19 -07:00
upload.go	server: limit upload parts to 16 (#6411 )	2024-08-19 09:20:52 -07:00