ollama/llm/patches/01-load-progress.diff

diff --git a/common/common.cpp b/common/common.cpp
index 2c05a4d4..927f0e3d 100644
--- a/common/common.cpp
+++ b/common/common.cpp
@@ -2093,6 +2093,8 @@ struct llama_model_params llama_model_params_from_gpt_params(const gpt_params &
     mparams.use_mmap        = params.use_mmap;
     mparams.use_mlock       = params.use_mlock;
     mparams.check_tensors   = params.check_tensors;
+    mparams.progress_callback = params.progress_callback;
+    mparams.progress_callback_user_data = params.progress_callback_user_data;
     if (params.kv_overrides.empty()) {
         mparams.kv_overrides = NULL;
     } else {
diff --git a/common/common.h b/common/common.h
index 65c0ef81..ebca2c77 100644
--- a/common/common.h
+++ b/common/common.h
@@ -184,6 +184,13 @@ struct gpt_params {
     std::string mmproj = "";        // path to multimodal projector
     std::vector<std::string> image; // path to image file(s)
 
+    // Called with a progress value between 0.0 and 1.0. Pass NULL to disable.
+    // If the provided progress_callback returns true, model loading continues.
+    // If it returns false, model loading is immediately aborted.
+    llama_progress_callback progress_callback = NULL;
+    // context pointer passed to the progress callback
+    void * progress_callback_user_data;
+
     // embedding
     bool embedding         = false; // get only sentence embedding
     int32_t embd_normalize = 2;     // normalisation for embendings (-1=none, 0=max absolute int16, 1=taxicab, 2=euclidean, >2=p-norm)
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`diff --git a/common/common.cpp b/common/common.cpp`
update llama.cpp submodule to `d7fd29f` (#5475) 2024-07-05 17:25:58 +00:00			`index 2c05a4d4..927f0e3d 100644`
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`--- a/common/common.cpp`
			`+++ b/common/common.cpp`
update llama.cpp submodule to `d7fd29f` (#5475) 2024-07-05 17:25:58 +00:00			`@@ -2093,6 +2093,8 @@ struct llama_model_params llama_model_params_from_gpt_params(const gpt_params &`
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`mparams.use_mmap = params.use_mmap;`
			`mparams.use_mlock = params.use_mlock;`
			`mparams.check_tensors = params.check_tensors;`
			`+ mparams.progress_callback = params.progress_callback;`
			`+ mparams.progress_callback_user_data = params.progress_callback_user_data;`
			`if (params.kv_overrides.empty()) {`
			`mparams.kv_overrides = NULL;`
			`} else {`
			`diff --git a/common/common.h b/common/common.h`
update llama.cpp submodule to `d7fd29f` (#5475) 2024-07-05 17:25:58 +00:00			`index 65c0ef81..ebca2c77 100644`
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`--- a/common/common.h`
			`+++ b/common/common.h`
update llama.cpp submodule to `d7fd29f` (#5475) 2024-07-05 17:25:58 +00:00			`@@ -184,6 +184,13 @@ struct gpt_params {`
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`std::string mmproj = ""; // path to multimodal projector`
			`std::vector<std::string> image; // path to image file(s)`
llm: update llama.cpp commit to `7c26775` (#4896) * llm: update llama.cpp submodule to `7c26775` * disable `LLAMA_BLAS` for now * `-DLLAMA_OPENMP=off` 2024-06-17 19:56:16 +00:00
Wire up load progress This doesn't expose a UX yet, but wires the initial server portion of progress reporting during load 2024-05-20 23:41:43 +00:00			`+ // Called with a progress value between 0.0 and 1.0. Pass NULL to disable.`
			`+ // If the provided progress_callback returns true, model loading continues.`
			`+ // If it returns false, model loading is immediately aborted.`
			`+ llama_progress_callback progress_callback = NULL;`
			`+ // context pointer passed to the progress callback`
			`+ void * progress_callback_user_data;`
llm: update llama.cpp commit to `7c26775` (#4896) * llm: update llama.cpp submodule to `7c26775` * disable `LLAMA_BLAS` for now * `-DLLAMA_OPENMP=off` 2024-06-17 19:56:16 +00:00			`+`
update llama.cpp submodule to `d7fd29f` (#5475) 2024-07-05 17:25:58 +00:00			`// embedding`
			`bool embedding = false; // get only sentence embedding`
			`int32_t embd_normalize = 2; // normalisation for embendings (-1=none, 0=max absolute int16, 1=taxicab, 2=euclidean, >2=p-norm)`