ollama/llm
Bruce MacDonald 6ee8c80199
restore model load duration on generate response (#1524)
* restore model load duration on generate response

- set model load duration on generate and chat done response
- calculate createAt time when response created

* remove checkpoints predict opts

* Update routes.go
2023-12-14 12:15:50 -05:00
..
llama.cpp Update runner to support mixtral and mixture of experts (MoE) (#1475) 2023-12-13 17:15:10 -05:00
ggml.go seek to end of file when decoding older model formats 2023-12-09 21:14:35 -05:00
gguf.go remove per-model types 2023-12-11 09:40:21 -08:00
llama.go restore model load duration on generate response (#1524) 2023-12-14 12:15:50 -05:00
llm.go load projectors 2023-12-05 14:36:12 -08:00
utils.go partial decode ggml bin for more info 2023-08-10 09:23:10 -07:00