ollama

History

Jeffrey Morgan 15c2d8fe14 server: parallelize embeddings in API web handler instead of in subprocess runner (#6220 ) For simplicity, perform parallelization of embedding requests in the API handler instead of offloading this to the subprocess runner. This keeps the scheduling story simpler as it builds on existing parallel requests, similar to existing text completion functionality.		2024-08-11 11:57:10 -07:00
..
CMakeLists.txt	line feed	2024-08-04 17:25:41 -07:00
httplib.h	Import server.cpp as of b2356	2024-03-12 13:58:06 -07:00
json.hpp	Import server.cpp as of b2356	2024-03-12 13:58:06 -07:00
server.cpp	server: parallelize embeddings in API web handler instead of in subprocess runner (#6220 )	2024-08-11 11:57:10 -07:00
utils.hpp	log clean up	2024-05-09 14:55:36 -07:00