* Initial Batch Embedding
* Revert "Initial Batch Embedding"
This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29.
* Initial Draft
* mock up notes
* api/embed draft
* add server function
* check normalization
* clean up
* normalization
* playing around with truncate stuff
* Truncation
* Truncation
* move normalization to go
* Integration Test Template
* Truncation Integration Tests
* Clean up
* use float32
* move normalize
* move normalize test
* refactoring
* integration float32
* input handling and handler testing
* Refactoring of legacy and new
* clear comments
* merge conflicts
* touches
* embedding type 64
* merge conflicts
* fix hanging on single string
* refactoring
* test values
* set context length
* clean up
* testing clean up
* testing clean up
* remove function closure
* Revert "remove function closure"
This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787.
* remove function closure
* remove redundant error check
* clean up
* more clean up
* clean up
* OpenAI v1 models
* Refactor Writers
* Add Test
Co-Authored-By: Attila Kerekes
* Credit Co-Author
Co-Authored-By: Attila Kerekes <439392+keriati@users.noreply.github.com>
* Empty List Testing
* Use Namespace for Ownedby
* Update Test
* Add back envconfig
* v1/models docs
* Use ModelName Parser
* Test Names
* Remove Docs
* Clean Up
* Test name
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* Add Middleware for Chat and List
* Testing Cleanup
* Test with Fatal
* Add functionality to chat test
* OpenAI: /v1/models/{model} compatibility (#5028)
* Retrieve Model
* OpenAI Delete Model
* Retrieve Middleware
* Remove Delete from Branch
* Update Test
* Middleware Test File
* Function name
* Cleanup
* Test Update
* Test Update
---------
Co-authored-by: Attila Kerekes <439392+keriati@users.noreply.github.com>
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
* API Show Extended
* Initial Draft of Information
Co-Authored-By: Patrick Devine <pdevine@sonic.net>
* Clean Up
* Descriptive arg error messages and other fixes
* Second Draft of Show with Projectors Included
* Remove Chat Template
* Touches
* Prevent wrapping from files
* Verbose functionality
* Docs
* Address Feedback
* Lint
* Resolve Conflicts
* Function Name
* Tests for api/show model info
* Show Test File
* Add Projector Test
* Clean routes
* Projector Check
* Move Show Test
* Touches
* Doc update
---------
Co-authored-by: Patrick Devine <pdevine@sonic.net>
* Fixed the API endpoint /api/tags to return {models: []} instead of {models: null} when the model list is empty.
* Update server/routes.go
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.