llama.cpp/CHANGELOG.md

504 B

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Added first version of the changelog
  • Server: Use async routes
  • Use numpy for internal buffers to reduce memory usage and improve performance.

Fixed

  • Performance bug in stop sequence check slowing down streaming.