fix: Circular dependancy preventing early Llama object free (#1176)

commit 901827013b introduced a cyclic dependency within Llama objects. That change causes old models to linger in memory longer than necessary, thereby creating memory bloat in most applications attempting to switch between models at runtime. This patch simply removes the problematic line, allowing models to deallocate without relying on GC. One might also consider combining `weakref.ref` with a `@property` if the `llama` attribute is absolutely necessary to expose in the tokenizer class.
2024-02-11 10:57:57 -08:00 · 2024-02-11 10:57:57 -08:00 · a05d90446f
commit a05d90446f
parent 918ff27e50
1 changed files with 0 additions and 1 deletions
--- a/llama_cpp/llama_tokenizer.py
+++ b/llama_cpp/llama_tokenizer.py
@ -27,7 +27,6 @@ class BaseLlamaTokenizer(abc.ABC):

 class LlamaTokenizer(BaseLlamaTokenizer):
    def __init__(self, llama: llama_cpp.Llama):
-        self.llama = llama
        self._model = llama._model  # type: ignore

    def tokenize(