fix: Circular dependancy preventing early Llama object free (#1176)

commit 901827013b introduced a cyclic dependency
within Llama objects. That change causes old models to linger in memory longer
than necessary, thereby creating memory bloat in most applications attempting
to switch between models at runtime. This patch simply removes the problematic
line, allowing models to deallocate without relying on GC. One might also
consider combining `weakref.ref` with a `@property` if the `llama` attribute is
absolutely necessary to expose in the tokenizer class.
This commit is contained in:
Connor 2024-02-11 10:57:57 -08:00 committed by GitHub
parent 918ff27e50
commit a05d90446f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -27,7 +27,6 @@ class BaseLlamaTokenizer(abc.ABC):
class LlamaTokenizer(BaseLlamaTokenizer):
def __init__(self, llama: llama_cpp.Llama):
self.llama = llama
self._model = llama._model # type: ignore
def tokenize(