Commit graph

12 commits

Author SHA1 Message Date
Michael Yang
eae3af6807 clean up convert tokenizer 2024-08-27 11:11:43 -07:00
Michael Yang
3eb08377f8 detect chat template from configs that contain lists 2024-08-27 10:49:33 -07:00
Michael Yang
5a28b9cf5f bert 2024-08-20 17:27:34 -07:00
Michael Yang
d8e2664c33 convert: fix parse functions 2024-07-31 15:58:55 -07:00
Michael Yang
eafc607abb convert: only extract large files 2024-07-31 15:58:55 -07:00
Michael Yang
df993fa37b comments 2024-07-31 15:58:55 -07:00
Michael Yang
5e9db9fb0b refactor convert 2024-07-31 15:58:33 -07:00
Michael Yang
c895a7d13f some gocritic 2024-06-04 11:13:30 -07:00
Ikko Eltociear Ashimine
955c317cab
chore: update tokenizer.go (#4571)
PreTokenziers -> PreTokenizers
2024-05-22 00:25:23 -07:00
Michael Yang
bbbd9f20f3 cleanup 2024-05-20 16:13:57 -07:00
Michael Yang
547132e820 bpe pretokenizer 2024-05-20 16:13:57 -07:00
Patrick Devine
2d315ba9a9 add missing file 2024-05-20 16:13:57 -07:00