Michael Yang
|
eae3af6807
|
clean up convert tokenizer
|
2024-08-27 11:11:43 -07:00 |
|
Michael Yang
|
3eb08377f8
|
detect chat template from configs that contain lists
|
2024-08-27 10:49:33 -07:00 |
|
Michael Yang
|
5a28b9cf5f
|
bert
|
2024-08-20 17:27:34 -07:00 |
|
Michael Yang
|
d8e2664c33
|
convert: fix parse functions
|
2024-07-31 15:58:55 -07:00 |
|
Michael Yang
|
eafc607abb
|
convert: only extract large files
|
2024-07-31 15:58:55 -07:00 |
|
Michael Yang
|
df993fa37b
|
comments
|
2024-07-31 15:58:55 -07:00 |
|
Michael Yang
|
5e9db9fb0b
|
refactor convert
|
2024-07-31 15:58:33 -07:00 |
|
Michael Yang
|
c895a7d13f
|
some gocritic
|
2024-06-04 11:13:30 -07:00 |
|
Ikko Eltociear Ashimine
|
955c317cab
|
chore: update tokenizer.go (#4571)
PreTokenziers -> PreTokenizers
|
2024-05-22 00:25:23 -07:00 |
|
Michael Yang
|
bbbd9f20f3
|
cleanup
|
2024-05-20 16:13:57 -07:00 |
|
Michael Yang
|
547132e820
|
bpe pretokenizer
|
2024-05-20 16:13:57 -07:00 |
|
Patrick Devine
|
2d315ba9a9
|
add missing file
|
2024-05-20 16:13:57 -07:00 |
|