Fast BPE training
- Iterate given merge vocabs
- Add string to vector sorted by length
- Counter for byte pairs and selecting the one with the most
- Put it all together
- Replicate the python example and time
- Optimize single threaded
- Multi-thread
- Read from file and output result to file
- Keep a global counter where when we find a new item, we don't need to recount everything. Just a count once.