- ftfy: Fixes Unicode problems, such as mojibake, after the fact.
- ordered-set: The standard library's missing data structure.
- langcodes: A library for working with and comparing language codes.
- pack64: Encodes and decodes floating point vectors in a compact, base64-like format.
- conceptnet5: The code that powers the ConceptNet project, an open, multilingual semantic network.
- wordfreq: A database of word frequencies in various languages.
- assoc-space: A representation for machine learning. Computes association strength over semantic networks.