Add tests for the C tokenizer and expose it as a private module (GH-27924)

9 files changed