Token budget optimization

July 7, 2024

I've hacked together some multiline string in Python using newlines (\n) or whitespace (tabs, spaces, non-breaking spaces, etc.) to format prompts. To shave off a few tokens, you can use RegEx to replace the newline characters with a space:

python
re.sub(r"[\s]+","", input)

This can trim some fat, however, the largest chunk of tokens is usually the string itself. There are compression methods, though it's usually a trade-off between accuracy and compression ratio.

Next time I could probably drop at least one random letter per word to save some dimes.

Why waste time say lot word when few word do trick? - Kevin Malone