Software Tools in Haskell: On Formats

Ostensibly, many software tools operate on streams of text. But not all text is created equal: some text is formatted. Whenever a character in a text stream doesn’t “mean what it says”, so to speak, we have a format. For example many tools follow the convention that the character sequence \n denotes the unicode character with code point 0A. The backslash is not a literal backslash. One of the reasons why text is so useful is that many different human-readable formats can be layered on top of it (even more than one at a time, though this can be dangerous).

Keeping in mind what kind of text a tool expects to receive and produce can help lead us to simpler and more consistent programs. On this page we’ll keep a list of a few different textual formats with links to more information about them.

Unformats

Of course delimited text can be thought of as lined text, which can be thought of as a stream of characters. But the point is that these formats have extra meaning baked in which informs how a tool should behave.

Markup

Really complicated