Software Tools in Haskell

Software Tools is a little book about tool building by Brian Kernighan and P. J. Plauger. It’s a classic, and people far more qualified than me have written very positive things about it. The book includes several example programs which are designed individually to solve simple problems and collectively to work together readily to solve larger problems.

I’ve written some small tools for my own use, the largest of which (by far) is the feivel templating language. But I’m not particularly good at it and would like to improve. So I will be reading through Software Tools and porting the examples to Haskell. Along the way, I expect to supplement the text’s examples with tools to solve my own problems.

Because I enjoy pain, all of this will be done publicly, with code hosted at GitHub and narrative documentation posted here.

Ground Rules

Okay, let’s party. But first, let’s go over the rules, because what is fun without the rules? Gru in Despicable Me 2

The purpose of this project is to learn, and so there are some self-imposed rules. (Subject to change.)

Produce working tools.
Follow established conventions regarding things like command-line arguments and return codes.
Think very hard before making a tool less consistent or more complicated.

I will prefix the names of these ports with sth-, to avoid clashing with existing real programs. And of course all should be considered works-in-progress. These tools operate on text, which turns out to be more interesting than I realized when I started this project.

The Tools (Book Order)

noop: exit successfully

Chapter 1: Getting Started

copy: copy characters from stdin to stdout
count: count lines or chars on stdin
wordcount: count words on stdin
sentcount: count sentences on stdin
glyphcount: count glyphs on stdin
detab: replace tabs on stdin with spaces
charcombine: replace chars on stdin with precomposed equivalents
charfullwidth: replace chars on stdin with fullwidth equivalents

Chapter 2: Filters

entab: replace spaces on stdin with tabs
echo: write arguments to stdout
overstrike: interpret backspaces on stdin
unescape: interpret escape codes on stdin
escape: replace strange chars on stdin with escape sequences
compress: compress text on stdin (run length encoding)
expand: uncompress text on stdin (run length encoding)
crypt: xor text on stdin with a list of keys
translit: transliterate or remove chars on stdin
charreplace: replace chars by strings on stdin
tail: get the last k lines or chars from stdin
getlines: extract lines from stdin by index

Chapter 3: Files

compare: find the first position where two text streams differ
import: splice contents of a file into stdin
concat: concatenate files
wye: write stdin to files and stdout
pslineprint: print stdin to postscript
paginate: format lines with page numbers and headers
examine: interactively view a file
archive: bundle text files
linenumber: number lines on stdin

Chapter 4: Sorting

bubble: (bubble)sort lines on stdin

Why Haskell?

The programs in Software Tools are written in Ratfor, a purpose-built extension of Fortran with control-flow statements. (At the time, control flow in Fortran was done by hand with GOTO.) Kernighan and Plauger explain that this was a pragmatic choice, as no language at the time had the right mix of ubiquity and expressiveness. With 40 years(!) of hindsight, though, I’d say that this was an inspired choice. Books written in real languages quickly become hopelessly outdated. But books written in toy languages can focus on timeless principles. TAOCP by Knuth (which I’ve never read) and Functional Programming: Practice and Theory by MacLennan (which I have) are positive examples of this, and I have a shelf full of nameless algebra books written in APL and Pascal to serve as negative examples.

So why Haskell. I’ve been using Haskell for several years as a “tool of thought”, to paraphrase Ken Iverson, mostly for one-off experiments. Haskell is good for that, and I find that it fits my problem-solving style very well. (Programs are arrows in a category? Of course!) But I want to improve my ability to write “real” programs in the language. So here we are.