Feb 24

Regex is something that’s scared me for a long time. Previously I thought it only to be understood by Indiana Jones coders and only with the help of a wax crayon, a sheet of tracing paper and a secret book of biblical drawings. But this morning I discovered otherwise.

I’ve been studying for the Microsoft Certified Technology Specialist foundation exam, reading through and writing code from this big book. I whizzed through the first two chapters with massive enjoyment, then came to regex in the third chapter and hid the book in a drawer for a month being too busy to look at it. It’s not that it’s overly tough, it’s just tough enough to put you off with loads of weird characters in undecipherable sequences that are not very intuitive and require committing to memory. Check this out:

        static bool IsZip(string s)
        {
            return Regex.IsMatch(s, @"^\d{5}(\-\d{4})?$");
        }

It’s a piece of regex to accept US zip codes which can either look like 90210 or 90210-1111 (5 digits or five digits dash four digits). It doesn’t look too inviting does it? But let’s break it down:

^ Means match from the start of the line

\d{5} Means match 5 digits (5 numerical characters)

\- Means match a dash

\d{4} Means match 4 digits

(\-\d{4})? Together means optionally match a dash followed by four digits. The ? means optional and the parentheses means you have to have a dash AND four digits. So it will accept a dash and four digits, or nothing, but not a dash alone, or four digits alone.

$ Means match the end of the line. This is important. If you miss this you could end up accepting 90210beverlyhillscopsucks as a valid zip code.

Piece it all together and you feel a bit like Indiana Jones. Unfortunately deciphering this regex string hasn’t led to a quest to the Brazilian rainforest to dig for skulls in haunted caves, but we’ll see what the rest of the MCTS book brings.

The important thing is that like Mac and that water heater thing in the basement from Home Alone - I’m not afraid anymore.

Tagged with: