Parser Combinators: a Walkthrough

  • Going further with the language itself: learning about language extensions, the theory behind them, and the advanced features they provide.
  • Learning more about the different ways to structure a program (such as monad transformers or effect systems).
  • Exploring some of the most used libraries, and understanding what makes them so ubiquitous: QuickCheck, Lens… and Parsec.

Choosing a type representation for parsers

A minimal viable implementation

type Parser a = String -> a
type Parser a = String -> Either ParseError a
newtype Parser a = Parser { runParser :: String -> (String, Either ParseError a) }

Writing elementary parsers

any :: Parser Char any = Parser $ \input -> case input of -- some input left: we unpack and return the first character (x:xs) -> (xs, Right x) -- no input left: the parser fails [] -> ("", Left $ ParseError "any character" -- expected "the end of the input" -- encountered ) eof :: Parser () eof = Parser $ \input -> case input of -- no input left: the parser succeeds [] -> ("", Right ()) -- leftover data: the parser fails (c:_) -> (input, Left $ ParseError "the end of the input" -- expected [c] -- encountered )

Sequencing parsers

jsonEntry :: Parser (String, JValue) jsonEntry = Parser $ \input -> -- parse a json string case runParser jsonString input of (input2, Left err) -> (input2, Left err) (input2, Right key) -> -- on success: parse a single colon case runParser (char ':') input2 of (input3, Left err) -> (input3, Left err) (input3, Right _) -> -- on success: parse a json value case runParser jsonValue input3 of (input4, Left err) -> (input4, Left err) (input4, Right value) -> -- on success: return the result (input4, Right (key, value))
andThen :: Parser a -> (a -> Parser b) -> Parser b parserA `andThen` f = Parser $ \input -> case runParser parserA input of (restOfInput, Right a) -> runParser (f a) restOfInput (restOfInput, Left e) -> (restOfInput, Left e)
jsonEntry :: Parser (String, JValue) jsonEntry = -- parse a json string jsonString `andThen` \key -> -- parse a single colon char ':' `andThen` \_ -> -- parse a JSON value jsonValue `andThen` \value -> -- create a constant parser that consumes no input constParser (key, value)

By the power of monads!

(>>=) :: Parser a -> (a -> Parser b) -> Parser b
jsonEntry :: Parser (String, JValue) jsonEntry = do key <- jsonString _ <- char ':' value <- jsonValue return (key, value)
(*>) :: Parser a -> Parser b -> Parser b (<*) :: Parser a -> Parser b -> Parser a (<$) :: a -> Parser b -> Parser a -- ignore leading spaces spaces *> value -- ignore trailing spaces value <* spaces -- substitute a value True <$ string "true"

Recap

satisfy :: String -> (Char -> Bool) -> Parser Char satisfy description predicate = try $ do c <- any if predicate c then return c else parseError description [c]

Combining parsers

(f . g) x = f (g x)

Choice, errors, and backtracking

try :: Parser a -> Parser a try p = Parser $ \state -> case runParser p state of (_newState, Left err) -> (state, Left err) success -> success
(<|>) :: Parser a -> Parser a -> Parser a p1 <|> p2 = Parser $ \s -> case runParser p1 s of (s', Left err) | s' == s -> runParser p2 s | otherwise -> (s', Left err) success -> success
choice :: String -> [Parser a] -> Parser a choice description = foldr (<|>) noMatch where noMatch = parseError description "no match"

Repetition

many, many1 :: Parser a -> Parser [a] many p = many1 p <|> return [] many1 p = do first <- p rest <- many p return (first:rest)
sepBy, sepBy1 :: Parser a -> Parser s -> Parser [a] sepBy p s = sepBy1 p s <|> return [] sepBy1 p s = do first <- p rest <- many (s >> p) return (first:rest)

Recap

Parsers in practice: parsing JSON

data JValue = JObject (HashMap String JValue) | JArray [JValue] | JString String | JNumber Double | JBool Bool | JNull

Recognizing characters

char c = satisfy [c] (== c) space = satisfy "space" isSpace digit = satisfy "digit" isDigit

Language syntax

string = traverse char spaces = many space symbol s = string s <* spaces between open close value = open *> value <* close brackets = between (symbol "[") (symbol "]") braces = between (symbol "{") (symbol "}")

Scalars

jsonNumber = read <$> many1 digit
jsonBool = choice "JSON boolean" [ True <$ symbol "true" , False <$ symbol "false" ]
jsonString = between (char '"') (char '"') (many jsonChar) <* spaces where jsonChar = choice "JSON string character" [ try $ '\n' <$ string "\\n" , try $ '\t' <$ string "\\t" , try $ '"' <$ string "\\\"" , try $ '\\' <$ string "\\\\" , satisfy "not a quote" (/= '"') ]

Arrays and objects

jsonObject = do assocList <- braces $ jsonEntry `sepBy` symbol "," return $ fromList assocList where jsonEntry = do k <- jsonString symbol ":" v <- jsonValue return (k,v)
jsonArray = brackets $ jsonValue `sepBy` symbol ","

Putting it all together

jsonValue = choice "a JSON value" [ JObject <$> jsonObject , JArray <$> jsonArray , JString <$> jsonString , JNumber <$> jsonNumber , JBool <$> jsonBool , JNull <$ symbol "null" ]

Wrapping up

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Hasura

Hasura

⚡️ Instant realtime GraphQL APIs! Connect Hasura to your database & data sources (GraphQL, REST & 3rd party API) and get a unified data access layer instantly.