How Does an Engineer Create a Programming Language?

Besides being a software engineer, Marianne Bellotti is also a kind of technological anthropologist. Back in 2016 at the Systems We Love conference, Bellotti began her talk by saying she appreciated the systems most engineers hate —”messy, archaic, duct-tape-and-chewing-gum.” Then she added, “Fortunately, I work for the federal government.”

At the time, Bellotti was working for the U.S. Digital Service, where talented technology workers are matched to federal systems in need of some consultation. (While there, she’d encountered a web application drawing its JSON-formatted data from a half-century-old IBM 7074 mainframe.)

The rich experiences led her to write a book with the irresistible title “Kill It with Fire: Manage Aging Computer Systems (and Future Proof Modern Ones).” Its official web page at Random House promises it offers “a far more forgiving modernization framework” with “illuminating case studies and jaw-dropping anecdotes from her work in the field,” including “Critical considerations every organization should weigh before moving data to the cloud.”

Kill it With Fire by Marianne Bellotti - book cover

Bellotti is now working on products for defense and national security agencies as the principal engineer for system safety at Rebellion Defense (handling identity and access control).

But her latest project is a podcast chronicling what she’s learned while trying to write her own programming language.

“Marianne Writes a Programming Language” captures a kind of expedition of the mind, showing how the hunger to know can keep leading a software engineer down ever-more-fascinating rabbit holes. But it’s also an inspiring example of the do-it-yourself spirit, and a fresh new perspective on the parsers, lexers and evaluators that make our code run.

In short, it’s a deeply informative deconstruction of where a programmer’s tools really come from.

Going Deep

In one blog post, Bellotti invited listeners to “start this strange journey with me through parsers, grammars, data structures and the like.”

And it is a journey, filled with hope and ambition — and a lot of unexpected twists and turns. “Along the way, I’ll interview researchers and engineers who are active in this space and go deep on areas of programming not typically discussed,” the podcast host promised. “All in all,  I’m hoping to start a conversation around program language design that’s less intimidating and more accessible to beginners.”

But the “Marianne Writes a Programming Language” podcast also comes with a healthy dose of self-deprecation. “Let’s get one question out of the way,” her first episode began. “Does the world really need another programming language? Probably not, no.” But she described it as a passion project, driven by good old-fashioned curiosity. “I have always wanted to write a programming language. I figured I would learn so much from the challenge.”

“In an industry filled with opinions, where people will fight to the death over tabs -vs.- spaces, there isn’t much guidance for would-be program language designers.”

—Marianne Bellotti, software engineer and podcast host

Fifteen years into a sparkling technology career, “I feel like there are all these weird holes in my knowledge,” Bellotti told her audience. And even with the things she does know — like bytecode and logic gates — “I don’t have a clear sense of how all those things work together.”

In the podcast’s third episode, Bellotti pointed out that, “for me at least, the hardest part of learning something is figuring out how to learn it in the first place.” She discovered a surprising lack of best-practices documents, she wrote in an essay in Medium. “In an industry filled with opinions, where people will fight to the death over tabs -vs.- spaces, there isn’t much guidance for would-be program language designers.”

Still, her podcast’s first episode showed the arrival of those first glimmers of insight. “Even knowing very little upfront, I had a sense that in order for a programming language to work, there had to be some sense of cohesion in its design.”

Where to Begin?

Her Medium post cited a 2012 article titled “Programming Paradigms for Dummies: What Every Programmer Should Know,” which offers a taxonomy of language types based on how exactly they’re providing their abstractions. That article apparently got her thinking about how exactly a programming language helps communicate the connections that exist between its various data structures — which led to more insights. (In a later podcast, Bellotti even says “technology suggests to its user how it should be used.”)

“Eventually I came to my own conclusions,” she wrote in her Medium article. To be successful at creating her own language, she realized that she needed to think of  programming paradigms like object-oriented or functional programming “as logical groupings of abstractions and be as intentional about what is included and what isn’t.”

Bellotti is also trying to design a language that will work for her specific needs: to know how likely certain types of problems are in a given system, to achieve model resilience. But on her first podcast episode, Bellotti acknowledged that she still had to begin by typing, “How do you design a programming language” into Google —and was surprised by how little came up. (Although she did discover “there’s a whole world of obscure experimental languages that appear in research papers, rack up a host of citations, and never touch an actual computer other than their inventor’s.”)

“I feel like I’ve been struggling to hang pictures around my home and one day someone knocks on my door and introduces me to the hammer,”

—Marianne Bellotti, software engineer and podcast host

So where to begin? Avoiding the standard dry collegiate textbooks like “Compilers: Principles, Techniques, and Tools,” she instead found her way to the book Writing an Interpreter in Go, a book which by necessity also created its own programming language (a modified version of Scheme called Monkey) for its interpreter.

That book’s author, Thorsten Ball, became her podcast’s first guest, explaining that his language was not so much designed as experimented into existence. (Later, other people suggested something similar — that Bellotti “pick something you like in another language and copy the implementation to start, because figuring out all the edge cases from scratch is really hard.”)

In that first podcast episode, Bellotti explained her concern that “tiny little design decisions I don’t even realize I’m making could have dramatic impacts… it does seem to be the case that programmers create languages without being able to fully anticipate exactly how they will be used or how technology will change around them.”

Things Get Complicated

There are moments where it all sounds so simple. (“What you’re doing when you write a programming language is actually writing a series of applications that take string input and translate it into something the machine can execute.”)

But things get complicated pretty quickly, and by episode three Bellotti started to see a pattern: “Confronting what feels like a tidal wave of information is becoming an all too familiar feeling on this project.” Yet, while considering a need for her language’s source code-interpreting parser, she realized that parsers can be auto-generated — as long as she can supply that tool with the necessary grammar rules.

“I feel like I’ve been struggling to hang pictures around my home and one day someone knocks on my door and introduces me to the hammer,” she told her podcast audience.

She ends up talking to a linguist who studied under Noam Chomsky, who refers her to another linguistics professor, who begins by discussing whether language can be learned through the brute-force assimilation of machine learning, and ends up explaining why Chomsky’s “context-free grammar” ultimately became the basis for programming languages and compilers.

But there are resources to discover. Along the way, Bellotti found a Reddit forum about programming language design. (“This subreddit is full of great stories and people will give detailed explanations and encouragement, which is rare on the internet these days.”) She’s also found a forum for people building Domain Specific Languages.

By December, she’d received a comment from a grateful listener who was also writing their own programming language, and was glad to find a relevant podcast. And Bellotti acknowledged in a response that her whole journey “has been so much fun so far.”

Progress is clearly being made. By episode 12, Bellotti considered how hard it would be to add modules to her language. (“From my vantage point, being able to split a system specification into smaller parts means you get to reuse those parts and build progressively more complex systems that are in easily digestible chunks.”) And there’s also already an empty repository on GitHub that’s waiting expectantly for the code to arrive.

Then, in mid-April Bellotti announced that episode 12 would be the last one “for a while. I’ve made some design decisions that I feel really good about, but it’s clear that the only way to validate them is to write code and try things out.”

She’s also spending some time researching how to optimize her compiler, “But really, I just need to just be heads-down, hands-on-a-keyboard for a while on this.”

And so, the podcast has entered a productive hiatus, leaving listeners with this tantalizing promise.

“I’ll be back in a couple of months to let you know how that went.”