Domain-Specific Languages
Michael Hunger
WHENEVER YOU LiSTEN TO A DiSCUSSiON BY ExPERTS in any domain, be it chess players, kindergarten teachers, or insurance agents, you’ll notice that their vocabulary is quite different from everyday language. That’s part of what domain-specific languages (DSLs) are about: a specific domain has a specialized vocabulary to describe the things that are particular to that domain.
In the world of software, DSLs are about executable expressions in a language specific to a domain, employing a limited vocabulary and grammar that is readable, understandable, and—hopefully—writable by domain experts. DSLs targeted at software developers or scientists have been around for a long time. The Unix “little languages” found in configuration files and the languages cre- ated with the power of LISP macros are some of the older examples.
DSLs are commonly classified as either internal or external:
Internal DSLs
Are written in a general-purpose programming language whose syntax has been bent to look much more like natural language. This is easier for languages that offer more syntactic sugar and formatting possibilities (e.g., Ruby and Scala) than it is for others that do not (e.g., Java). Most internal DSLs wrap existing APIs, libraries, or business code and provide a wrap- per for less mind-bending access to the functionality. They are directly executable by just running them. Depending on the implementation and the domain, they are used to build data structures, define dependencies, run processes or tasks, communicate with other systems, or validate user input. The syntax of an internal DSL is constrained by the host language. There are many patterns—e.g., expression builder, method chaining, and annotation—that can help you to bend the host language to your DSL. If the host language doesn’t require recompilation, an internal DSL can be developed quite quickly working side by side with a domain expert.
??46
97 Things Every Programmer Should Know
?
???????????????External DSLs
Are textual or graphical expressions of the language—although textual DSLs tend to be more common than graphical ones. Textual expressions can be processed by a toolchain that includes lexer, parser, model transformer, gen- erators, and any other type of post-processing. External DSLs are mostly read into internal models that form the basis for further processing. It is helpful to define a grammar (e.g., in EBNF). A grammar provides the starting point for generating parts of the toolchain (e.g., editor, visualizer, parser generator). For simple DSLs, a handmade parser may be sufficient—using, for instance, regular expressions. Custom parsers can become unwieldy if too much is asked of them, so it makes sense to look at tools designed specifically for working with language grammars and DSLs—e.g., openArchitectureWare, ANTLR, SableCC, AndroMDA. Defining external DSLs as XML dialects is also quite common, although readability is often an issue—especially for nontechnical readers.
You must always take the target audience of your DSL into account. Are they developers, managers, business customers, or end users? You have to adapt the technical level of the language, the available tools, syntax help (e.g., IntelliSense), early validation, visualization, and representation to the intended audience. By hiding technical details, DSLs can empower users by giving them the abil- ity to adapt systems to their needs without requiring the help of developers. It can also speed up development because of the potential distribution of work after the initial language framework is in place. The language can be evolved gradually. There are also different migration paths for existing expressions and grammars available.