Boot, the Fancy New Clojure Build Framework
Build tools are known to inspire the entire gamut of emotions from bored impatience to Homeric rage (I'm looking at you, Grunt). Personally, I've never given them much thought; they've always seemed like tedious overhead, an unfortunate necessity for getting real work done.
Recently, though, I've started learning about Boot, and I've found that build programming can actually be interesting. This article will explain Boot's underlying concepts and guide you through writing your first Boot tasks. If you're interested in using Boot to build projects right this second, then check out its github README and its wiki.
Boot's Abstractions
Created by Micha Niskin and Alan Dipert, Boot is a completely controversy-free addition to the Clojure tooling landscape. On the surface, it's "merely" a convenient way to build Clojure applications and run Clojure tasks from the command line. But dig a little deeper and you'll see that Boot is like the lisped-up lovechild of Git and Unix in that it provides abstractions that make it much more pleasant to write code that exists at the intersection of your operating system and your application.
Unix provides abstractions that we're all familiar with to the point
of taking them for granted. (I mean, would it kill you to take your
computer out to a nice restaurant once in awhile?) The process
abstraction allows you to reason about programs as isolated units of
logic that can be easily composed into a stream-processing pipeline
through the STDIN
and STDOUT
file descriptors. These abstractions
make certain kinds of operations, like text processing, very easy.
Similarly, Boot provides abstractions that make it actually pleasant to compose independent operations into the kinds of complex, coordinated operations that build tools end up doing, like converting Clojurescript into Javascript. Boot's task abstraction lets you easily define units of logic that communicate through filesets. The fileset abstraction keeps track of the evolving build context and it provides a well-defined, reliable method of task coordination, as opposed to the ill-defined, ad-hoc task coordination which programmers have to impose on other build tools.
That's a lot of high-level description, which hopefully is great for when you want to hook someone's attention, which hopefully I have now done. But I would be ashamed to leave you with a plateful of metaphors. Oh no, dear reader; that was only the appetizer. For the rest of this article you will learn what that word salad means by building your own Boot tasks. Along the way, you'll discover that build tools can actually have a conceptual foundation.
Tasks
Like make, rake, grunt, and other build tools of yore, Boot lets you define tasks. Tasks are
- named operations
- that take command line options
- dispatched by some intermediary program (make, rake, Boot)
Boot provides the dispatching program, boot
, and a Clojure library
that makes it easy for you to define named operations and their
command line options with the deftask
macro. So that you can see
what all the fuss is about, it's time to create your first
task. Normally, programming tutorials encourage have you write
something that prints "Hello World", but I like my examples to have
real-world utility, so your task is going to print "My pants are on
fire!", information which is objectively more useful. First,
install Boot, then create
a new directory named boot-walkthrough
, navigate to that directory,
and finally create a file named build.boot
and put in this in it:
(deftask fire
"Prints 'My pants are on fire!'"
[]
(println "My pants are on fire!"))
Now run this task with boot fire
; you should see the message you
wrote printed to your terminal. This demonstrates two out of the three
task components - the task is named (fire
) and it's dispatched by
boot
. This is already super cool – you've essentially created
a Clojure script, standalone Clojure code that you can easily run from
the command line. No project.clj
or directory structure or
namespaces needed!
Let's extend the example to demonstrate how you'd write command line options:
(deftask fire
"Announces that something is on fire"
[t thing THING str "The thing that's on fire"
p pluralize bool "Whether to pluralize"]
(let [verb (if pluralize "are" "is")]
(println "My" thing verb "on fire!")))
Try running the task like so:
boot fire -t heart
# => My heart is on fire!
boot fire -t logs -p
# => My logs are on fire!
In the first instance, either you're newly in love or you need to be rushed to the emergency room. In the second, you are a boy scout awkwardly exclaiming your excitement over meeting the requirements for a merit badge. In both instances, you were able to easily specify options for the task.
This refinement of the fire
task introduced two command line
options, thing
and pluralize
. These options are defined using the
options DSL. In the case of thing
, t
specifies the option's short
name and thing
specifies the long name. THING
is a little
complicated, and I'll get to it in a second. str
specifies the
option's type, and Boot uses that both to validate the argument and
convert it. "The thing that's on fire"
is the documentation for the
option. You can view a task's documentation with boot task-name -h
:
boot fire -h
# Announces that something is on fire
#
# Options:
# -h, --help Print this help info.
# -t, --thing THING Set the thing that's on fire to THING.
# -p, --pluralize Whether to pluralize
Pretty groovy! Boot makes it very, very easy to write code that's meant to be invoked from the command line.
Now, about THING
. THING
is an optarg and it indicates that this
option expects an argument. You don't have to include an optarg when
you're defining an option (notice that the pluralize
option has no
optarg). The optarg doesn't have to correspond to the full name of the
option; you could replace THING
with BILLY_JOEL
or whatever you
want and the task would work the same. Finally, you can also designate
complex options
using the optarg. (That link will take you to Boot's documentation on
the subject.) Basically, complex options allow you to specify that
option arguments should be treated as as maps, sets, vectors, or even
nested collections. It's pretty powerful.
Boot provides you with all the tools you could ask for in building command-line interfaces with Clojure. And you've only just started learning about it!
The REPL
Boot comes with a good number of useful built-in tasks, including a
REPL task; run boot repl
to fire up that puppy. The Boot REPL is
similar to Leiningen's in that it handles loading your project code so
that you can play around with it. You might not think this applies to
the project you've been writing because you've only written tasks, but
you can actually run tasks in the REPL (I've left out the
boot.user=>
prompt):
;; pass arguments as flags
(fire "-t" "NBA Jam guy")
; My NBA Jam guy is on fire!
;=> nil
;; or as keywords
(fire :thing "NBA Jam guy")
; My NBA Jam guy is on fire!
;=> nil
(fire "-p" "-t" "NBA Jam guys")
; My NBA Jam guys are on fire!
;=> nil
(fire :pluralize true :thing "NBA Jam guys")
; My NBA Jam guys are on fire!
;=> nil
And of course, you can also use deftask
in the REPL – it's just
Clojure, after all. The takeaway is that Boot lets you interact with
your tasks as Clojure functions, because that's what they are.
Composition and Coordination
If what you've seen so far was all that Boot had to offer, it'd be a pretty swell tool, though not very different from other build tools. One thing that sets Boot apart, though, is how it lets you compose tasks. For comparison's sake, here's an example Rake invocation (Rake is the premier Ruby build tool):
rake db:create db:migrate db:seed
In case you were wondering, this will create a database, run
migrations on it, and populate it with seed data when run in a Rails
project. What's worth noting, however, is that Rake doesn't provide
any way for these tasks to communicate with each other. Specifying
multiple tasks is just a convenience, saving you from having to run
rake db:create; rake db:migrate; rake db:seed
. If you want to access
the result of Task A within Task B, the build tool doesn't help you;
you have to manage that coordination yourself. Usually, you'll do this
by shoving the result of Task A into a special place on the
filesystem, and then making sure Task B reads that special place.
This looks like programming with mutable, global variables, and it's
just as brittle.
Handlers and Middleware
Boot addresses this problem by treating tasks as middleware factories. If you're familiar with Ring, Boot's tasks work very similarly; feel free to skip to the next section. If you're not familiar with the concept of middleware, then allow me to explain! First, the term middleware refers to a set of conventions that programmers adhere to so that they can flexibly create domain-specific function pipelines. That's pretty dense, so let's un-dense it. I'll go over the flexible part in this section, and cover domain-specific in the next.
To understand how the middleware approach differs from run-of-the-mill function composition, here's an example of composing everyday functions:
(def strinc (comp str inc))
(strinc 3)
; => "4"
There's nothing interesting about this function composition. This function composition is so unremarkable that it strains my abilities as a writer to try and actually say anything about it. There are two functions, each doing its own thing, and now they've been been composed into one. Whoop-dee-do!
Middleware introduce an extra step to function composition, and this
gives you more flexibility in defining your function
pipeline. Suppose, in the example above, that you wanted to return "I
don't like the number X"
for arbitrary numbers, but still return the
stringified number for everything else. Here's how you could do that:
(defn whiney-str
[rejects]
{:pre [(set? rejects)]}
(fn [x]
(if (rejects x)
(str "I don't like " x)
(str x))))
(def whiney-strinc (comp (whiney-str #{2}) inc))
(whiney-strinc 1)
; => "I don't like 2 :'("
Now let's take it one step further. What if you want to decide whether
or not to call inc
in the first place? Here's how you could do that:
(defn whiney-middleware
[next-handler rejects]
{:pre [(set? rejects)]}
(fn [x]
(if (= x 1) ; ~1~
"I'm not going to bother doing anything to that"
(let [y (next-handler x)]
(if (rejects y)
(str "I don't like " y " :'(")
(str y))))))
(def whiney-strinc (whiney-middleware inc #{3}))
Here, instead of using comp
to create your function pipeline, you
pass the next function in the pipeline as the first argument to the
middleware function. In this case, you're passing inc
as the first
argument to whiney-middleware
. whiney-middleware
then returns an
anonymous functions which closes over inc
and has the ability to
choose whether to call it or not. You can see this at ~1~
.
We say that middleware take a handler as their first argument, and
return a handler. In the example above, whiney-middleware
takes a
handler as its first argument, inc
here, and it returns another
handler, the anonymous function with x
as its only
argument. Middleware can also take extra arguments, like rejects
,
that act as configuration. The result is that the handler returned by
the middleware can behave more flexibly (thanks to configuration) and
it has more control over the function pipeline (because it can choose
whether or not to call the next handler).
Tasks are Middleware Factories
Boot takes this pattern one step further by separating middleware configuration from handler creation. First, you create a function which takes n many configuration arguments. This is the middleware factory and it returns a middleware function. The middleware function expects one argument, the next handler, and it returns a handler, just like in the example above. Here's a whiney middleware factory:
(defn whiney-middleware-factory
[rejects]
{:pre [(set? rejects)]}
(fn [handler]
(fn [x]
(if (= x 1)
"I'm not going to bother doing anything to that"
(let [y (handler x)]
(if (rejects y)
(str "I don't like " y " :'(")
(str y)))))))
(def whiney-strinc ((whiney-middleware-factory #{3}) inc))
As you can see, it's nearly identical to the previous example. The
change is that the topmost function, whiney-middleware-factory
, now
only accepts one argument, rejects
. It returns an anonymous
function, the middleware, which expects one argument, a handler. The
rest of the code is the same.
In Boot, tasks can act as middleware factories. In fact, they usually
do, I just didn't present them that way above in order to keep things
simple. To show this, let's split the fire
task into two tasks:
what
and fire
. what
will let you specify an object and whether
it's plural, and fire
will announce that it's on fire. This is
great, modular software engineering because it allows you to add other
tasks like gnomes
, to announce that a thing is being overrun with
gnomes, which is just as objectively useful. (Exercise for the reader:
create the gnome
task.)
(deftask what
"Specify a thing"
[t thing THING str "An object"
p pluralize bool "Whether to pluralize"]
(fn middleware [next-handler]
(fn handler [_]
(next-handler {:thing thing :pluralize pluralize}))))
(deftask fire
"Announce a thing is on fire"
[]
(fn middleware [next-handler]
(fn handler [thing-map]
(let [updated-thing-map (next-handler thing-map)
verb (if (:pluralize thing-map) "are" "is")]
(println "My" (:thing thing-map) verb "on fire!")))))
Here's how you'd run this on the command line:
boot what -t "pants" -p -- fire
And here's how you'd run it in the REPL:
(boot (what :thing "pants" :pluralize true) (fire))
Wait a minute, what's that boot
call doing there? In Micha's words,
"The boot
macro takes care of setup and cleanup (creating the
initial fileset, stopping servers started by tasks, things like
that). Tasks are functions so you can call them directly, but if they
use the fileset they will fail unless you call them via the boot
macro." Wait a minute, what's a fileset?
Filesets
I mentioned earlier that middleware are for creating domain-specific function pipelines. All that means is that each handler expects to receive domain-specific data, and returns domain-specific data. With Ring, for example, each handler expects to receive a request map representing the HTTP request. This might look something like:
{:server-port 80
:request-method :get
:scheme :http}
Each handler can choose to modify this request map in some way before
passing it on to the next handler, say by adding a :params
key with
a nice Clojure map of all query string and POST parameters. Ring
handlers return a response map, which consists of the keys
:status
, :headers
, and :body
, and once again each handler can
transform this data in some way before returning it to its parent
handler.
In Boot, each handlers receives and returns a
fileset. The
fileset abstraction gives you a way to treat files on your filesystem
as immutable data, and this is a great innovation for build tools
because building projects is so file-centric. For example, your
project might need to place temporary, intermediary files on the
filesystem. Usually, with build tools that aren't Boot, these files
get placed in some specially-named place, say,
project/target/tmp
. The problem with this is that
project/target/tmp
is effectively a global variable, and other tasks
can accidentally muck it up.
The fileset abstraction works by adding a layer of indirection on top
of the filesystem. Let's say Task A creates File X and tells the
fileset to store it. Behind the scenes, the fileset stores the file in
an anonymous, temporary directory. The fileset then gets passed to
Task B, and Task B modifies File X and asks the fileset to store the
result. Behind the scenes, a new file, File Y, is created and stored,
but File X remains untouched. In Task B, an updated fileset is
returned. This is the equivalent of doing assoc-in
with a map; Task
A can still access the original fileset and the files it references.
The mechanics of working with filesets are all explained in the fileset wiki, but I hope this gives a good conceptual overview!
Everything else
The point of this article was to explain the concepts behind
Boot. However, it also has a bunch of features, like set-env!
and
task-options!
that make life easier when you're actually using
it. It does amazing magical things like providing classpath isolation
so that you can run multiple projects using one JVM and letting you
add new dependencies to your project without having to restart your
REPL. If Boot tickles your fancy, check out its
README for more info on real-world
usage. Also, its wiki
provides top-notch documentation.
If you're new to Clojure, then check out Clojure for the Brave and True, an introduction to the language written by yours truly. Have fun!