Hi, I‘m Martin. You should follow me: @martinklepsch

January 2015 CLJSJS - Use Javascript Libraries in Clojurescript With Ease

In Clojure, Java interoperability or “interop” is a core feature. In Clojurescript, interop with Javascript libraries does not work out-of-the-box across optimization modes. Extern files or “externs” required for advanced optimizations are often hard to find.

To fix this a few newly found friends and I created CLJSJS. CLJSJS is an effort to package Javascript libraries with their respective extern files and provide tools to integrate them into your project.

My personal hope is that this will make it easier for newcomers to get started with Clojurescript.

Also existing solutions like deps.clj (more here) only address the problem of Javascript dependencies partially. Maybe CLJSJS can serve as a vehicle to find some “pseudo-standard” for this kind of stuff.

Thanks to Juho Teperi, Micha Niskin & Alan Dipert for their contributions and ideas so far. Now go and check out the project homepage or jump straight into the packages repo and learn how you can contribute.

Announcement post and discussion on the Clojurescript mailinglist

November 2014 Why Boot is Relevant For The Clojure Ecosystem

Boot is a build system for Clojure projects. It roughly competes in the same area as Leiningen but Boot’s new version brings some interesting features to the table that make it an alternative worth assessing.

Compose Build Steps

If you’ve used Leiningen for more than packaging jars and uberjars you likely came across plugins like lein-cljsbuild or lein-garden, both compile your stuff into a target format (i.e. JS, CSS). Now if you want to run both of these tasks at the same time — which you probably want during development — you have two options: either you open two terminals and start them separately or you fall back to something like below that you run in a dev profile (this is how it’s done in Chestnut):

(defn start-garden []
    (print "Starting Garden.\n")
    (lein/-main ["garden" "auto"])))

Now there are issues with both of these options in my opinion. Opening two terminals to initiate your development environment is just not very user friendly and putting code related to building the project into your codebase is boilerplate that unnecessarily can cause trouble by getting outdated.

What Boot allows developers to do is to write small composable tasks. These work somewhat similar to stateful transducers and ring middleware in that you can just combine them with regular function composition.

A Quick Example

Playing around with Boot, I tried to write a task. To test this task in an actual project I needed to install it into my local repository (in Leiningen: lein install). Knowing that I’d need to reinstall the task constantly as I change it I was looking for something like Leiningen’s Checkouts so I don’t have to re-install after every change.

Turns out Boot can solve this problem in a very different way that illustrates the composing mechanism nicely. Boot defines a bunch of built-in tasks that help with packaging and installing a jar: pom, add-src, jar & install.

We could call all of these these on the command line as follows:

boot pom add-src jar install

Because we’re lazy we’ll define it as a task in our project’s build.boot file. (Command-line task and their arguments are symmetric to their Clojure counterparts.)

(require '[boot.core          :refer [deftask]]
         '[boot.task.built-in :refer [pom add-src jar install]])

(deftask build-jar
  "Build jar and install to local repo."
  (comp (pom) (add-src) (jar) (install)))

Now boot build-jar is roughly equivalent to lein install. To have any changes directly reflected on our classpath we can just compose our newly written build-jar task with another task from the repertoire of built-in tasks: watch. The watch-task observes the file system for changes and initiates a new build cycle when they occur:

boot watch build-jar

With that command we just composed our already composed task with another task. Look at that cohesion!

There Are Side-Effects Everwhere!

Is one concern that has been raised about Boot. Leiningen is beautifully declarative. It’s one immutable map that describes your whole project. Boot on the other hand looks a bit different. A usual boot file might contain a bunch of side-effectful functions and in general it’s much more a program than it is data.

I understand that this might seem like a step back at first sight, in fact I looked at it with confusion as well. There are some problems with Leiningen though that are probably hard to work out in Leiningen’s declarative manner (think back to running multiple lein X auto commands.

Looking at Boot’s code it becomes apparent that the authors spent a great deal of time on isolating the side effects that might occur in various build steps. I recommend reading the comments on this Hacker News thread for more information on that.

When To Use Boot, When To Use Leiningen

Boot is a build tool. That said it’s task composition features only get to shine when multiple build steps are involved. If you’re developing a library I’m really not going to try to convince you to switch to Boot. Leiningen works great for that and is, I’d assume, more stable than Boot.

If you however develop an application that requires various build steps (like Clojurescript, Garden, live reloading, browser-repl) you should totally check out Boot. There are tasks for all of the above mentioned: Clojurescript, Clojurescript REPL, Garden, live reloading. I wrote the Garden task and writing tasks is not hard once you have a basic understanding of Boot.

If you need help or have questions join the #hoplon channel on freenode IRC. I’ll try to help and if I can’t Alan or Micha, the authors of Boot, probably can.

October 2014 S3-Beam — Direct Upload to S3 with Clojure & Clojurescript

In a previous post I described how to upload files from the browser directly to S3 using Clojure and Clojurescript. I now packaged this up into a small (tiny, actually) library: s3-beam.

An interesting note on what changed to the process described in the earlier post: the code now uses pipeline-async instead of transducers. After some discussion with Timothy Baldridge this seemed more appropriate even though there are some aspects about the transducer approach that I liked but didn’t get to explore further.

Maybe in an upcoming version it will make sense to reevaluate that decision. If you have any questions, feedback or suggestions I’m happy to hear them!

October 2014 Patalyze — An Experiment Exploring Publicly Available Patent Data

For a few months now I’ve been working on and off on a little “data-project” analyzing patents published by the US Patent & Trademark Office. Looking at the time I spent on this until now I think I should start talking about it instead of just hacking away evening after evening.

It started with a simple observation: there are companies like Apple that sometimes collaborate with smaller companies building a small part of Apple’s next device. A contract like this usually gives the stock of the small company a significant boost. What if you could foresee those relationships by finding patents that employees from Apple and from the small company filed?

An API for patent data?

Obviously this isn’t going to change the world for the better but just the possibility that such predictions or at least indications are possible kept me curious to look out for APIs offering patent data. I did not find much. So thinking about something small that could be “delivered” I thought a patent API would be great. To build the dataset I’d parse the archives provided on Google’s USPTO Bulk downloads page.

I later found out about Enigma and some offerings by Thomson Reuters. The prices are high and the sort of analysis we wanted to do would have been hard with inflexible query APIs.

For what we wanted to do we only required a small subset of the data a patent contains. We needed the organization, it’s authors, the title and description, filing- and publication dates and some identifiers. With such a reduced amount of data that’s almost only useful in combination with the complete data set I discarded the plan to build an API. Maybe it will make sense to publish reduced and more easily parseable versions of the archives Google provides at some point. Let me know if you would be interested in that.

What’s next

So far I’ve built up a system to parse, store and query some 4 million patents that have been filed at the USPTO since beginning of 2001. While it sure would be great to make some money off of the work I’ve done so far I’m not sure what product could be built from the technology I created so far. Maybe I could sell the dataset but the number of potential customers is probably small and in general I’d much more prefer to make it public. I’ll continue to explore the possibilities with regards to that.

For now I want to explore the data and share the results of this exploration. I setup a small site that I’d like to use as home for any further work on this. By now it only has a newsletter signup form (just like any other landing page) but I hope to share some interesting analysis with the subscribers to the list every now and then in the near future. Check it out at patalyze.co. There even is a small chart showing some data.

September 2014 Running a Clojure Uberjar inside Docker

For a sideproject I wanted to deploy a Clojure uberjar on a remote server using Docker. I imagined that to be fairly straight foward but there are some caveats you need to be aware of.

Naively my first attempt looked somewhat like this:

FROM dockerfile/java
ADD https://example.com/app-standalone.jar /
ENTRYPOINT [ "java", "-verbose", "-jar", "/app-standalone.jar" ]

I expected this to work. But it didn’t. Instead it just printed the following:

[Opened /usr/lib/jvm/java-7-oracle/jre/lib/rt.jar]
# this can vary depending on what JRE you're using

And that has only been printed because I added -verbose when starting the jar. So if you’re not running the jar verbosely it’ll fail without any output. Took me quite some time to figure that out.

As it turns out the dockerfile/java image contains a WORKDIR command that somehow breaks my java invocation, even though it is using absolute paths everywhere.

What worked for me

I ended up splitting the procedure into two files in a way that allowed me to always get the most recent jar when starting the docker container.

The Dockerfile basically just adds a small script to the container that downloads and starts a jar it downloads from somewhere (S3 in my case).

FROM dockerfile/java
ADD fetch-and-run.sh /
EXPOSE 42042
CMD ["/bin/sh", "/fetch-and-run.sh"]

And here is fetch-and-run.sh:

#! /bin/sh
wget https://s3.amazonaws.com/example/yo-standalone.jar -O /yo-standalone.jar;
java -verbose -jar /yo-standalone.jar

Now when you build a new image from that Dockerfile it adds the fetch-and-run.sh script to the image’s filesystem. Note that the jar is not part of the image but that it will be downloaded whenever a new container is being started from the image. That way a simple restart will always fetch the most recent version of the jar. In some scenarios it might become confusing to not have precise deployment tracking but in my case it turned out much more convenient than going through the process of destroying the container, deleting the image, creating a new image and starting up a new container.

« 1 2 3 4 5 6 7 »