Wednesday, 13 March 2013

Moving the blog

I've moved this blog over to github. Please update your bookmarks and RSS readers.

Click here to go the new blog

Wednesday, 6 February 2013

Talk on Frinj

I recently gave a talk on frinj at Skills Matter in London.

I go over the basics and also attempt some more involved calculations (*). Lots of live coding in Emacs included.

Click here for the video

(*) that among other things tries to explain the course of human civilisation for the last couple of hundred years :-)

Sunday, 27 January 2013

Embedding a new runtime into your legacy C/C++ app

Let's say you have a big / legacy C++ app, then you're undoubtedly covered by Greenspun's tenth rule. Let's also say that your home-grown, buggy and slow DSL / scripting language has been pushed to the limit and can not be tweaked any further. What do you do, how can you replace it?

As you might expect, this is quite a common problem, and embedding scripting languages into a big C/C++ monolith is popular. There are famous examples from gaming where Lisps and Lua are widely used.

In this post I'll focus on 4 options; Mono, JavaScript / v8, Guile and Lua. I only picked runtimes where true 'scripting' is possible, thus all of them are managed environments with garbage collection. I will try to categorize these 4 with some key metrics that is of interest when embedded runtimes. I assume that you need to "properly" embed these runtimes, i.e. creating a RESTful micro service is not an option.

Benchmarking figures for most of these runtimes are available on Alioth.

Option 1/ Heavy-weight full blown generic runtime - Mono (.NET)

Mono is an open-source implementation of the Microsoft's Common Language Runtime (CLR) and a few tools such as a C# compiler etc. The project has been going for 8 years now and has been making steady progress. Version 3 introduced a new generational garbage collector, and overall it's performant and stable. It is possible to embed into your application, but you have to realise that Mono/CLR is a generic VM specified at byte-code level. It's intended to be the target of many languages compilers, even if C# is the most commonly known. .NET is one of the corner-stones of Windows, so it comes with mechanisms for versioning and signing it's "assemblies" (executables / libraries) and storing them in a central depot (the global assembly cache, GAC). Mono includes most of these features as well.

Pros General purpose, supports many languages
Big eco-system with ready-made libraries
Multi threaded applications
Great IDE support
Cons Big
Somewhat clunky to interrop with native code
Assemblies needs to be stored/shipped in binary form -- not as simple scripts
No natural way to work with the GAC from the embedded VM
C# and F# are statically typed languages (might not be a great fit for scripting purposes)
Hard to static-link (embed) into your app
Size of (static-linked) hello world example12MB
Time of running hello world example70ms
Future proof (10 years from now)3 out of 5

The static-link issue can be a major headache when embedding Mono. Other than this it's a very powerful and stable runtime. Mono can also use LLVM for it's JIT code generation making is suitable for many different CPU architectures. The fact that C# and F# "scripts" need to be shipped / stored as binary assemblies can be a deal-breaker if you're looking for a easily editable/patchable script solution. Note that this is only true for the compiled CLR languages as C# / F#, IronPython/Clojure for instance can be shipped in source.

Option 2/ Medium-weight, not-so generic runtime - Javascript V8

Javascript is huge language nowadays and the runtime implementations in the big browsers (except maybe IE) are now very sofisticated and fast. In fact, v8 is on par with Mono/C#, that is a quite astonishing fact if you consider the nature of the Javascript language and what v8 has to do in order to run that fast. V8 has been designed to be embeddable and offers also nice and easy interrop.

Pros Fast
Wide industry usage
Big ecosystem (node.js is a great source)
Easy to embed / interrop (again, node.js is a great example)
Reads scripts in source format so they can be stored/shipped in that manner
Dynamically typed
Huge industry uptake, you can safely assume that all your new devs will know it
Cons Single threaded
Quirky syntax and other language artefacts
Size of (static-linked) hello world example5.5MB
Time of running hello world example25ms
LicenseNew BSD License
Future proof5 out of 5

Due to the fact that all browsers can run javascript, the language have unmatched reach. Over the last couple of years it has become the "bytecode of the web", meaning that lots of languages/compilers has emerged that targets javascript. For example; CoffeScript, ClojureScript, TypeScript to mention just a few.

Option 3/ Medium-weight, generic runtime - Guile

Guile is the official extension language in the GNU universe. Originally it's been a Scheme, but with guile2.0 parsers for Javascript, Emacs lisp was added. Support for Lua is also in the works. The idea is to expose the innards of your app to scheme programs, in the form of Scheme functions, and thus making it possible for you and your users to use the software in a very flexible way.

Pros Good interrop
Very powerful language
Dynamically typed
Cons Quite slow, order of magnitude slower than Mono/v8
Hard to static-link (embed) into your app
Small ecosystem of ready-made libraries
Restrictive licensing
Size of (static-linked) hello world example5MB
Time of running hello world example20ms
Future-proof5 out of 5

Picking a Lisp for you scripting might seem controversial, but the level of expressiveness it gives in unmatched by any other language. If licensing is problem, there are other Scheme implementations worth considering, Chicken, Gambit, Bigloo. Guile tends to be slowest of them all. Guile also shares some of the headaches with Mono when it comes to static compile it into your app.

Option 4/ Light-weight, single-language runtime - Lua

Lua is a popular embedded scripting language in games (world of warcraft) and many embedded systems. It's extremely small and draws a lot of it's power from it's simplicity. It's also very easy to interrop with your existing code. Lua was specifically designed to be embedded and interrop easily.

Pros Extremely lightweight
Amazing interrop
Simple yet powerful language
Broad industry uptake
Dynamically typed
Cons Slow (about 30x slower than v8/mono)
Single threaded
Small ecosystem of ready-made libraries
Size of (static-linked) hello world example198KB
Time of running hello world example12ms
Future-proof3 out of 5

The slowness of Lua has been adressed in the LuaJIT project, which indeed produces some very impressive numbers, well worth a look.


While these 4 aren't a complete list, I believe they cover many bases. Other popular embeddable languages include Python and Ruby, I'd put them in the same group as Guile when it comes to complexity and performance. The safest option in most cases is IMHO Javascript / v8. It's got the speed, industry acceptance and developer familiarity. If you have a resource constrained system, Lua is very attractive. Finally, if you're looking for maximum expressiveness in your embedded language Scheme/Guile is hard to beat.

Wednesday, 24 October 2012

The future of .NET lies in Mono. The future of F# lies in MonoDevelop.

It's been a year since I last wrote about F# and Mono. What's happened since then?

F# 3.0 has recently been released, bundled in with the new all-grey, ALL-CAPS Visual Studio 2012. The biggest new feature is type providers, bringing some of the benefits of dynamic languages into type safe world. Innovations like type providers deserve more industry attention. I really hope these ideas will spread and hopefully languages like Scala will pick them up pretty soon so more developers (including me) can enjoy the benefits.

OK, that's cool, but how is good old F# doing? Well, about the same. It lumbers on in obscurity under the massive shadow of Microsoft and whatever crazy idea the company is currently peddling (Win8 metro UI, touch-screen laptops, WinRT, $100 surface covers, pretending Silverlight never happened, etc...) F# is still awesome and deserves a lot more attention and adoption.

Take a look at ThoughWorks latest Tech Radar. F# distant relatives (as in fairly new "functional-first" languages) Scala and Clojure are steaming ahead and have both reached "Adopt" status. F# is stuck in "assess" never-land. I don't see many signs of that changing anytime soon.

F# has limited credibility because of Microsoft. Even though F# is actually open source, it has a very small open source community. The development is completely driven from Microsoft, and there is very little "open source buzz" about it, typical for any Microsoft products. F# moves with the same slow cadence as Visual Studio, which is software terms are eons between releases. Any big and open F# frameworks are sorely lacking. Microsoft's completely r******d (there, I said it) messaging regarding F# and .NET is also to blame.

On messaging; firstly, there is the total confusion about .NET. Where is that going? Windows8 is all about HTML5. Anders is doing Typescript now, silverlight is dead. There's a lot of frustrated .NET developer out there. I've predicted that Mono is future home and legacy for .NET, and it looks more likely every day. As a die-hard MSDN developer you might frown upon this fact, but really it's not a bad thing. Open-source and Mono has a lot to offer, for instance OS independence. This is absolutely critical to continue to drive adoption.

Secondly, F# has always been the odd one out in the .NET space (compared to headline technologies such as C#, VB, ASP). If Microsofts messaging on the future of .NET is confusing, their messaging on what F# is and supposed to be used for is crazy; "Use C# for everything, and if you're an academic do some data analysis check out F#". Screw that, F# is superior to C# in every single way, for any application. Microsoft should promote the hell out of it and stop pussyfooting about. However, I have very little faith that this will ever improve, and F# is (and always has been) dancing close to the edge of oblivion.

There is a big F# shaped hole in the language space currently, on the JVM and elsewhere. Like I stated a year ago, if F# did run on the JVM the story would be completely different, it would have massive adoption. It beats Scala on every single point, and is a perfect candidate for "the static typed language of choice" in your language toolbox. Today people are seriously looking into Haskell when they get fed up with their gnarly python/ruby projects. That's completely nuts if you ask me, I don't believe Haskell is the answer for any real-world problems, but let's keep that for a future blog post. F# should be what comes to mind first!

So what about Mono and open source then? Don Syme spoke at the recent MonkeySpace conference, and generated a lot of buzz. .NET has never been sexy technology in the hands of Microsoft, but the Xamarin guys are turning Mono into just that. MonoTouch, MonoGame, Unity are some really good products. Mr Don Syme, this is where you and your language belong, this is how you take F# to the next level. Forget about all the in-fighting and bickering at Microsoft and focus on what's good for F#. That is to embrace Mono fully, it's your number one platform.

The culture shift for developers who's been living inside the Microsoft/MSDN bubble moving to Mono is drastic. Mono is an all-out open source community with all it's up and downsides. Say goodbye to stable, supported releases every 3/4 years and hello to nightly builds and pull requests. That certainly won't fit all current .NET developers, like lemmings they'll just move along to whatever Microsoft is feeding them next. Could that be the native renaissance perhaps? :) Real .NET enthusiast should free themselves from the Microsoft shackles and embrace Mono, they'd be much better for it. Join the community, contribute improvements to the garbage collector. Go to awesome conferences in Boston, have fun!

To summarise, how do we save this gem of a language? F# must break out of the Microsoft prison. Ideally I'd like to see a couple of key members of the team to actually leave Microsoft, get some funding and set up a independent F# foundation (or maybe join Xamarin). This foundation could pursue important issues like sponsoring key libraries like web frameworks, making sure the F# MonoDevelop integration is top notch etc. So while Microsoft is committing corporate suicide with Windows8, the .NET and F# community needs to move on.

The future of .NET lies in Mono. The future of F# lies in MonoDevelop.

Monday, 27 August 2012

Some more Datalog

I've written about datalog and Datomic a bit recently. To conclude here's another post comparing execution speed with the contrib.datalog library, by Jeffrey Straszheim. Clojure1.4 ready source can be found here.

The example I'm using in my benchmark is a simple join between two relations, in datomic/datalog it would look like this; In contrib.datalog the same query requires a bit more ceremony, you can write it like this; In my previous posts I described a number of different way to use core.logic, unify+clojure.set/join to replicate the same query. How does the execution times compare? I use the same benchmark as in the previous post (the same query, with 5000 joins between the 2 'tables').

Datomic/datalog is fastest by far needing ~11ms to complete. Second fastest is the unify + clojure.set/join solution described here about an order of magnitude slower at ~150ms. The core.logic defrel/fact and contrib.datalog is about equal in speed at ~320ms, ie. 2x slower than unify+join and ~30x slower than datomic/datalog.


My recent datalog posts focused on querying in-memoy data structures (like log rings etc). This is not exactly what Datomic was designed to do, still it's query engine performs really well. An added bonus is that it's low on ceremony. The recently announced Datomic free edition eradicates some of my initial fear of using it in my projects. The main drawback is that Datomic is closed sourced, even the free edition. Another detail that's annoying is that even if you're just after the datalog/query machinery -- by adding datomic-free, you pull in all of datomics 27 jar dependencies weighing in at 19Mb. That's quite a heavy tax on your uberjar.

There are certainly alternatives, like core.logic and contrib.datalog. But the order of magnitude worse execution time can be hard to live with if your datasets are big. By using datomic/datalog queries you also have the flexibility to move into disk-based databases if your datasets explodes. More on this in upcoming blog posts.

For completeness, here's the join-test function I used for the contrib.datalog benchmark;