grove is out

After working away at it for a month or two of scraped together free time, whilst learning erlang, grove is done and there’s a screencast to demo it.

(By done I mean done for now, I’ve got some other projects to attend to for a bit)

http://nickelcode.com/grove

Scale Rails Screencasts

http://railslab.newrelic.com/scaling-rails

Been watching these, and they are great from every perspective. Even if you’re rails app is small and doesn’t really need to scale, the client side optimization tips in the first video are awesome stuff to put into practice.

testing = mochijson2 funnyness feature

[Update] I filed a ticket on the mochiweb google code site, and they pointed out that this IS expected behaviour. Mochijson2 expects binary strings, which makes good sense since when you decode with mochijson2, strings are converted to just that! Also, string operations in Erlang are faster on binary strings (or so I’ve been told). See the gist for an example.

So I’m working on my test coverage for grove, which has been otherwise abysmal and late to the game. The bright side is that I caught some funnyness with my testing that I was not dealing with before, and I thought I would share my good fortune with everyone.

Notice I said funnyness and not wrongness, this may be the intended behavior but I was not expecting it.

I realized this once I started building my unit tests because I’d been dealing with atoms and numbers ONLY for the value position of those key/value tuples. Obviously I thought it a bit odd when the string result I got included and array of integers where I expected to see a string.

Solution: use mochijson. Not sure why I was using 2 in the first place.

Blah, blah worthless post, _but_ I am constantly reminded why testing makes the world a better place. To take it a bit farther I’m going to borrow a story my high school econ teacher, Mr. Werbylo, told me to explain why we shouldn’t sit on the desks in class:

Look, [ these desks are old < => programming is hard ].

If you [ fall of the desk < => don't build unit tests ] you’re going to be hurt and everyone is going to laugh at you for your clumsiness. Most people think the story ends there, but it doesn’t. When you get home, you’ll still be angry and you might kick your dog. Your dog, being angry and sad from the physical pain you’ve caused him will bite the mailman. And finally, ultimately, the mailman will go to work the next day with an Uzi and kill everyone.

That, is how [ falling of a desk < => failing to get test coverage ] causes loss of life.

One of my favorite teachers.

Erlang: Pattern Matching Declarations vs Case Statements/Other

[Update] I did some refactoring that I’ve been meaning to do thanks to ayrnieu in #erlang.

So as I’m hacking my first Erlang project I’ve come across a few places where I was unsure what would generally be a more readable/understandable/robust solution to a given problem. The one I’m thinking about right now is using pattern matching in my function definitions or, alternatively, case statements. I’m hoping someone from the community might be able to shed some light.

Part of the project I’m working on turns a tuple like:

into a set comprehension like:

As you can see my goal is to be able to query Mnesia, or rather to produce a standard way to query any data store. The following function definition and support functions do the work:

The tuple is “filtered” down the function definitions, and when each tuple (columns, operations, etc) is handled it is translated into a string and thus not matched again until ultimately the final definition concatenates all the pieces together to build/return our query. I REALLY like the way this works in that its short and simple. Also the tuples could in theory be replaced by an appropriate string built outside the function and it would still work, which makes it more flexible.

It has other issues, such as its fragile ordering requirement. If its not ordered properly it may break, or worse just not form the comprehension properly. Which leads me to wonder if there’s a better way to implement this, even if its not _quite_ as concise.

The following is a quick hack and not tested but another way to handle it might look like:

Honestly, I can’t say which would be better as I haven’t tried replacing the version I’m using with a case statement. Is there even a third and better way to handle this situation? Or is this just a case of agonizing over the fork or spoon for eating your pie: doesn’t really matter both will work equally well.

Also, I am proposing a new term for pattern matching in function declarations:

patmatchlarations!

Happy Holidays.

I’m not sure if anyone reads this regularly, but with that uptick in people stopping by I thought I would wish everyone a great holiday season just in case they do!

I switched the theme for the site because I was running out of tab space at the top and I will be putting up at least one new project for people to look at relatively soon. My first in erlang!

Also, if you haven’t already checked it out, erlanguid.com is up and running. Not too many members right now but I’m hoping that as time passes and I can get some good tutorials/screencasts up there about working with erlang and setting up a development environment (yaws appmods, debugging with emacs, io:format vs io_lib:format, etc).

Again, have a great holiday and a wonderful new year!

An Idea: Scripted C# for ASP.NET Development

I work as a .NET consultant in New York, most often with ASP.NET and SharePoint so the frustration expressed here comes directly from experience. Outside of work I like to dabble with lots of other neat tools an one thing I’ve really gotten used to in web development is NOT having to compile when I make simple changes to my app. PHP, Python, and Ruby all make this very easy. And if your project is pretty small, using ASP.NET doesn’t add too much over head. But that doesn’t last for long

The problem is when the project gets large or you group a lot of classes/code into a single DLL and THEN have to compile them to see your changes. Even failing to mention the need to get IIS to clear its cache before anyone can see a change from the code, there’s a wait time of 30 seconds to a minute depending on the size of the dll.

I hear you say, “30 seconds isn’t that big of a deal”, but doing a little math shows you that 30 seconds compounded maybe a hundred times (conservative) a day over the course of a week or two and you end up with hours wasted. All that time spent just waiting, doing nothing. What’s worse is that you lose your focus on what you’re testing. You’re continuity of thought is crushed under the weight of compiling your 1 liner alteration.

So, lots of whining without many suggestions. Well here it is: http://tirania.org/blog/archive/2008/Sep-29.html. That’s the mono project’s leader Miguel De Icaza’s blog and he talks about the csharp interactive shell that came out of the mono project recently.

What I would like to see is a development mode for IIS that deals with C# in a script like fashion, using something like the aforementioned link so that we can stop compiling things while were working on them, and still get the nice quick execution we’ve come to expect from compiled/cached .NET assemblies in production. Faster iterations means more efficiency and less brain breaks when developing web applications.

Right, the page might load slower when you fire up your server, but I’m tired of recompiling some huge thick dll every time I make an extremely small change.

Erlang and Cloud Computing: A Fine Pair Indeed

[update]: Some clarification (in it’s favor!) on Mnesia’s limitations thanks to commenter Gleb Peregud

“The Cloud”. Infrastructure as a resource. Whether or not you’ve bought into the hype (or HiPE as you prefer :D ) some of the largest software/IT companies in the world are throwing piles of cash at this idea of a hardware-service. But, for the purposes of new web applications that are looking to take advantage of hardware scaling to meet demand, there’s a lot of work to be done. For a quick roundup on some of the issues facing web applications with the EC2 platform you can check out Tony Arcieri’s Post on Rails with EC2. Just as Tony Points out in his article, Erlang has a lot of tools ready made for these demands, and, with the added side benefit of proven stability/scalability in intense environments, it’s certainly worthy of some consideration for your next cloud ready app.

A Language (and VM) Built to Scale

Erlang has been around the block a few times ( short history ), and it’s been used in some situations with incredible requirements and results. From it’s home at Ericsson to other telecom applications with the likes of Nortel, T-Mobile, and Motorola, where it has achieved 5 nines of availability in some instances, Erlang has been proven as a reliable platform on which to build applications.

Platform is the key word there, because it’s not just the language and its syntax that make it great for it’s appointed task but also the VM that it runs on. Erlang’s VM comes with some really great features that also make concurrent programming a lot easier.

  1. “Green Threads” – There’s some disagreement over whether the name is correctly applied in Erlang’s case, but the benefit is clear: Cheap process creation. In comparison system threads are “heavy”, and if a programmer is working with them on a intensive concurrent application they are forced to worry about going thread crazy because of performance degredation. While Erlang doesn’t completely fix this issue, it does the system thread managment for you, allowing for more attention to be paid to building the application as it makes sense for the problem set. If you need a thousand processes, then use a thousand.
  2. Message Passing Primitive’s – Erlang has been designed to pass messages between processes. It has built in syntax constructs specifically for this purpose, and while there are many languages that have something similar, it’s a great indicator of what the language was meant for. Here’s a very simple example:

    As long as you know the Erlang process id of a given bit of running code you can dial it up and tell it what to do.

  3. Hot Code Swapping – This is a big one here. This is where the 5 – 9 nines up-time comes from because, as long as your build your applications with a little forethought, you don’t have to bring them down for the purposes of providing bug fixes or features. Before jumping at the “with a little forethought”, take a read here and see how easy it really is. And it is easy, but even if it was hard when was the last time you worked on an application that didn’t need to be restarted when new code was added?

Libraries Ready Made for Distributing Load

Erlang comes with a whole host of libraries and modules built in support of its primary goal as a massively concurrent programming language. These libraries are important enough that Erlang is often referred to in conjunction with them as “Erlang/OTP”.

  1. Mnesia – “is a distributed DataBase Management System (DBMS), appropriate for telecommunications applications and other Erlang applications which require continuous operation and exhibit soft real-time properties.” (ref) I’m sure many of you are saying to yourselves, “I can distribute load with MySQL so what’s the advantage?”. The advantage comes from being able to add a table copy to a new node as easy as:

    That’s pretty amazing when you consider what it means for easy data distribution.

    Mnesia is not without its drawbacks though. Most importantly it wasn’t designed to store enormous tables like SQL databases. In fact it has a 2gb table size limit for disc_only_copy configuration of tables, but the in memory and memory and disc table copies are only limited by ram size. Also, with more than 8-10 nodes the amount of network communication may begin to inflict performance degredation. (Big thanks to comment poster Gleb Peregud, for both of those tidbits).

  2. gen_server – gen_server is a module that provides an interface that allows your code to take advantage of much of the OTP goodness like process supervision and logging without lifting a finger. Supervision itself is enough to get excited about, as it allows safe and reliable process topologies to be created with ease. Processes monitoring and restarting other processes sounds like a recipe for reliability when implemented correctly.

That’s only two notable examples from an enormouse set of libraries all targeted at reliable distributed infrastructure. Most of which have been proven in the harsh environments mentioned earlier.

Alternatives to Erlang

This is a big enough market that there are bound to be a lot of solutions out there for the problem of easy node addition to applications, and it only takes one look at the comments in one of Yariv’s posts ( Erlang vs Scala ) to see there are a lot of opinions.

At this point I have to confess I can’t really comment with full knowledge on how something like Scala stacks up to Erlang in this specific scenario. What I do know is that Scala runs on the JVM, which is great from a stability perspective but may not be as stellar for massive concurrency because of the way JVM handles threading with real system threads. Much the same, the CLR from Microsoft uses real system threads, but has the ability to use something akin to green threads (reference?). Clearly using either of these has some great advantages, like access to Java libraries (though many argue this might make concurrency more difficult), so it’s not as cut and dry as Erlang > All.

Clouds on the Brain

However you choose to tackle it, the cloud is growing in popularity and importance when considering web applications. Looking to the future, even the most basic hosting plans may consist of on demand hardware for our disposal, and then it will really be up to you how you want to utilize the resources you have to better serve out your apps.

Let me know what you think in the comments!