Erlang: Pattern Matching Declarations vs Case Statements/Other
[Update] I did some refactoring that I’ve been meaning to do thanks to ayrnieu in #erlang.
So as I’m hacking my first Erlang project I’ve come across a few places where I was unsure what would generally be a more readable/understandable/robust solution to a given problem. The one I’m thinking about right now is using pattern matching in my function definitions or, alternatively, case statements. I’m hoping someone from the community might be able to shed some light.
Part of the project I’m working on turns a tuple like:
into a set comprehension like:
As you can see my goal is to be able to query Mnesia, or rather to produce a standard way to query any data store. The following function definition and support functions do the work:
The tuple is “filtered” down the function definitions, and when each tuple (columns, operations, etc) is handled it is translated into a string and thus not matched again until ultimately the final definition concatenates all the pieces together to build/return our query. I REALLY like the way this works in that its short and simple. Also the tuples could in theory be replaced by an appropriate string built outside the function and it would still work, which makes it more flexible.
It has other issues, such as its fragile ordering requirement. If its not ordered properly it may break, or worse just not form the comprehension properly. Which leads me to wonder if there’s a better way to implement this, even if its not _quite_ as concise.
The following is a quick hack and not tested but another way to handle it might look like:
Honestly, I can’t say which would be better as I haven’t tried replacing the version I’m using with a case statement. Is there even a third and better way to handle this situation? Or is this just a case of agonizing over the fork or spoon for eating your pie: doesn’t really matter both will work equally well.
Also, I am proposing a new term for pattern matching in function declarations:
patmatchlarations!
Comments(10)
I’m voting for the case statement just because it seems more elegant. I’m not sure about speed though. You might want to profile both methods and see what results you get.
@jmkogut
It might be less syntax for the case statement, but I think Erlangs case/if statements are sort of clunky. That’s the only reason I can come up with for why I like using the head matching.
I’ve always found cases to be pretty elegant. It’s the if statements that irk me.
but as you said, the case is less syntax. If you can do the same function, with less to type, and without losing clarity, go for it.
Also: http://gist.github.com/49033
That additional formatting is all I can think of to make it easier to read. A quick glance at that function is easier to understand than tracing through head matching functions, imho.
@jmkogut
The more I look at it the more I’m starting to agree with you. Personal preference to a point I guess.
I should probably finish the software before bothering with stuff like this though eh?
Kindof! Refactoring is fun.
I don’t know what the overhead of pattern matching on a function versus in a case statement are, but I’m pretty sure the “Erlang Way” is the one that avoids a lot of if/else-type statements in favor of pattern-matched tail recursion, as you’ve written it. I also think it reads much more naturally:
operation(Thing) ->
% something funky which binds Result
Result.
something([], Accum) ->
Accum;
something([Next | Rest], Accum) ->
something(Rest, operation(Next)).
To me, that reads: `something` will return Accum if the first parameter is an empty list, `something` will operate and recurse if there’s an element to be used. I can read the purpose of the function by reading its definition, instead of delving into its guts.
Kudos.
@Mason
Agreed, I think if/case statements are soft of an afterthought for erlang, which is fine with me. I think R. Virding wrote something like this on his blog.
-module(speed).
-export([run/1,loop_case/1,loop_fun/1,is_even/1,fun_it/2]).
run(N)->
error_logger:info_report(timer:tc(?MODULE,loop_fun,[N])),
error_logger:info_report(timer:tc(?MODULE,loop_case,[N])).
is_even(I)->
I rem 2 == 0.
loop_case(0)->
finish;
loop_case(N)->
case_it(N,is_even(N)).
case_it(N,Boolean)->
case Boolean of
true->
loop_case(N-1);
false->
loop_case(N-1)
end.
loop_fun(0)->
finish;
loop_fun(N)->
fun_it(N,is_even(N)).
fun_it(N,true)->
loop_fun(N-1);
fun_it(N,false)->
loop_fun(N-1).
Results
36> speed:run(10000000).
=INFO REPORT==== 14-May-2009::15:51:36 ===
{1484000,finish}
ok
=INFO REPORT==== 14-May-2009::15:51:38 ===
{1390999,finish}
37>
Seems like no difference:)
@Ravindranath
I imagine it compiles down to something similar either way, but its good to know you don’t have to take a performance hit.