<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Mostly Software</title>
	<atom:link href="http://schani.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://schani.wordpress.com</link>
	<description></description>
	<lastBuildDate>Fri, 13 Jan 2012 17:16:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='schani.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Mostly Software</title>
		<link>http://schani.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://schani.wordpress.com/osd.xml" title="Mostly Software" />
	<atom:link rel='hub' href='http://schani.wordpress.com/?pushpress=hub'/>
		<item>
		<title>The Schism</title>
		<link>http://schani.wordpress.com/2012/01/13/the-schism/</link>
		<comments>http://schani.wordpress.com/2012/01/13/the-schism/#comments</comments>
		<pubDate>Fri, 13 Jan 2012 17:16:20 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Photography]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=161</guid>
		<description><![CDATA[<p>I've been a very bad blogger recently.  Part of the reason is certainly laziness, but there's also the fact that I felt more motivated to write about photography recently than about software whereas this blog has become focused almost exclusively on software, so it didn't feel like it would fit in anymore.
</p>
<p>
For this reason I have decided to start a dedicated photography blog, creatively named <a href="http://blog.markprobst.net/">Mark's Photography Blog</a>.  I've started it off with a post on <a href="http://blog.markprobst.net/2012/01/12/10-not-totally-crappy-photos-from-2011/">10 of my favorite images from 2011</a>.
</p>
<p>
What that means for this blog is that, for the rare occasions on which I will post, I shall focus it almost exclusively on software, which is also reflected in its new awesome title.  I'll also try to overcome my aversion to writing smaller, less epic posts, which will hopefully also result in a higher posting frequency.
</p>
<p>
On the more epic side of things, I still owe you one or two more posts on SGen, and I'd like to write a post on the <a href="https://github.com/schani/michael-alloc">lock-free allocator</a> we're now using in SGen, but which is independent of it and is pretty cool. What can it be used for?  Dynamic memory allocation in signal handlers, for example.
</p><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=161&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been a very bad blogger recently.  Part of the reason is certainly laziness, but there&#8217;s also the fact that I felt more motivated to write about photography recently than about software whereas this blog has become focused almost exclusively on software, so it didn&#8217;t feel like it would fit in anymore.
</p>
<p>
For this reason I have decided to start a dedicated photography blog, creatively named <a href="http://blog.markprobst.net/">Mark&#8217;s Photography Blog</a>.  I&#8217;ve started it off with a post on <a href="http://blog.markprobst.net/2012/01/12/10-not-totally-crappy-photos-from-2011/">10 of my favorite images from 2011</a>.
</p>
<p>
What that means for this blog is that, for the rare occasions on which I will post, I shall focus it almost exclusively on software, which is also reflected in its new awesome title.  I&#8217;ll also try to overcome my aversion to writing smaller, less epic posts, which will hopefully also result in a higher posting frequency.
</p>
<p>
On the more epic side of things, I still owe you one or two more posts on SGen, and I&#8217;d like to write a post on the <a href="https://github.com/schani/michael-alloc">lock-free allocator</a> we&#8217;re now using in SGen, but which is independent of it and is pretty cool. What can it be used for?  Dynamic memory allocation in signal handlers, for example.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/161/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=161&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2012/01/13/the-schism/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>The ICFP Programming Contest 2011</title>
		<link>http://schani.wordpress.com/2011/06/22/the-icfp-programming-contest-2011/</link>
		<comments>http://schani.wordpress.com/2011/06/22/the-icfp-programming-contest-2011/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 22:21:03 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[clojure]]></category>
		<category><![CDATA[icfp]]></category>
		<category><![CDATA[ocaml]]></category>
		<category><![CDATA[programming contest]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=158</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>Last weekend saw yet another iteration of the <a href="http://www.icfpcontest.org/">ICFP Programming Contest</a>, an event the glorious team <a href="http://schani.wordpress.com/2010/06/27/the-2010-icfp-programming-contest/">Funktion im Kopf der Mensch</a> could not miss.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">The task </h3>
<div class="outline-text-3" id="text-2">

<p>The <a href="http://www.icfpcontest.org/2011/06/task-description-contest-starts-now.html">task</a> this year was to build a program that would play a card game, Lambda The Gathering, to be run on the contest organizer's machines. The game is played with two players each, taking turns.  In each round, a player must play one out of several kinds of cards, for each of which there is an infinite supply.  Each player owns 256 slots, each of which has some "vitality", initially 10000.  If a slot's vitality drops to zero it is dead and cannot be used anymore (unless it is explicitly revived).  The object of the game is to kill all the opponent's slots.
</p>
<p>
Apart from its vitality, a slot also holds a value, which is where the real action happens.  A value can be either a number or a function. The cards in the game also represent functions, and the number 0. When a card is played, it must be played to a specific slot, meaning that the card's value and the slot's value are combined.  The combination can be either that the slot's value is applied to the card (right) or the other way around (left).  Initially, each slot holds the identity function, <code>I</code>.
</p>
<p>
As an example, let's say we want to generate the number 4 in a slot. Initially it will hold <code>I</code>, so we play <code>zero</code> right, giving <code>I(zero)</code>, which is reduced to 0.  Now we play the function <code>succ</code>, which takes a number and returns that number plus one, left, giving <code>succ(0)</code>, which reduces to 1.  In each of the next two turns we play the card <code>dbl</code> left.  <code>dbl</code> is a function that doubles a number, so we end up with 4.
</p>
<p>
To actually do something, like attacking your opponent's slots, there are cards whose functions have side effects.  One of them is <code>attack</code>, which you supply with three numbers (it's a <a href="http://en.wikipedia.org/wiki/Currying">curried function</a>).  The first is the number of one of your slots, the second the number of one your opponent's slots and the third is the "strength" of the attack. Once the function has all of its arguments it will subtract the third number from your slot's vitality and then subtract nine tenths of that number from your opponent's slot.  There is another function, <code>help</code>, which you can use to top up your vitality, so if you first <code>help</code> and then <code>attack</code> with the right strengths you will have damaged your opponent without having lost any vitality yourself.
</p>
<p>
Also among the cards in the game are <code>S</code>, <code>K</code> and <code>I</code>, representing the homonymous functions of the <a href="http://en.wikipedia.org/wiki/SKI_combinator_calculus">SKI combinator calculus</a>.  These three simple functions, together with function application, are <a href="http://en.wikipedia.org/wiki/Turing_completeness">Turing complete</a>, which means that it is possible to build programs in slots.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg" />
</p></div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Strategy </h3>
<div class="outline-text-3" id="text-3">

<p>Actual programs can potentially damage the opponent much quicker than would be possible by building each attack by playing the respective cards anew.  (Note that I'm talking about the programs (functions) in the game's slots now, not about the program that plays the card game).
</p>
<p>
There is a limit to how long a slot's function is allowed to "execute", however, which is 1000 function applications.  If it has not finished until then, execution is stopped and the slot's value is reset to <code>I</code>.  Any side effects stay, however, so an endless attack loop would do damage, though only a limited amount.
</p>
<p>
There is a good reason for not using endless loops, however, namely that, since the slot's value is reset, the function is "lost", so it has to be copied to another slot first (which is possible with the <code>get</code> function, but somewhat costly, since the function's slot number has to be built first) or, even worse, generated again.
</p>
<p>
A better strategy is to use a function that does some more or less fixed amount of damage and then returns itself, or some slightly different variant of itself.  The actual result of the function will, after all, be the new value of the slot.  So, suppose we could build a function like this:
</p>



<pre class="example"><span class="linenr">1:  </span>function make_killer (slot) {
<span class="linenr">2:  </span>  return function (dummy) {
<span class="linenr">3:  </span>    kill_opponent_slot (slot);
<span class="linenr">4:  </span>    return make_killer (slot + 1);
<span class="linenr">5:  </span>  }
<span class="linenr">6:  </span>}
</pre>



<p>
If we generate this function and give it the number 0, it will result in a function that, given a dummy argument, will kill the opponent's slot 0 and return a function that when called with a dummy argument will kill the opponent's slot 1 and return a function that when called with a dummy argument etc.  In other words, once we have this function we can kill one enemy slot per turn.  In comparison, just launching a single <code>attack</code> without building more complex functions requires more than 50 turns, and that's not sufficient to kill an enemy's slot.  If you factor in the <code>help</code> calls to top up your own vitality so as not to die quicker than the enemy you're looking at more than 200 rounds just to kill a single slot of the enemy (assuming it has the initial vitality of 10000).
</p></div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Feasibility </h3>
<div class="outline-text-3" id="text-4">

<p>Unfortunately, it's not completely trivial to write a function like this in SKI calculus.  In fact, compared to SKI, x86 machine code looks incredibly high level.  Fortunately, there is help.  As luck would have it, I wrote a <a href="http://www.complang.tuwien.ac.at/schani/oldstuff/index.html#schemeinunlambda">Scheme subset to Unlambda compiler</a> many years ago.  <a href="http://en.wikipedia.org/wiki/Scheme_(programming_language)">Scheme</a> is essentially Lambda calculus and <a href="http://www.madore.org/~david/programs/unlambda/">Unlambda</a> is essentially SKI.  In other words I was already a bit familiar with both SKI as well as the compilation algorithm (which is quite simple).
</p>
<p>
Here is the function from above in very simple Scheme:
</p>



<pre class="example"><span class="linenr">1:  </span>((lambda (x) (x x))
<span class="linenr">2:  </span> (lambda (self)
<span class="linenr">3:  </span>   (lambda (slot)
<span class="linenr">4:  </span>     (lambda (dummy)
<span class="linenr">5:  </span>       (kill-opponent-slot slot)
<span class="linenr">6:  </span>       ((self self) (succ slot))))))
</pre>



<p>
Since there are no function names in SKI or pure Lambda calculus we have to achieve self-reference using a trick, namely the function <code>(lambda (x) (x x))</code>, which applies its argument to itself, so if you pass it your function then that function gets itself as its (first) argument, in this case named <code>self</code>.  Note that I am assuming that we have sequential execution available, to first kill the opponent's slot and then return the function for the next slot.  This is not natively available in Lambda calculus, which doesn't even deal with side effects, but can be implemented like so:
</p>



<pre class="example"><span class="linenr">1:  </span>((lambda (dummy)
<span class="linenr">2:  </span>   do-this-afterwards)
<span class="linenr">3:  </span> do-this-first)
</pre>



<p>
How easily can we generate an attack function like the one above in a slot?  The first step is to translate Lambda to SKI calculus.  The simple algorithm (given in the task description) produces ridiculously huge programs.  The problem is that since SKI calculus doesn't know named function arguments the arguments must be passed downwards "manually" from where they originated.  The simple algorithm passes them down all the way to all the leaves of the syntax tree, where they might or might not be needed.  By cutting off the passing down at subtrees which don't need the argument a significant reduction is achieved, but program size is still an issue, and depends a lot on how deeply nested the original function is.
</p>
<p>
Now that we have the function in SKI form, can we trivially play cards to summon it to a slot?  It turns out that this is only possible in simple cases because a player's turn can only apply a card left or right, so nesting on both sides of an application is not trivially possible.  For example, it's easy to generate <code>S(S(SK))</code>, starting from <code>I</code>, by applying <code>K</code> right and the <code>S</code> left three times. <code>S((SS)K)</code> also works: <code>S</code> right, <code>S</code> left, <code>K</code> right and <code>S</code> left. However, <code>(SS)(KK)</code> is not doable that easily.
</p>
<p>
There is a trick, though: Assume we already have some arbitrarily complex <code>x</code>, and we want <code>x(MN)</code>.  If we apply <code>K</code> left, <code>S</code> left and then <code>M</code> and <code>N</code> right we end up with <code>((S(Kx))M)N</code> which, thanks to the way the <code>S</code> and <code>K</code> combinators work, reduces to <code>x(MN)</code>.
</p>
<p>
This trick still only works if <code>M</code> and <code>N</code> are single cards.  It can be used recursively however, and, using yet another similar trick, arbitrarily nested right sides can be produced, but at a great increase in the number of cards that have to be played.  Before we figured out those tricks we came across a simpler general solution, using the card <code>get</code>, which, when given a slot number, reduces to the value that's in that slot.  If we use <code>get</code> and <code>zero</code> as the cards <code>M</code> and <code>N</code> in the trick above, the function will reduce to <code>xy</code>, with <code>y</code> being whatever is in slot 0.  The generation algorithm built using this trick has to use more than one slot, and it depends vitally on slot 0, but it is reasonably simple and the overhead is small.
</p>
<p>
It looks as if we have all the pieces of the puzzle together now and can finally generate a function that does some serious damage quickly, but there is another problem.  Let's say we want to generate a function which, when given a dummy argument does the side effect of <code>inc zero</code> (the details of what <code>inc</code> does are not important here). In Scheme, that is <code>(lambda (d) (inc zero))</code>.  Translated to SKI calculus it is <code>K(inc zero)</code>.  To generate it in a slot (assuming it holds <code>I</code>) we apply <code>zero</code> right, then <code>inc</code> and <code>K</code> left.  After having applied the <code>inc</code> the slot's function is <code>inc zero</code>, though, which is immediately reduced, thereby doing the side effect and giving as a result the function <code>I</code>, so after applying <code>K</code> left we end up with <code>K(I)</code> and a side effect we only wanted after giving the dummy argument.
</p>
<p>
Clearly, we need a way to "guard" side effects against immediate execution.  The easiest way we found to do this is to build an equivalent function, but in a different way.  From a completely functional point of view, i.e. ignoring possible side effects, the functions
</p>



<pre class="example"><span class="linenr">1:  </span>K(inc zero)
</pre>



<p>
and
</p>



<pre class="example"><span class="linenr">1:  </span>(S(K inc))(K zero)
</pre>



<p>
are the same - they both take an argument that is ignored and reduce to <code>inc zero</code>, but in the latter form the <code>inc zero</code> can only be reduced once the dummy argument is given.  Let's call such a function a "side effect function" - it requires a dummy argument and only once that is present will it produce a side effect.
</p>
<p>
What if we want to combine two such side effect functions, resulting in a new side effect function that does one first, then the other?  It seems tempting to just apply the second to the first, like this:
</p>



<pre class="example"><span class="linenr">1:  </span>(lambda (d)
<span class="linenr">2:  </span>  (second-se-fn (first-se-fn d)))
</pre>



<p>
The error in this approach is that even though <code>(first-se-fn d)</code> can only be reduced once the argument <code>d</code> is present, it is still, in terms of the SKI calculus, a function, and therefore a value, so <code>second-se-fn</code> can be reduced, even without <code>d</code>.  What we need instead is a function that takes an argument and then applies first one function to that argument and then a second one.  As luck would have it, that function is <code>S</code>.  So, to combine two side effect functions, yielding a new side effect function, we have
</p>



<pre class="example"><span class="linenr">1:  </span>((S first-se-fn) second-se-fn)
</pre>



<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg" />
</p></div>

</div>

<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">More strategy </h3>
<div class="outline-text-3" id="text-5">

<p>With all of that in place we can come back to our self-returning slot-increasing attack function.  As it turns out, passing around that extra slot number argument is quite an overhead when everything is translated down to SKI calculus, but we had already thought of a simpler solution.  Instead of having the slot number inside the function we can use indirection: We store the number of the slot to be attacked in some other slot and use the <code>get</code> function to retrieve it from within the attack function.  The disadvantage of this approach is that we need to increment the slot number after each attach, costing us one extra turn, but on the other hand we can switch to another slot to attack reasonably quickly.
</p>
<p>
After some optimizations, some of them directly on the generated SKI code, we managed to get the number of turns required to load this function down to below 180.  After that we just have to load <code>zero</code> into a slot and we can kill one enemy slot every other turn.
</p>
<p>
One small detail I omitted above is that when attacking a slot the number you give will actually be inverted, i.e. giving 0 will attack slot 255, and vice versa.  That makes it easy to attack high slot numbers but more difficult to attack lower ones, because higher numbers take more time to generate.  Nonetheless, our bot doesn't actually start its attack from the highest slot, as would be easiest. It first figures out which opponent slot holds the most complex function (using the trivial metric of counting the leaves of the syntax trees) and attack that first, then increment until we hit slot 0, then we start again from the highest slot, hopefully finishing off the opponent.
</p>
<p>
We did not have much time left to refine our bot beyond that.  There are very few measures we take to defend ourselves.  If one of our few vital slots is killed while we are generating our attack function, for example, we just revive the slot and the start over with generating the function.  A smarter bot would try to salvage the pieces left in the other slots.  If the slot we take the vitality from to launch attacks is killed, we're dead in the water - we had code to <code>help</code> it get back into working condition, but we failed at the last minute to integrate it.
</p>
<p>
Overall, I think we're doing quite well.  We launch our first attack within about 200 rounds and can finish off a bot that doesn't defend itself within a bit more than 500 turns after that.  Against <a href="https://github.com/tanakh/ICFP2011">the best teams</a>, however, we're helpless - they can kill all 256 of out slots stone dead in less than 190 rounds, starting from scratch.  Kudos!
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg" />
</p></div>

</div>

<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">Contest organization </h3>
<div class="outline-text-3" id="text-6">

<p>In <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">the past</a> I have taken contest organizers to task for screwing up in minor and major ways.  This year, I am happy to say, there was almost nothing to find fault with.  The task was clear, not artificially obscured, many-layered, interesting and enjoyable.  There was a test submission server but teams didn't depend on it and it worked quite well in any case.
</p>
<p>
So, from Team Funktion im Kopf der Mensch a huge Thank You to the organizers for this amazing job!  May all future contests be organized this well.
</p></div>

</div>

<div id="outline-container-7" class="outline-3">
<h3 id="sec-7">Programming languages </h3>
<div class="outline-text-3" id="text-7">

<p>Last year was the first year we used <a href="http://clojure.org/">Clojure</a> after using <a href="http://caml.inria.fr/">OCaml</a> almost exclusively in the years before.  This time we stepped back a bit and used OCaml for the actual bot, i.e. the program that will run on the organizer's servers.  The main reasons for this were that OCaml has a much smaller footprint and more predictable performance than Clojure and we wanted to make sure we didn't hit against any limits, and that we felt somewhat safer writing an unsupervised program when protected by OCaml's type system.
</p>
<p>
We still used Clojure for developing the attack functions, starting from Lambda calculus, going down to SKI and then generating a sequence of card application.
</p>
<p>
Our visualizer was written in C.
</p>
<p>
All of these languages did their respective jobs well.  No complaints this time.
</p></div>

</div>

<div id="outline-container-8" class="outline-3">
<h3 id="sec-8">The Code </h3>
<div class="outline-text-3" id="text-8">

<p>All the code we wrote for this year's contest is available on <a href="http://github.com/schani/icfp-2011">GitHub</a>. If you need any assistance with it, please email me.
</p></div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=158&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>Last weekend saw yet another iteration of the <a href="http://www.icfpcontest.org/">ICFP Programming Contest</a>, an event the glorious team <a href="http://schani.wordpress.com/2010/06/27/the-2010-icfp-programming-contest/">Funktion im Kopf der Mensch</a> could not miss.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">The task </h3>
<div class="outline-text-3" id="text-2">
<p>The <a href="http://www.icfpcontest.org/2011/06/task-description-contest-starts-now.html">task</a> this year was to build a program that would play a card game, Lambda The Gathering, to be run on the contest organizer&#8217;s machines. The game is played with two players each, taking turns.  In each round, a player must play one out of several kinds of cards, for each of which there is an infinite supply.  Each player owns 256 slots, each of which has some &#8220;vitality&#8221;, initially 10000.  If a slot&#8217;s vitality drops to zero it is dead and cannot be used anymore (unless it is explicitly revived).  The object of the game is to kill all the opponent&#8217;s slots.
</p>
<p>
Apart from its vitality, a slot also holds a value, which is where the real action happens.  A value can be either a number or a function. The cards in the game also represent functions, and the number 0. When a card is played, it must be played to a specific slot, meaning that the card&#8217;s value and the slot&#8217;s value are combined.  The combination can be either that the slot&#8217;s value is applied to the card (right) or the other way around (left).  Initially, each slot holds the identity function, <code>I</code>.
</p>
<p>
As an example, let&#8217;s say we want to generate the number 4 in a slot. Initially it will hold <code>I</code>, so we play <code>zero</code> right, giving <code>I(zero)</code>, which is reduced to 0.  Now we play the function <code>succ</code>, which takes a number and returns that number plus one, left, giving <code>succ(0)</code>, which reduces to 1.  In each of the next two turns we play the card <code>dbl</code> left.  <code>dbl</code> is a function that doubles a number, so we end up with 4.
</p>
<p>
To actually do something, like attacking your opponent&#8217;s slots, there are cards whose functions have side effects.  One of them is <code>attack</code>, which you supply with three numbers (it&#8217;s a <a href="http://en.wikipedia.org/wiki/Currying">curried function</a>).  The first is the number of one of your slots, the second the number of one your opponent&#8217;s slots and the third is the &#8220;strength&#8221; of the attack. Once the function has all of its arguments it will subtract the third number from your slot&#8217;s vitality and then subtract nine tenths of that number from your opponent&#8217;s slot.  There is another function, <code>help</code>, which you can use to top up your vitality, so if you first <code>help</code> and then <code>attack</code> with the right strengths you will have damaged your opponent without having lost any vitality yourself.
</p>
<p>
Also among the cards in the game are <code>S</code>, <code>K</code> and <code>I</code>, representing the homonymous functions of the <a href="http://en.wikipedia.org/wiki/SKI_combinator_calculus">SKI combinator calculus</a>.  These three simple functions, together with function application, are <a href="http://en.wikipedia.org/wiki/Turing_completeness">Turing complete</a>, which means that it is possible to build programs in slots.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg" />
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Strategy </h3>
<div class="outline-text-3" id="text-3">
<p>Actual programs can potentially damage the opponent much quicker than would be possible by building each attack by playing the respective cards anew.  (Note that I&#8217;m talking about the programs (functions) in the game&#8217;s slots now, not about the program that plays the card game).
</p>
<p>
There is a limit to how long a slot&#8217;s function is allowed to &#8220;execute&#8221;, however, which is 1000 function applications.  If it has not finished until then, execution is stopped and the slot&#8217;s value is reset to <code>I</code>.  Any side effects stay, however, so an endless attack loop would do damage, though only a limited amount.
</p>
<p>
There is a good reason for not using endless loops, however, namely that, since the slot&#8217;s value is reset, the function is &#8220;lost&#8221;, so it has to be copied to another slot first (which is possible with the <code>get</code> function, but somewhat costly, since the function&#8217;s slot number has to be built first) or, even worse, generated again.
</p>
<p>
A better strategy is to use a function that does some more or less fixed amount of damage and then returns itself, or some slightly different variant of itself.  The actual result of the function will, after all, be the new value of the slot.  So, suppose we could build a function like this:
</p>
<pre class="example"><span class="linenr">1:  </span>function make_killer (slot) {
<span class="linenr">2:  </span>  return function (dummy) {
<span class="linenr">3:  </span>    kill_opponent_slot (slot);
<span class="linenr">4:  </span>    return make_killer (slot + 1);
<span class="linenr">5:  </span>  }
<span class="linenr">6:  </span>}
</pre>
<p>
If we generate this function and give it the number 0, it will result in a function that, given a dummy argument, will kill the opponent&#8217;s slot 0 and return a function that when called with a dummy argument will kill the opponent&#8217;s slot 1 and return a function that when called with a dummy argument etc.  In other words, once we have this function we can kill one enemy slot per turn.  In comparison, just launching a single <code>attack</code> without building more complex functions requires more than 50 turns, and that&#8217;s not sufficient to kill an enemy&#8217;s slot.  If you factor in the <code>help</code> calls to top up your own vitality so as not to die quicker than the enemy you&#8217;re looking at more than 200 rounds just to kill a single slot of the enemy (assuming it has the initial vitality of 10000).
</p>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Feasibility </h3>
<div class="outline-text-3" id="text-4">
<p>Unfortunately, it&#8217;s not completely trivial to write a function like this in SKI calculus.  In fact, compared to SKI, x86 machine code looks incredibly high level.  Fortunately, there is help.  As luck would have it, I wrote a <a href="http://www.complang.tuwien.ac.at/schani/oldstuff/index.html#schemeinunlambda">Scheme subset to Unlambda compiler</a> many years ago.  <a href="http://en.wikipedia.org/wiki/Scheme_(programming_language)">Scheme</a> is essentially Lambda calculus and <a href="http://www.madore.org/~david/programs/unlambda/">Unlambda</a> is essentially SKI.  In other words I was already a bit familiar with both SKI as well as the compilation algorithm (which is quite simple).
</p>
<p>
Here is the function from above in very simple Scheme:
</p>
<pre class="example"><span class="linenr">1:  </span>((lambda (x) (x x))
<span class="linenr">2:  </span> (lambda (self)
<span class="linenr">3:  </span>   (lambda (slot)
<span class="linenr">4:  </span>     (lambda (dummy)
<span class="linenr">5:  </span>       (kill-opponent-slot slot)
<span class="linenr">6:  </span>       ((self self) (succ slot))))))
</pre>
<p>
Since there are no function names in SKI or pure Lambda calculus we have to achieve self-reference using a trick, namely the function <code>(lambda (x) (x x))</code>, which applies its argument to itself, so if you pass it your function then that function gets itself as its (first) argument, in this case named <code>self</code>.  Note that I am assuming that we have sequential execution available, to first kill the opponent&#8217;s slot and then return the function for the next slot.  This is not natively available in Lambda calculus, which doesn&#8217;t even deal with side effects, but can be implemented like so:
</p>
<pre class="example"><span class="linenr">1:  </span>((lambda (dummy)
<span class="linenr">2:  </span>   do-this-afterwards)
<span class="linenr">3:  </span> do-this-first)
</pre>
<p>
How easily can we generate an attack function like the one above in a slot?  The first step is to translate Lambda to SKI calculus.  The simple algorithm (given in the task description) produces ridiculously huge programs.  The problem is that since SKI calculus doesn&#8217;t know named function arguments the arguments must be passed downwards &#8220;manually&#8221; from where they originated.  The simple algorithm passes them down all the way to all the leaves of the syntax tree, where they might or might not be needed.  By cutting off the passing down at subtrees which don&#8217;t need the argument a significant reduction is achieved, but program size is still an issue, and depends a lot on how deeply nested the original function is.
</p>
<p>
Now that we have the function in SKI form, can we trivially play cards to summon it to a slot?  It turns out that this is only possible in simple cases because a player&#8217;s turn can only apply a card left or right, so nesting on both sides of an application is not trivially possible.  For example, it&#8217;s easy to generate <code>S(S(SK))</code>, starting from <code>I</code>, by applying <code>K</code> right and the <code>S</code> left three times. <code>S((SS)K)</code> also works: <code>S</code> right, <code>S</code> left, <code>K</code> right and <code>S</code> left. However, <code>(SS)(KK)</code> is not doable that easily.
</p>
<p>
There is a trick, though: Assume we already have some arbitrarily complex <code>x</code>, and we want <code>x(MN)</code>.  If we apply <code>K</code> left, <code>S</code> left and then <code>M</code> and <code>N</code> right we end up with <code>((S(Kx))M)N</code> which, thanks to the way the <code>S</code> and <code>K</code> combinators work, reduces to <code>x(MN)</code>.
</p>
<p>
This trick still only works if <code>M</code> and <code>N</code> are single cards.  It can be used recursively however, and, using yet another similar trick, arbitrarily nested right sides can be produced, but at a great increase in the number of cards that have to be played.  Before we figured out those tricks we came across a simpler general solution, using the card <code>get</code>, which, when given a slot number, reduces to the value that&#8217;s in that slot.  If we use <code>get</code> and <code>zero</code> as the cards <code>M</code> and <code>N</code> in the trick above, the function will reduce to <code>xy</code>, with <code>y</code> being whatever is in slot 0.  The generation algorithm built using this trick has to use more than one slot, and it depends vitally on slot 0, but it is reasonably simple and the overhead is small.
</p>
<p>
It looks as if we have all the pieces of the puzzle together now and can finally generate a function that does some serious damage quickly, but there is another problem.  Let&#8217;s say we want to generate a function which, when given a dummy argument does the side effect of <code>inc zero</code> (the details of what <code>inc</code> does are not important here). In Scheme, that is <code>(lambda (d) (inc zero))</code>.  Translated to SKI calculus it is <code>K(inc zero)</code>.  To generate it in a slot (assuming it holds <code>I</code>) we apply <code>zero</code> right, then <code>inc</code> and <code>K</code> left.  After having applied the <code>inc</code> the slot&#8217;s function is <code>inc zero</code>, though, which is immediately reduced, thereby doing the side effect and giving as a result the function <code>I</code>, so after applying <code>K</code> left we end up with <code>K(I)</code> and a side effect we only wanted after giving the dummy argument.
</p>
<p>
Clearly, we need a way to &#8220;guard&#8221; side effects against immediate execution.  The easiest way we found to do this is to build an equivalent function, but in a different way.  From a completely functional point of view, i.e. ignoring possible side effects, the functions
</p>
<pre class="example"><span class="linenr">1:  </span>K(inc zero)
</pre>
<p>
and
</p>
<pre class="example"><span class="linenr">1:  </span>(S(K inc))(K zero)
</pre>
<p>
are the same &#8211; they both take an argument that is ignored and reduce to <code>inc zero</code>, but in the latter form the <code>inc zero</code> can only be reduced once the dummy argument is given.  Let&#8217;s call such a function a &#8220;side effect function&#8221; &#8211; it requires a dummy argument and only once that is present will it produce a side effect.
</p>
<p>
What if we want to combine two such side effect functions, resulting in a new side effect function that does one first, then the other?  It seems tempting to just apply the second to the first, like this:
</p>
<pre class="example"><span class="linenr">1:  </span>(lambda (d)
<span class="linenr">2:  </span>  (second-se-fn (first-se-fn d)))
</pre>
<p>
The error in this approach is that even though <code>(first-se-fn d)</code> can only be reduced once the argument <code>d</code> is present, it is still, in terms of the SKI calculus, a function, and therefore a value, so <code>second-se-fn</code> can be reduced, even without <code>d</code>.  What we need instead is a function that takes an argument and then applies first one function to that argument and then a second one.  As luck would have it, that function is <code>S</code>.  So, to combine two side effect functions, yielding a new side effect function, we have
</p>
<pre class="example"><span class="linenr">1:  </span>((S first-se-fn) second-se-fn)
</pre>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg" />
</p>
</div>
</div>
<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">More strategy </h3>
<div class="outline-text-3" id="text-5">
<p>With all of that in place we can come back to our self-returning slot-increasing attack function.  As it turns out, passing around that extra slot number argument is quite an overhead when everything is translated down to SKI calculus, but we had already thought of a simpler solution.  Instead of having the slot number inside the function we can use indirection: We store the number of the slot to be attacked in some other slot and use the <code>get</code> function to retrieve it from within the attack function.  The disadvantage of this approach is that we need to increment the slot number after each attach, costing us one extra turn, but on the other hand we can switch to another slot to attack reasonably quickly.
</p>
<p>
After some optimizations, some of them directly on the generated SKI code, we managed to get the number of turns required to load this function down to below 180.  After that we just have to load <code>zero</code> into a slot and we can kill one enemy slot every other turn.
</p>
<p>
One small detail I omitted above is that when attacking a slot the number you give will actually be inverted, i.e. giving 0 will attack slot 255, and vice versa.  That makes it easy to attack high slot numbers but more difficult to attack lower ones, because higher numbers take more time to generate.  Nonetheless, our bot doesn&#8217;t actually start its attack from the highest slot, as would be easiest. It first figures out which opponent slot holds the most complex function (using the trivial metric of counting the leaves of the syntax trees) and attack that first, then increment until we hit slot 0, then we start again from the highest slot, hopefully finishing off the opponent.
</p>
<p>
We did not have much time left to refine our bot beyond that.  There are very few measures we take to defend ourselves.  If one of our few vital slots is killed while we are generating our attack function, for example, we just revive the slot and the start over with generating the function.  A smarter bot would try to salvage the pieces left in the other slots.  If the slot we take the vitality from to launch attacks is killed, we&#8217;re dead in the water &#8211; we had code to <code>help</code> it get back into working condition, but we failed at the last minute to integrate it.
</p>
<p>
Overall, I think we&#8217;re doing quite well.  We launch our first attack within about 200 rounds and can finish off a bot that doesn&#8217;t defend itself within a bit more than 500 turns after that.  Against <a href="https://github.com/tanakh/ICFP2011">the best teams</a>, however, we&#8217;re helpless &#8211; they can kill all 256 of out slots stone dead in less than 190 rounds, starting from scratch.  Kudos!
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg" />
</p>
</div>
</div>
<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">Contest organization </h3>
<div class="outline-text-3" id="text-6">
<p>In <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">the past</a> I have taken contest organizers to task for screwing up in minor and major ways.  This year, I am happy to say, there was almost nothing to find fault with.  The task was clear, not artificially obscured, many-layered, interesting and enjoyable.  There was a test submission server but teams didn&#8217;t depend on it and it worked quite well in any case.
</p>
<p>
So, from Team Funktion im Kopf der Mensch a huge Thank You to the organizers for this amazing job!  May all future contests be organized this well.
</p>
</div>
</div>
<div id="outline-container-7" class="outline-3">
<h3 id="sec-7">Programming languages </h3>
<div class="outline-text-3" id="text-7">
<p>Last year was the first year we used <a href="http://clojure.org/">Clojure</a> after using <a href="http://caml.inria.fr/">OCaml</a> almost exclusively in the years before.  This time we stepped back a bit and used OCaml for the actual bot, i.e. the program that will run on the organizer&#8217;s servers.  The main reasons for this were that OCaml has a much smaller footprint and more predictable performance than Clojure and we wanted to make sure we didn&#8217;t hit against any limits, and that we felt somewhat safer writing an unsupervised program when protected by OCaml&#8217;s type system.
</p>
<p>
We still used Clojure for developing the attack functions, starting from Lambda calculus, going down to SKI and then generating a sequence of card application.
</p>
<p>
Our visualizer was written in C.
</p>
<p>
All of these languages did their respective jobs well.  No complaints this time.
</p>
</div>
</div>
<div id="outline-container-8" class="outline-3">
<h3 id="sec-8">The Code </h3>
<div class="outline-text-3" id="text-8">
<p>All the code we wrote for this year&#8217;s contest is available on <a href="http://github.com/schani/icfp-2011">GitHub</a>. If you need any assistance with it, please email me.
</p>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/158/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/158/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/158/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=158&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2011/06/22/the-icfp-programming-contest-2011/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006883.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/icfp2011/L1006887.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/icfp2011/ltgvis.jpg</media:title>
		</media:content>
	</item>
		<item>
		<title>SGen – Finalization and Weak References</title>
		<link>http://schani.wordpress.com/2011/02/22/sgen-%e2%80%93-finalization-and-weak-references/</link>
		<comments>http://schani.wordpress.com/2011/02/22/sgen-%e2%80%93-finalization-and-weak-references/#comments</comments>
		<pubDate>Tue, 22 Feb 2011 15:41:01 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[garbage collection]]></category>
		<category><![CDATA[mono]]></category>
		<category><![CDATA[sgen]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=152</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>In this installment of <a href="http://schani.wordpress.com/2010/12/20/sgen/">my series on SGen</a>, Mono's new garbage collector, we shall be looking at how finalizers and weak references are implemented, and why you (almost certainly) should not use finalizers.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Tracking object lifetime </h3>
<div class="outline-text-3" id="text-2">

<p>Both finalizers and weak references need to track the lifetime of certain objects in order to take an action when those objects become unreachable.  To that end SGen keeps lists of finalizable objects and weak references which it checks against at the end of every collection.
</p>
<p>
If the object referred to by a weak reference has become unreachable, the weak reference is nulled.
</p>
<p>
If a finalizable object is deemed unreachable by the collector, it is put onto the finalization queue and it is marked, since it must be kept alive until the finalizer has run.  Of course, all objects that it references have to be marked as well, so the main collection loop is activated again.
</p>
</div>

<div id="outline-container-2_1" class="outline-4">
<h4 id="sec-2_1">Generations </h4>
<div class="outline-text-4" id="text-2_1">

<p>A nursery collection cannot collect an object in the major heap, so it is not necessary to check the status of old objects after a nursery collection.  That is why SGen keeps separate finalization and weak reference lists for the nursery and major heaps.
</p></div>
</div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Invoking finalizers </h3>
<div class="outline-text-3" id="text-3">

<p>SGen uses a dedicated thread for invoking finalizers.  The finalization queue is processed one object at a time.  As long as an object is in the finalization queue it is also considered live, i.e. the finalization queue is a GC root.
</p>
</div>

<div id="outline-container-3_1" class="outline-4">
<h4 id="sec-3_1">Resurrection </h4>
<div class="outline-text-4" id="text-3_1">

<p>Resurrecting an object means making it reachable again from within its finalizer or a finalizer that can still reach the object (or via a tracking weak reference).  The garbage collector does not have to treat this case specially&#8212;until the finalizer has run the object is considered live by virtue of its being in the finalization queue, and afterwards it is live because it is reachable through some other root(s).
</p></div>
</div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Tracking weak references </h3>
<div class="outline-text-3" id="text-4">

<p>If an object is weakly referenced and has a finalizer, the weak reference will be nulled during the same collection as the finalizer is put in the finalization queue.  That is not always desirable, especially for objects that might be resurrected.
</p>
<p>
Tracking references solve the problem by keeping the reference intact at least until the finalizer has run.  Once the finalizer has finished, a tracking reference acts like a standard weak reference, i.e. it will be nulled once the object becomes unreachable, typically during the next collection (unless the object was resurrected).
</p>
<p>
When SGen encounters a tracking reference, instead of nulling it, it turns it into a non-tracking reference.  The referenced object is now on the finalization queue and therefore considered live again, so the reference will not be nulled before the finalizer has run.
</p></div>

</div>

<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">Why finalization is Evil </h3>
<div class="outline-text-3" id="text-5">

<p>Finalization should not be used to manage scarce resources, such as file descriptors.
</p>
</div>

<div id="outline-container-5_1" class="outline-4">
<h4 id="sec-5_1">Time of finalization is not determined </h4>
<div class="outline-text-4" id="text-5_1">

<p>Unless you force a garbage collection and wait for the finalizers to finish, the time at which they are run is not determined.  If your program does not allocate a lot of memory, or your heap is huge, it might take a very long time until a major collection is triggered, so dead objects on the major heap might not be finalized in time before your scarce resource is depleted.
</p>
<p>
In <a href="http://www.youtube.com/watch?v=Dj7Y7Rd1Ou0">this highly recommended talk</a> (starting at 8:35) Cliff Click gives an example where it was necessary to put a hack into the JVM garbage collector that triggered a collection whenever the system ran out of file handles, because Apache Tomcat relied on finalizers to reclaim them.
</p></div>

</div>

<div id="outline-container-5_2" class="outline-4">
<h4 id="sec-5_2">Finalizers run one after another </h4>
<div class="outline-text-4" id="text-5_2">

<p>A finalizer that performs a time consuming task will delay the execution of other finalizers.  An especially worrisome case is finalizers that do potentially blocking I/O.  Depending on the timeout of the operation, other finalizers can be blocked for a long time.
</p></div>

</div>

<div id="outline-container-5_3" class="outline-4">
<h4 id="sec-5_3">Finalizers make objects live longer </h4>
<div class="outline-text-4" id="text-5_3">

<p>Since finalizable objects are considered live until their finalizers have run, they also consume memory until their finalizers have run and the next garbage collection is triggered.  This not only applies to the finalizable objects themselves but to all objects they reference as well.  This is particularly problematic if finalization is blocked by an ill-behaved finalizer.
</p></div>

</div>

<div id="outline-container-5_4" class="outline-4">
<h4 id="sec-5_4">Order of finalization is not determined </h4>
<div class="outline-text-4" id="text-5_4">

<p>A finalizer cannot count on other finalizers running before or after it, irrespective of whether it references or is referenced by those objects or not.  In other words, a finalizer cannot assume that an object it references has not been finalized yet.  That is true even if that object is known to be strongly held by a GC root, namely when shutting down.
</p>
<p>
An exception to this are critical finalizers, which are run after "normal" finalizers, but between normal finalizers there is no determined order, neither between critical finalizers.
</p></div>
</div>

</div>

<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">[Update] </h3>
<div class="outline-text-3" id="text-6">

<p>It turns out that SGen's handling of tracking references, as described above, is not only conceptually wrong, it was also incorrectly implemented, which, ironically, mostly fixed the bug in the design.
</p>
<p>
Our conceptual mistake was to demote tracking references to non-tracking ones the first time the object became unreachable, which would have led to the reference being non-tracking even if its target was re-registered for finalization.  The bug in the implementation was that we didn't actually do that.  Instead, SGen would keep tracking references around until their targets became really, truly unreachable, even via finalization, i.e. until it was done finalizing, without any further chance of resurrection.  Only at that point would it turn the tracking reference into a non-tracking one, thereby making the object alive once again and keeping it around for one more garbage collection cycle.
</p>
<p>
The bug is now <a href="https://github.com/mono/mono/commit/3481f718fbb7b9bb0260158d056aafa2330390c1">fixed</a>.  Thanks to Alan for noticing this case and investigating.
</p></div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=152&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>In this installment of <a href="http://schani.wordpress.com/2010/12/20/sgen/">my series on SGen</a>, Mono&#8217;s new garbage collector, we shall be looking at how finalizers and weak references are implemented, and why you (almost certainly) should not use finalizers.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Tracking object lifetime </h3>
<div class="outline-text-3" id="text-2">
<p>Both finalizers and weak references need to track the lifetime of certain objects in order to take an action when those objects become unreachable.  To that end SGen keeps lists of finalizable objects and weak references which it checks against at the end of every collection.
</p>
<p>
If the object referred to by a weak reference has become unreachable, the weak reference is nulled.
</p>
<p>
If a finalizable object is deemed unreachable by the collector, it is put onto the finalization queue and it is marked, since it must be kept alive until the finalizer has run.  Of course, all objects that it references have to be marked as well, so the main collection loop is activated again.
</p>
</div>
<div id="outline-container-2_1" class="outline-4">
<h4 id="sec-2_1">Generations </h4>
<div class="outline-text-4" id="text-2_1">
<p>A nursery collection cannot collect an object in the major heap, so it is not necessary to check the status of old objects after a nursery collection.  That is why SGen keeps separate finalization and weak reference lists for the nursery and major heaps.
</p>
</div>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Invoking finalizers </h3>
<div class="outline-text-3" id="text-3">
<p>SGen uses a dedicated thread for invoking finalizers.  The finalization queue is processed one object at a time.  As long as an object is in the finalization queue it is also considered live, i.e. the finalization queue is a GC root.
</p>
</div>
<div id="outline-container-3_1" class="outline-4">
<h4 id="sec-3_1">Resurrection </h4>
<div class="outline-text-4" id="text-3_1">
<p>Resurrecting an object means making it reachable again from within its finalizer or a finalizer that can still reach the object (or via a tracking weak reference).  The garbage collector does not have to treat this case specially&mdash;until the finalizer has run the object is considered live by virtue of its being in the finalization queue, and afterwards it is live because it is reachable through some other root(s).
</p>
</div>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Tracking weak references </h3>
<div class="outline-text-3" id="text-4">
<p>If an object is weakly referenced and has a finalizer, the weak reference will be nulled during the same collection as the finalizer is put in the finalization queue.  That is not always desirable, especially for objects that might be resurrected.
</p>
<p>
Tracking references solve the problem by keeping the reference intact at least until the finalizer has run.  Once the finalizer has finished, a tracking reference acts like a standard weak reference, i.e. it will be nulled once the object becomes unreachable, typically during the next collection (unless the object was resurrected).
</p>
<p>
When SGen encounters a tracking reference, instead of nulling it, it turns it into a non-tracking reference.  The referenced object is now on the finalization queue and therefore considered live again, so the reference will not be nulled before the finalizer has run.
</p>
</div>
</div>
<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">Why finalization is Evil </h3>
<div class="outline-text-3" id="text-5">
<p>Finalization should not be used to manage scarce resources, such as file descriptors.
</p>
</div>
<div id="outline-container-5_1" class="outline-4">
<h4 id="sec-5_1">Time of finalization is not determined </h4>
<div class="outline-text-4" id="text-5_1">
<p>Unless you force a garbage collection and wait for the finalizers to finish, the time at which they are run is not determined.  If your program does not allocate a lot of memory, or your heap is huge, it might take a very long time until a major collection is triggered, so dead objects on the major heap might not be finalized in time before your scarce resource is depleted.
</p>
<p>
In <a href="http://www.youtube.com/watch?v=Dj7Y7Rd1Ou0">this highly recommended talk</a> (starting at 8:35) Cliff Click gives an example where it was necessary to put a hack into the JVM garbage collector that triggered a collection whenever the system ran out of file handles, because Apache Tomcat relied on finalizers to reclaim them.
</p>
</div>
</div>
<div id="outline-container-5_2" class="outline-4">
<h4 id="sec-5_2">Finalizers run one after another </h4>
<div class="outline-text-4" id="text-5_2">
<p>A finalizer that performs a time consuming task will delay the execution of other finalizers.  An especially worrisome case is finalizers that do potentially blocking I/O.  Depending on the timeout of the operation, other finalizers can be blocked for a long time.
</p>
</div>
</div>
<div id="outline-container-5_3" class="outline-4">
<h4 id="sec-5_3">Finalizers make objects live longer </h4>
<div class="outline-text-4" id="text-5_3">
<p>Since finalizable objects are considered live until their finalizers have run, they also consume memory until their finalizers have run and the next garbage collection is triggered.  This not only applies to the finalizable objects themselves but to all objects they reference as well.  This is particularly problematic if finalization is blocked by an ill-behaved finalizer.
</p>
</div>
</div>
<div id="outline-container-5_4" class="outline-4">
<h4 id="sec-5_4">Order of finalization is not determined </h4>
<div class="outline-text-4" id="text-5_4">
<p>A finalizer cannot count on other finalizers running before or after it, irrespective of whether it references or is referenced by those objects or not.  In other words, a finalizer cannot assume that an object it references has not been finalized yet.  That is true even if that object is known to be strongly held by a GC root, namely when shutting down.
</p>
<p>
An exception to this are critical finalizers, which are run after &#8220;normal&#8221; finalizers, but between normal finalizers there is no determined order, neither between critical finalizers.
</p>
</div>
</div>
</div>
<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">[Update] </h3>
<div class="outline-text-3" id="text-6">
<p>It turns out that SGen&#8217;s handling of tracking references, as described above, is not only conceptually wrong, it was also incorrectly implemented, which, ironically, mostly fixed the bug in the design.
</p>
<p>
Our conceptual mistake was to demote tracking references to non-tracking ones the first time the object became unreachable, which would have led to the reference being non-tracking even if its target was re-registered for finalization.  The bug in the implementation was that we didn&#8217;t actually do that.  Instead, SGen would keep tracking references around until their targets became really, truly unreachable, even via finalization, i.e. until it was done finalizing, without any further chance of resurrection.  Only at that point would it turn the tracking reference into a non-tracking one, thereby making the object alive once again and keeping it around for one more garbage collection cycle.
</p>
<p>
The bug is now <a href="https://github.com/mono/mono/commit/3481f718fbb7b9bb0260158d056aafa2330390c1">fixed</a>.  Thanks to Alan for noticing this case and investigating.
</p>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/152/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/152/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/152/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=152&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2011/02/22/sgen-%e2%80%93-finalization-and-weak-references/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>FOSDEM 2011</title>
		<link>http://schani.wordpress.com/2011/02/16/fosdem-2011/</link>
		<comments>http://schani.wordpress.com/2011/02/16/fosdem-2011/#comments</comments>
		<pubDate>Wed, 16 Feb 2011 15:20:13 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Photography]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[mono sgen gc fosdem]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=148</guid>
		<description><![CDATA[
<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">My SGen Talk </h3>
<div class="outline-text-3" id="text-1">

<p><a href="http://www.myplick.com/view/a9QLYJaDdJR">On myPlick</a>
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">A few photos </h3>
<div class="outline-text-3" id="text-2">

<p><img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg" />
</p>
<p>
More photos <a href="http://www.facebook.com/album.php?aid=277887&#38;id=503328245&#38;l=d6e9958fc6">here</a>.
</p></div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=148&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">My SGen Talk </h3>
<div class="outline-text-3" id="text-1">
<p><a href="http://www.myplick.com/view/a9QLYJaDdJR">On myPlick</a>
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">A few photos </h3>
<div class="outline-text-3" id="text-2">
<p><img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg" />
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg" />
</p>
<p>
More photos <a href="http://www.facebook.com/album.php?aid=277887&amp;id=503328245&amp;l=d6e9958fc6">here</a>.
</p>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/148/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=148&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2011/02/16/fosdem-2011/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-1.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-2.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-3.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/fosdem2011/fosdem2011-4.jpg</media:title>
		</media:content>
	</item>
		<item>
		<title>SGen &#8211; The Major Collectors</title>
		<link>http://schani.wordpress.com/2011/01/10/sgen-the-major-collectors/</link>
		<comments>http://schani.wordpress.com/2011/01/10/sgen-the-major-collectors/#comments</comments>
		<pubDate>Mon, 10 Jan 2011 22:24:40 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[garbage collection]]></category>
		<category><![CDATA[mono]]></category>
		<category><![CDATA[sgen]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=144</guid>
		<description><![CDATA[
<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>In the <a href="http://schani.wordpress.com/2010/12/29/sgen-the-nursery/">last post</a> of my <a href="http://schani.wordpress.com/2010/12/20/sgen/">series on SGen</a>, Mono's new garbage collector, I talked about the young generation, called the nursery, and its collector.  In this post I shall talk about the old, or major, generation and the two different collectors that implement it.
</p>
<p>
The older of the two, the copying collector, is mostly used for reference nowadays.  It has been supplanted by the Mark-and-Sweep collector in practical use because it performs better in most cases and still has room for improvement.
</p>
<p>
The Mono man page gives details on how to specify which collector to use and how to change their parameters with the environment variable <code>MONO_GC_PARAMS</code>.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Copying </h3>
<div class="outline-text-3" id="text-2">

<p>The copying major collector is a very simple copying garbage collector.  Instead of using one big consecutive region of memory, however, it uses fixed-size blocks (of 128 KB) that are allocated and freed on demand.  Within those blocks allocation is done by pointer bumping.
</p>
<p>
At a major collection all objects for which this is possible are copied to newly allocated blocks.  Pinned objects clearly cannot be copied, so they stay in place.  All blocks that have been vacated completely are freed.  In the ones that still contain pinned objects we zero the regions that are now empty.  We don't actually reuse that space in those blocks but just keep them around until all their objects have become unpinned.  This is clearly an area where things can be improved.
</p></div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Why two collectors? </h3>
<div class="outline-text-3" id="text-3">

<p>The copying collector was implemented when SGen was still young.  It shares most of the copying machinery with the nursery collector, so it was easy to implement.  As a production collector, however, it is not suited for most workloads.  The old generation is expected to be large and composed of objects that have a long life, so copying them is bad from the perspective of memory usage (it might double for the duration of the collection) as well as of cache behavior.
</p></div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Mark-and-Sweep </h3>
<div class="outline-text-3" id="text-4">

<p>Mark-and-Sweep is SGen's default major collector and is actively being developed and improved.
</p>
<p>
In contrast to the copying collector a mark-and-sweep collector has to deal with individual objects being freed and new objects filling up that space again.  (This is different from pinned objects leaving holes in blocks for the copying collector because in that case it is expected that there will only be a tiny number of remaining pinned objects, so the free regions will be large, whereas here it is expected that many objects survive, leaving lots of small holes). This creates the problem of fragmentation - lots of small holes between objects that cannot be filled anymore, resulting in lots of wasted space.
</p>
</div>

<div id="outline-container-4_1" class="outline-4">
<h4 id="sec-4_1">Blocks </h4>
<div class="outline-text-4" id="text-4_1">

<p>The basic idea of SGen's Mark-and-Sweep collector is to have fixed-size blocks, each of which is divided into equally sized object slots of a variety of different sizes - this is similar to what Boehm does.  One advantage of this over allocating objects sequentially is that the holes are always the same size, so if there are new objects arriving at that size the holes can always be filled.
</p>
<p>
Of course it is not feasible to support blocks with slots of every conceivable object size, so we space out the different supported sizes so as not to waste too much space (in the future we might want to choose the sizes dynamically, to fit the workload).  Another thing that works to our advantage is that, since the slots have a known size, we don't have to keep tabs on where they start and end.  All we need is one mark bit per object which we keep in a separate region for better memory locality (we could use unused bits in the first object word, like the nursery collector does).  For faster processing we're actually keeping one bit per potential object start address, i.e. 1 bit per 8 bytes.
</p>
<p>
Since we only have a limited number of different slot sizes, we can also keep free lists, so allocation is more efficient because it doesn't involve search.
</p>
<p>
During garbage collection it is necessary to get from an object to its block in order to access its mark bit.  The size of each block is 16 KB, and they are allocated on 16 KB boundaries, so to get from an object that is inside a block to the address of its containing block a simple logical <code>AND</code> is sufficient.  Unfortunately, large objects live outside of blocks and for them this operation would yield a bogus block address.  To determine whether an object is large we need to examine its class metadata which is referenced from its VTable, requiring three loads (the "fixed heap" variant of the collector solves this problem and is described below.)
</p>
<p>
Each block has a "block info" data structure that contains metadata about the block.  Apart from the size of the object slots and several other pieces of data it also contains the mark bits.
</p></div>

</div>

<div id="outline-container-4_2" class="outline-4">
<h4 id="sec-4_2">Fixed Heap </h4>
<div class="outline-text-4" id="text-4_2">

<p>In the standard Mark-and-Sweep collector the blocks are allocated on demand, so they are potentially scattered all over the process's address space.  The fixed heap variant, on the other hand, allocates (using <code>mmap</code>) a heap of a fixed size (excluding the large object space) on startup and then assigns blocks out of that space.  The advantage here is that we can check whether a pointer is within a block simply by checking whether it falls within the bounds of the allocated space, so no loads are necessary to get to its mark bit.
</p>
<p>
In standard Mark-and-Sweep the first word of each block links to its block info.  Fixed heap can forego this load as well because the block infos are also allocated sequentially, so all that is needed to get to a block info is its index which is the same as its block's index, which can be determined from the block address.
</p>
<p>
The obvious disadvantage of fixed heap Mark-and-Sweep is that the heap cannot grow beyond its fixed size.  On some workloads fixed heap yields significant performance improvements.  I will try to give some benchmarking results in a later post.
</p></div>

</div>

<div id="outline-container-4_3" class="outline-4">
<h4 id="sec-4_3">Non-Reference Objects </h4>
<div class="outline-text-4" id="text-4_3">

<p>Objects that don't contain references don't have to be scanned, which implies that they also don't have to be put on the gray stack.  To handle this special case more efficiently Mark-and-Sweep uses separate blocks for objects with versus without references.  The block info contains a flag that says which type the block is.
</p>
<p>
For the fixed heap collector this separation means that we don't even have to load a single word of an object or its metadata if it doesn't contain references.
</p></div>

</div>

<div id="outline-container-4_4" class="outline-4">
<h4 id="sec-4_4">Evacuation </h4>
<div class="outline-text-4" id="text-4_4">

<p>Even despite the segregation of blocks according to fixed object sizes one form of memory waste can still occur.  If the workload shifts from having lots of objects of a specific size to having only a few of them, a lot of blocks for that object size will have very low occupancy - perhaps a single object will still be live in the block and no new objects will arrive to fill the empty slots.
</p>
<p>
Unlike Boehm, SGen has the liberty to move objects around.  During the sweep stage of a collection we keep statistics on the occupancy of blocks of all slot sizes.  If the occupancy for a given slot size falls below a certain (user definable) threshold then we mark that slot size for evacuation on the next major collection.
</p>
<p>
When blocks of a slot size are being evacuated Mark-and-Sweep acts like a copying collector on the objects within those blocks.  Instead of marking them they are copied to newly allocated blocks, filling them sequentially and emptying all the sparsely occupied ones which will then be freed during the following sweep.
</p>
<p>
Blocks with pinned objects are exempt from this whole process since they cannot be freed anyway.
</p>
<p>
Note that the occupancy situation might change between a sweep and the following major collection - the workload could change in the mean time and produce lots of live objects of that size which would all be copied, without actually saving any space.  SGen cannot tell which objects will be live without doing the mark phase because that's the mark phase's job, so we take the risk of that happening.
</p>
<p>
As a side-note: Sun's <a href="http://research.sun.com/jtech/pubs/04-g1-paper-ismm.pdf">Garbage-First</a> collector solves this problem by running the mark phase concurrently with the mutator, so it is actually able to tell at the start of a garbage collection pause which blocks will contain mostly garbage.
</p></div>

</div>

<div id="outline-container-4_5" class="outline-4">
<h4 id="sec-4_5">Parallel Mark </h4>
<div class="outline-text-4" id="text-4_5">

<p>The mark phase can be parallelized with reasonable effort. Essentially, a number of threads each start with a set of root objects independently.  Each thread has its own gray stack, but care must be taken that work is not repeated.  Two threads can compete for marking the same object, and if they both put the object into their gray stacks, the object will be scanned twice, which is to be avoided.
</p>
<p>
Some care must also be taken with handling objects that still reside in the nursery.  They have to be copied to the major heap and a forwarding pointer must be installed.  The forwarding pointer can only be installed after the space on the major heap has been allocated (otherwise where would it point to?), so two threads might do that independently but only one will win out and the other will have done the allocation in vain.
</p>
<p>
It is not actually sufficient to just partition the set of root objects between the worker threads - some threads might end up having a far smaller share of the live object graph to work with.  SGen currently uses a shared gray stack section buffer where worker threads deposit parts of their gray queues for the others to take.
</p>
<p>
Parallel mark achieves quite impressive performance gains on some workloads, but on most the gains are still somewhat disappointing.  I suspect the work distribution needs improvement.
</p></div>

</div>

<div id="outline-container-4_6" class="outline-4">
<h4 id="sec-4_6">Concurrent Sweep </h4>
<div class="outline-text-4" id="text-4_6">

<p>The sweep phase goes through all blocks in the system, resets the mark bits, zeroes the object slots that were freed and rebuilds the free lists.
</p>
<p>
None of those data structures are actually needed until the next nursery collection, so the sweep lends itself to be done in the background while the mutator is running, which is what Mark-and-Sweep can do optionally.  The next nursery collection will in that case wait for the sweep thread to finish until it proceeds.
</p></div>
</div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=144&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>In the <a href="http://schani.wordpress.com/2010/12/29/sgen-the-nursery/">last post</a> of my <a href="http://schani.wordpress.com/2010/12/20/sgen/">series on SGen</a>, Mono&#8217;s new garbage collector, I talked about the young generation, called the nursery, and its collector.  In this post I shall talk about the old, or major, generation and the two different collectors that implement it.
</p>
<p>
The older of the two, the copying collector, is mostly used for reference nowadays.  It has been supplanted by the Mark-and-Sweep collector in practical use because it performs better in most cases and still has room for improvement.
</p>
<p>
The Mono man page gives details on how to specify which collector to use and how to change their parameters with the environment variable <code>MONO_GC_PARAMS</code>.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Copying </h3>
<div class="outline-text-3" id="text-2">
<p>The copying major collector is a very simple copying garbage collector.  Instead of using one big consecutive region of memory, however, it uses fixed-size blocks (of 128 KB) that are allocated and freed on demand.  Within those blocks allocation is done by pointer bumping.
</p>
<p>
At a major collection all objects for which this is possible are copied to newly allocated blocks.  Pinned objects clearly cannot be copied, so they stay in place.  All blocks that have been vacated completely are freed.  In the ones that still contain pinned objects we zero the regions that are now empty.  We don&#8217;t actually reuse that space in those blocks but just keep them around until all their objects have become unpinned.  This is clearly an area where things can be improved.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Why two collectors? </h3>
<div class="outline-text-3" id="text-3">
<p>The copying collector was implemented when SGen was still young.  It shares most of the copying machinery with the nursery collector, so it was easy to implement.  As a production collector, however, it is not suited for most workloads.  The old generation is expected to be large and composed of objects that have a long life, so copying them is bad from the perspective of memory usage (it might double for the duration of the collection) as well as of cache behavior.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Mark-and-Sweep </h3>
<div class="outline-text-3" id="text-4">
<p>Mark-and-Sweep is SGen&#8217;s default major collector and is actively being developed and improved.
</p>
<p>
In contrast to the copying collector a mark-and-sweep collector has to deal with individual objects being freed and new objects filling up that space again.  (This is different from pinned objects leaving holes in blocks for the copying collector because in that case it is expected that there will only be a tiny number of remaining pinned objects, so the free regions will be large, whereas here it is expected that many objects survive, leaving lots of small holes). This creates the problem of fragmentation &#8211; lots of small holes between objects that cannot be filled anymore, resulting in lots of wasted space.
</p>
</div>
<div id="outline-container-4_1" class="outline-4">
<h4 id="sec-4_1">Blocks </h4>
<div class="outline-text-4" id="text-4_1">
<p>The basic idea of SGen&#8217;s Mark-and-Sweep collector is to have fixed-size blocks, each of which is divided into equally sized object slots of a variety of different sizes &#8211; this is similar to what Boehm does.  One advantage of this over allocating objects sequentially is that the holes are always the same size, so if there are new objects arriving at that size the holes can always be filled.
</p>
<p>
Of course it is not feasible to support blocks with slots of every conceivable object size, so we space out the different supported sizes so as not to waste too much space (in the future we might want to choose the sizes dynamically, to fit the workload).  Another thing that works to our advantage is that, since the slots have a known size, we don&#8217;t have to keep tabs on where they start and end.  All we need is one mark bit per object which we keep in a separate region for better memory locality (we could use unused bits in the first object word, like the nursery collector does).  For faster processing we&#8217;re actually keeping one bit per potential object start address, i.e. 1 bit per 8 bytes.
</p>
<p>
Since we only have a limited number of different slot sizes, we can also keep free lists, so allocation is more efficient because it doesn&#8217;t involve search.
</p>
<p>
During garbage collection it is necessary to get from an object to its block in order to access its mark bit.  The size of each block is 16 KB, and they are allocated on 16 KB boundaries, so to get from an object that is inside a block to the address of its containing block a simple logical <code>AND</code> is sufficient.  Unfortunately, large objects live outside of blocks and for them this operation would yield a bogus block address.  To determine whether an object is large we need to examine its class metadata which is referenced from its VTable, requiring three loads (the &#8220;fixed heap&#8221; variant of the collector solves this problem and is described below.)
</p>
<p>
Each block has a &#8220;block info&#8221; data structure that contains metadata about the block.  Apart from the size of the object slots and several other pieces of data it also contains the mark bits.
</p>
</div>
</div>
<div id="outline-container-4_2" class="outline-4">
<h4 id="sec-4_2">Fixed Heap </h4>
<div class="outline-text-4" id="text-4_2">
<p>In the standard Mark-and-Sweep collector the blocks are allocated on demand, so they are potentially scattered all over the process&#8217;s address space.  The fixed heap variant, on the other hand, allocates (using <code>mmap</code>) a heap of a fixed size (excluding the large object space) on startup and then assigns blocks out of that space.  The advantage here is that we can check whether a pointer is within a block simply by checking whether it falls within the bounds of the allocated space, so no loads are necessary to get to its mark bit.
</p>
<p>
In standard Mark-and-Sweep the first word of each block links to its block info.  Fixed heap can forego this load as well because the block infos are also allocated sequentially, so all that is needed to get to a block info is its index which is the same as its block&#8217;s index, which can be determined from the block address.
</p>
<p>
The obvious disadvantage of fixed heap Mark-and-Sweep is that the heap cannot grow beyond its fixed size.  On some workloads fixed heap yields significant performance improvements.  I will try to give some benchmarking results in a later post.
</p>
</div>
</div>
<div id="outline-container-4_3" class="outline-4">
<h4 id="sec-4_3">Non-Reference Objects </h4>
<div class="outline-text-4" id="text-4_3">
<p>Objects that don&#8217;t contain references don&#8217;t have to be scanned, which implies that they also don&#8217;t have to be put on the gray stack.  To handle this special case more efficiently Mark-and-Sweep uses separate blocks for objects with versus without references.  The block info contains a flag that says which type the block is.
</p>
<p>
For the fixed heap collector this separation means that we don&#8217;t even have to load a single word of an object or its metadata if it doesn&#8217;t contain references.
</p>
</div>
</div>
<div id="outline-container-4_4" class="outline-4">
<h4 id="sec-4_4">Evacuation </h4>
<div class="outline-text-4" id="text-4_4">
<p>Even despite the segregation of blocks according to fixed object sizes one form of memory waste can still occur.  If the workload shifts from having lots of objects of a specific size to having only a few of them, a lot of blocks for that object size will have very low occupancy &#8211; perhaps a single object will still be live in the block and no new objects will arrive to fill the empty slots.
</p>
<p>
Unlike Boehm, SGen has the liberty to move objects around.  During the sweep stage of a collection we keep statistics on the occupancy of blocks of all slot sizes.  If the occupancy for a given slot size falls below a certain (user definable) threshold then we mark that slot size for evacuation on the next major collection.
</p>
<p>
When blocks of a slot size are being evacuated Mark-and-Sweep acts like a copying collector on the objects within those blocks.  Instead of marking them they are copied to newly allocated blocks, filling them sequentially and emptying all the sparsely occupied ones which will then be freed during the following sweep.
</p>
<p>
Blocks with pinned objects are exempt from this whole process since they cannot be freed anyway.
</p>
<p>
Note that the occupancy situation might change between a sweep and the following major collection &#8211; the workload could change in the mean time and produce lots of live objects of that size which would all be copied, without actually saving any space.  SGen cannot tell which objects will be live without doing the mark phase because that&#8217;s the mark phase&#8217;s job, so we take the risk of that happening.
</p>
<p>
As a side-note: Sun&#8217;s <a href="http://research.sun.com/jtech/pubs/04-g1-paper-ismm.pdf">Garbage-First</a> collector solves this problem by running the mark phase concurrently with the mutator, so it is actually able to tell at the start of a garbage collection pause which blocks will contain mostly garbage.
</p>
</div>
</div>
<div id="outline-container-4_5" class="outline-4">
<h4 id="sec-4_5">Parallel Mark </h4>
<div class="outline-text-4" id="text-4_5">
<p>The mark phase can be parallelized with reasonable effort. Essentially, a number of threads each start with a set of root objects independently.  Each thread has its own gray stack, but care must be taken that work is not repeated.  Two threads can compete for marking the same object, and if they both put the object into their gray stacks, the object will be scanned twice, which is to be avoided.
</p>
<p>
Some care must also be taken with handling objects that still reside in the nursery.  They have to be copied to the major heap and a forwarding pointer must be installed.  The forwarding pointer can only be installed after the space on the major heap has been allocated (otherwise where would it point to?), so two threads might do that independently but only one will win out and the other will have done the allocation in vain.
</p>
<p>
It is not actually sufficient to just partition the set of root objects between the worker threads &#8211; some threads might end up having a far smaller share of the live object graph to work with.  SGen currently uses a shared gray stack section buffer where worker threads deposit parts of their gray queues for the others to take.
</p>
<p>
Parallel mark achieves quite impressive performance gains on some workloads, but on most the gains are still somewhat disappointing.  I suspect the work distribution needs improvement.
</p>
</div>
</div>
<div id="outline-container-4_6" class="outline-4">
<h4 id="sec-4_6">Concurrent Sweep </h4>
<div class="outline-text-4" id="text-4_6">
<p>The sweep phase goes through all blocks in the system, resets the mark bits, zeroes the object slots that were freed and rebuilds the free lists.
</p>
<p>
None of those data structures are actually needed until the next nursery collection, so the sweep lends itself to be done in the background while the mutator is running, which is what Mark-and-Sweep can do optionally.  The next nursery collection will in that case wait for the sweep thread to finish until it proceeds.
</p>
</div>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/144/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/144/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/144/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=144&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2011/01/10/sgen-the-major-collectors/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>SGen &#8211; The Nursery</title>
		<link>http://schani.wordpress.com/2010/12/29/sgen-the-nursery/</link>
		<comments>http://schani.wordpress.com/2010/12/29/sgen-the-nursery/#comments</comments>
		<pubDate>Wed, 29 Dec 2010 22:49:35 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[garbage collection]]></category>
		<category><![CDATA[mono]]></category>
		<category><![CDATA[sgen]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=140</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>This is the second part of a <a href="http://schani.wordpress.com/2010/12/20/sgen/">series of blog posts on SGen</a>, Mono's new garbage collector.  In this installment we shall be taking a closer look at the first generation, also called the "nursery".
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Nursery </h3>
<div class="outline-text-3" id="text-2">

<p>As the name implies, the nursery is the generation where objects are allocated, or, to be more poetic, born.  In SGen, it is a contiguous region of memory that is allocated when SGen starts up, and it does not change size.  The default size is 4 MB, but a different size can be specified via the environment variable <code>MONO_GC_PARAMS</code> when invoking Mono.  See the man page for details.
</p></div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Allocation </h3>
<div class="outline-text-3" id="text-3">

<p>In a classic semi-space collector allocation is done in a linear fashion by pointer bumping, i.e. simply incrementing the pointer that indicates the start of the current semi-space's region that is still unused by the amount of memory that is required for the new object.
</p>
<p>
In a multi-threaded environment like Mono using a single pointer would introduce additional costs because we would have to do at least a compare-and-swap to increment it, potentially looping because it might fail due to contention.  The standard solution to this problem is to give each thread a little piece of the nursery exclusively from which it can bump-allocate without contention.  Synchronization is only needed when a thread has filled up its piece and needs a new one, which should be far less frequent.  These pieces of nursery are called "thread local allocation buffers", or "TLABs" for short.
</p>
<p>
TLABs are assigned to threads lazily.  The relevant pointers, most notably the "bumper" and the address of the TLAB's end, are stored in thread local variables.  Currently TLABs are fixed in size to 4 KB (actually that's the maximum size - they might be smaller because the nursery can be fragmented).  In the future we might dynamically adapt that size to the current situation.  The less threads that are allocating objects, the larger the TLABs should be so that they need to be assigned less often.  On the other hand, the more threads we allocate TLABs to the sooner we run out of nursery space, so we will collect prematurely.  In that situation the TLABs should be smaller. Ideally a thread should get a TLAB whose size is proportional to its rate of allocation.
</p>
<p>
Calling from managed into unmanaged code involves doing a relatively costly transition.  To avoid this the fast paths of the allocation functions are managed code.  They only handle allocation from the TLAB.  If the thread does not have a TLAB assigned yet or it is not of sufficient size to fulfill the allocation request we call the full, unmanaged allocation function.
</p></div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Collection </h3>
<div class="outline-text-3" id="text-4">

<p>Collecting the nursery is not much different in principle from collecting the major heap, so much of what I discuss here is also valid there.
</p>
</div>

<div id="outline-container-4_1" class="outline-4">
<h4 id="sec-4_1">Coloring </h4>
<div class="outline-text-4" id="text-4_1">

<p>The collection process can be thought of as a coloring game.  At the start of the collection all objects start out as white.  First the collector goes through all the roots and marks those gray, signifying that they are reachable but their contents have not been processed yet.  Each gray object is then scanned for further references.  The objects found during that scanning are also colored gray if they are still white.  When the collector has finished scanning a gray object it paints it black, meaning that object is reachable and fully processed.  This is repeated until there are no gray objects left, at which point all white objects are known not to be reachable and can be disposed of whereas all black objects must be considered reachable.
</p>
<p>
One of the central data structures in this process, then, is the "gray set", i.e. the set of all gray objects.  In SGen it is implemented as a stack.
</p></div>

</div>

<div id="outline-container-4_2" class="outline-4">
<h4 id="sec-4_2">The roots </h4>
<div class="outline-text-4" id="text-4_2">

<p>As I mentioned above as well as in my overview post, the collection starts with the roots.  These include, most prominently, the references on the stacks of all managed threads and static class fields.  In addition to that there are numerous other roots that the runtime registers and uses for internal purposes.  Since SGen is a generational collector, the roots for the nursery collection also include all references from the major heap to objects in the nursery. These are kept track of with write barriers which I will discuss in a later blog post.
</p></div>

</div>

<div id="outline-container-4_3" class="outline-4">
<h4 id="sec-4_3">Forwarding </h4>
<div class="outline-text-4" id="text-4_3">

<p>In a copying collector like SGen's nursery collector, which copies reachable objects from the nursery to the major heap, coloring an object also implies copying it, which in turn requires that all references to it must be updated.  To that end we must keep track of where each object has been copied to.  The easiest way to do this is to use a forwarding pointer.  In Mono, the first pointer-sized word of each object is a pointer to the object's vtable, which is aligned to at least 4 bytes.  This alignment leaves the two least significant bits for SGen to play with during a collection, so we use one of them to indicate whether that particular object is forwarded, i.e. has already been copied to the major heap.  If it is set, the rest of the word doesn't actually point to the vtable anymore, but to the object's new location on the major heap.  The second bit is used for pinning, which I will discuss further down.
</p></div>

</div>

<div id="outline-container-4_4" class="outline-4">
<h4 id="sec-4_4">The loop </h4>
<div class="outline-text-4" id="text-4_4">

<p>Here is a simplified (pinning is not handled) pseudo-code implementation of the central loop for the nursery collector:
</p>

<pre class="example"><span class="linenr"> 1:  </span>while (!gray_stack_is_empty ()) {
<span class="linenr"> 2:  </span>    object = gray_stack_pop ();
<span class="linenr"> 3:  </span>    foreach (refp in object) {
<span class="linenr"> 4:  </span>        old = *refp;
<span class="linenr"> 5:  </span>        if (!ptr_in_nursery (old))
<span class="linenr"> 6:  </span>            continue;
<span class="linenr"> 7:  </span>        if (object_is_forwarded (old)) {
<span class="linenr"> 8:  </span>            new = forwarding_destination (old);
<span class="linenr"> 9:  </span>        } else {
<span class="linenr">10:  </span>            new = major_alloc (object_size (old));
<span class="linenr">11:  </span>            copy_object (new, old);
<span class="linenr">12:  </span>            forwarding_set (old, new);
<span class="linenr">13:  </span>            gray_stack_push (new);
<span class="linenr">14:  </span>       }
<span class="linenr">15:  </span>       *refp = new;
<span class="linenr">16:  </span>   }
<span class="linenr">17:  </span>}
</pre>

<p>
In line <code>2</code> we pop an object out of the gray stack and in line <code>3</code> we loop over all the references that it contains.  <code>refp</code> is a pointer to the reference we're currently looking at.  In line <code>4</code> we fetch the actual reference.  We are only interested in collecting the nursery, so if the reference already points to the old generation we skip it (lines <code>5</code> and <code>6</code>).  If the object is already forwarded (line <code>7</code>) we fetch its new location (line <code>8</code>).  Otherwise we must copy it to the major heap.  To that end we allocate the space required (line <code>10</code>), do the actual copying (line <code>11</code>) and forward the object (line <code>12</code>). That newly copied object must also be processed, so we push it on the gray stack (line <code>13</code>).  Finally, for both cases, we update the reference to point to the new location of the object (line <code>15</code>).
</p></div>

</div>

<div id="outline-container-4_5" class="outline-4">
<h4 id="sec-4_5">Pinning </h4>
<div class="outline-text-4" id="text-4_5">

<p>In the overview post I mentioned that we currently cannot scan stack frames precisely, i.e. we do not know whether a word that looks like a reference just does so by coincidence or is actually one.  Work for scanning managed stack frames precisely is under way, but for unmanaged frames this is not possible anyway because the C compiler does not provide us with the necessary information (some runtimes avoid this problem by not exposing managed references to unmanaged code directly).
</p>
<p>
Because we lack this knowledge we must proceed conservatively and assume that what looks like a reference is one, which means considering the object it points to as reachable.  On the other hand, since we cannot be sure, we are not allowed to modify this supposed reference, because it might just be a number.  That implies that we also cannot move the object.  In other words the object is "pinned".
</p>
<p>
Pinned objects are a bit of a complication.  Having pinned objects in the nursery means that we cannot clean it out completely and so it becomes fragmented.  It also requires an additional check in the central loop.
</p></div>

</div>

<div id="outline-container-4_6" class="outline-4">
<h4 id="sec-4_6">Finishing up </h4>
<div class="outline-text-4" id="text-4_6">

<p>After all the copying is done what's left is to clean up the nursery for the next round of allocations.  SGen doesn't actually zero the nursery memory at this point, however.  Instead that happens at the point when TLABs are assigned to threads.  The advantage is not only that it potentially gets distributed over more than one thread, but also that is has better cache behavior - the TLAB is likely to be touched again soon by that thread.
</p>
<p>
What has to happen at this point, though, is to take score of the regions of nursery memory that are available for allocation, which means everything apart from pinned objects.  Since we have a list of all pinned objects this is done very efficiently - I will discuss this "pin queue" in a later post.  It is these "fragments" from which TLABs will be assigned in the next mutator round.
</p>
<p>
I should mention that there is a bit of additional work to be done prior to fragment creation, namely making sure unreachable finalizable objects are finalized and weak references are zeroed if the objects they pointed to have become unreachable.  Both are very similar issues and shall be discussed in a later post.
</p></div>
</div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=140&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>This is the second part of a <a href="http://schani.wordpress.com/2010/12/20/sgen/">series of blog posts on SGen</a>, Mono&#8217;s new garbage collector.  In this installment we shall be taking a closer look at the first generation, also called the &#8220;nursery&#8221;.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Nursery </h3>
<div class="outline-text-3" id="text-2">
<p>As the name implies, the nursery is the generation where objects are allocated, or, to be more poetic, born.  In SGen, it is a contiguous region of memory that is allocated when SGen starts up, and it does not change size.  The default size is 4 MB, but a different size can be specified via the environment variable <code>MONO_GC_PARAMS</code> when invoking Mono.  See the man page for details.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Allocation </h3>
<div class="outline-text-3" id="text-3">
<p>In a classic semi-space collector allocation is done in a linear fashion by pointer bumping, i.e. simply incrementing the pointer that indicates the start of the current semi-space&#8217;s region that is still unused by the amount of memory that is required for the new object.
</p>
<p>
In a multi-threaded environment like Mono using a single pointer would introduce additional costs because we would have to do at least a compare-and-swap to increment it, potentially looping because it might fail due to contention.  The standard solution to this problem is to give each thread a little piece of the nursery exclusively from which it can bump-allocate without contention.  Synchronization is only needed when a thread has filled up its piece and needs a new one, which should be far less frequent.  These pieces of nursery are called &#8220;thread local allocation buffers&#8221;, or &#8220;TLABs&#8221; for short.
</p>
<p>
TLABs are assigned to threads lazily.  The relevant pointers, most notably the &#8220;bumper&#8221; and the address of the TLAB&#8217;s end, are stored in thread local variables.  Currently TLABs are fixed in size to 4 KB (actually that&#8217;s the maximum size &#8211; they might be smaller because the nursery can be fragmented).  In the future we might dynamically adapt that size to the current situation.  The less threads that are allocating objects, the larger the TLABs should be so that they need to be assigned less often.  On the other hand, the more threads we allocate TLABs to the sooner we run out of nursery space, so we will collect prematurely.  In that situation the TLABs should be smaller. Ideally a thread should get a TLAB whose size is proportional to its rate of allocation.
</p>
<p>
Calling from managed into unmanaged code involves doing a relatively costly transition.  To avoid this the fast paths of the allocation functions are managed code.  They only handle allocation from the TLAB.  If the thread does not have a TLAB assigned yet or it is not of sufficient size to fulfill the allocation request we call the full, unmanaged allocation function.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Collection </h3>
<div class="outline-text-3" id="text-4">
<p>Collecting the nursery is not much different in principle from collecting the major heap, so much of what I discuss here is also valid there.
</p>
</div>
<div id="outline-container-4_1" class="outline-4">
<h4 id="sec-4_1">Coloring </h4>
<div class="outline-text-4" id="text-4_1">
<p>The collection process can be thought of as a coloring game.  At the start of the collection all objects start out as white.  First the collector goes through all the roots and marks those gray, signifying that they are reachable but their contents have not been processed yet.  Each gray object is then scanned for further references.  The objects found during that scanning are also colored gray if they are still white.  When the collector has finished scanning a gray object it paints it black, meaning that object is reachable and fully processed.  This is repeated until there are no gray objects left, at which point all white objects are known not to be reachable and can be disposed of whereas all black objects must be considered reachable.
</p>
<p>
One of the central data structures in this process, then, is the &#8220;gray set&#8221;, i.e. the set of all gray objects.  In SGen it is implemented as a stack.
</p>
</div>
</div>
<div id="outline-container-4_2" class="outline-4">
<h4 id="sec-4_2">The roots </h4>
<div class="outline-text-4" id="text-4_2">
<p>As I mentioned above as well as in my overview post, the collection starts with the roots.  These include, most prominently, the references on the stacks of all managed threads and static class fields.  In addition to that there are numerous other roots that the runtime registers and uses for internal purposes.  Since SGen is a generational collector, the roots for the nursery collection also include all references from the major heap to objects in the nursery. These are kept track of with write barriers which I will discuss in a later blog post.
</p>
</div>
</div>
<div id="outline-container-4_3" class="outline-4">
<h4 id="sec-4_3">Forwarding </h4>
<div class="outline-text-4" id="text-4_3">
<p>In a copying collector like SGen&#8217;s nursery collector, which copies reachable objects from the nursery to the major heap, coloring an object also implies copying it, which in turn requires that all references to it must be updated.  To that end we must keep track of where each object has been copied to.  The easiest way to do this is to use a forwarding pointer.  In Mono, the first pointer-sized word of each object is a pointer to the object&#8217;s vtable, which is aligned to at least 4 bytes.  This alignment leaves the two least significant bits for SGen to play with during a collection, so we use one of them to indicate whether that particular object is forwarded, i.e. has already been copied to the major heap.  If it is set, the rest of the word doesn&#8217;t actually point to the vtable anymore, but to the object&#8217;s new location on the major heap.  The second bit is used for pinning, which I will discuss further down.
</p>
</div>
</div>
<div id="outline-container-4_4" class="outline-4">
<h4 id="sec-4_4">The loop </h4>
<div class="outline-text-4" id="text-4_4">
<p>Here is a simplified (pinning is not handled) pseudo-code implementation of the central loop for the nursery collector:
</p>
<pre class="example"><span class="linenr"> 1:  </span>while (!gray_stack_is_empty ()) {
<span class="linenr"> 2:  </span>    object = gray_stack_pop ();
<span class="linenr"> 3:  </span>    foreach (refp in object) {
<span class="linenr"> 4:  </span>        old = *refp;
<span class="linenr"> 5:  </span>        if (!ptr_in_nursery (old))
<span class="linenr"> 6:  </span>            continue;
<span class="linenr"> 7:  </span>        if (object_is_forwarded (old)) {
<span class="linenr"> 8:  </span>            new = forwarding_destination (old);
<span class="linenr"> 9:  </span>        } else {
<span class="linenr">10:  </span>            new = major_alloc (object_size (old));
<span class="linenr">11:  </span>            copy_object (new, old);
<span class="linenr">12:  </span>            forwarding_set (old, new);
<span class="linenr">13:  </span>            gray_stack_push (new);
<span class="linenr">14:  </span>       }
<span class="linenr">15:  </span>       *refp = new;
<span class="linenr">16:  </span>   }
<span class="linenr">17:  </span>}
</pre>
<p>
In line <code>2</code> we pop an object out of the gray stack and in line <code>3</code> we loop over all the references that it contains.  <code>refp</code> is a pointer to the reference we&#8217;re currently looking at.  In line <code>4</code> we fetch the actual reference.  We are only interested in collecting the nursery, so if the reference already points to the old generation we skip it (lines <code>5</code> and <code>6</code>).  If the object is already forwarded (line <code>7</code>) we fetch its new location (line <code>8</code>).  Otherwise we must copy it to the major heap.  To that end we allocate the space required (line <code>10</code>), do the actual copying (line <code>11</code>) and forward the object (line <code>12</code>). That newly copied object must also be processed, so we push it on the gray stack (line <code>13</code>).  Finally, for both cases, we update the reference to point to the new location of the object (line <code>15</code>).
</p>
</div>
</div>
<div id="outline-container-4_5" class="outline-4">
<h4 id="sec-4_5">Pinning </h4>
<div class="outline-text-4" id="text-4_5">
<p>In the overview post I mentioned that we currently cannot scan stack frames precisely, i.e. we do not know whether a word that looks like a reference just does so by coincidence or is actually one.  Work for scanning managed stack frames precisely is under way, but for unmanaged frames this is not possible anyway because the C compiler does not provide us with the necessary information (some runtimes avoid this problem by not exposing managed references to unmanaged code directly).
</p>
<p>
Because we lack this knowledge we must proceed conservatively and assume that what looks like a reference is one, which means considering the object it points to as reachable.  On the other hand, since we cannot be sure, we are not allowed to modify this supposed reference, because it might just be a number.  That implies that we also cannot move the object.  In other words the object is &#8220;pinned&#8221;.
</p>
<p>
Pinned objects are a bit of a complication.  Having pinned objects in the nursery means that we cannot clean it out completely and so it becomes fragmented.  It also requires an additional check in the central loop.
</p>
</div>
</div>
<div id="outline-container-4_6" class="outline-4">
<h4 id="sec-4_6">Finishing up </h4>
<div class="outline-text-4" id="text-4_6">
<p>After all the copying is done what&#8217;s left is to clean up the nursery for the next round of allocations.  SGen doesn&#8217;t actually zero the nursery memory at this point, however.  Instead that happens at the point when TLABs are assigned to threads.  The advantage is not only that it potentially gets distributed over more than one thread, but also that is has better cache behavior &#8211; the TLAB is likely to be touched again soon by that thread.
</p>
<p>
What has to happen at this point, though, is to take score of the regions of nursery memory that are available for allocation, which means everything apart from pinned objects.  Since we have a list of all pinned objects this is done very efficiently &#8211; I will discuss this &#8220;pin queue&#8221; in a later post.  It is these &#8220;fragments&#8221; from which TLABs will be assigned in the next mutator round.
</p>
<p>
I should mention that there is a bit of additional work to be done prior to fragment creation, namely making sure unreachable finalizable objects are finalized and weak references are zeroed if the objects they pointed to have become unreachable.  Both are very similar issues and shall be discussed in a later post.
</p>
</div>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/140/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/140/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/140/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=140&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2010/12/29/sgen-the-nursery/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>SGen</title>
		<link>http://schani.wordpress.com/2010/12/20/sgen/</link>
		<comments>http://schani.wordpress.com/2010/12/20/sgen/#comments</comments>
		<pubDate>Mon, 20 Dec 2010 14:56:59 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[garbage collection]]></category>
		<category><![CDATA[mono]]></category>
		<category><![CDATA[sgen]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=135</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>SGen is Mono's new garbage collector that we have been working on intensely for almost two years now and that has been becoming <a href="http://lists.ximian.com/pipermail/mono-devel-list/2010-December/036513.html">stable and competitive</a> during the past few months.
</p>
<p>
In this series of blog posts I will try to explain how garbage collection works in general, how SGen works in particular, how to get the best performance out of it and, finally, what our plans are for the future.
</p>
<p>
This first post will give a very brief overview over what a garbage collector does and how it does it, before outlining the broad architecture of SGen.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Why a new garbage collector? </h3>
<div class="outline-text-3" id="text-2">

<p>Since its inception Mono, like many other garbage collected runtimes, has been using the <a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">Boehm-Demers-Weiser collector</a>, which I shall refer to as "Boehm" henceforth.  Boehm's main advantages are its portability, stability and the ease with which it can be embedded.  It is designed as a garbage collector for C and C++, not for managed languages, however, so it does not come as a surprise that it falls short for that purpose compared to collectors dedicated to such languages or runtimes.
</p>
<p>
Our goal with SGen was to overcome Boehm's limitations and provide better performance for managed applications, in terms of allocation speed, collector pauses as well as memory utilization.  In this and the following posts of this series I will mention points where we improve upon Boehm whenever appropriate.
</p></div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Garbage collection </h3>
<div class="outline-text-3" id="text-3">

<p>Before digging into the details of SGen it seems prudent to discuss how garbage collection actually works.  I will only discuss topics that are relevant to SGen, and even paint those only with very broad strokes.  For more comprehensive overviews of garbage collection see <a href="ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps">"Uniprocessor Garbage Collection Techniques" by Wilson</a> or, for more detailed information, <a href="http://www.amazon.com/gp/product/0471941484%3Fie=UTF8&#38;tag=juggphotsofta-20&#38;linkCode=as2&#38;camp=1789&#38;creative=9325&#38;creativeASIN=0471941484">Jones and Lins's book</a>.
</p>
<p>
The garbage collector, in short, is that part of the language's or platform's runtime that keeps a program from exhausting the computer's memory, despite the lack of an explicit way to free objects that have been allocated.  It does that by discovering which objects might still be reached by the program and getting rid of the rest.  It considers as reachable objects those that are directly or indirectly referenced by the so-called "roots", which are global variables, local variables on the stacks or in registers and other references explicitly registered as roots.  In other words, all those objects are considered reachable, as well as those that they reference, and those that they reference, and so on.  The pedantic will note that the program might not actually be able to reach all of those.  Computing the minimum set of objects that the program could reach is impossible, however, which we know thanks to <a href="http://www.amazon.com/gp/product/0470229055%3Fie=UTF8&#38;tag=juggphotsofta-20&#38;linkCode=as2&#38;camp=1789&#38;creative=9325&#38;creativeASIN=0470229055">this</a> <a href="http://www.amazon.com/gp/product/0802775802%3Fie=UTF8&#38;tag=juggphotsofta-20&#38;linkCode=as2&#38;camp=1789&#38;creative=9325&#38;creativeASIN=0802775802">guy</a>.
</p>
<p>
SGen, as well as Boehm, are "stop-the-world" collectors, meaning that they do their work while the actual program, slightingly called the "mutator" in garbage collection lingo, is stopped.
</p>
<p>
There are three classic garbage collection algorithms: Mark-and-sweep, Copying, and Compaction, of which only the first two are relevant to our discussion of SGen since we have not implemented a compacting collector and have no plans of doing so in the foreseeable future.
</p>
</div>

<div id="outline-container-3_1" class="outline-4">
<h4 id="sec-3_1">Mark-and-Sweep </h4>
<div class="outline-text-4" id="text-3_1">

<p>Mark-and-sweep is the oldest and probably most widely implemented garbage collection algorithm.  Boehm is a mark-and-sweep collector.
</p>
<p>
The idea is to have a "mark" bit on each object that is set on all reachable ones in the "mark" phase of the collection, by recursively traversing references starting from the root set.  Of course in practice this is not actually implemented recursively, but by using a queue or stack.
</p>
<p>
The "sweep" phase then frees the memory occupied by the objects that were not marked and resets the mark bits on the marked ones.
</p>
<p>
Of course many variants of this algorithm exist that vary in details.
</p></div>

</div>

<div id="outline-container-3_2" class="outline-4">
<h4 id="sec-3_2">Copying </h4>
<div class="outline-text-4" id="text-3_2">

<p>Traditional mark-and-sweep has two main weaknesses.  First, it needs to visit all objects, including the unreachable ones.  Second, it can suffer from memory fragmentation like a <code>malloc/free</code> allocator.
</p>
<p>
A copying collector addresses both problems by copying all reachable objects to a new memory region, allocating them linearly one after the other.  The old memory region can then be discarded wholesale.  The classic copying collector is the so called "semi-space" algorithm where the heap is divided into two halves.  Allocation is done linearly from one half until it is full, at which point the collector copies the reachable objects to the second half.  Allocation proceeds from there and at the next collection the now empty first half is used as the copying destination, the so-called "to-space".
</p>
<p>
Since with a copying collector objects change their addresses it is obvious that the collector also has to update all the references to the objects that it moves.  In Mono this is not always possible because we cannot in all cases unambiguously identify whether a value in memory (usually on the stack) is really a reference or just a number that points to an object by coincidence.  I will discuss how we deal with this problem in a later installment of this series.
</p></div>
</div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Generational garbage collection </h3>
<div class="outline-text-3" id="text-4">

<p>It is observed that for most workloads the majority of objects either die very quickly or live for a very long period of time.  One can take advantage of this so-called "generational hypothesis" by dividing the heap into two or more "generations" that are collected at different frequencies.
</p>
<p>
Objects begin their lives in the first generation, also referred to as the "nursery".  If they manage to stick around long enough they get "promoted" to the second generation, etc.  Since all objects are born in the nursery it grows very quickly and needs to be collected often. If the generational hypothesis holds only a small fraction of those objects will make it to the next generation so it needs to be collected less frequently.  Also, it is expected that while only a small fraction of the objects in the nursery will survive a collection, the percentage will be higher for older generations, so a collection algorithm that is better suited to high survival rates can be used for those.  Some collectors go so far as to give objects "tenure" at some point, making them immortal so they don't burden the collection anymore.
</p>
<p>
One difficulty with generational collection is that it's not quite possible to collect a young generation without looking at the older generations at all, because the mutator might have stored a reference to a young generation object in an older generation object.  Even if that young object is not referenced from anywhere else it must still be considered alive.  Clearly scanning all objects in older generations for such references would defeat the whole purpose of separating them in the first place.  To address this problem generational collectors make the mutator register all new references from old to young generations.  This registration is referred to as a "write barrier" and will be discussed in a later installment.  It is also possible to register reads instead of writes, with a "read barrier", but this is impractical without hardware support.  It's obvious that the write barrier must be very efficient since it's invoked for every write to a reference field or array member.
</p></div>

</div>

<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">SGen </h3>
<div class="outline-text-3" id="text-5">

<p>SGen, which historically stands for "Simple Generational", is a generational garbage collector with two generations.  The nursery is collected with a copying collector that immediately promotes all live objects, if possible, to the old generation (or "major heap").
</p>
<p>
For the old generation we have two collectors that the user can choose between: A mark-and-sweep and a copying collector.  The mark-and-sweep collector is available in four variants, with and without a parallel mark phase and with a dynamic or fixed-size heap.
</p>
<p>
In addition to that SGen has a separate space for large objects (larger than 8000 bytes), called the "Large Object Space", or "LOS", which is logically part of the old generation.  Large objects are not born in the nursery but directly in the LOS, and they are not moved.
</p>
<p>
One major difficulty SGen has to deal with is objects that are "pinned", which means that they cannot be moved.  The reason for this is usually that they are referenced from stack frames that we cannot scan "precisely", i.e. for which we do not know what is a reference and what is not.  Work is currently under way to scan managed stack frames precisely, but we will always have to handle unmanaged stack frames, too, for which no type information is available and which can therefore only be scanned "conservatively".  More on this in one of the following posts.
</p></div>

</div>

<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">Future installments </h3>
<div class="outline-text-3" id="text-6">

<p>Having covered a few basic concepts of garbage collection I shall go into more detail about SGen in the blog posts following this one. Here is a tentative list of the posts I intend to write in the coming weeks.  Comments and suggestions are of course welcome.
</p>
</div>

<div id="outline-container-6_1" class="outline-4">
<h4 id="sec-6_1">Allocating objects and the minor collector </h4>
<div class="outline-text-4" id="text-6_1">

<p>How is the nursery organized, how are new objects allocated and how is the nursery collected?
</p></div>

</div>

<div id="outline-container-6_2" class="outline-4">
<h4 id="sec-6_2">The major collectors </h4>
<div class="outline-text-4" id="text-6_2">

<p>An overview of SGen's major collectors, mark-and-sweep and copying. I'll discuss how they organize the heap and how they collect.
</p></div>

</div>

<div id="outline-container-6_3" class="outline-4">
<h4 id="sec-6_3">Sundry </h4>
<div class="outline-text-4" id="text-6_3">

<p>Several shorter topics: Large objects, GC descriptors, conservative scanning, pinning, write barriers, finalization, weak references and domain unloading.
</p></div>

</div>

<div id="outline-container-6_4" class="outline-4">
<h4 id="sec-6_4">Debugging a garbage collector </h4>
<div class="outline-text-4" id="text-6_4">

<p>Finding bugs in a garbage collector is ridiculously hard.  Here I'll describe some of the tools we use for assistance.
</p></div>

</div>

<div id="outline-container-6_5" class="outline-4">
<h4 id="sec-6_5">Getting the best performance out of SGen </h4>
<div class="outline-text-4" id="text-6_5">

<p>How to fine-tune SGen for a workload and how to take write applications to take advantage of SGen's performance characteristics.
</p></div>

</div>

<div id="outline-container-6_6" class="outline-4">
<h4 id="sec-6_6">The future </h4>
<div class="outline-text-4" id="text-6_6">

<p>What's in stock for SGen in the future?
</p></div>
</div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=135&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>SGen is Mono&#8217;s new garbage collector that we have been working on intensely for almost two years now and that has been becoming <a href="http://lists.ximian.com/pipermail/mono-devel-list/2010-December/036513.html">stable and competitive</a> during the past few months.
</p>
<p>
In this series of blog posts I will try to explain how garbage collection works in general, how SGen works in particular, how to get the best performance out of it and, finally, what our plans are for the future.
</p>
<p>
This first post will give a very brief overview over what a garbage collector does and how it does it, before outlining the broad architecture of SGen.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Why a new garbage collector? </h3>
<div class="outline-text-3" id="text-2">
<p>Since its inception Mono, like many other garbage collected runtimes, has been using the <a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">Boehm-Demers-Weiser collector</a>, which I shall refer to as &#8220;Boehm&#8221; henceforth.  Boehm&#8217;s main advantages are its portability, stability and the ease with which it can be embedded.  It is designed as a garbage collector for C and C++, not for managed languages, however, so it does not come as a surprise that it falls short for that purpose compared to collectors dedicated to such languages or runtimes.
</p>
<p>
Our goal with SGen was to overcome Boehm&#8217;s limitations and provide better performance for managed applications, in terms of allocation speed, collector pauses as well as memory utilization.  In this and the following posts of this series I will mention points where we improve upon Boehm whenever appropriate.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Garbage collection </h3>
<div class="outline-text-3" id="text-3">
<p>Before digging into the details of SGen it seems prudent to discuss how garbage collection actually works.  I will only discuss topics that are relevant to SGen, and even paint those only with very broad strokes.  For more comprehensive overviews of garbage collection see <a href="ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps">&#8220;Uniprocessor Garbage Collection Techniques&#8221; by Wilson</a> or, for more detailed information, <a href="http://www.amazon.com/gp/product/0471941484%3Fie=UTF8&amp;tag=juggphotsofta-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0471941484">Jones and Lins&#8217;s book</a>.
</p>
<p>
The garbage collector, in short, is that part of the language&#8217;s or platform&#8217;s runtime that keeps a program from exhausting the computer&#8217;s memory, despite the lack of an explicit way to free objects that have been allocated.  It does that by discovering which objects might still be reached by the program and getting rid of the rest.  It considers as reachable objects those that are directly or indirectly referenced by the so-called &#8220;roots&#8221;, which are global variables, local variables on the stacks or in registers and other references explicitly registered as roots.  In other words, all those objects are considered reachable, as well as those that they reference, and those that they reference, and so on.  The pedantic will note that the program might not actually be able to reach all of those.  Computing the minimum set of objects that the program could reach is impossible, however, which we know thanks to <a href="http://www.amazon.com/gp/product/0470229055%3Fie=UTF8&amp;tag=juggphotsofta-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0470229055">this</a> <a href="http://www.amazon.com/gp/product/0802775802%3Fie=UTF8&amp;tag=juggphotsofta-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0802775802">guy</a>.
</p>
<p>
SGen, as well as Boehm, are &#8220;stop-the-world&#8221; collectors, meaning that they do their work while the actual program, slightingly called the &#8220;mutator&#8221; in garbage collection lingo, is stopped.
</p>
<p>
There are three classic garbage collection algorithms: Mark-and-sweep, Copying, and Compaction, of which only the first two are relevant to our discussion of SGen since we have not implemented a compacting collector and have no plans of doing so in the foreseeable future.
</p>
</div>
<div id="outline-container-3_1" class="outline-4">
<h4 id="sec-3_1">Mark-and-Sweep </h4>
<div class="outline-text-4" id="text-3_1">
<p>Mark-and-sweep is the oldest and probably most widely implemented garbage collection algorithm.  Boehm is a mark-and-sweep collector.
</p>
<p>
The idea is to have a &#8220;mark&#8221; bit on each object that is set on all reachable ones in the &#8220;mark&#8221; phase of the collection, by recursively traversing references starting from the root set.  Of course in practice this is not actually implemented recursively, but by using a queue or stack.
</p>
<p>
The &#8220;sweep&#8221; phase then frees the memory occupied by the objects that were not marked and resets the mark bits on the marked ones.
</p>
<p>
Of course many variants of this algorithm exist that vary in details.
</p>
</div>
</div>
<div id="outline-container-3_2" class="outline-4">
<h4 id="sec-3_2">Copying </h4>
<div class="outline-text-4" id="text-3_2">
<p>Traditional mark-and-sweep has two main weaknesses.  First, it needs to visit all objects, including the unreachable ones.  Second, it can suffer from memory fragmentation like a <code>malloc/free</code> allocator.
</p>
<p>
A copying collector addresses both problems by copying all reachable objects to a new memory region, allocating them linearly one after the other.  The old memory region can then be discarded wholesale.  The classic copying collector is the so called &#8220;semi-space&#8221; algorithm where the heap is divided into two halves.  Allocation is done linearly from one half until it is full, at which point the collector copies the reachable objects to the second half.  Allocation proceeds from there and at the next collection the now empty first half is used as the copying destination, the so-called &#8220;to-space&#8221;.
</p>
<p>
Since with a copying collector objects change their addresses it is obvious that the collector also has to update all the references to the objects that it moves.  In Mono this is not always possible because we cannot in all cases unambiguously identify whether a value in memory (usually on the stack) is really a reference or just a number that points to an object by coincidence.  I will discuss how we deal with this problem in a later installment of this series.
</p>
</div>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Generational garbage collection </h3>
<div class="outline-text-3" id="text-4">
<p>It is observed that for most workloads the majority of objects either die very quickly or live for a very long period of time.  One can take advantage of this so-called &#8220;generational hypothesis&#8221; by dividing the heap into two or more &#8220;generations&#8221; that are collected at different frequencies.
</p>
<p>
Objects begin their lives in the first generation, also referred to as the &#8220;nursery&#8221;.  If they manage to stick around long enough they get &#8220;promoted&#8221; to the second generation, etc.  Since all objects are born in the nursery it grows very quickly and needs to be collected often. If the generational hypothesis holds only a small fraction of those objects will make it to the next generation so it needs to be collected less frequently.  Also, it is expected that while only a small fraction of the objects in the nursery will survive a collection, the percentage will be higher for older generations, so a collection algorithm that is better suited to high survival rates can be used for those.  Some collectors go so far as to give objects &#8220;tenure&#8221; at some point, making them immortal so they don&#8217;t burden the collection anymore.
</p>
<p>
One difficulty with generational collection is that it&#8217;s not quite possible to collect a young generation without looking at the older generations at all, because the mutator might have stored a reference to a young generation object in an older generation object.  Even if that young object is not referenced from anywhere else it must still be considered alive.  Clearly scanning all objects in older generations for such references would defeat the whole purpose of separating them in the first place.  To address this problem generational collectors make the mutator register all new references from old to young generations.  This registration is referred to as a &#8220;write barrier&#8221; and will be discussed in a later installment.  It is also possible to register reads instead of writes, with a &#8220;read barrier&#8221;, but this is impractical without hardware support.  It&#8217;s obvious that the write barrier must be very efficient since it&#8217;s invoked for every write to a reference field or array member.
</p>
</div>
</div>
<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">SGen </h3>
<div class="outline-text-3" id="text-5">
<p>SGen, which historically stands for &#8220;Simple Generational&#8221;, is a generational garbage collector with two generations.  The nursery is collected with a copying collector that immediately promotes all live objects, if possible, to the old generation (or &#8220;major heap&#8221;).
</p>
<p>
For the old generation we have two collectors that the user can choose between: A mark-and-sweep and a copying collector.  The mark-and-sweep collector is available in four variants, with and without a parallel mark phase and with a dynamic or fixed-size heap.
</p>
<p>
In addition to that SGen has a separate space for large objects (larger than 8000 bytes), called the &#8220;Large Object Space&#8221;, or &#8220;LOS&#8221;, which is logically part of the old generation.  Large objects are not born in the nursery but directly in the LOS, and they are not moved.
</p>
<p>
One major difficulty SGen has to deal with is objects that are &#8220;pinned&#8221;, which means that they cannot be moved.  The reason for this is usually that they are referenced from stack frames that we cannot scan &#8220;precisely&#8221;, i.e. for which we do not know what is a reference and what is not.  Work is currently under way to scan managed stack frames precisely, but we will always have to handle unmanaged stack frames, too, for which no type information is available and which can therefore only be scanned &#8220;conservatively&#8221;.  More on this in one of the following posts.
</p>
</div>
</div>
<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">Future installments </h3>
<div class="outline-text-3" id="text-6">
<p>Having covered a few basic concepts of garbage collection I shall go into more detail about SGen in the blog posts following this one. Here is a tentative list of the posts I intend to write in the coming weeks.  Comments and suggestions are of course welcome.
</p>
</div>
<div id="outline-container-6_1" class="outline-4">
<h4 id="sec-6_1">Allocating objects and the minor collector </h4>
<div class="outline-text-4" id="text-6_1">
<p>How is the nursery organized, how are new objects allocated and how is the nursery collected?
</p>
</div>
</div>
<div id="outline-container-6_2" class="outline-4">
<h4 id="sec-6_2">The major collectors </h4>
<div class="outline-text-4" id="text-6_2">
<p>An overview of SGen&#8217;s major collectors, mark-and-sweep and copying. I&#8217;ll discuss how they organize the heap and how they collect.
</p>
</div>
</div>
<div id="outline-container-6_3" class="outline-4">
<h4 id="sec-6_3">Sundry </h4>
<div class="outline-text-4" id="text-6_3">
<p>Several shorter topics: Large objects, GC descriptors, conservative scanning, pinning, write barriers, finalization, weak references and domain unloading.
</p>
</div>
</div>
<div id="outline-container-6_4" class="outline-4">
<h4 id="sec-6_4">Debugging a garbage collector </h4>
<div class="outline-text-4" id="text-6_4">
<p>Finding bugs in a garbage collector is ridiculously hard.  Here I&#8217;ll describe some of the tools we use for assistance.
</p>
</div>
</div>
<div id="outline-container-6_5" class="outline-4">
<h4 id="sec-6_5">Getting the best performance out of SGen </h4>
<div class="outline-text-4" id="text-6_5">
<p>How to fine-tune SGen for a workload and how to take write applications to take advantage of SGen&#8217;s performance characteristics.
</p>
</div>
</div>
<div id="outline-container-6_6" class="outline-4">
<h4 id="sec-6_6">The future </h4>
<div class="outline-text-4" id="text-6_6">
<p>What&#8217;s in stock for SGen in the future?
</p>
</div>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/135/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/135/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/135/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=135&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2010/12/20/sgen/feed/</wfw:commentRss>
		<slash:comments>22</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>Combat</title>
		<link>http://schani.wordpress.com/2010/09/20/combat/</link>
		<comments>http://schani.wordpress.com/2010/09/20/combat/#comments</comments>
		<pubDate>Mon, 20 Sep 2010 12:48:18 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Juggling]]></category>
		<category><![CDATA[Photography]]></category>
		<category><![CDATA[combat]]></category>
		<category><![CDATA[game]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=131</guid>
		<description><![CDATA[<p><a href="http://www.facebook.com/photo.php?pid=5395787&#38;l=910fd5a8a9&#38;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3179.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395790&#38;l=3e84c17196&#38;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3228.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395800&#38;l=d053833997&#38;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3286.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395806&#38;l=394f86c81b&#38;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3376.jpg" alt="Combat Photograph" /></a></p>

<p><a href="http://www.facebook.com/album.php?aid=234729&#38;id=503328245&#38;l=9c6053de25">More</a></p>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=131&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.facebook.com/photo.php?pid=5395787&amp;l=910fd5a8a9&amp;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3179.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395790&amp;l=3e84c17196&amp;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3228.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395800&amp;l=d053833997&amp;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3286.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/photo.php?pid=5395806&amp;l=394f86c81b&amp;id=503328245"><img src="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3376.jpg" alt="Combat Photograph" /></a></p>
<p><a href="http://www.facebook.com/album.php?aid=234729&amp;id=503328245&amp;l=9c6053de25">More</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/131/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/131/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/131/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=131&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2010/09/20/combat/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3179.jpg" medium="image">
			<media:title type="html">Combat Photograph</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3228.jpg" medium="image">
			<media:title type="html">Combat Photograph</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3286.jpg" medium="image">
			<media:title type="html">Combat Photograph</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/combat/_MG_3376.jpg" medium="image">
			<media:title type="html">Combat Photograph</media:title>
		</media:content>
	</item>
		<item>
		<title>More Siteswap Puzzles</title>
		<link>http://schani.wordpress.com/2010/07/10/more-siteswap-puzzles/</link>
		<comments>http://schani.wordpress.com/2010/07/10/more-siteswap-puzzles/#comments</comments>
		<pubDate>Sat, 10 Jul 2010 23:56:45 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Juggling]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[clojure]]></category>
		<category><![CDATA[puzzle]]></category>
		<category><![CDATA[siteswap]]></category>
		<category><![CDATA[sudoku]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=127</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>A few years ago I wrote about <a href="http://schani.wordpress.com/2006/04/09/a-siteswap-puzzle/">Sudoku-like puzzles involving siteswaps</a>. Back then I used Prolog with <a href="http://en.wikipedia.org/wiki/Constraint_logic_programming#Finite_domains">CLP/FD</a> to generate a few small puzzles. The use of Prolog had the unfortunate side-effect that it was rather hard to use the generator on ones own machine.  Due to <a href="http://www.siteswapgeneration.com/">popular demand</a> I've re-written the generator on a more popular platform and in the process added a somewhat usable command-line interface and a few features.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Download and instructions </h3>
<div class="outline-text-3" id="text-2">

<p>The application can be <a href="http://github.com/schani/clj-siteswap-sudoku/downloads">downloaded from GitHub</a>.  For instructions, read the <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/README.md">README file</a>.
</p></div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">New features </h3>
<div class="outline-text-3" id="text-3">


</div>

<div id="outline-container-3.1" class="outline-4">
<h4 id="sec-3.1">Bigger puzzles </h4>
<div class="outline-text-4" id="text-3.1">

<p>The new generator can produce arbitrarily large puzzles, given enough time.  In practice, the limit seems to be at around 9x10.  Try to solve this one:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" />

<tbody>
<tr><td>.</td><td>.</td><td>4</td><td>2</td><td>.</td><td>6</td><td>3</td><td>5</td><td>.</td><td>.</td></tr>
<tr><td>.</td><td>6</td><td>.</td><td>5</td><td>.</td><td>.</td><td>.</td><td>5</td><td>.</td><td>5</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>5</td><td>.</td><td>.</td></tr>
<tr><td>6</td><td>.</td><td>.</td><td>.</td><td>4</td><td>.</td><td>5</td><td>6</td><td>4</td><td>.</td></tr>
<tr><td>7</td><td>.</td><td>.</td><td>3</td><td>.</td><td>6</td><td>.</td><td>.</td><td>.</td><td>.</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>5</td><td>.</td><td>4</td><td>.</td><td>.</td><td>.</td><td>7</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>8</td><td>.</td><td>2</td><td>8</td><td>6</td><td>.</td><td>1</td></tr>
<tr><td>6</td><td>4</td><td>.</td><td>.</td><td>.</td><td>5</td><td>.</td><td>4</td><td>.</td><td>.</td></tr>
<tr><td>.</td><td>4</td><td>.</td><td>.</td><td>.</td><td>6</td><td>.</td><td>.</td><td>.</td><td>.</td></tr>
</tbody>
</table>

</div>

</div>

<div id="outline-container-3.2" class="outline-4">
<h4 id="sec-3.2">Allowed throws </h4>
<div class="outline-text-4" id="text-3.2">

<p>The old generator would always use throws 1 to 9.  Now the range is configurable.  The more throws you allow, the higher the number of possible solutions, which means that the puzzle must have less unknown elements to be still uniquely solvable.  The more you restrict the range of throws, however, the more unknowns the puzzle can take.  This one, for example, only allows throws 1 to 5:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" />

<tbody>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>5</td><td>.</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>1</td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>1</td><td>.</td></tr>
<tr><td>2</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td></tr>
<tr><td>.</td><td>4</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td></tr>
</tbody>
</table>

</div>

</div>

<div id="outline-container-3.3" class="outline-4">
<h4 id="sec-3.3">Arbitrary shapes </h4>
<div class="outline-text-4" id="text-3.3">

<p>There's no reason why the puzzles should always be rectangular in shape.  Here's a triangle:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" /><col align="left" />

<tbody>
<tr><td>.</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>1</td><td>.</td><td></td><td></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>1</td><td>1</td><td>.</td><td></td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>.</td><td>5</td><td>1</td><td>2</td><td></td><td></td><td></td><td></td><td></td></tr>
<tr><td>.</td><td>3</td><td>.</td><td>.</td><td>2</td><td></td><td></td><td></td><td></td></tr>
<tr><td>1</td><td>.</td><td>2</td><td>5</td><td>.</td><td>4</td><td></td><td></td><td></td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>3</td><td>.</td><td>.</td><td></td><td></td></tr>
<tr><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>.</td><td>9</td><td>.</td><td></td></tr>
<tr><td>.</td><td>.</td><td>7</td><td>.</td><td>.</td><td>.</td><td>1</td><td>7</td><td>4</td></tr>
</tbody>
</table>


<p>
You can specify the shape of the puzzle in a simple text file, like <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/shapes/grid5x6">this one</a>.
</p></div>

</div>

<div id="outline-container-3.4" class="outline-4">
<h4 id="sec-3.4">Simple rules </h4>
<div class="outline-text-4" id="text-3.4">

<p>The rules for the old generator required that no two siteswaps in the puzzle be the same, which included all the rotations.  Many people found this confusing and stumbled over it, so I removed the restriction by default.  The old rules are still accessible through a command-line option.
</p></div>
</div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">The Implementation </h3>
<div class="outline-text-3" id="text-4">

<p>The new generator is written in Clojure, using the <a href="http://jacop.osolpro.com/">JaCoP finite domain constraint solver</a> to do the heavy lifting.  One of the nice features of JaCoP is that the order in which it goes through the range of a variable is configurable, including a random option.  Whereas in the old generator I had to pre-assign random values to a few of the elements to get a random puzzle, with JaCoP I can just specify the constraints for the puzzle and then let it generate a random one.
</p>
<p>
The second step in puzzle generation is the introduction of unknowns. In the new generator, the user has to specify how many unknowns they want in the puzzle.  The generator then removes a random subset of that size from the puzzle and checks if it's still uniquely solvable. The higher the number of unknowns, the smaller the chance that this is the case, meaning that the search will take longer or might indeed never end.  I haven't been able to find a 2x3 puzzle (with throws 1 to 9 and complex rules) with more than 3 unknowns, for example.
</p></div>

</div>

<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">The Code </h3>
<div class="outline-text-3" id="text-5">

<p>If you want to play around with the code or are just curious then take a look at it at <a href="http://github.com/schani/clj-siteswap-sudoku">GitHub</a>.
</p></div>

</div>

<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">More puzzles </h3>
<div class="outline-text-3" id="text-6">

<p>You can find a few more puzzles of varying difficulty, including solutions, <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/puzzles.org">here</a>.
</p></div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=127&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>A few years ago I wrote about <a href="http://schani.wordpress.com/2006/04/09/a-siteswap-puzzle/">Sudoku-like puzzles involving siteswaps</a>. Back then I used Prolog with <a href="http://en.wikipedia.org/wiki/Constraint_logic_programming#Finite_domains">CLP/FD</a> to generate a few small puzzles. The use of Prolog had the unfortunate side-effect that it was rather hard to use the generator on ones own machine.  Due to <a href="http://www.siteswapgeneration.com/">popular demand</a> I&#8217;ve re-written the generator on a more popular platform and in the process added a somewhat usable command-line interface and a few features.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">Download and instructions </h3>
<div class="outline-text-3" id="text-2">
<p>The application can be <a href="http://github.com/schani/clj-siteswap-sudoku/downloads">downloaded from GitHub</a>.  For instructions, read the <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/README.md">README file</a>.
</p>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">New features </h3>
<div class="outline-text-3" id="text-3">
</div>
<div id="outline-container-3.1" class="outline-4">
<h4 id="sec-3.1">Bigger puzzles </h4>
<div class="outline-text-4" id="text-3.1">
<p>The new generator can produce arbitrarily large puzzles, given enough time.  In practice, the limit seems to be at around 9&#215;10.  Try to solve this one:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<tbody>
<tr>
<td>.</td>
<td>.</td>
<td>4</td>
<td>2</td>
<td>.</td>
<td>6</td>
<td>3</td>
<td>5</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>6</td>
<td>.</td>
<td>5</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>5</td>
<td>.</td>
<td>5</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>5</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>6</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>4</td>
<td>.</td>
<td>5</td>
<td>6</td>
<td>4</td>
<td>.</td>
</tr>
<tr>
<td>7</td>
<td>.</td>
<td>.</td>
<td>3</td>
<td>.</td>
<td>6</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>5</td>
<td>.</td>
<td>4</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>7</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>8</td>
<td>.</td>
<td>2</td>
<td>8</td>
<td>6</td>
<td>.</td>
<td>1</td>
</tr>
<tr>
<td>6</td>
<td>4</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>5</td>
<td>.</td>
<td>4</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>4</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>6</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
</tbody>
</table>
</div>
</div>
<div id="outline-container-3.2" class="outline-4">
<h4 id="sec-3.2">Allowed throws </h4>
<div class="outline-text-4" id="text-3.2">
<p>The old generator would always use throws 1 to 9.  Now the range is configurable.  The more throws you allow, the higher the number of possible solutions, which means that the puzzle must have less unknown elements to be still uniquely solvable.  The more you restrict the range of throws, however, the more unknowns the puzzle can take.  This one, for example, only allows throws 1 to 5:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<tbody>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>5</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>1</td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>1</td>
<td>.</td>
</tr>
<tr>
<td>2</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
<tr>
<td>.</td>
<td>4</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
</tr>
</tbody>
</table>
</div>
</div>
<div id="outline-container-3.3" class="outline-4">
<h4 id="sec-3.3">Arbitrary shapes </h4>
<div class="outline-text-4" id="text-3.3">
<p>There&#8217;s no reason why the puzzles should always be rectangular in shape.  Here&#8217;s a triangle:
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<col align="left" />
<tbody>
<tr>
<td>.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>.</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>.</td>
<td>5</td>
<td>1</td>
<td>2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>.</td>
<td>3</td>
<td>.</td>
<td>.</td>
<td>2</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>.</td>
<td>2</td>
<td>5</td>
<td>.</td>
<td>4</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>3</td>
<td>.</td>
<td>.</td>
<td></td>
<td></td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>9</td>
<td>.</td>
<td></td>
</tr>
<tr>
<td>.</td>
<td>.</td>
<td>7</td>
<td>.</td>
<td>.</td>
<td>.</td>
<td>1</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>
<p>
You can specify the shape of the puzzle in a simple text file, like <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/shapes/grid5x6">this one</a>.
</p>
</div>
</div>
<div id="outline-container-3.4" class="outline-4">
<h4 id="sec-3.4">Simple rules </h4>
<div class="outline-text-4" id="text-3.4">
<p>The rules for the old generator required that no two siteswaps in the puzzle be the same, which included all the rotations.  Many people found this confusing and stumbled over it, so I removed the restriction by default.  The old rules are still accessible through a command-line option.
</p>
</div>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">The Implementation </h3>
<div class="outline-text-3" id="text-4">
<p>The new generator is written in Clojure, using the <a href="http://jacop.osolpro.com/">JaCoP finite domain constraint solver</a> to do the heavy lifting.  One of the nice features of JaCoP is that the order in which it goes through the range of a variable is configurable, including a random option.  Whereas in the old generator I had to pre-assign random values to a few of the elements to get a random puzzle, with JaCoP I can just specify the constraints for the puzzle and then let it generate a random one.
</p>
<p>
The second step in puzzle generation is the introduction of unknowns. In the new generator, the user has to specify how many unknowns they want in the puzzle.  The generator then removes a random subset of that size from the puzzle and checks if it&#8217;s still uniquely solvable. The higher the number of unknowns, the smaller the chance that this is the case, meaning that the search will take longer or might indeed never end.  I haven&#8217;t been able to find a 2&#215;3 puzzle (with throws 1 to 9 and complex rules) with more than 3 unknowns, for example.
</p>
</div>
</div>
<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">The Code </h3>
<div class="outline-text-3" id="text-5">
<p>If you want to play around with the code or are just curious then take a look at it at <a href="http://github.com/schani/clj-siteswap-sudoku">GitHub</a>.
</p>
</div>
</div>
<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">More puzzles </h3>
<div class="outline-text-3" id="text-6">
<p>You can find a few more puzzles of varying difficulty, including solutions, <a href="http://github.com/schani/clj-siteswap-sudoku/blob/master/puzzles.org">here</a>.
</p>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/127/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/127/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/127/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=127&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2010/07/10/more-siteswap-puzzles/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>
	</item>
		<item>
		<title>The 2010 ICFP Programming Contest</title>
		<link>http://schani.wordpress.com/2010/06/27/the-2010-icfp-programming-contest/</link>
		<comments>http://schani.wordpress.com/2010/06/27/the-2010-icfp-programming-contest/#comments</comments>
		<pubDate>Sun, 27 Jun 2010 02:41:20 +0000</pubDate>
		<dc:creator>schani</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[clojure]]></category>
		<category><![CDATA[icfp]]></category>
		<category><![CDATA[ocaml]]></category>
		<category><![CDATA[programming contest]]></category>

		<guid isPermaLink="false">http://schani.wordpress.com/?p=116</guid>
		<description><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">

<p>Last weekend several hundred programming teams all over the world participated in the <a href="http://icfpcontest.org/2010/">2010 ICFP Programming Contest</a>.  Among them, of course, the glorious team <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">Funktion im Kopf der Mensch</a>.
</p></div>

</div>

<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">The task </h3>
<div class="outline-text-3" id="text-2">

<p>This year's <a href="http://icfpcontest.org/2010/task/">task</a> was multi-layered.  There was a core problem, but to even get to the point where we could submit part of a solution to it, we had to first make our way through a couple of secondary tasks.
</p>
</div>

<div id="outline-container-2.1" class="outline-4">
<h4 id="sec-2.1">The core task </h4>
<div class="outline-text-4" id="text-2.1">

<p>The primary task was to submit "cars" and "fuels" to the organizers' submission server.  "Cars" are data structures describing operations performed on vectors called "air", parameterized by matrices, called "fuel".  The thing of importance here is that cars don't necessarily work with any fuel.  Points were awarded only for fuel.  The less fuels submitted for a specific car by different teams, the more those few teams scored for their fuels.  Between fuels submitted for the same car, the one that was "shortest" scored the most.
</p></div>

</div>

<div id="outline-container-2.2" class="outline-4">
<h4 id="sec-2.2">Encodings </h4>
<div class="outline-text-4" id="text-2.2">

<p>Cars and fuels could not be submitted directly, however.  Both had to be encoded as ternary strings, in an undisclosed format.  What's more, before we could submit a car we had to submit a fuel for an existing car first, and fuels had an additional encoding step added - fuel factories.
</p></div>

</div>

<div id="outline-container-2.3" class="outline-4">
<h4 id="sec-2.3">Fuel factories </h4>
<div class="outline-text-4" id="text-2.3">

<p>To submit a ternary fuel string it had to be encoded as a "fuel factory" first.  A fuel factory is a circuit of individual gates of identical function.  Each gate has two inputs and two outputs, all of which have to be wired.  The whole circuit has one input and one output.  The gate function was not disclosed.
</p>
<p>
To make the task of figuring out how gates work possible, the submission server would accept circuits and give back the first 17 "trits" of the circuit's output.  By submitting an empty circuit, wiring the circuit input to the output directly, without any intervening gates, we learned the first 17 trits of the input.  Next we let the server produce the output for all circuits containing one (wired) gate.  Those four output strings were enough to determine the gate function by brute-forcing (trying all possible gate functions and seeing which ones fit).  We assumed (correctly) that the gates have no state.
</p></div>

</div>

<div id="outline-container-2.4" class="outline-4">
<h4 id="sec-2.4">The key </h4>
<div class="outline-text-4" id="text-2.4">

<p>Another roadblock on the way to submitting fuels was that the fuel circuit had to produce a string that contained not only the encoded fuel but a 17 trit prefix - the "key" - before it.  In order to gain the rights to submit a car we first had to produce a circuit that output this prefix.  We brute-forced this circuit as well, trying all combinations of gates until we found one that generated the key.  The <a href="http://github.com/schani/icfp-2010/blob/master/circuits/key-generators">shortest ones</a> have 6 gates.
</p>
<p>
Now that we could submit cars as well as fuels, we had two problems to work on in parallel: How are cars and fuels encoded, and how do we produce a circuit that generates a given, arbitrarily long, ternary string?
</p></div>

</div>

<div id="outline-container-2.5" class="outline-4">
<h4 id="sec-2.5">Cars </h4>
<div class="outline-text-4" id="text-2.5">

<p>We knew from the task description that cars were encoded as a combination of lists, tuples, and natural numbers.  Discovering the actual encoding was aided not only by the availability of one sample car and several cars already submitted by other teams, but also by the contest organizers' server, which gave helpful error messages for incorrectly encoded cars.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg" />
</p>
<p>
Here are a few examples of numbers and lists of numbers.  Note that the encoding assumes that the type of the encoded object is known, i.e., the encoding of a particular list and some number might be the same, but since it is known which one of the two types is expected, that is not a problem.
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="right" /><col align="right" />
<thead>
<tr><th scope="col">Thing</th><th scope="col">Encoding</th></tr>
</thead>
<tbody>
<tr><td>0</td><td>0</td></tr>
<tr><td>1</td><td>10</td></tr>
<tr><td>2</td><td>11</td></tr>
<tr><td>3</td><td>12</td></tr>
<tr><td>4</td><td>22000</td></tr>
<tr><td>7</td><td>22010</td></tr>
<tr><td>13</td><td>2210000</td></tr>
<tr><td>()</td><td>0</td></tr>
<tr><td>(1)</td><td>110</td></tr>
<tr><td>(1 2)</td><td>2201011</td></tr>
<tr><td>(1 2 3)</td><td>2210101112</td></tr>
</tbody>
</table>

</div>

</div>

<div id="outline-container-2.6" class="outline-4">
<h4 id="sec-2.6">Generator circuits </h4>
<div class="outline-text-4" id="text-2.6">

<p>We had several ideas for how to generate circuits to produce arbitrary ternary strings.  The one we settled on was to reverse-chain together a sequence of adding sub-circuits.  Consider this circuit:
</p>

<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/circuit_0d9ff760343ae3fbaec1ad6f2b5309368fdcd017.png" alt="circuit" />
</p>

<p>
One thing I haven't mentioned about circuits yet is that transfer along forward wires is instantaneous, but is delayed by one step along backward wires.  In the first step, all backward wires carry the value 0.  What that means for this circuit is that in the first step, the one-adder gets an input of 0 and produces 1, which is the first step's circuit output.  Also, the two-adder produces a 2, but its delivery to the one-adder is delayed until the second step, at which point the one-adder produces a 0 (because 1+2 is 0 modulo 3).  The output for the third step already depends on the circuit input from the first step.  This circuit, then, will, no matter what the input, always produce the two trits 10 in its first two steps.  It is easy to see how to extend this pattern to produce an arbitrary string, given only the three adders as building blocks.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg" />
</p>
<p>
Of course we didn't have the three adders starting out, but it was not too hard to brute-force them, like we did for the key generator.  In fact, we even went so far as to brute-force circuits that combined up to three of those adders, reverse-chained together, to make our circuits shorter.  Our adders also had a delay built-in, which made the last sub-circuit superfluous.
</p></div>

</div>

<div id="outline-container-2.7" class="outline-4">
<h4 id="sec-2.7">Fuel encoding </h4>
<div class="outline-text-4" id="text-2.7">

<p>At this point we could use the server's submission system to figure out the ternary encoding of fuels, which turned out to be quite simple - a list of lists of lists of numbers.  With that done, we could concentrate on the actual task of the contest, namely the production of cars and fuel.
</p></div>

</div>

<div id="outline-container-2.8" class="outline-4">
<h4 id="sec-2.8">Producing fuels </h4>
<div class="outline-text-4" id="text-2.8">

<p>The only way to score points was to produce fuels, either for our own cars, or for the cars of others, which we could download.  As it turned out, a sizable proportion of the cars other teams submitted worked with one out of a handful of very basic fuels, so we wrote scripts that automatically tried to submit those fuels for any new cars that showed up.
</p>
<p>
We had two strategies for solving cars that didn't give in so easily, both of which were rather unsophisticated and give away our complete lack of any deeper understanding of the problem.
</p>
<p>
The first approach, which I worked on, was to employ a genetic algorithm.  The second was to translate the constraints a car placed on a fuel into a series of inequalities and feed them into Mathematica, hoping it would spit out a solution.  Both approaches yielded a number of matching fuels.
</p></div>

</div>

<div id="outline-container-2.9" class="outline-4">
<h4 id="sec-2.9">Producing cars </h4>
<div class="outline-text-4" id="text-2.9">

<p>My approach to producing cars was the same as for producing fuel - a genetic algorithm.  We have a running joke in our team: For every problem that comes up I first suggest a genetic algorithm, even if it's completely unsuited.  This year was the first time a genetic algorithm was actually a somewhat feasible approach.  The problem with producing a car this way is the fitness function.  We had the idea of generating a random pool of fuels and then scoring cars so that they came out on top if they matched exactly one of those fuels, hoping that would make finding a matching fuel harder.  The cars produced with this approach were solved by about 20 to 25 other teams.
</p>
<p>
<a href="http://github.com/schani/icfp-2010/blob/master/src/clj/at/ac/tuwien/tilab/icfp2010/carmaker.clj">Another approach</a>, which I was not involved with and therefore cannot describe in any detail, was more successful: Only 5 or 6 teams solved those cars, so we ended up not using the genetically-produced cars much - there was a limit of 72 cars per team.
</p></div>

</div>

<div id="outline-container-2.10" class="outline-4">
<h4 id="sec-2.10">The result </h4>
<div class="outline-text-4" id="text-2.10">

<p>We finished the contest in 23rd place.  <a href="http://icfpcontest.org/2010/teams/">Around 210 teams</a> ended up with a non-zero score, with the total number of participating teams probably considerably higher.
</p></div>
</div>

</div>

<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Contest organization </h3>
<div class="outline-text-3" id="text-3">

<p>This contest was organized quite well, more so in contrast to <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">last year's debacle</a>.  The biggest organizational issue was the submission server, which was down often, as a result of being hammered by hundreds of teams, most of which probably used scripts to automatically download cars and submit fuels.  In fact, for the first few hours of the contest, the server was completely unresponsive.
</p>
<p>
The organizers tried to remedy the situation with a variety of approaches, at one point limiting the length of submitted cars to something ridiculously low - around 100 trits if memory serves me right.
</p>
<p>
Still, I do not think that we were significantly held back by these problems, at least not more so than all the other teams.
</p></div>

</div>

<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Suggestions to future organizers </h3>
<div class="outline-text-3" id="text-4">

<p>I've been thinking a bit about what I liked and didn't like about the ICFP contest tasks over the past 10 or so years.  Here are a few wishes to the organizers of future contests:
</p>
</div>

<div id="outline-container-4.1" class="outline-4">
<h4 id="sec-4.1">Visual </h4>
<div class="outline-text-4" id="text-4.1">

<p>I found that having a problem that can be visualized makes it much more enjoyable to work on.  The best example for this is probably 2004's contest, where we had to write software for ants' brains that had to work together to search for and gather food and to defend against attackers.  Watching those ants move about and improve with each version of the code was a joy.
</p>
<p>
Despite all the other issues, I did like this aspect about last year's task, as well.
</p></div>

</div>

<div id="outline-container-4.2" class="outline-4">
<h4 id="sec-4.2">Programming </h4>
<div class="outline-text-4" id="text-4.2">

<p>The task should really be a programming task at heart.  The contest is, after all, a Programming contest.  This year's task, after you stripped all the intentional obscurity, was a mathematical problem. I'm not saying that mathematical problems aren't nice or interesting. Neither am I saying that there shouldn't be any math involved in the contest, but don't make the core problem a mathematical one.
</p></div>

</div>

<div id="outline-container-4.3" class="outline-4">
<h4 id="sec-4.3">No obscurity </h4>
<div class="outline-text-4" id="text-4.3">

<p>It seems that all the obscure bits that were put into this year's task had the purpose of disguising the fact that, as already mentioned, the core task was not first and foremost a programming one.  Please don't do this in the future!  Give us a task where all the non-essentials have been stripped, where the hard and interesting part is actually the task, not all the stuff you have to go through to finally get to it, discovering that it wasn't all it was cracked up to be.
</p>
<p>
Last year's task also had a bit of completely unmotivated obscurity - the format of their VM instructions was slightly different depending on whether the instruction address was odd or even.  Why, oh why?
</p></div>

</div>

<div id="outline-container-4.4" class="outline-4">
<h4 id="sec-4.4">Self-containment </h4>
<div class="outline-text-4" id="text-4.4">

<p>The more we have to rely on the contest organizers' servers to go about our task, it seems, the more problems crop up.  Last year was a total failure due to a bug in the submission system, and this year the server was down or at least very hard to reach for considerable stretches of time.  Please give us something self-contained again!  If you must have some interaction, make is as simple as possible.  The submission system for the 2006 contest, for example, only required sending a short hash for each solved sub-problem, and it worked very well.
</p></div>

</div>

<div id="outline-container-4.5" class="outline-4">
<h4 id="sec-4.5">Don't show off </h4>
<div class="outline-text-4" id="text-4.5">

<p>You're smart, we know, but that's not what the contest is about.  The contest is about how smart we are, so please refrain from making the task a monument to your intellect, as was the case in 2007.
</p></div>
</div>

</div>

<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">Clojure </h3>
<div class="outline-text-3" id="text-5">

<p>After our <a href="http://schani.wordpress.com/2007/07/23/why-ocaml-is-not-my-favorite-programming-language/">disillusionment with OCaml</a> we had been looking for a language to replace it with as our first choice. To my surprise and joy we managed to settle on <a href="http://clojure.org/">Clojure</a>, which turned out to work quite well for us.
</p>
<p>
We stumbled across some rough edges, most of them having their roots in the integration of Clojure with the JVM, but they posed no significant obstacles, and will most likely be ironed out soon.  The biggest problem with Clojure right now, and this is generally acknowledged in the community, is the lack of its error reporting, which can make finding bugs quite an ordeal sometimes, until one accommodates, at which point it is still at best bearable.
</p>
<p>
On the positive side most of us found working in Clojure fun and, after some learning period, quite productive.  Having an interactive environment like Emacs/SLIME certainly accelerates not only the learning process, but development in general.
</p>
<p>
Clojure's dynamic nature allowed us to do stuff we hardly would have considered with, say, OCaml.  A few weeks before the contest I put together a <a href="http://github.com/schani/clj-simple-dist">simple job distribution service</a> which allowed us to interactively distribute several workloads over a <a href="http://www.zid.tuwien.ac.at/vsc/">number of computing nodes</a> with relative ease.
</p>
<p>
The dynamicism does come at a price, though: Like many Lisps, Clojure's performance model is non-intuitive.  It is not clear, at least without considerable experience, which operations, under which circumstances, are cheap and which are expensive, the gulf between which can be vast.  As a result, we have resorted to implementing a few critical pieces of code in Java, which, to be fair, was quick, easy, and gave us efficient code.
</p>
<p>
I am looking forward to using Clojure more in the future.
</p></div>

</div>

<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">The Code </h3>
<div class="outline-text-3" id="text-6">

<p>All the code we wrote for this year's contest is available on <a href="http://github.com/schani/icfp-2010">GitHub</a>. If you need any assistance with it, please email me.
</p></div>
</div>
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=116&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div id="outline-container-1" class="outline-3">
<h3 id="sec-1">Introduction </h3>
<div class="outline-text-3" id="text-1">
<p>Last weekend several hundred programming teams all over the world participated in the <a href="http://icfpcontest.org/2010/">2010 ICFP Programming Contest</a>.  Among them, of course, the glorious team <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">Funktion im Kopf der Mensch</a>.
</p>
</div>
</div>
<div id="outline-container-2" class="outline-3">
<h3 id="sec-2">The task </h3>
<div class="outline-text-3" id="text-2">
<p>This year&#8217;s <a href="http://icfpcontest.org/2010/task/">task</a> was multi-layered.  There was a core problem, but to even get to the point where we could submit part of a solution to it, we had to first make our way through a couple of secondary tasks.
</p>
</div>
<div id="outline-container-2.1" class="outline-4">
<h4 id="sec-2.1">The core task </h4>
<div class="outline-text-4" id="text-2.1">
<p>The primary task was to submit &#8220;cars&#8221; and &#8220;fuels&#8221; to the organizers&#8217; submission server.  &#8220;Cars&#8221; are data structures describing operations performed on vectors called &#8220;air&#8221;, parameterized by matrices, called &#8220;fuel&#8221;.  The thing of importance here is that cars don&#8217;t necessarily work with any fuel.  Points were awarded only for fuel.  The less fuels submitted for a specific car by different teams, the more those few teams scored for their fuels.  Between fuels submitted for the same car, the one that was &#8220;shortest&#8221; scored the most.
</p>
</div>
</div>
<div id="outline-container-2.2" class="outline-4">
<h4 id="sec-2.2">Encodings </h4>
<div class="outline-text-4" id="text-2.2">
<p>Cars and fuels could not be submitted directly, however.  Both had to be encoded as ternary strings, in an undisclosed format.  What&#8217;s more, before we could submit a car we had to submit a fuel for an existing car first, and fuels had an additional encoding step added &#8211; fuel factories.
</p>
</div>
</div>
<div id="outline-container-2.3" class="outline-4">
<h4 id="sec-2.3">Fuel factories </h4>
<div class="outline-text-4" id="text-2.3">
<p>To submit a ternary fuel string it had to be encoded as a &#8220;fuel factory&#8221; first.  A fuel factory is a circuit of individual gates of identical function.  Each gate has two inputs and two outputs, all of which have to be wired.  The whole circuit has one input and one output.  The gate function was not disclosed.
</p>
<p>
To make the task of figuring out how gates work possible, the submission server would accept circuits and give back the first 17 &#8220;trits&#8221; of the circuit&#8217;s output.  By submitting an empty circuit, wiring the circuit input to the output directly, without any intervening gates, we learned the first 17 trits of the input.  Next we let the server produce the output for all circuits containing one (wired) gate.  Those four output strings were enough to determine the gate function by brute-forcing (trying all possible gate functions and seeing which ones fit).  We assumed (correctly) that the gates have no state.
</p>
</div>
</div>
<div id="outline-container-2.4" class="outline-4">
<h4 id="sec-2.4">The key </h4>
<div class="outline-text-4" id="text-2.4">
<p>Another roadblock on the way to submitting fuels was that the fuel circuit had to produce a string that contained not only the encoded fuel but a 17 trit prefix &#8211; the &#8220;key&#8221; &#8211; before it.  In order to gain the rights to submit a car we first had to produce a circuit that output this prefix.  We brute-forced this circuit as well, trying all combinations of gates until we found one that generated the key.  The <a href="http://github.com/schani/icfp-2010/blob/master/circuits/key-generators">shortest ones</a> have 6 gates.
</p>
<p>
Now that we could submit cars as well as fuels, we had two problems to work on in parallel: How are cars and fuels encoded, and how do we produce a circuit that generates a given, arbitrarily long, ternary string?
</p>
</div>
</div>
<div id="outline-container-2.5" class="outline-4">
<h4 id="sec-2.5">Cars </h4>
<div class="outline-text-4" id="text-2.5">
<p>We knew from the task description that cars were encoded as a combination of lists, tuples, and natural numbers.  Discovering the actual encoding was aided not only by the availability of one sample car and several cars already submitted by other teams, but also by the contest organizers&#8217; server, which gave helpful error messages for incorrectly encoded cars.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg" />
</p>
<p>
Here are a few examples of numbers and lists of numbers.  Note that the encoding assumes that the type of the encoded object is known, i.e., the encoding of a particular list and some number might be the same, but since it is known which one of the two types is expected, that is not a problem.
</p>
<table border="2" cellspacing="0" cellpadding="6" rules="groups">
<caption></caption>
<col align="right" />
<col align="right" />
<thead>
<tr>
<th scope="col">Thing</th>
<th scope="col">Encoding</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>10</td>
</tr>
<tr>
<td>2</td>
<td>11</td>
</tr>
<tr>
<td>3</td>
<td>12</td>
</tr>
<tr>
<td>4</td>
<td>22000</td>
</tr>
<tr>
<td>7</td>
<td>22010</td>
</tr>
<tr>
<td>13</td>
<td>2210000</td>
</tr>
<tr>
<td>()</td>
<td>0</td>
</tr>
<tr>
<td>(1)</td>
<td>110</td>
</tr>
<tr>
<td>(1 2)</td>
<td>2201011</td>
</tr>
<tr>
<td>(1 2 3)</td>
<td>2210101112</td>
</tr>
</tbody>
</table>
</div>
</div>
<div id="outline-container-2.6" class="outline-4">
<h4 id="sec-2.6">Generator circuits </h4>
<div class="outline-text-4" id="text-2.6">
<p>We had several ideas for how to generate circuits to produce arbitrary ternary strings.  The one we settled on was to reverse-chain together a sequence of adding sub-circuits.  Consider this circuit:
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/circuit_0d9ff760343ae3fbaec1ad6f2b5309368fdcd017.png" alt="circuit" />
</p>
<p>
One thing I haven&#8217;t mentioned about circuits yet is that transfer along forward wires is instantaneous, but is delayed by one step along backward wires.  In the first step, all backward wires carry the value 0.  What that means for this circuit is that in the first step, the one-adder gets an input of 0 and produces 1, which is the first step&#8217;s circuit output.  Also, the two-adder produces a 2, but its delivery to the one-adder is delayed until the second step, at which point the one-adder produces a 0 (because 1+2 is 0 modulo 3).  The output for the third step already depends on the circuit input from the first step.  This circuit, then, will, no matter what the input, always produce the two trits 10 in its first two steps.  It is easy to see how to extend this pattern to produce an arbitrary string, given only the three adders as building blocks.
</p>
<p>
<img src="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg" alt="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg" />
</p>
<p>
Of course we didn&#8217;t have the three adders starting out, but it was not too hard to brute-force them, like we did for the key generator.  In fact, we even went so far as to brute-force circuits that combined up to three of those adders, reverse-chained together, to make our circuits shorter.  Our adders also had a delay built-in, which made the last sub-circuit superfluous.
</p>
</div>
</div>
<div id="outline-container-2.7" class="outline-4">
<h4 id="sec-2.7">Fuel encoding </h4>
<div class="outline-text-4" id="text-2.7">
<p>At this point we could use the server&#8217;s submission system to figure out the ternary encoding of fuels, which turned out to be quite simple &#8211; a list of lists of lists of numbers.  With that done, we could concentrate on the actual task of the contest, namely the production of cars and fuel.
</p>
</div>
</div>
<div id="outline-container-2.8" class="outline-4">
<h4 id="sec-2.8">Producing fuels </h4>
<div class="outline-text-4" id="text-2.8">
<p>The only way to score points was to produce fuels, either for our own cars, or for the cars of others, which we could download.  As it turned out, a sizable proportion of the cars other teams submitted worked with one out of a handful of very basic fuels, so we wrote scripts that automatically tried to submit those fuels for any new cars that showed up.
</p>
<p>
We had two strategies for solving cars that didn&#8217;t give in so easily, both of which were rather unsophisticated and give away our complete lack of any deeper understanding of the problem.
</p>
<p>
The first approach, which I worked on, was to employ a genetic algorithm.  The second was to translate the constraints a car placed on a fuel into a series of inequalities and feed them into Mathematica, hoping it would spit out a solution.  Both approaches yielded a number of matching fuels.
</p>
</div>
</div>
<div id="outline-container-2.9" class="outline-4">
<h4 id="sec-2.9">Producing cars </h4>
<div class="outline-text-4" id="text-2.9">
<p>My approach to producing cars was the same as for producing fuel &#8211; a genetic algorithm.  We have a running joke in our team: For every problem that comes up I first suggest a genetic algorithm, even if it&#8217;s completely unsuited.  This year was the first time a genetic algorithm was actually a somewhat feasible approach.  The problem with producing a car this way is the fitness function.  We had the idea of generating a random pool of fuels and then scoring cars so that they came out on top if they matched exactly one of those fuels, hoping that would make finding a matching fuel harder.  The cars produced with this approach were solved by about 20 to 25 other teams.
</p>
<p>
<a href="http://github.com/schani/icfp-2010/blob/master/src/clj/at/ac/tuwien/tilab/icfp2010/carmaker.clj">Another approach</a>, which I was not involved with and therefore cannot describe in any detail, was more successful: Only 5 or 6 teams solved those cars, so we ended up not using the genetically-produced cars much &#8211; there was a limit of 72 cars per team.
</p>
</div>
</div>
<div id="outline-container-2.10" class="outline-4">
<h4 id="sec-2.10">The result </h4>
<div class="outline-text-4" id="text-2.10">
<p>We finished the contest in 23rd place.  <a href="http://icfpcontest.org/2010/teams/">Around 210 teams</a> ended up with a non-zero score, with the total number of participating teams probably considerably higher.
</p>
</div>
</div>
</div>
<div id="outline-container-3" class="outline-3">
<h3 id="sec-3">Contest organization </h3>
<div class="outline-text-3" id="text-3">
<p>This contest was organized quite well, more so in contrast to <a href="http://schani.wordpress.com/2009/07/11/the-icfp-programming-contest-2009/">last year&#8217;s debacle</a>.  The biggest organizational issue was the submission server, which was down often, as a result of being hammered by hundreds of teams, most of which probably used scripts to automatically download cars and submit fuels.  In fact, for the first few hours of the contest, the server was completely unresponsive.
</p>
<p>
The organizers tried to remedy the situation with a variety of approaches, at one point limiting the length of submitted cars to something ridiculously low &#8211; around 100 trits if memory serves me right.
</p>
<p>
Still, I do not think that we were significantly held back by these problems, at least not more so than all the other teams.
</p>
</div>
</div>
<div id="outline-container-4" class="outline-3">
<h3 id="sec-4">Suggestions to future organizers </h3>
<div class="outline-text-3" id="text-4">
<p>I&#8217;ve been thinking a bit about what I liked and didn&#8217;t like about the ICFP contest tasks over the past 10 or so years.  Here are a few wishes to the organizers of future contests:
</p>
</div>
<div id="outline-container-4.1" class="outline-4">
<h4 id="sec-4.1">Visual </h4>
<div class="outline-text-4" id="text-4.1">
<p>I found that having a problem that can be visualized makes it much more enjoyable to work on.  The best example for this is probably 2004&#8242;s contest, where we had to write software for ants&#8217; brains that had to work together to search for and gather food and to defend against attackers.  Watching those ants move about and improve with each version of the code was a joy.
</p>
<p>
Despite all the other issues, I did like this aspect about last year&#8217;s task, as well.
</p>
</div>
</div>
<div id="outline-container-4.2" class="outline-4">
<h4 id="sec-4.2">Programming </h4>
<div class="outline-text-4" id="text-4.2">
<p>The task should really be a programming task at heart.  The contest is, after all, a Programming contest.  This year&#8217;s task, after you stripped all the intentional obscurity, was a mathematical problem. I&#8217;m not saying that mathematical problems aren&#8217;t nice or interesting. Neither am I saying that there shouldn&#8217;t be any math involved in the contest, but don&#8217;t make the core problem a mathematical one.
</p>
</div>
</div>
<div id="outline-container-4.3" class="outline-4">
<h4 id="sec-4.3">No obscurity </h4>
<div class="outline-text-4" id="text-4.3">
<p>It seems that all the obscure bits that were put into this year&#8217;s task had the purpose of disguising the fact that, as already mentioned, the core task was not first and foremost a programming one.  Please don&#8217;t do this in the future!  Give us a task where all the non-essentials have been stripped, where the hard and interesting part is actually the task, not all the stuff you have to go through to finally get to it, discovering that it wasn&#8217;t all it was cracked up to be.
</p>
<p>
Last year&#8217;s task also had a bit of completely unmotivated obscurity &#8211; the format of their VM instructions was slightly different depending on whether the instruction address was odd or even.  Why, oh why?
</p>
</div>
</div>
<div id="outline-container-4.4" class="outline-4">
<h4 id="sec-4.4">Self-containment </h4>
<div class="outline-text-4" id="text-4.4">
<p>The more we have to rely on the contest organizers&#8217; servers to go about our task, it seems, the more problems crop up.  Last year was a total failure due to a bug in the submission system, and this year the server was down or at least very hard to reach for considerable stretches of time.  Please give us something self-contained again!  If you must have some interaction, make is as simple as possible.  The submission system for the 2006 contest, for example, only required sending a short hash for each solved sub-problem, and it worked very well.
</p>
</div>
</div>
<div id="outline-container-4.5" class="outline-4">
<h4 id="sec-4.5">Don&#8217;t show off </h4>
<div class="outline-text-4" id="text-4.5">
<p>You&#8217;re smart, we know, but that&#8217;s not what the contest is about.  The contest is about how smart we are, so please refrain from making the task a monument to your intellect, as was the case in 2007.
</p>
</div>
</div>
</div>
<div id="outline-container-5" class="outline-3">
<h3 id="sec-5">Clojure </h3>
<div class="outline-text-3" id="text-5">
<p>After our <a href="http://schani.wordpress.com/2007/07/23/why-ocaml-is-not-my-favorite-programming-language/">disillusionment with OCaml</a> we had been looking for a language to replace it with as our first choice. To my surprise and joy we managed to settle on <a href="http://clojure.org/">Clojure</a>, which turned out to work quite well for us.
</p>
<p>
We stumbled across some rough edges, most of them having their roots in the integration of Clojure with the JVM, but they posed no significant obstacles, and will most likely be ironed out soon.  The biggest problem with Clojure right now, and this is generally acknowledged in the community, is the lack of its error reporting, which can make finding bugs quite an ordeal sometimes, until one accommodates, at which point it is still at best bearable.
</p>
<p>
On the positive side most of us found working in Clojure fun and, after some learning period, quite productive.  Having an interactive environment like Emacs/SLIME certainly accelerates not only the learning process, but development in general.
</p>
<p>
Clojure&#8217;s dynamic nature allowed us to do stuff we hardly would have considered with, say, OCaml.  A few weeks before the contest I put together a <a href="http://github.com/schani/clj-simple-dist">simple job distribution service</a> which allowed us to interactively distribute several workloads over a <a href="http://www.zid.tuwien.ac.at/vsc/">number of computing nodes</a> with relative ease.
</p>
<p>
The dynamicism does come at a price, though: Like many Lisps, Clojure&#8217;s performance model is non-intuitive.  It is not clear, at least without considerable experience, which operations, under which circumstances, are cheap and which are expensive, the gulf between which can be vast.  As a result, we have resorted to implementing a few critical pieces of code in Java, which, to be fair, was quick, easy, and gave us efficient code.
</p>
<p>
I am looking forward to using Clojure more in the future.
</p>
</div>
</div>
<div id="outline-container-6" class="outline-3">
<h3 id="sec-6">The Code </h3>
<div class="outline-text-3" id="text-6">
<p>All the code we wrote for this year&#8217;s contest is available on <a href="http://github.com/schani/icfp-2010">GitHub</a>. If you need any assistance with it, please email me.
</p>
</div>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/schani.wordpress.com/116/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/schani.wordpress.com/116/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/schani.wordpress.com/116/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=schani.wordpress.com&amp;blog=143659&amp;post=116&amp;subd=schani&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://schani.wordpress.com/2010/06/27/the-2010-icfp-programming-contest/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/daf7ba6f06480c52ac459772f2bb5268?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">schani</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard2.jpg</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/circuit_0d9ff760343ae3fbaec1ad6f2b5309368fdcd017.png" medium="image">
			<media:title type="html">circuit</media:title>
		</media:content>

		<media:content url="http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg" medium="image">
			<media:title type="html">http://www.complang.tuwien.ac.at/schani/blog/icfp2010/blackboard1.jpg</media:title>
		</media:content>
	</item>
	</channel>
</rss>
