Cocoa for Scientists (Part XXVII): Getting Closure with Objective-C

Author: Drew McCormack
Web Sites: www.maccoremac.com, www.macanics.net

Last week, Chris Lattner — who manages the Clang, LLVM, and GCC groups at Apple — announced that work was well underway to bring ‘blocks’ to the GCC and Clang compilers. ‘So what?’, I hear you ask, ‘My kid has been using blocks since he was 9 months old.’ Fair point, but maybe not these blocks.

A Demonstration of ‘Blocks’

Blocks, or closures as they are often called, have existed in other languages for quite some time. Ruby, for instance, is famous for them. They also exist in Python, which I’ll use here to demonstrate the principle.

Take this Python code

def EvalFuncOnGrid(f, forceConst):
    for i in range(5):
        x = i*0.1
        print x, f(forceConst, x)

def QuadraticFunc(forceConst, x): 
    return 0.5 * forceConst * x * x 

def Caller():
    forceConst = 3.445
    EvalFuncOnGrid(QuadraticFunc, forceConst)

Caller()

This simple program begins with a call to the Caller function. The Caller function calls to the EvalFuncOnGrid function to evaluate the function passed, in this case QuadraticFunc, which represents a simple quadratic function. The result is the value of the quadratic function on a grid of points.

0.0 0.0
0.1 0.017225
0.2 0.0689
0.3 0.155025
0.4 0.2756

Unquestionably exciting stuff, but what I want to draw attention to is the extra data that was passed along with the function itself. The QuadraticFunc function takes two arguments: the coordinate (x), and a force constant. This force constant needs to be passed along with function, because the function itself has no way to store it. This may not seem like a big deal, but suppose now we want to reuse EvalFuncOnGrid to print values of a different type of function, one that does not have a force constant, and instead takes a wave number parameter. Hopefully you can see that passing ‘state’ for the function, in the form of data and parameters, is limiting the flexibility of our code.

One viable solution would be to make QuadraticFunc a class, but that is a bit heavy-handed. Besides, this solution would work for our own functions, but not for built-in functions, or functions from libraries. We need some way to pass state to EvalFuncOnGrid, so that it can use that state when evaluating the function. This is exactly what ‘blocks’ allow us to do.

Here is the Python code rewritten to use a block:

def EvalFuncOnGrid(f):
    for i in range(5):
        x = i*0.1
        print x, f(x)


def Caller():
    const = 3.445

    def QuadraticFunc(x):
        return 0.5 * const * x * x 

    EvalFuncOnGrid(QuadraticFunc)

Caller()

If you run it, you will find this code produces the same output as before.

So what’s changed? You’ll note that the force constant has been removed from all argument lists, and no reference is made to it at all in EvalFuncOnGrid. This was a key objective: to have EvalFuncOnGrid be completely general, and work with any function. But the force constant must still be there, otherwise how does the quadratic function get evaluated?

You will have noticed that the QuadraticFunc function has been moved into the Caller function. The effect of this is that QuadraticFunc gets a copy of the stack of Caller, that is, it ‘inherits’ any variables and constants that are set in Caller. Because const is set, QuadraticFunc copies its value to its own stack, and can access it later in EvalFuncOnGrid. This is the essence of blocks: it is similar to passing a function argument, with the difference that the block has a copy of the stack from the scope where it was defined.

Blocks in Objective-C

Chris Lattner’s announcement details how blocks will be used in C and Objective-C, and — in essence — it is similar to the Python example above. Here is that example rewritten in the new C syntax:

void EvalFuncOnGrid( float(^block)(float) ) {
    int i;
    for ( i = 0; i < 5 ; ++i ) {
        float x = i * 0.1;
        printf("%f %f", x, block(x));
    }
}

void Caller(void) {
    float forceConst = 3.445;
    EvalFuncOnGrid(^(float x){ return 0.5 * forceConst * x * x; });
}

void main(void) {
   Caller();
}

(I’m not sure if this is 100% correct, because I haven’t tried to compile it yet, but it should at least give you the idea.)

The block syntax in C is very similar to the standard syntax for function pointers, but you use a caret (^) in place of the standard asterisk pointer (*). The block itself looks like a function definition, but is anonymous, and is embedded directly in the argument list. (Note that we named our ‘block’ in Python, but Python does also support anonymous functions.)

Inside-Out Programming

Another way to think about closures/blocks is that they allow you to rewrite the inside of functions, such as EvalFuncOnGrid in the example. I like to think of this as ‘inside-out programming’: Traditionally, you call functions from outside, and pass them what they need to get the job done. With blocks, you get to pass in the guts of a function, effectively rewriting it on the fly.

Why Blocks?

Why is all of this important, and why now? Well, as you are undoubtedly aware, there has been a vicious war raging the last few years, and it is only going to get worse before it gets better. That’s right — it’s the War on Multicore.

Our chips no longer get faster, they just get more abundant, like the broomsticks in Disney’s Fantasia. Chipmakers just take existing designs, and chop them in half, and then in half again, and software developers are expected to do something useful with that extra ‘power’.

It turns out that blocks could be a very useful weapon in the War on Multicore, because they allow you to create units of work, which each have their own copy of the stack, and don’t step on each others toes as a result. What’s more, you can pass these units around like they are values, when in actual fact they contain a whole stack of values (pun intended), and executable code to perform some operation.

In fact, blocks could be seen as a low-level form of NSOperation. For example, if you are parallelizing a loop, you could easily generate blocks for each of the iterations in the loop, and schedule them to run in parallel, in the same way that NSOperationQueue does this with instances of NSOperation. The advantage of blocks is that they are at a lower level, built into the language, and require much less overhead. Stay tuned, because Apple undoubtedly has some big things planned along these lines in Snow Leopard.

Comments

more about Blocks

Thanks for the introduction, Drew, that was quick! There is also an interesting post coming from the perspective of SmallTalk at http://mooseyard.com/Jens/2008/08/blocksclosures-for-c/.

Python anonymous functions

While not quite "anonymous functions", Python does support "lambdas":


def Caller():
const = 3.445
EvalFuncOnGrid(lambda x: 0.5 * const * x * x)

Re: Anonymous functions

Well, I would say a 'lambda' is a simple anonymous function.
You can fake a full-on anonymous function by just calling it something simple, like '_'.

def Caller():
    const = 3.445
    def _(x):
        return 0.5 * const * x**2
    EvalFuncOnGrid(_)

D.

---------------------------
Drew McCormack
http://www.maccoremac.com
http://www.macanics.net
http://www.macresearch.org

Yuck. The C syntax is

Yuck. The C syntax is understandably horrid. But the adding of closures to Objective-C should make us think about the other parent of the language: Smalltalk. It is desirable to make the language closer to Smalltalk than to C.

Nasty

I agree that this is beyond hideous and wish that Apple would dedicate more of this time to replacing Objective-C rather than grafting more bizarre nonstandard nonsense onto it. I'm sure it was very excitingly dynamic back in 1988 but the NeXT expats and their pet evolutionary cul-de-sac have been lapped several times over since then.

Multiple things.

There are multiple things that this article slightly gets not quite right. Not that anything is strictly wrong (tho the term "block" is potentially misleading), it just seems to conflate a number of things.

My first issue is that the article conflates two or three issues: first-class functions, and closures, and possibly C lambdas.

First-class functions are functions that can be used as data, passed too functions as arguments, and returned from functions as values, and stored in variables. The ability to do this in the new system is convenient (more so than using function pointers) because of the flexibility that closures and lambdas provide.

Closures, which aren't related to first-class functions except only very tenuously, are functions that have variables bound outside the definition of the function, in the scope that they were defined in. These are already present in C, for instance any global function (all, prior to this, if I'm not mistaken), form a closure with all global variables. The benefit of closures comes from when you can define a variable in a non-global scope, and also define a function within that same scope, so that the function itself has access to that variable but nothing outside that scope has access. An example that makes this clear, using Python, might be this.

Closures are incredibly easy to construct if you're using a lexically scoped language, in which variables in a function body that are not formal parameters of the function are looked up by inspecting the environment in which the function was created. Dynamically scoped languages, however, have a much harder time creating closures, required special constructs for passing the variable bindings along with a function. Any lexically scoped language with both functions and variables outside the function that can be viewed from within the function (e.g. C prior to this modification, as noted earlier) technically has closures. Whether or not they're useful depends heavily on the ability to used functions as first-class objects, thus allowing the closure to be passed around with the function.

Finally, lambdas, are what make this possible in C. Without lambdas, you can still have closures, as in the def-ed Python examples, and in a fictional version of C that allows function definitions within other functions. Lambdas are convenient because they allow functions to exist without requiring that they have names, and making the definition itself an expression (whereas a function definition in C, being a statement, has no value). This makes it easy to create functions on the fly.

The ability to have closures in C, as demonstrated above, is currently more a result of lambdas, and to a lesser extent first-class functions, than anything else. As I noted, C already had closures, you just couldn't define functions anywhere other than in the global scope, and so only global variables could be closed over. The real innovation here is that functions can now be created within other functions, as this is what allows closures to have the really useful properties.

Some examples of things that can now be done in C, thanks to all of this, are class-like behavior without the shadey C-with-Classes methodology: A class can instead be defined as a closure-generating function, like so:

In your function body, declare and initiate private variables, and private functions. Then, return a struct with public variables, and public and private methods. The methods, being functions with closure over the private variables and functions, can still access those variables after the constructor function returns, but nothing else can access those variables and functions.

Also interesting is that along with first-class functions, these modifications to C provide the ability require certain types of first-class functions. In duck-typed languages, and with function pointers in C, you have no way of specifying the type of the function you want to have as an argument. So for example, from the above example, Python just has EvalFuncOnGrid(f) and the equivalent C function pointer version might be EvalFuncOnGrid(float* f), while the new version has EvalFuncOnGrid( float^(float) ). With the Python and C function pointer versions, any function can be passed, which makes it possible that you might not have the appropriate argument number, or return value. The new version is much more like Haskell's type system, where this EvalFuncOnGrid function would have the type (float -> float) -> nil or something similar. The type signature float -> float is a type signature for a function that takes a single float argument to float value. These new typed function arguments provide for some interesting, and confusing, type signatures, such as these ones in C:

void SomeFn( float^(float^(float^(float))) )
(its equivalent in Haskell being roughly (((float -> float) -> float) -> float) -> nil)

or

float^(float^(float^(float))) SomeFn()
(in Haskell: nil -> (((float -> float) -> float) -> float))

This is obviously an enormously confusing type signature, but it's not uncommon when dealing with higher order functions. Consider an imaginary function that, when given some function A, creates another function B that uses A to sort an array. This might have the type signature void^(void*) Sorter( void^(void*) ) (Haskell: (float -> float) -> (float -> float)). The C type system is now a whole lot more interesting. :)

GCC and end-user effects

Well, I read the original announcement from Chris Lattner, and I just thought I'd mention, that GCC seems to already have built-in support for blocks, so no compiler re-write again.
Nobody seems to have mentioned the actual use of this new blocks-featurette. I believe it is not only flexible for programmers, and adding more power. But it also makes the programs rund smoother and consistently faster, as less memory becomes unused. In combination with garbage collection, we are one step before idealising our nice language Objective-C. With this in mind, the performance should run much safer too, though we shouldn't forget, the easier it is for the programmer, the easier it is for the Black Hat!

Grand Central Dispatch

Nobody seems to have mentioned the actual use of this new blocks-featurette.

I can tell you why Apple is going this route right now: Check out the 'Multicore' section of Apple's Snow Leopard preview page:

"Grand Central takes full advantage by making all of Mac OS X multicore aware and optimizing it for allocating tasks across multiple cores and processors."

Now substitute 'blocks' for 'tasks'. Yeah, like that. Neat, huh?

So, ultimately, you have a machine with n cores. You run a computation, but instead of putting code in a plain ol' C loop, you use blocks to implement the logic. You hand off that block either to something like a block-aware enumerator on an Objective-C object, or directly to the block dispatch routines of the Grand Central API. Now, because your loop code is block-based, Grand Central can run n loop iterations at a time — even when the blocks access local variables belonging to the function which defined the block. Because you're writing block-based code, you've arranged things such that each block is completely independent of those either side (otherwise, you're probably not suited to blocks).

Oh, and the best thing? Because Grand Central Dispatch is sitting on top of the mach thread/processor management APIs, once a block gets assigned to a processor/core it stays on that core, even when the current task/process is swapped out and back in by the mach task scheduler. That means that it is highly unlikely that the code within the block will ever need to touch main memory or swap data between each CPU core's level 1 caches. Which means that the code in the block can potentially run at the CPU/core's full processing speed, with no blocking for memory IO. It also means that performance will scale in a close-to-linear manner with the number of cores available, whereas currently as you add more cores, you get more overhead in synchronizing memory and cache access between all those cores. By separating everything and locking code blocks to a single specific core with its associated cache, Grand Central Dispatch can sidestep that problem really quite nicely.

In response to Alan

In response to Alan Quartermain, the idea of bundling things up into little self contained closure functions that perform all the requisite computation for a particular task without needing arguments, etc. (called "thunks") is old hat in languages like Scheme or Haskell, or any other language with closures. Their use in parallelization is one of the most appealing features of Haskell, etc. as well. One of the problems with doing this, however, is that if you program in a non-functional fashion (that is, if those thunks modify their closure environment), any other thunks dependent on the closed-over variables may not work in expected ways.

Consider some function where a thunk is generated for an array, each element being closed over trivially, like say..

for (i = 0; i < length(arr); i++) {
thunks[i] = int^(int) {
return arr[i] + some_variable_in_the_higher_scope++;
};
}

or something like that.

If the array is, say, [1,2,3] and the some_variable_in_the_higher_scope is 0 at the beginning, the distributed evaluation of the functions in thunks has 6 possible outcomes, depending on which thunk is executed first. If they're executed in order as they appear in thunks then the result will be [1,2,4], if they're executed in reverse order the result is [3,3,3], etc.

This is one reason why languages that intend to use thunks for parallelization (Haskell, Erlang, etc.) tend to be purely functional — that is to say, they don't allow you to set variables more than once (or at all, if you don't consider argument-value binding to be "setting" a variable). Because C is definitely not purely functional, issues related to concurrency, etc. will result from this addition to the language, so code careful.

Global functions are closures??

To psygnisfive: can global functions accessing global variables really considered closures? could you clarify the behavior of this (assuming the following in defined in the global scope):


int some_global_var = 1;

void some_global_function ()
{
   printf("%d\n", some_global_var);
}

some_global_function();
some_global_var = 2;
some_global_function();

I thought closures meant that both calls would give the same output, and that the closures would somehow 'save' the values of higher scope variables at the point where the function is defined. But I believe the above will print 1 then print 2?

Sorry I have not had much formal education in programming, so I am just trying to make sure I understand these concepts :-)

Yeah Yeah...

Lisp has had first class anonymous functions and dynamic variables since the 50s. Wake me when there's something new.

Meanwhile everyone else can download a free personal edition of LispWorks and go crazy.

--
Gorbag

Another interesting look at implementing Smalltalk-esque blocks

Another interesting perspective is that of the Portable Object Compiler team. They have a paper on the topic of Objective-C Blocks penned in 1998 in which they go over the design and implementation of Smalltalk like blocks in their compiler. It's a really detailed and interesting perspective on how to add blocks to the language and get as close to Smalltalk blocks as possible without interfering with their C underpinnings.

______________________________________________

Christopher Roach
http://christopherroach.com

In response to cparnot

No, closures don't have to always return the same values. Closure merely means that a function has access to the environment it was defined in (or some other specified environment). Whether or not that environment can change depends entirely on the language in question. Scheme has mutable environments, Haskell doesn't, so in Scheme multiple calls can result in different values, but in Haskell this isn't the case at all.

What makes it a closure is merely the fact that variables defined outside the function are accessible from within the function. That's a relatively trivial thing, people in C do this intuitively with global variables. The magic is when you can do it in some scope other than the global scope, like the Python example I gave above. Now, you'd be correct in saying that the variables can't be accessed outside the scope of the function in which they're defined, but that doesn't mean that functions also defined in that scope can't access them.