Thursday, August 11th, 2011

<script type="text/javascript">
function addBlink() { 	
   var blinks = document.getElementsByTagName("blink");
   var timeoutLength = 300;

   var blink = function () {
      for (var i = 0; i < blinks.length; i++) { 
         var b = blinks[i]; = ( === "visible" ? "hidden" : "visible");

      setTimeout(blink, timeoutLength);

   if (blinks.length > 0)
      setTimeout(blink, timeoutLength);

window.onload = addBlink;

Offered without comment except to note that there’s no reason to expect this to work on non-modern browsers. I don’t have that kind of time.

In a recent episode of Hypercritical, Mssrs. Benjamin and Siracusa revealed the secret for disabling a system-wide window animation that Apple added in their most recent OS release. I’ve seen these sorts of tips (where you go to terminal and use a command to update a system-wide and undocumented preference value) before, but I never really gave much thought to how people discover these. If I had to guess, I’d say that some engineer at Apple told a few friends and then it spread.

But, it turns out that there’s a more methodical way to discover these things. And Siracusa dropped the clue in that podcast. He said that Mr. Franzén had swizzled some methods. “Ah hah!” I said. “So that’s the secret!” And then I said, “Well, duh. Why didn’t I think of that myself?! How else would it work?!”

But even, then, I only knew this secret in the abstract. I knew how to swizzle methods in Objective-C, but I’d never really used it for anything. So I decided to investigate. Allow me to drop some knowledge. I’ll start with some background.

The main application development language for OS X is Objective-C. Objective-C is a strict superset of the older language C with object support added to the C base in a backwards compatible fashion. One of the most interesting things about Objective-C is that it’s a dynamic language even though it’s also a compiled language (unlike the popular dynamic languages Ruby and Python which are both dynamic but are usually interpreted). This dynamic nature lets the language do some cool things. Indeed, by using things like Key-Value Observing, many developers take advantage of Objective-C’s dynamic nature without even realizing it. A lot of the things that Cocoa developers miss in other languages turns out to be a result of Objective-C’s dynamic nature.

Now, you know how Objective-C is built on top of C? This goes all the way down to the very core of the language. In the end, Objective-C objects are really just C structs and Objective-C methods are really just pointers to standard C functions. All of this is managed invisibly by the Objective-C runtime. But it doesn’t have to be invisible: the full power of the runtime is available to any application developer by importing <objc/runtime.h>. I’m going to use this power to swap out some Apple code with my own: but which code?

These secret preferences (and particularly the one I’m interested in at the moment) are usually given in the form of terminal commands which write a value to the system defaults database, the preference database system that Mac OS X provides. For Objective-C programs, Apple has provided the NSUserDefaults class for talking to this database. NSUserDefaults has several instance methods for retrieving defaults values like objectForKey:, stringForKey:, integerForKey:, etc. Since I already knew I was looking for NSAutomaticWindowAnimationsEnabled and this sure looks like a boolean value, I decided to start by swizzling boolForKey:.

First, of course, I need a fake boolForKey: to swizzle out. I’m going to implement this as a regular old C function and take advantage of the fact that every Objective-C method is really a standard C function with a hidden id and SEL parameter tacked on to the beginning of the parameter list. The id is the object that the method is being called on (in fact, when you access self in an Objective-C method, you’re really just using this hidden parameter!) and the SEL is the selector that was originally used to call the method.

I’ve also created a static IMP called realMethodImplementation_boolForKey. When I swizzle the methods, I’ll store the original in this variable so my fake method can call it to hand the real value back to the original caller. IMP is just a function pointer to a function that returns an id and has an id and SEL as the first two parameters, so I can call the realMethodImplementation_boolForKey just like I would a normal function.

BOOL fakeBoolForKey(id s, SEL cmd, NSString *defaultName)
    NSString *result = @"No real method was found"; //The string that we'll log to the console
    BOOL r = NO; //The value we'll return to the caller
    //If we cached an real method, call it (remember, IMP is just a function pointer). 
    if(realMethodImplementation_boolForKey) { 
        //IMP is typed to return an id, but we can cast that to BOOL to get the right value
        r = (BOOL) realMethodImplementation_boolForKey(s, cmd, defaultName);

        //Pretty-print whatever we got back from the real function
        result = r ? @"YES" : @"NO"; 
    NSLog(@"boolForKey: %@ and got answer: %@", defaultName, result);
    return r;

Once I have my custom implementation, I can write a function to actually swap out Apple’s boolForKey: for mine. You can see here where I save the IMP to Apple’s function before swapping it out for mine with class_replacemethod(). The only other interesting thing here is that class_replaceMethod wants the method signature encoded as a char*. I think you should be able to do this with the @encode() directive, but I’m not sure how everything needs to line up. Since I already had the real method handy and my signature was identical, I just got the encoding from the real method and passed that in to class_replaceMethod.

void swizzleBoolForKey(void) 
    //We want to swizzle boolForKey: on NSUserDefaults, so get the basic metadata for those
    Class cls = [NSUserDefaults class];
    SEL selector = @selector(boolForKey:);
    //Convert the selector into a string so we can log which method we're swizzling
    NSString *selectorString = NSStringFromSelector(selector);
    //fakeBoolForKey(...) is the function we want to call instead of whatever Apple wrote. Turn it into 
    //an IMP (which is really a function pointer, so a simple cast will suffice)
    IMP newImplementation = (IMP) &fakeBoolForKey;
    //Lookup the method that Apple wrote
    Method realMethod = class_getInstanceMethod(cls, selector);
    //Save the function pointer for the method that Apple wrote so we can call it from the swizzled function
    realMethodImplementation_boolForKey = method_getImplementation(realMethod);
    //Try to replace Apple's IMP with ours. method_getTypeEncoding returns a char* encoding of the method's signature and 
    //parameters. Since we're swizzling out a method with identical parameters and signature, just get the type encoding from 
    //the real method. 
    if(!class_replaceMethod(cls, selector, newImplementation, method_getTypeEncoding(realMethod))) {
        NSLog(@"Could not replace %@", selectorString);
    } else {
        NSLog(@"Replaced %@!", selectorString);

For my first pass, I had a little window with a “Swizzle!” button on it. Clicking the button would call my swizzleBoolForKey() function. And then I’d watch my console as…nothing happened. By the time I was able to click a button, the system had already loaded all of the preferences it needed. I had to get my swizzle in earlier.

So I did something you almost never do when writing a Mac app. I did something you’re not really even supposed to do. I opened up main.m and put my call to swizzleBoolForKey() directly before the normal NSApplicationMain() call. Once the swizzle was literally the first thing that happens in the program, I started seeing things fly by on the console as a bunch of boolForKey: messages got logged. But, despite seeing tons of fun settings (like NSOnlyFlipFontsWithIdentityMatrix, whatever that does), I didn’t see anything about window animations. Hmm.

Since I already knew what I was looking for, I knew it had to be there. But I wasn’t seeing it. It was hiding. So I decided that my guess that it would show up in boolForKey: was wrong. They must be doing something like retrieving it as an integer and seeing if it’s 1 or 0. Or something like that. I didn’t know, so I realized that I had to do something drastic. I’d have to swizzle all of the {type}ForKey: methods.

I didn’t want to write a new fake function for each of these methods (there’s 11 in all), so I decided that my second attempt would be a bit more generic. My new method would take advantage of the cmd argument to see which selector had been used to call it. It could then use that to both log which method it was pretending to be as well as look up the real implementation in a static NSDictionary (again, I’ll save it here when I actually swizzle the method). Since SEL is really a pointer and NSDictionary cannot store pointers, I have to store it as an NSValue.

Perhaps more importantly, I don’t really know what the return type will be (a float? A BOOL? An object? Who knows?) so I don’t have a good way to format the result. So I decided to just not output that.

id fakeAnythingForKey(id s, SEL cmd, NSString *defaultName)
    //Translate the selector to a string so we can log which method got called and 
    //so we can look up the real method implementation in our dictionary
    NSString *selectorString = NSStringFromSelector(cmd);
    NSLog(@"Called %@ for %@", selectorString, defaultName);
    //Look up the real implementation method in our dictionary using the selector string as key. 
    //IMP is a pointer, so we were able to store it in an NSValue
    NSValue *realImpValue = [realMethodImplementations valueForKey:selectorString];
    IMP realImp = NULL;
    //IMP is typed to return id
    id r = nil;
    if(realImp) {
        realImp = [realImpValue pointerValue]; //Get the function pointer value from the NSValue and call it
        r = realImp(s, cmd, defaultName);
    //Since I don't know if r is really an object or a primative (were we objectForKey: or floatForKey:?),
    //I dont' want to log the output. If that were important, we could switch over the selector to format it
    //properly. Alternately, once we knew which method we were interested in, we could write a one-off swizzle 
    //like we did for boolForKey:
    return r;

I then knocked up a quick generic swizzling method. The most interesting thing here is where I shove the SEL into an NSValue (and that’s not particularly interesting).

void swizzleForKeyMethod(SEL selector) 
    //We want to swizzle selector on NSUserDefaults, so get its class metadata
    Class cls = [NSUserDefaults class];
    //Convert the selector into a string so we can log which method we're swizzling and use it as a 
    //dictionary key for saving the real IMP to call from the swizzled method
    NSString *selectorString = NSStringFromSelector(selector);
    //We want to replace Apple's method with a fakeAnythingForKey. IMP is a function pointer so a simple cast
    //will suffice
    IMP newImplementation = (IMP) fakeAnythingForKey;
    //Lookup the method that Apple wrote
    Method realMethod = class_getInstanceMethod(cls, selector);
    //Get the function pointer for Apple's method and save it to an NSValue. Remember to use the alloc-init
    //pattern to avoid creating an autoreleased object with the convenience methods
    IMP realMethodImplementation = method_getImplementation(realMethod);
    NSValue *realMethodImplementationValue = [[NSValue alloc] initWithBytes:&realMethodImplementation objCType:@encode(IMP*)];

    //Store the implementation of the real method into a global dictionary with the selector string as the key
    [realMethodImplementations setValue:realMethodImplementationValue forKey:selectorString];
    //Try to replace Apple's IMP with ours. method_getTypeEncoding returns a char* encoding of the method's signature and 
    //parameters. Since we're swizzling out a method with identical parameters and signature, just get the type encoding from 
    //the real method. 
    if(!class_replaceMethod(cls, selector, newImplementation, method_getTypeEncoding(realMethod))) {
        NSLog(@"Could not replace %@", selectorString);
    } else {
        NSLog(@"Replaced %@!", selectorString);

And that’s it! As the first thing in main(), I can just call something like swizzleForKeyMethod(@selector(boolForKey:)) and I’ll have a swizzled method. Once I swizzled all 11 of the NSDefault {type}forKey: methods, I started seeing tons of stuff fly across my console. One of them (well, five of them for some reason) was an objectForKey: call for NSAutomaticWindowAnimationsEnabled. If I had been doing the original research, I could have done a quick Find for something like “anim” and I’d have found it. Then I could’ve been the big Lion hero instead of this Tomas Franzén guy. Alas. I don’t know why they’re using objectForKey: instead of boolForKey:. If you have any insight into that, I’d love to hear it.

And that’s how to sniff out secret, undocumented preferences on OS X. It’s mostly not necessary, of course, since the interesting ones will show up on the Internet anyway. If nothing else, I hope you’ve enjoyed this look at the power of Objective-C.

I’ve uploaded a sample project you can play with. Unlike just about every other Cocoa project out there, all of the interesting code is in main.m. I learned about Objective-C method swizzling from Scott Stevenson and Mike Ash. You can probably learn a lot from them too.

Friday, July 22nd, 2011

This morning, my boss said something to the effect of: “This test-server is pretty under-powered. It still only has 16 gigs of RAM.”.

Yesterday, I saw an estimate that Apple shifted almost 4 petabytes during the first twenty-four hours after they released their latest operating system on the Internet. The petabyte was a unit that my beloved spouse had never heard of before.

I’ve recently started using Spotify which is a music service that gives me instant access to any one of 15 million songs.

I currently follow 404 people on twitter. These people represent nations, professions, and hobbies that span the globe and the gamut from high-tech to iron-age-tech. I’m even fortunate to call some large percentage of that 404 a friend — even people that I’ve never actually met.

The sheer scale of technology is increasing at almost frightening speeds. More and more, it seems that our improvements are running into a wall created by the laws of physics: from heat dissipation in microchips to the size of magnetic filings on hard disks. I read an article the other day noting that we’re getting to the point in our fiber optic lines where the pulses of laser light are so short that they start to blur into each other.

This is a far cry from the DEC Rainbow I first used while sitting on my dad’s knee. And if I’m not careful, I’ll forget just how much magic is in the things I use every day.

I try not to forget, though. Because that sense of childlike wonder makes my own job even better than it already is.

Saturday, July 16th, 2011

I was perusing the threads at my favorite message board when I came across a thread discussing Word vs. LaTeX. I’ve always liked the idea of LaTeX but since I don’t do much writing, it’s always been far more powerful than I need. That power leads to complexity that I have to learn and manage with no gain. So I usually just turn to a word processor when I need to bang out some paragraphs. And, of course, I always get supremely frustrated with trying to use styles in Word or Pages. They work, but they’re terrible.

So I asked myself, “Why can’t writing for print be as simple as styling my blog posts with CSS?”. So I typed something similar into Google and found A List Apart’s article about writing their book in CSS.

So, people who actually produce books are thinking about it. Neat! I’m sure there’s a lot more work that could be done to ease the workflow, but the basics seem pretty solid. The only sticking-point is that the tool they use to go from CSS+HTML to PDF is an expensive piece of commercial software. I went through the article assuming it would be open-source. But alas, no.

I can’t fault anyone for making money with their hard work (I enjoy profiting from my company’s software, after all!), but I don’t think it’s something I’ll be able justify playing around with any time soon.

But, if I were a professional publishing type, I’d take a hard look at it. It feels like this sort of workflow could fit in a sweet spot between Word and InDesign.

As an aside, CSS3 is really powerful. I’m really looking forward to increased browser support for some of this stuff. And I’m not even really a web developer!

I was recently fortunate enough to attend some Microsoft SQL Server training on my employer’s dime. The training had been billed to me as performance training, but it ended up being a SQL Server developer deep-dive instead. Performance was certainly a major topic, but we covered a lot of other things as well.

Most of my “training” with relational databases (and SQL Server in particular) has been pretty ad-hoc and cargo-cult oriented. I did take an Oracle class in college, so I was at least familiar with thinking in sets when I started working here but the particular idioms and characteristics of SQL Server have mostly come from watching what the more experienced programmers in my workplace have been doing.

Which is to say: I had some things to learn. I’m not entirely sure that I took away the right things from this training, though. Previously, my relationship with SQL Server has been friendly. We’ve been colleagues working together to retrieve data correctly and quickly for our customers. Now, though? I see SQL Server as a devious force working to undermine me at every turn.

I want to share a few examples. Partly to cement these things in my head and partly to whinge. These are in no particular order.


SQL Server does not support nested transactions. It looks like it supports nested transactions. It tells you it can support nested transactions. But it cannot support nested transactions. Observe.

create table dbo.A ([KEY] int, [VALUE] nvarchar(100));
create table dbo.B ([KEY] int, [VALUE] nvarchar(100));

begin transaction FIRST
insert into dbo.A ([KEY], [VALUE]) 
	values (10, 'hello'), (11, 'goodbye'), (42, null);

--Doesn't actually start a new transaction 
begin transaction SECOND 
insert into dbo.B ([KEY], [VALUE]) 
	values (1, 'one'), (2, 'two'), (3, 'four'); 
--Rolls back the inserts to A and B

--Returns an empty result set because the transaction 
--was rolled back 
select * from dbo.a 

--throw an error message because there's no 
--longer a transaction running. It's been committed. 

The good news, if you want to call it that, is that it works the other way. If you commit the “first” transaction and rollback the “second”, both inserts will be rolled back. Take a look at the docs for @@TRANCOUNT to see how it all works (“begin transaction” increments @@TRANCOUNT, “commit transaction” decrements it, and “rollback” undoes everything and sets @@TRANCOUNT to 0. It’s a mess). The take-away, I think, is that nested transactions must be considered harmful; even though you can probably get away with it as long as you only ever commit in the “inner” transactions and keep your error handling and rolling back in your outermost code. It still seems too fragile, though.

Temp Tables

Unfortunately, it gets worse. Take a look at this:

create procedure dbo.A 
create table #temptable (thing1 int, thing2 int); 
insert into #temptable values (42, 43);
exec dbo.B;

create procedure dbo.B
create table #temptable (anotherThing nvarchar(100));
insert into #temptable values ('This One Here');
select * from #temptable 

--This won't work: (depending on what's found its way into your proccache)
exec dbo.A 

When I ran this, SQL Server threw the 213 message “Procedure B, Line 5 Column name or number of supplied values does not match table definition.” But we can clearly see that the insert in dbo.B is fine. It gets worse, though. If I clear the procedure cache (dbcc freeproccache) and then run

exec dbo.B;
exec dbo.A;

everything works fine. When you execute A first, SQL Server creates a plan for inserting into #temptable. Then, when it goes into A’s call to B, it looks for a plan for inserting into #temptable. But even though it’s a completely different #temptable, it thinks the plan it had for A’s #temptable will work. But the number or arguments is wrong. So that doesn’t work at all.

The second time around, I called B first. Since SQL Server already had a plan in cache for running B, it didn’t try to generate another one. So that worked. Of course, if in the fullness of time B’s plan ends up dropping out of the cache…well, B will break again. Randomly. Through no fault of its own (or its developer’s).

So temp tables are pretty unsafe as well. I think the only sane way to use temp tables is to enforce strict naming discipline. If I’d named them #dbo_a_temptable_1 and #dbo_b_temptable_2 there wouldn’t have been an issue without A being actively malicious (and if you’re letting someone write actively malicious code on your database, this is the least of your worries). Still, this sort of discipline isn’t something I see documented very often. That’s pretty unfortunate.

Miscellaneous Evils

Then there are the smaller things that I already knew in the back of my mind but seeing them all at once in a formal class sort of drove home just how programmer-unfriendly SQL Server can be.

I had a vague idea that scaler functions were slow and that inline (or “single statement”) table valued function were fast. It was driven home to me just how slow scalers are though: so don’t use them (I’d actually just spent a week or two converting a really slow query that used a scaler function to a nice and speedy inline table function, so this didn’t really come from the class). The instructor was also not fond of multi-statement table functions, preferring SPs instead (since that’s how these functions are actually run in the database). Personally, I think there’s a lot to be said for being able to join directly to the table valued functions, though. That said, they can only return table variables and those can only have at most one index (they’ll become a clustered index if you put a clustered primary key on them when you declare them). So if your function is returning a lot of data and an index would be helpful, all that’s for it is to use an SP to toss data into a real table (or a temp table, I suppose) and join to that instead.

Parameter sniffing came up a lot in the class. If you have something like

create procedure dbo.FIND_ENTITY(@ENTITYTYPE tinyint, @ENTITYNAME nvarchar(100))
	select * from dbo.PEOPLE where NAME like @ENTITYNAME;
	select * from dbo.COMPANIES where NAME like @ENTITYNAME;

you’ll almost certainly be running the query with the wrong plan some of the times based on what’s in the cache. If the procedure is planned out for finding people, the query will be optimized for companies with names like “John Smith”. If the procedure is planned out for finding companies, the query will be optimized for people with names like “Microsoft”. Since the optimizer is dependent on a statistical analysis of what the results are likely to be (in order to decide, say, between a merge join or a nested loop join), optimizing based on the wrong parameters can be devastating. And it call comes down to which version is used when the SP isn’t in the cache.

The solution is to do something like

create procedure dbo.FIND_ENTITY(@ENTITYTYPE tinyint, @ENTITYNAME nvarchar(100))

since the separate SPs won’t have plans generated for them until they’re actually executed (and when they’re executed, they’ll have been called with the “right” @ENTITYNAME for that use). You could also use option recompile to make the procedure recompile every time, but that might not be a great solution for procedures you expect to get called hundreds of times a second. Still, it just feels sneaky.

You might think that you can be clever and combine this trick with the fact that inline table functions are fast. So you try to do

create procedure dbo.FIND_ENTITY(@ENTITYTYPE tinyint, @ENTITYNAME nvarchar(100))
	select * from dbo.FIND_PEOPLE(); --Inline table-valued function
	select * from dbo.FIND_COMPANIES(); --Inline table-valued function

But the key to an inline table function is that they’re just that: inline. SQL Server sees that procedure exactly as if you’d copied and pasted the text from the functions into the procedure. So you end up with the same problem as before. So SQL Server would defeat your cleverness in this case.

Multi-statement functions are fine, though, allowing for the fact that the data they return won’t be indexed. I’m not sure it’s fair that the function can look identical to the caller, but have such completely different performance characteristics.


And that’s really my problem with most of this. I didn’t come up in the mainframe days. I’ve never punched cards. I don’t worry too much about what my processor is doing. I’m a high-level programmer raised in a high-level programming world. My entire education and most of my career has taught me the value of black box programming. I should be able to call a function and, as long as it obeys its contract, everything should be fine. Anyone should be able to call the function I’m writing, and as long as they’re obeying their contract, everything should be fine. I don’t have to know the details of my callers and I don’t have to know the details of what I’m calling.

SQL Server doesn’t really give me that though. I have to be familiar with everything that’s happening throughout the call stack. I have to know if it’s safe to begin or rollback a transaction (because someone else in the chain might want to start or commit one). I have to know what the safe names for my temp tables are. I have to know how everything I call was written so I can consider its performance correctly.

If I have something that’s best “wrapped up” as a scaler function (and doesn’t really lend itself to being a table function), I have to copy and paste that code into all my queries that can use it (or pay an outrageous performance penalty). So any bug fixes have to be made in all my queries instead of just one function. (Alternately, I guess, I can create a table function that returns one row and cross apply it in my other queries. But I try really hard to not confuse the guy coming after me).

And that, really, seems to be the best way to achieve good performance in SQL: forget everything we’ve learned about programming in the last few decades and do all the things we thought we’d left behind.

I can understand a lot of the reasons why: I can understand the difficulties of ACID. I can understand the mishmash between procedural languages and declarative ones. I can understand the tradeoffs that have to be made between maintainablity and speed. And because I can understand all of that, I can learn it and I can practice with it, and I can (hopefully) write professional applications built on this tool with a minimum of whinging.

That said, I can still wish it would warn me when I did things that aren’t so good. SQL Server demands perfection, but I’m just not there yet.