String Obfuscation

Online services and APIs are an inseparable part of most apps. Often they require the use of a secret key to identify the subscribing client, usually is in the form of a long string of alphanumeric characters. Invariably, it would be a bad thing™ for a malicious user to get their hands on this key. Perfect security is impossible, but there are some simple steps you can take to make it more than trivially easy for snoopers to extract your API keys from your app.

strings: A Snooper’s Best Friend

There is a command line app called strings that is designed to scan binary files and print out anything it thinks is a string embedded within. Here’s the description from the man page:

Strings looks for ASCII strings in a binary file or standard input. Strings is useful for identifying random object files and many other things. A string is any sequence of 4 (the default) or more printing characters ending with a newline or a null.

Here’s a tiny bit of the output when I pointed it at the Photos application binary:

Burst favoriting action doesn't currently support the 'none' option
-[IPXChangeBurstFavoritesAction _revertFavoritingChanges]
***** Burst action: _revertFavoriting changes
Will undo/redo for Keep Selection option
Total: %ld. Trashed: %ld. Untrashed: %ld. Pick type set: %ld
Will undo/redo for Keep Everything option
Total: %ld. Fav: %ld. Unfav: %ld.
Warning: burst == nil
-[IPXChangeBurstFavoritesAction _setFavoritingOnVersion:stackPick:]
Invalid state: version exists in both favorite and unfavorite sets for action.
Burst Change - Favorite: %@
Burst Change - Unfavorite: %@
IPXChangeBurstFavoritesActionKey
IPXActionAlertThresholdMessageGeneric
IPXTestActionProgress
-[IPXActionProgressController endModal]
/Library/Caches/com.apple.xbs/Sources/PhotoApp/PhotoApp-370.42/app/spark/Source/Actions/IPXActionProgressController.m
-[IPXActionProgressController performActionSelector:]
Invalid selector
-[IPXActionProgressController checkModalSession]
Progress window still visible after action complete. Possibly hung? Action log:
appIcon

In the case of the Photos app, strings found 38675 string candidates. A lot of them were garbage, and there were literally thousands of Objective-C selectors, but there were also a lot of strings that were obviously never intended for user consumption. If it’s a string in your code, it will be found by strings and you can bet that someone snooping for API keys has pattern matching schemes that will make them trivial to find.

Obfuscation Basics

The easiest way to prevent Strings from finding your API keys is simply to not include them as strings. However, do not think that putting a sequence of ASCII bytes into an array is going to help you, if your array’s bytes match the ASCII codes for the characters, you’ve just made a cumbersome string and it will probably still be detected as such.

A good first step for obfuscation would be to mutate those bytes in some way so that they don’t all fall within the ASCII alphanumeric range. The two simplest, non-destructive ways of doing this would be:

  1. Invert the bytes by subtracting them from 255. So, a value of 10 becomes 245 and a value of 50 becomes 205, etc. Note: this is identical to using XOR with a nonce of 255.
  2. XOR each byte with a single-byte “nonce”, which is just random number between 1 – 255 (XOR with 0 produces no change). XOR is a reversible operation: if you XOR with a given byte twice, you end up with your original value. In practice, you’d want to pick a nonce byte that has at least 3 of the 8 bits as 1s to ensure sufficient mutation of your API key bytes.

Then you would simply store the converted bytes in your app instead of the string and convert it back to the string by reversing the operation at runtime to produce the original string.

To be quite honest, either of these approaches is probably good enough. But if you want to be more thorough…

Multi-byte XOR

If you are using the single-byte XOR approach from above, your API key would be safe from a simple strings search, but there are still only 254 ways you can possibly obfuscate the string and a really determined snooper might still be able to find it. Let’s make their job exponentially harder and use a multi-byte nonce!

The basis for this approach is a new Sequence type I created called RepeatingSequence. The general idea is that it initializes with any collection type and returns the elements in sequence, wrapping back to the first element once the last one has been emitted.

This lets us use a sequence of random bytes instead of just one. I created a Playground that you can use to generate a multi-byte nonce and use it to encode a string. Then, just include the byte array it prints out instead of the string in your app.

Reconstituted Bytes

Of course, that byte array isn’t going to do you any good unless you can turn it back into a string. Here’s a struct with a static method that does just that:

This code is pretty simple to incorporate into your workflow and can give you a lot of peace-of-mind that your app’s API keys won’t be trivially easy to steal.

Protocols, Default Implementations and Class Inheritance

Say you have the following setup:

  • A protocol named Doable that defines the doSomething() method.
  • A default implementation for the doSomething() method in a protocol extension.
  • A base class that conforms to Doable, but does not implement the doSomething() method itself.
  • A sub-class inheriting from the base class which provides a custom implementation of the doSomething() method.
  • An array of mixed base and subclass instances that is type [Doable].

The results of invoking doSomething() on all elements of the array may surprise you. When the for loop / reduce / whatever invokes doSomething() on a member of the array which is a subclass, you will not get the subclass’ custom implementation. Instead, you will get the default implementation!

When the runtime goes looking for doSomething() on the current object (of type Doable) in the loop, it looks to the object which actually conforms to the Doable protocol, which is the base class. The runtime checks to see if the class implements the method, and when it sees that the base class does not, it falls back to the default implementation, rather than seeing if the subclass implements it. Apparently, the subclass is only checked in instances where it is overriding a method explicitly defined on its superclass.

So, the solution is actually quite simple:

  • Provide an implementation of doSomething() on the base class. It can just be a duplicate of the default implementation, if that’s the behavior you want for it.
  • Change the subclass’ doSomething() implementation to include an override declaration.

That’s it! The next time you run your loop, the sub-class will have it’s doSomething() method called. I made a playground for you to check this out (turn on documentation rendering):

CIColorCube

The CIColorCube filter is quite an interesting beast. It is incredibly hard to set up properly, given the odd data requirement, but can recreate very complex color effects efficiently.

inputCubeData

The cube data is a NSData / Data object containing a sequence of floating-point color values where the red, green, blue, and alpha channels are represented not by the usual 8-bit UInt, but by 32-bit Floats. This is Core Image’s internal working color format, which allows much greater precision when mixing colors and prevents rounding errors. The size of the NSData must be precisely (size^3 * 4 * sizeof(CGFloat)) bytes where size is one of the following: 4, 16, 64, or 256. That is to say, the width * height * depth * 4 color channels * the size of a CGFloat.

The CIColorCube documentation describes the format the data should take:

In the color table, the R component varies fastest, followed by G, then B.

Using this rule, we can produce a reference image that looks like this:
colorcubeimage64

Certainly not your standard spectrum image, but it’s designed for Core Image’s consumption not our aesthetic enjoyment.

PNGs are the key

One major problem encountered working with cube data as NSData / Data is that it is quite large. Color cube data with a dimension of 64 requires (64 * 64 * 64 * 4 * 4) = 4,194,304 bytes or 4 megabytes. Each color cube you store in your app consumes 4MB of storage, which is pretty excessive! Luckily, there is a better way.

While storing the color cube data as CGFloats might be more precise, it is almost never necessary to have that level of precision when defining a color effect. We can use PNG images to encode the data for the color cube in a much more efficient format. For example, the reference image I included above (which is for a size 64 cube) occupies only 8kB on disk!

The other primary benefit of storing the data as a PNG is that we can use readily-available bitmap editing programs like Photoshop to modify them. This is crucial unless you want the Color Cube filter to produce output that looks identical to the input.

Here is a Gist that has Swift 2 and Swift 3 versions of a class that can generate these reference images: https://gist.github.com/JoshuaSullivan/5951e08ff0f3e155ef52220a181864e8

Alternatively, if you just want to download and use the images, you can get them here: Color Cube Reference Images

Create your effect

The next step is to create a color effect. You can use any kind of color transformation you like on the reference bitmap. It is important not to use any distortion, blurring or other kinds of filters that would change the layout of the pixels, unless you’re interested in some extremely glitchy looking results.

In Photoshop, I find it handy to work on an actual photograph, applying the filters as layer effects until I have something I’m happy with. Then I simply copy the layer effects onto the reference image and save the result. Here are some examples I made for my Core Image Explorer app:

This is a very high-contrast B&W filter, approximating having a deep-red filter on the camera using black and white film.
highcontrastbwcolorcube

This is an inverted B&W filter which recreates the “hot black” infra-red view of a scene.
hotblackcolorcube

Changes the whole scene to shades of blue, as in a cyanotype photograph.
justblueitcolorcube

Changes things to a low-contrast green filter that approximates the view through night vision goggles.
nightvisioncolorcube

The only limit for creating your effect is what you can imagine and accomplish without any pixel rearrangement.

Apply the effect

The final step to applying the effect is to use the PNG you have created to an image in your app. I’ve created a class which converts the PNG files to NSData / Data for the Color Cube filter:

https://gist.github.com/JoshuaSullivan/b575d96bfd8a4177d8d46352e5f36458

The usage is simple; at runtime, simply pass the static method your effect PNG as a UIImage and the color cube size that you’re using. The class will validate the image size and then attempt to convert the 8-bit per-channel PNG data into the 32-bit per-channel format that is required by Core Image.

Once the filter has been created, you can use it with whatever input image you want, including video input. Color Cube is a very performant filter, so it is a fantastic way to include color transformations into a filter stack.

New Core Image filters in iOS 10

Unlike the tidal wave of new filters we saw in iOS 9, we only get 6 new filters in iOS 10. Here is our source image, a fetching portrait of Sir Jony Ive receiving his knighthood. Click any of the images to view the full-sized version.

unmodified

CIClamp

CIClamp is very similar to CIAffineClamp, except without the applied Affine Transform. The main input parameter is a rect which defines the image region which is unmodified. Everything outside that rect is just repeated edge pixels.
ciclamp

CIHueSaturationValueGradient

This filter is capable of producing a color wheel of arbitrary size. Input parameters include the radius of the wheel and the color space used. Very handy if you want to create a color picker tool in your app. Note: The color wheel is bottom-left aligned within the image bounds because Core Image’s origin point is the bottom-left corner, not the top-left corner as in UIKit.
cihuesaturationvaluegradient

CINinePartStretched

Specify a 9-part region in an image and it can be scaled up just like the image slicing in the Asset Catalog.
cininepartstretched

CINinePartTiled

Similar to above, the slice pixels are tiled rather than being stretched. This would be more applicable to things like custom interface frames than photographs of people.
cinineparttiled

CIThermal

Available for years as part of the Photo Booth app, developers now have access to the faux thermal imaging effect.
cithermal

CIXRay

Also a long-time part of Photo Booth, the faux X-ray filter is now available to everyone.
cixray

Optional Return vs throws: Pick One

Swift 2 provides a number of ways to indicate a problem in a method that returns a value, but the two most common are optional return values and throwing errors. Which you choose comes down to your requirements for a particular function, which I will discuss below. But if you take nothing else away from this post, take this: Don’t use both!

That is not to say you can’t use both approaches in your app, just don’t use them both in a single function.

Optional Returns

Pros: Simple to write, simple to handle
Cons: Uninformative

The most common method for indicating failure in a function that returns a value is to make that value optional and return nil when a problem occurs that prevents the successful completion of the function. This has been available since Swift 1.0 and the language has a lot of syntactical features to allow you to efficiently detect and handle nil responses.

Optional returns should be your first choice when all you care about is whether or not a value was returned from a function, not why the function may have failed.

Example:

Throwing Functions

Pros: Informative errors allow robust recovery options, non-optional return values
Cons: More verbose syntax to handle errors (if you don’t just ignore them with try?)

Swift 2 introduced a robust error handling mechanism to Swift. Functions marked with throws can throw errors which calling objects must explicitly handle or ignore. Having a well-defined set of errors allows calling objects to implement an intelligent recovery plan in the event of a failure, such as correcting erroneous input and trying again. The down-side is that writing error handling code gets verbose, with the do-catch blocks.

Example:

Combining Optional Returns and Errors

Don’t! If you don’t want to bother with errors, then your method should just return an optional value. If you do go to the effort of adding errors, let them communicate problems and guarantee a non-nil return value from your function. If the calling object doesn’t want to handle the errors, it can simply invoke your function with try? and treat the return value as optional. If the calling function does handle the errors, allow it to forgo the extra steps of handling optional return values.

Just how expensive is creating an NSDateFormatter?

One of the many mantras that is drilled into iOS devs is that you shouldn’t recreate expensive, complex objects like formatters every time you need them. Instead, keep them around in a static or instance variable and use them as needed. To test this, I created a playground and put this file in the Sources folder:

I didn’t put the code directly into the playground because doing ANYTHING in a playground 100k times is going to take a very long time and not be very representative of real-world conditions.

Here’s the playground code, which simply invokes the tests and records the results:

The results were unequivocal: creating an NSDateFormatter each time you need to use it is roughly 9x slower than creating it once and using it repeatedly. That said, this is only likely to be an issue for tasks where there is a lot of date formatting going on, such as in a table view with dates in every cell. If you have a situation where you only need to format a single date infrequently (today’s date in a page header, for example), then you shouldn’t worry about hanging on to the date formatter; even though it’s heavy, it only takes a tiny fraction of a second to instantiate one.

CONVENTIONAL WISDOM: CONFIRMED

Addendum: At the request of a coworker (@rexeisen), I added in a test of the static NSDateFormatter.
localizedStringFromDate(_:dateStyle:timeStyle:)
test. As you can see, the results are no better than creating an instance each time. That said, it would be more convenient to use for 1-off formatting tasks, where keeping the formatter around is unnecessary.

Addendum II (2016-01-15): Further testing has revealed that changing the timeStyle and dateStyle of an NSDateFormatter is tremendously expensive. Even more so than just creating a new formatter for each use! Across several trials, performance using a single NSDateFormatter that is re-parameterized on each use was 10% slower than creating a new formatter each time and a full 10x slower than using a single, pre-configured formatter. The take away here is, create a formatter for each repeating case, don’t try to make a single shared formatter do all the work.

Using didSet to configure IBOutlet views…

Swift’s didSet property observer is a great way to configure views linked via IBOutlet. It allows you to set properties that need to be dynamic at runtime or that can’t be configured via Interface Builder:

However, it’s worth keeping a simple rule in mind when you go about configuring your views with didSet:

Don’t reference other IBOutlet views or implicitly unwrapped properties in your didSet block.

The reason for this is simple: you have no way of knowing which order your IBOutlets will be set in, so the other view or property you’re trying to access may not be there. At best, you’re going to be nil-checking a lot and end up only partially configuring your views, requiring follow-up elsewhere in code. At worst, you’re going to accidentally force-unwrap a nil and crash your app.

It’s okay to reference external objects, as seen in the call to the StyleManager above, as long as they’re non-optional and not subject to race conditions. The proper place to establish things like view layout relationships or properties from one view that copy the properties from another is still in viewDidLoad.

Avoid [unowned self] whenever possible!

One of the confusing aspects of Swift is how capture semantics work with closures. When used improperly, they can result in retain cycles or crash the app with the dreaded EXC_BAD_ACCESS. It is worth keeping in mind that capture semantics only apply to reference-based objects (classes); value objects can be used freely without worrying about this. The situation I’ll be looking at here is the capture of self, which is far-and-away the most common situation.

Many asynchronous processes such as API calls include a completion closure, and the most common way to provide it is as an inline closure. In this example, we’re storing the API request object that is created by the call to our API client so that we can allow the user to cancel the request if it’s taking too long. Let’s assume this code appears in our LoginViewController class.

Whoops! The compiler is mad at us because we have calls to other methods in the class which have an implicit self in front of them. Swift requires that you be explicit about capturing references to objects to avoid unexpected behavior. Fine, let’s add self:

Better! However, the closure is now holding a strong reference to self and the LoginViewController is holding a strong reference to the closure. This creates a retain cycle and will cause the LoginViewController to be kept alive, even if the user navigates away from this screen. Even if the login was successful and the completion closure was invoked, it still exists and maintains its capture of self unless you explicitly nil the reference as part of the completion. We don’t want to go leaking view controllers all willy-nilly, so let’s try that [unowned self] thing we saw in some WWDC video:

Right, now we definitely don’t have a strong reference to self! However, users are reporting the app is crashing when login is taking too long and they leave the screen before it completes. Looking at the crash logs, you see a rash of EXC_BAD_ACCESS events occurring. Because you declared self was unowned, it was deallocated when the users left the screen, but when the API call completed, it attempted to call the methods referenced in the closure to disastrous effect.

Using unowned as a capture semantic is the equivalent of force-unwrapping an optional. It’s never a great idea and should only be done when you are 100% sure there’s no chance the captured object will be deallocated before it is invoked. When you’re dealing with long-running asynchronous tasks like API requests, this is a bad bet unless the originating class is a singleton or some other pattern which will guarantee the object exists for the lifetime of the app.

For all other cases, consider this pattern using weak capture semantics:

The secret sauce is line 4, where we guard against the possibility that the LoginViewController was deallocated prior to the API request completing. If self no longer exists, it doesn’t care about the API result, and we can bail out of the closure right away. We simply use the strongSelf reference for the remainder of the closure to avoid unwrapping self at every step and we’re good to go!

Why you shouldn’t mourn the removal of –, ++ and C-style for loops from Swift

One of the neatest things about Swift going open-source earlier this year is that the deliberation process for the future of the language, including breaking changes to the syntax, is out in the open. Case in point are the two accepted proposals to remove the unary -- and ++ operators and to remove C-style for loops.

The case against — and ++

View the proposal.

Chris Lattner, the principle architect of Swift, has stated in the past that the ++ and -- operators were added very early in Swift’s inception simply because Objective-C had them. Now that Swift has had a chance to mature, there are a few factors which indicate they are a poor fit for the language. First and foremost is that they are confusing as hell to programmers who haven’t already spent time banging their heads against them in one of the C-derived programming languages. Consider this case:

The trailing versions of these operators are particularly confusing, where the value is being returned prior to being changed. This difference in pre- and post-incrimenting of the variable is a particularly fruitful source of errors in code, with things like index values going out of range or holding unexpected values because the wrong operator was used.

The strongest case for keeping them is their brevity, but Swift does not favor brevity over security and, as Chris Lattner points out, the more expressive n += 1 is hardly an onerous amount of typing. The main use for the operators seems to be in C-style for loops (based on a survey of Swift-based GitHub projects). Thus, with the imminent removal of those for loops from the language, the main use-case for the operators will die with them.

The case against C-style for loops

View the proposal.

The C-style for loop, like the unary increment and decrement operators, were added early in Swift’s development simply because Objective-C had them. As Erica Sadun so eloquently points out in her proposal, they’re a hold-over from an earlier era of programming and have a complex and error-prone syntax. They accomplish nothing which can’t be accomplished in a more succinct and expressive fashion using the Swift for in loop. Consider these two examples, both of which combine strings from an array to a base string and prints them out:

C-style for loop

Note that the for loop syntax is completely non-expressive. There’s no indication of what each of the “;”-separated fields aims to accomplish…you have to already be familiar with it.

Swift map() function

“But wait,” you might say, “I need the index value as well!” There are a couple of ways to do it in Swift without relying on C-style for loops:

Method 1: enumerate()
The handy enumerate() method is present on all collections conforming to SequenceType is one way:

Method 2: for-in over a Range

The 2nd method is stylistically closest to the C-style for loop, but it is still much easier to understand what values i will hold and isn’t subject the problem of a statement inside the for loop modifying the index and causing it to go out of bounds (i is constant).

Conclusion

It can feel a little jarring to lose language features, but with some thought it is clear to see that the removal of these 2 features will result in a language that is more expressive and less error prone.

Don’t use Swift enums to contain magic strings!

At first glance, using a Swift enum with a raw type of String seems to be a great way to package (or, if you like, enumerate) magic strings used by things like Notifications:

However, this is a poor application of the Swift enum for the following reason: you are not interested in the enum case, only its raw value. Any place you want to use the magic string in your code you’re forced into fully qualifying the enum case (because the argument type is a string, not the type of your enum) and then accessing the rawValue property.

A better approach is to use a Swift struct with static constant string members defining the magic strings:

The look is very similar to an enum, but in practice it ends up being shorter and cleaner to use:

Now, this argument is moot if you have a situation where methods in your classes take your enum type as an argument and use the rawValue at some point internally, but for things like userInfo dictionary keys, user defaults keys, notification names, segue names, etc. you are better off with the struct approach, since the string is all you’re interested in.

2015-12-16 Addendum:
As with any advice on using Swift, this should not be viewed as an Immutable Truth of the Universe™. There are still situations when an enum would be a perfectly reasonable container for your strings: namely, when you have a model built around the use of the enums and not just the strings they contain.