Advanced string comparison and sorting in Swift

Going beyond standard contains() and hasPrefix() methods. Post explains locale-aware sorting and how to ignore diacritics.

I bet you are well aware of the contains() method available in Swift to check if one string well.. contains.. substring 🙂 We can also readily use methods hasPrefix() and hasSuffix() to check if our string starts or ends with particular substring.

The first issues arise when we have to consider case. This is very important when dealing with user input. One solution is to convert to either uppercase of lowercase and then use the standard methods mentioned above.

If we want to compare two strings and ignore case, we don't have to uppercase or lowercase them and instead use method caseInsensitiveCompare() which does exactly what its name suggests. There is small difference because the return is not bool but instead an instance of ComparisonResult. This is an enum that will additionally tell them which strings comes first when sorting.

var str = "Hello, playground"
str.caseInsensitiveCompare("hello, playGround") == .orderedSame

We can compare the result to .orderedSame which in this case will be true. There is also locale-aware version available.

There is no built in way of having hasPrefix() or hasSuffix() be case insensitive. In this case we have to resort to once again user uppercase or lowercase form or use the handy folding() method on string.

This can be very handy for more complex tasks. For example in my native language we have some "funny" letters like č, ě, ž and more.

If I wanted to match my surname even when the strings uses c instead of č and e instead of ě I can use folding() to modify the input before comparison.

let surname = "Němeček"
let folded = surname.folding(options: .diacriticInsensitive, locale: Locale.current)
// folded = "Nemecek"

You can also specify any other locale to take regional rules into account. Since we have touched on Locale already, let's check how to order strings correctly based on users locale. So once again, the č from my language comes right after c but with standard Swift sorting it will be at the end after z.

let letters = ["c", "a", "d", "x", "č"]
print(letters.sorted())

And the output is: ["a", "c", "d", "x", "č"]

Let's fix this:

let letters = ["c", "a", "d", "x", "č"]
print(letters.sorted(by: { (s1, s2) -> Bool in
    return s1.localizedCompare(s2) == .orderedAscending
}))

Which results in: ["a", "c", "č", "d", "x"] exactly what we are after.

Initially I wanted to write super comprehensive guide on formatting but then I had realized my goal is not to write an encyclopedia on string formatting in Swift but rather show the common use cases and how to use it effectivelly.

I believe it is much better to be aware that these options exist and know where to find them than to memorize all available APIs.

Thanks for reading!

Uses: Xcode 11 & Swift 5