Demystifying Core Data: A guide for newcomers

This post is intended for newcomers to the development on Apple platforms. Since Core Data will undoubtedly feel daunting for a lot of new developers, I decided to try to explain it in simpler terms.

If you are more comfortable with Core Data and feel some simplifications in the article may do more to confuse, please let me know on Twitter and I will fix it.

Also before we get started, I would like to thank Donny Wals for reading the draft and providing useful hints and feedback which made this post better. Donny is currently working on a book about Core Data, check it out.


Okay, with that out of the way, let's get started. If you check the Internet for definition, you can find something like this on Apple Developer:

Core Data is a framework that you use to manage the model layer objects in your application. It provides generalized and automated solutions to common tasks associated with object life cycle and object graph management, including persistence.

Or maybe something like this on the Wiki:

Core Data is an object graph and persistence framework provided by Apple in the macOS and iOS operating systems.
That does not explain a lot.

Before we get to what Core Data is in a simple terms, let's start with the why question.

Why do we need something like Core Data?

Apps often need to store non-trivial amount of important data. Typical example can be something like app for notes, messaging or basically any kind of app that downloads data from the server and wants to avoid potentially long load of the same data again.

While you could store data for basic notes app as a plain file, maybe with the help of Codable, this approach does not scale well. For each small change, you need to write the entire file again and if you later decide to modify your data model (in this case your Codable struct/class) you need to be very careful or otherwise your users can easily lose data, if their file is encoded with the old definition and you changed that. As you are likely aware, UserDefaults should not be used to store large amounts of data, but instead just user preferences.

To practically store and retrieve large amount of data, you need database. Normally standard databases require their own language (SQL) to retrieve and save data. If you wanted to directly use a database from your Swift code, you would need to manually construct SQL commands, execute them and then manually parse the result.

Thankfully we have Core Data which handles all of this for us.

What is Core Data really?

I think in the simplest terms, Core Data can be understood as a layer between the raw database and your Swift code. It internally uses SQLite database, which is basically one special file that holds the logic and data. What is important to us is that we can work with just Core Data and instances of Swift classes that get saved and retrieved from the underlying database by Core Data.

The "object graph" stuff and other fancy descriptions refers to the fact, that Core Data can intelligently monitor your classes, keep track of changes and when you call save on the object context, these changes get saved to the database. It also handles migration, which is term you find quite often in the SQL database world. Because these databases have strictly defined structure how they save data, if you decide to change your model (this means stuff like adding/removing properties to your classes, renaming properties or adding relationships), you need to do a migration. This is the process that makes sure the database structure (called a schema) is up to date and ready to work with new model definitions. Meaning if you have a property name, then the database has column to save this information.

If you were to use the database directly, you would need to do these migrations by hand and you could potentially really mess up the database. Fortunately Core Data can handle a lot of changes with automatic migrations which means you don't have to do anything.

Apart from getting and saving data, migrations, there is another important thing Core Data handles for you and these are relationships between objects. If you have some object for example Folder and it has an array of Notes, then you need to save this information, along with the actual folder and notes. In the SQL world, you would need to define special kind of keys, that would bind these together. If you setup relationships through Core Data, you get all of this automatically. So you can work with the related objects without thinking about manually creating and managing these relationships.

There is tons of stuff we could go into, but since this is an introduction, let's skip that and check out the parts that make Core Data.

Core Data parts

In this section, we will take a look at the Core Data stuff you encounter, when you start using it.

I am personally not a fan of the default Core Data setup you get if you check the Core Data box while creating a project in Xcode. It kind of feels like a magic and something only Xcode can add to your project... Anyway, that it also a thing for another time.

.xcdatamodeld

This is the Core Data model file, typically named something like "Model", "Database" or same as current project. This is basically a template for Core Data to tell it what kind of objects we plan to store, what are their properties and relationships between them.

If you open it in Xcode you can add entities and customize their properties among other things. These entities then have corresponding classes that inherit from NSManagedObject. You can either let Xcode generate those, or create your own which is something I always do and recommend.

This file is nothing mysterious, it is just a XML file. (XML is standardised format for data exchange, which just means various programming languages know how to open it). In fact we can inspect this file and see the inner contents.

Xcode won't let you see the source file but you can find it on a disk and open in any text editor. If you select one that has XML syntax support it will be easier to read.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<model type="com.apple.IDECoreDataModeler.DataModel" documentVersion="1.0" lastSavedToolsVersion="17511" systemVersion="19H2" minimumToolsVersion="Automatic" sourceLanguage="Swift" userDefinedModelVersionIdentifier="">
    <entity name="Joke" representedClassName="Joke" syncable="YES">
        <attribute name="id" attributeType="Integer 64" defaultValueString="0" usesScalarValueType="YES"/>
        <attribute name="punchline" attributeType="String"/>
        <attribute name="setup" attributeType="String"/>
    </entity>
    <elements>
        <element name="Joke" positionX="-63" positionY="-18" width="128" height="88"/>
    </elements>
</model>

This is the entire contents of a data model that defines single entity Joke. Unlike something like .storyboard or .xib these files are pretty readable.

NSManagedObject

All Core Data classes subclass this class which makes them work with Core Data. These are standard Swift classes with a few extra annotations. Above we saw the Joke entity represented in the model and here it is as a class:

import CoreData
class Joke: NSManagedObject {
    @NSManaged var id: Int
    @NSManaged var setup: String
    @NSManaged var punchline: String
    @NSManaged var created: Date
}

The @NSManaged is special annotation that lets Core Data work with these properties in a special way. It is not a property wrapper, although the signature looks the same. Thanks to this attribute Core Data can monitor changes to the object properties and therefore know what needs to be saved.

It also lets Core Data populate these properties as needed, they may be empty without us knowing or caring about it. Core Data entities are frequently returned as "faults", which in the jargon means that these are empty objects and the data is populated once your app asks for it. This is great optimization. If you get 1000 objects from Core Data and your collection view only displays 20 of them without user scrolling, the rest of the objects don't need to have their data present in the memory.

NSPersistentContainer

This is the "main" class that encompasses working with Core Data. Its responsibility is to load the data model (the .xcdatamodeld file) and possibly complain if it can't find it or there are missing classes for the defined entities.

You typically instantiate it with the name of the model file and then call loadPersistentStores to load it.

It also has very important property viewContext which is NSManagedObjectContext and we will look at it next.

NSManagedObjectContext

This class lets us get data from the database and also save it. You get data by calling fetch on this context which requires an instance of NSFetchRequest. Your app typically has one main context (which can be enough in a lot of cases). This context has associated entities, so it can track changes and save them if needed.

That is also the reason why you need to pass context in the NSManagedObject init. This way you associate the entity with context which will manage it. Apart from fetch already mentioned, you will likely use save method a lot and there is also hasChanges property to check before save.

Before we move on, we need to take a step back to NSPersistentContainer. We can call performBackgroundTask on it, which will get us a closure with background NSManagedObjectContext as an input parameter. This way we can pretty easily execute background database work without affecting the visible performance of the. You can use fetch on this background context, modify entities and then save it.

NSFetchRequest

An instance of NSFetchRequest tells the NSManagedObjectContext what you want and in what fashion. It is generic over the NSManagedObject which lets us specify what entity type we want to fetch.

You can then setup filter (using NSPredicate) and also sorting (using an array of NSSortDescriptor instances). In basic cases only the sorting is needed, because you likely want to show all the data to the user.

There are a few advanced uses cases, but I think this is fine for now.

NSPredicate

This is the "filter method" of Core Data world. It has a weak spot because you need to specify it as a special format string and if you make a mistake you find out only when running the app.

Here is a very basic example:

NSPredicate(format: "%K == %@", #keyPath(Note.wasDeleted), NSNumber(value: false))

This uses #keyPath feature of Swift to at least have some safety. Let's look more closely at the short format string: %K == %@. These percentage-somethings mark parts of string that should be substituted for a value. The %K is special one reserved for keypaths and the %@ is for objects. The value inside %@ will be surrounded by quotes. NSNumber is an ObjC holdover and kind of acceptable way to work with bool values in NSPredicate.

The predicate above will filter Note entities that have wasDeleted property set to false. Marking items as deleted instead of deleting them is called "soft deletion" and it is pretty useful. You can easily implement something like "Trash" feature and also if you are handling cloud synchronization, it is basically a must to keep track of deleted items.

We could rewrite the predicate above in different ways:

NSPredicate(format: "wasDeleted == %@"), NSNumber(value: false))

This is shorter, but if we rename wasDeleted in the future, this will stop working.

Another, even shorter, option would be this:

NSPredicate(format: "wasDeleted == NO"))

This requires the knowledge that false maps to "NO" and true maps to "YES". I think the first most verbose option is the safest bet and less likely to break.

NSSortDescriptor

With instance of this class, we can tell NSFetchRequest how to sort our entities. You can provide an array and the order will determine how the items will be sorted by multiple properties.

The init expects a key and a bool to specify whether the sorting is ascending or descending (12345 vs. 54321). Similar to NSPredicate I would highly recommend to use #keyPath to avoid potential problems when renaming attributes.

NSSortDescriptor(key: #keyPath(Note.title), ascending: true)

This is basic example that will sort Note entities by their title. If we wanted to also show favorite notes first, we would pass this array to NSFetchRequest

[
    NSSortDescriptor(key: #keyPath(Note.isFavorite), ascending: true),
    NSSortDescriptor(key: #keyPath(Note.title), ascending: true)
]

We would first get all the favorite notes sorted by title and then all the rest, also sorted by title.

NSFetchedResultsController

Commonly shortened as "FRC", this is a class basically made for UITableView and UICollectionView. It will manage fetching the data from the database for you and also tell you then number of sections and items in a particular sections, which is required for the data sources implementation.

Another great benefit is that it notifies us for changes in the database. We can then implement delegate method and react accordingly - insert new rows, delete rows or update. This is not trivial but it is still easier then doing all of this manually.

Since iOS 13, I would recommend going with the diffable datasources which greatly simplify working with these collection views and FRC. But that is a topic for another day.

I hope this post helped you understand the basic of Core Data and if anything is missing or not clear, please feel free to let me know on Twitter and I will do my best to improve this.

*Thanks for reading. And now go build something with Core Data :-) *

Looking for more? You can check my approach setting up Core Data stack in iOS apps or minimal example to setup Core Data with diffable.

Uses: Xcode 12 & Swift 5.3


Follow me on Twitter for latest updates and news