Introduction to Data

Data is a critical component of many applications. On the surface, data access appears simple - you have a store for your data, whether that be a traditional RDBMS system, a modern NoSQL store, or local storage. The reality, however, is much more complex.

The Kinvey Platform's data abstraction provides a framework for dealing with these complexities, providing developers tools that simplify the development of rich, data-driven apps.

Collections

The core data abstraction of Kinvey is the collection. A collection is simply a group of related data entities. Kinvey makes no assumptions about the makeup or schema of a collection (other than a few pre-defined system properties).

The data collection can be backed by any data source. Regardless of the data source, Collections provide a common interface for the application developer, whether via the Kinvey SDK or via the REST API. This simplifies client development by making all data access behave consistently.

There are two types of collection data sources:

KinveyData: A document-based NoSQL data store.
Service: A service can be used to for intergration to any external data source. There are two types of services:
- RapidData: A set of configuration-based data services for integrating with common data sources or protocols.
- FlexData: For non-standard or complex integrations Kinvey provides the Flex-SDK that allows you to write your own Node.js services in order to integrate with data sources for which there is no RapidData connector, or for implementing integrations with more complex data orchestration needs.

Services and Service Objects

Data integration is accomplished by the use of a Service. A Service can be either a RapidData or FlexData service. In both cases, data services expose Service Objects. A Service Object is a representation of some collection of data in a remote system - whether this be a database table, a Salesforce Object, or a REST endpoint and represents that object's properties.

A Service Object can map and transform data from the remote data source object. It is common for Service Objects to represent only a subset of properties, and to rename, flatten, or expand properties to allow them to fit into a more Object-based schema as represented in JSON.

When defining the field mappings and transformations, the goal should be to design the mappings to meet the format of what the application needs to implement its models.

A "Service Object" also allows you to select what types of CRUD operations are allowed on your data source object. For example, you may want to prevent the app from deleting from your data source. By selecting one or more data handler methods, you can define the actions apps can perform on your data.

A Service Object exposes several data events (e.g. onInsert, onUpdate, etc). Each data event on a Service Object can be enabled or disabled, allowing you to select which data operations are allowed. For example, you may want to prevent the app from deleting from your data source.

For RapidData Services, the data events are automatically mapped depending on the data source type. For some data source types (e.g. Rapid REST), the default behavior of the data event can be overriden.

In the case of a FlexData service, each data event that you want to support needs to be implemented with a Flex Function handler.

To complete a data integration, Service Objects need to be exposed to client by mapping a collection to it.

Offline Considerations

In many contexts today, apps are expected to work whether or not they have internet connectivity. Kinvey's client SDKs provide several tools for working with offline data by providing both cache and sync local datastores.

A cache datastore is used to temporarily store received data in local storage for performance optimization, and to allow the app to continue to work with data when the device goes offline. This type of datastore is ideal for apps that are generally used with an active network, but may experience short periods of network loss.

A sync datastore is used to pull a copy of your data to the device and work with it completely offline. The library provides APIs to synchronize local data with the backend. This type of datastore is ideal for apps that need to work for long periods without a network connection.

When building an application, you should determine which offline model fits your user's needs. In general, a cache datastore provides an ideal balance of offline use and performance along with real time access to data and fewer conflicts. Sync datstores should mainly be used for apps where offline is the rule, not the exception (e.g. a field service app where internet connectivity is unlikely in many scenarios), but it comes with the tradeoff of more stale data and higher potential for save conflicts than the cache datastore.

Data Acceleration

Data sources have various data access performance characteristics, ranging from the very fast (sub-second), to the very slow (multi-second). As noted above, in modern applciation contexts, slow data access is not acceptable to users. There are several tools and practices that can accelerate slow systems of record to keep the app experience positive.

By utilizing Cloud Caching, developers can store commonly used data from slow systems in the Kinvey Cloud for fast retrieval.
With Delta Sync, apps can only retrieved new and changed data when synchronizing data on a device.
By parallelizing asynchronous requests, the client can quickly retrieve data from multiple sources quickly while other actions are taking place - rendering, login, etc.

Best Practices

Modern data-driven applications have different needs and considerations than traditional client/server or website models. For developers used to working in these environments, this may require a certain shift in design approach.

Don't build mobile apps like a traditional website (page loads, click link, new page loads once data is retrieved). Instead:
- Create user-centric pages that focus on specific tasks that a user needs to complete
- Instantly render a new page, rendering any local data you may have
- Load only the data you need from the cloud (by applying filters and selecting only the properties you need for your model)
- Design your data model to facilitate this
Render the UI as quickly as possible - do not sync or cache upon login except to "Warm" the store, and then only in the background.
Load data as you go - don't try to load it all at once before presenting it to the user. Instead, utilize paging to load small segments of data, and lazy load the rest as the user naviagtes through the aopp.
If you need to sync, do it in a background thread. Don't prevent the user from using the app during a sync.
Break up larger transactions from the client into manegable pieces. Several smaller transactions are preferable to one single large transaction.
Avoid RPC-style collections, such asgetMyOrders, saveMyOrders. Treat each collection as an object that has various actions (e.g. Save, Fetch, etc.).