David Linthicum
Contributor

Edge computing can be a data cache for public clouds

analysis
Apr 27, 20213 mins
Cloud ComputingData Architecture

Better performance and reliability plus lower costs and greater security make this architecture worth keeping in the toolbox.

toolbox full of tools and equipment for development, building and repair
Credit: dorian2013 / Getty Images

A data (or database) cache is a high-performance data storage layer that stores a subset of transient data so that future requests for that data are provided faster than by accessing the primary storage location of the data. In the world of edge computing, the “primary data” resides on the public cloud, and the edge device is somehow an intermediary of that data, sometimes providing decoupled data processing.

We already understand the use of edge devices as points of data processing that are closer to the producer of the data. The key advantage here is performance.

[ Also on InfoWorld: Amazon, Google, and Microsoft take their clouds to the edge ]

If the data does not have to be sent to back-end processing systems, such as on public clouds, then it can be processed immediately on the edge device. This is helpful when performance could be critical, such as shutting down a jet engine that is drastically overheating. You don’t want to check with a centralized cloud system to determine a course of action for that.

Of course, that’s just one architecture, which I call edge partitioning or tiering. This is when you divide the processing, and some data runs between the edge device and a centralized system and data store that typically run on a public cloud.

Another approach to edge architecture comes from the notion that an edge device can serve as a remote data cache as well. This is a bit different than partitioning; a partition has its own independent database or data store, as well as decoupled processing occurring on that data. A data cache is simply intermediate storage for data normally stored centrally. The data cache’s single purpose is to provide better performance and reliability.

For example, say you have an edge device that controls a factory robot. It’s connected to a centralized data and processing engine hosted on a public cloud. In this case, the edge device relies on the centralized system for the production and consumption of data, as well as to provide processing of that data.

Although the edge device controlling your factory robot does not have an independent database or data store, it does host a data cache. The most-accessed data is stored locally and is directly accessible by the edge device with almost no latency.

This is helpful when the network in the factory is less than reliable. However, there is not a core requirement of full-blown databases existing on the edge devices for this particular use case.

The advantage here is lower cost of operations and edge storage. By deciding not to place a decoupled database on the edge device, you don’t have to maintain that database or worry about sync issues with the centralized database. Moreover, the edge devices can be much smaller and cheaper—something to think about if you’re deploying thousands of them.

Security is much easier as well. If you’re storing data centrally, you can focus on security there. This does not mean that the caching system should be exposed, but it’s much easier to deal with than a complete database with more attack vectors.

The key idea here is optimization. Using edge differently, such as leveraging data caches on those edge devices, makes sense when you can save money and time, as well as reduce risks. It’s not the right architecture every time, but it is another tool to make sure you’re doing your best to serve the business.

David Linthicum
Contributor

David S. Linthicum is an internationally recognized industry expert and thought leader. Dave has authored 13 books on computing, the latest of which is An Insider’s Guide to Cloud Computing. Dave’s industry experience includes tenures as CTO and CEO of several successful software companies, and upper-level management positions in Fortune 100 companies. He keynotes leading technology conferences on cloud computing, SOA, enterprise application integration, and enterprise architecture. Dave writes the Cloud Computing blog for InfoWorld. His views are his own.

More from this author