Home Blogs Cloud Computing How to screw up data migration to the cloud

by David Linthicum

Contributor

How to screw up data migration to the cloud

analysis

Aug 06, 20214 mins

Cloud ComputingData ManagementData Quality

Many enterprises move their data problems to the cloud. Invest the time and money to clean up your data so that it can be more valuable to the business.

To be kind, most enterprise data is less than optimal. Want to test this statement out at your company? Just ask where the customer data of record resides. If you ask someone in four different departments, you’ll get four very different answers.

This issue is the natural byproduct of 20 to 30 years spent creating new databases using whatever database was popular at the time. This includes databases for mainframes, big relational databases, open source SQL, object databases, and now, special-purpose databases.

Heterogeneity and complexity problems are an undeniable reality for those looking to move terabytes of data to the cloud. You must find a database analog in the cloud that is either an exact brand match or one that requires a minimal amount of restructuring and conversion. Unfortunately, this approach perpetuates the database silo problem. It’s a classic and seemingly endless example of kicking the can down the road for the next generation of IT.

The trouble is that the “kick the can” path is relatively cheap. The “fix everything” path? Not so much. Those with a short-term view often find that migrating data to a public cloud provides no real gains in cost savings, agility, or productivity. Indeed, the problem that resided in their data center is now a problem that resides in the cloud.

The pandemic drove many organizations to create a larger role for the public cloud within the enterprise. Most enterprises just want their move to the cloud to be fast and cheap. That means they take a lift-and-shift approach to data migration. At first, this method may make budgetary sense. However, taking the long view, lift and shift means you’ll have to migrate your data twice: once, the wrong way, and second, the right way.

Here’s the bad news: The most effective data migration efforts take years, not months.

Today, there are some who look at migrating data to the cloud as an opportunity to finally fix their enterprise data—to make data a first-class citizen and do wonderful things with all the data their enterprise has collected over the years.

The best migration efforts focus on normalizing and improving all the data as it moves to the public cloud. Here are three fundamentals of a more effective data migration:

Single source of truth. One database should manage data about customers, inventory, sales, etc. It should not have to gather data from 20 different places and deal with the resulting data quality issues.

This may mean major surgery on your data, and perhaps the normalization of your database after 30 years. However, this is a basic step that makes enterprise data more usable and more valuable to the company.

Heterogeneous metadata management. An abstraction layer exists over all cloud and on-premises databases that allows us to alter the structure and meaning of the data and to do so from a single interface.

Data virtualization. A common architecture trick is to leverage data virtualization. This allows you to view any number of physical databases that you can virtually combine or split to meet your existing needs. The power of data virtualization is that it does not require back-end physical database changes to restructure data. It’s a quick way to move databases to the cloud and still deal with data in much more efficient and agile ways.

If this sounds like new technology, it’s not. Data virtualization has been around since the 90s and can now be had in the public clouds. Some view data virtualization as cheating. It’s actually a sensible compromise if there is only a small budget to augment and improve data moving to the cloud.

If you want to lock-in failure, relocating your database as-is to a public cloud will ensure it. Let’s face facts; your data is probably a mess. There comes a time when Band-Aids can no longer hold together decades of data slices and dices. It’s past time for most enterprise data to undergo the surgery required to fix the underlying problems.

Simply moving the problem to the cloud simply creates a bigger problem. Do you really want to be that company?

by David Linthicum

Contributor

David S. Linthicum is an internationally recognized industry expert and thought leader. Dave has authored 13 books on computing, the latest of which is An Insider’s Guide to Cloud Computing. Dave’s industry experience includes tenures as CTO and CEO of several successful software companies, and upper-level management positions in Fortune 100 companies. He keynotes leading technology conferences on cloud computing, SOA, enterprise application integration, and enterprise architecture. Dave writes the Cloud Computing blog for InfoWorld. His views are his own.

Topics

About

Policies

Our Network

More

How to screw up data migration to the cloud

Many enterprises move their data problems to the cloud. Invest the time and money to clean up your data so that it can be more valuable to the business.

More from this author

Succeeding with observability in the cloud

Strategies to navigate the pitfalls of cloud costs

The rise of specialized private clouds

Serverless computing’s second act

Overlooked cloud sustainability issues

Rise of the cloud computing opposition

A look at risk, regulation, and lock-in in the cloud

Is data gravity no longer centered in the cloud?

Show me more

What is Rust? Safe, fast, and easy software development

Kotlin for Java developers: Classes and coroutines

Azure AI Foundry tools for changes in AI applications

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx

How to screw up data migration to the cloud

Many enterprises move their data problems to the cloud. Invest the time and money to clean up your data so that it can be more valuable to the business.

Related content

Dataframes explained: The modern in-memory data science format

Cloud providers make bank with genAI while projects fail

Overcoming data inconsistency with a universal semantic layer

Bridging the performance gap in data infrastructure for AI

More from this author

Succeeding with observability in the cloud

Strategies to navigate the pitfalls of cloud costs

The rise of specialized private clouds

Serverless computing’s second act

Overlooked cloud sustainability issues

Rise of the cloud computing opposition

A look at risk, regulation, and lock-in in the cloud

Is data gravity no longer centered in the cloud?

Show me more

What is Rust? Safe, fast, and easy software development

Kotlin for Java developers: Classes and coroutines

Azure AI Foundry tools for changes in AI applications

Building Python wheels to distribute your programs

Creating a pip install-able Python package

How to get better web requests in Python with httpx