Let's review. First, semantic debt is inevitable. It happens in an organization when, broadly speaking, the maps no longer match the territory. This always happens, although sometimes it happens faster than anyone is aware. But it happens because business changes; regulatory change like GDPR, competitive changes like the introduction of new products and services, and technological changes like improved automation or process instrumentation all leave the old ways of understanding an organization inadequate. The difference between where we think we are, with respect to our model of the organization's entities and relationships, and where the organization actually is: That's semantic debt.
Every time data is moved or stored from one storage system to another it has to have, in purely practical terms, a format. This format is an implicit data model. Usually its some subset of the data model the data originates in - e.g. the order management system or device - with some kind of mapping function into the data model or structure its moving into. So we take our order management extract, with its specific set of fields, and we rename the fields and apply some transformation logic to the contents of the fields so it'll flow smoothly and without error into the target structure. We call this combination of data model and data a "semantic asset."
When we look at the picture above, we've got a very large semantic asset that by time t4 no longer correctly maps to its business model. The models are out of synch. The semantic asset may have a lot of useful portions of the business asset, but it still needs to be updated. We say then that the asset is in semantic debt. How much semantic debt is a function of the distance between the data model and the business model: At t1 there is no debt. At t2 and t3 there is a little more debt each time. By t4 our data model includes entities and relationships that are no longer in the business model, and our business model includes entities and relationships that aren't in the data model.
If at t4 we were to try to report on the business model using the data model in ways that useful to the organization, we'd have to ignore good portions of the data model. We'd also have to find out where the organization was putting the data its creating that won't fit in the data model. We might have to import Excel spreadsheets into our reporting system, for example, and/or engage in endless arguments over governance and entity decomposition to try to rationalize keeping the old model.
Far better to just make the change. There's more going on in the business than is reflected in the data model.
No comments:
Post a Comment