Semantic Debt: January 2024

Monday, January 29, 2024

2.4 Most IT VPs are software people, not database people

This subsection comes as the ending essay of a section entitled Data Management is Hard, explaining all the reasons why data people are the way they are. This section is more sociological and also more critical and controversial. It takes on a management problem.

2.4 Most IT VPs are software people, not database people

It is a strange historical accident that most VPs in the IT industry gain their experience in software development and not data management. One can imagine an alternate universe where this is not so, and organizations go decades producing horrible software until they decide to hire an experienced software architect but in the meantime their databases are all in fifth normal form and none of their data warehouses are crude copies of all the tables in the accounting module. We don’t live in this alternate universe.
That might be a sad comment on the triumph of PR over studious hard work. But it’s more likely due to the simple fact that software teams are often at least three times the size of the database teams they work with, if there is even a database team at all. Moreover, database teams are often said to “support” software teams, which puts DBAs into a role not much higher on the corporate totem pole than customer support or desktop support people. When you’re looking around for someone to run a large team, you look for people who’ve run large teams, and not those who’ve spent their career running small “support” teams. That consideration ends up, by design, selecting for the people who’ve run large software groups for VP positions.

Outline for _Practical Data Management_

A few years back, when I found myself with a lot of time on my hands, I started writing a very practical guide to data management. It was wordy and opinionated, as you might expect from me. But it filled a gap in data management, specifically the gap between cookbooks that published schemas, like Silverston's book, and data modeling practice books, which started with the Enterprise Architecture and worked their way down to the mere details. The goal of the book was practical, at the detail level. A practitioner whether beginner or advanced should come away from the book having learned something. To do that the book needed to include bits of SQL, where relevant, but also a discussion of the theory, the practice and how the two balanced in the real world.

I find myself with time on my hands once again, although this time I'm much more comfortable and in a more stable frame of mind. The Universe seems to be telling me something, and it sounds like maybe you should be writing this down. I've avoided documentation in favor of action for most of my professional life. But once again I'm presented with an opportunity, and I find myself arguing that writing it down violates the habit of action. But it's time for new habits.

So this is, practically, a different kind of action. The tension between theory and practice is fundamental to data management, which is a new science still feeling its way through the growing pains of formalization. I think I've got something to say about it. So here's an outline of the book. I'll post chapters that have been written, and chapters as they come to fruition.

Practical Data Management

1. Introduction

1b. A note on terminology. What do I mean by…?

2. Data Management is Hard

2.1. There are a lot of sources

2.2. Applications change over time
2.3. It takes time to get the database right
2.4. Most IT VPs are software people, not database people

3. Patterns, heuristics, logic and philosophical disagreements

3.1. Codd’s essential insight

3.2. Normal forms
3.3. Layers
3.4. Patterns of persistence
3.5. Patterns of movement
3.6. Patterns of development
3.7. Patterns of organization

3.7.1. Analysis vs. Operations
3.7.2. Customer-driven vs. Architecture-driven

3.8. Bizness Logic
3.9. Theories of Knowledge and Truth
3.10. Governance

4. The pieces, defined

4.1. Source systems

4.2. Data lakes
4.3. Operational Data Stores
4.4. Data warehouses
4.5. Data Vaults
4.6. Master Data Management
4.7. Pipelines
4.8. Logs

5. Platform choices

5.1. The value of a fixed schema

5.2. The value of schema-on-demand
5.3. Brands of DB
5.4. NoSQL choices
5.5. Pipeline tool choices

6. The build process

6.1. Can it ever be Agile?

6.2. Practical concerns
6.3. Addressing long-term concerns
6.4. The value of feedback
6.5. A simple model

7. Fallacies

7.1. “Agile”
7.2. Throwing stuff into the lake
7.3. Naming conventions
7.4. Governance
7.5. Short-term wins
7.6. Long-term architecture

Semantic Debt

Monday, January 29, 2024

2.4 Most IT VPs are software people, not database people

2.4 Most IT VPs are software people, not database people

Outline for _Practical Data Management_

Practical Data Management

The point of this blog

Welcome to the beginning of the film

Top 3 Posts

Report Abuse