A few years back, when I found myself with a lot of time on my hands, I started writing a very practical guide to data management. It was wordy and opinionated, as you might expect from me. But it filled a gap in data management, specifically the gap between cookbooks that published schemas, like Silverston's book, and data modeling practice books, which started with the Enterprise Architecture and worked their way down to the mere details. The goal of the book was practical, at the detail level. A practitioner whether beginner or advanced should come away from the book having learned something. To do that the book needed to include bits of SQL, where relevant, but also a discussion of the theory, the practice and how the two balanced in the real world.
I find myself with time on my hands once again, although this time I'm much more comfortable and in a more stable frame of mind. The Universe seems to be telling me something, and it sounds like maybe you should be writing this down. I've avoided documentation in favor of action for most of my professional life. But once again I'm presented with an opportunity, and I find myself arguing that writing it down violates the habit of action. But it's time for new habits.
So this is, practically, a different kind of action. The tension between theory and practice is fundamental to data management, which is a new science still feeling its way through the growing pains of formalization. I think I've got something to say about it. So here's an outline of the book. I'll post chapters that have been written, and chapters as they come to fruition.
Practical Data Management
1b. A note on terminology. What do I mean by…?
2.1. There are a lot of sources
2.2. Applications change over time2.3. It takes time to get the database right2.4. Most IT VPs are software people, not database people
3.1. Codd’s essential insight
3.2. Normal forms3.3. Layers3.4. Patterns of persistence3.5. Patterns of movement3.6. Patterns of development3.7. Patterns of organization
3.7.1. Analysis vs. Operations3.7.2. Customer-driven vs. Architecture-driven
3.8. Bizness Logic3.9. Theories of Knowledge and Truth3.10. Governance
4.1. Source systems
4.2. Data lakes4.3. Operational Data Stores4.4. Data warehouses4.5. Data Vaults4.6. Master Data Management4.7. Pipelines4.8. Logs
5.1. The value of a fixed schema
5.2. The value of schema-on-demand5.3. Brands of DB5.4. NoSQL choices5.5. Pipeline tool choices
6.1. Can it ever be Agile?
6.2. Practical concerns6.3. Addressing long-term concerns6.4. The value of feedback6.5. A simple model
7.1. “Agile”7.2. Throwing stuff into the lake7.3. Naming conventions7.4. Governance7.5. Short-term wins7.6. Long-term architecture
No comments:
Post a Comment