SlideShare a Scribd company logo
7 Dangerous Myths DBAs Believe about
Data Modeling
Karen Lopez and John Sterrett
Yes, Please do Tweet/Share today’s event
@datachick
@JohnSterrett
@ERStudio
Karen López
Karen has 20+ years of data and information architecture
experience on large, multi-project programs.
She is a frequent speaker on data modeling, data-driven
methodologies and pattern data models.
She blogs at community.embarcadero.com, datamodel.com
and Dataversity.net
She wants you to love your data.
John Sterrett
John has 10+ years of experience with SQL Server
performance tuning, high availability, disaster recovery,
database design, development and database administration.
He is a frequent speaker on performance tuning, high
availability, disaster recovery and proactive monitoring.
He blogs at community.embarcadero.com , johnsterrett.com ,
and sqlauthority.com.
He likes to use tools and automate processes so he has more
time playing with his family and mixing records on his
turntables.
About You...
Who Are You?
About You...
Model Much?
Outcomes TODAY
Understand & Describe Enterprise Data Modeling Process
and Values
Describe the Costs, Benefits & Risks of EDM-driven projects
Understand how Modern Data Modeling efforts contribute
to agility
Understand how Modern Data Modeling efforts contribute
to quality
7 Dangerous Myths
1. Data models are only for creating new databases
2. Data modeling is just documentation
3. DDL generated from data models is unusable
4. Data modeling isn’t needed – ORM scripts are good
enough
5. I can code that myself faster & better than the data
modeling tool can
6. All normalization just slows things down
7. Data modeling is just boxes and lines (the diagram is
the model)
What Other Myths Do You Hear?
Put them in the Q&A!
Dangerous, really?
Computers are good at syntax and
consistency, humans aren’t
A non-trivial database will have tens or
hundreds of thousands of objects in it
Human error costs money, delays projects
and puts business and customers at risk
Facts of Life, 2015
Faster staff turnover on projects
Geographically distributed resources
Overloaded with projects
IT must be agile
Just the right “documentation”
Models are better than documents
Modeling leads to better designs
Just go ask
the guy who
wrote it.
Just New Databases
Enterprise Data Modeling
encourages a practical balance between enterprise and project
points of view
comprises both application and enterprise data models
enables IT groups to respond more quickly and effectively to
business needs
delivers information that is the most useful to the business
uses the proper tools and techniques in delivering project outcomes
Data Models
Conceptual
(Data)
Model
Logical
Data Model
Physical
Data
Model(s)
More than New Databases
Reverse
Engineering
More than a
picture
Vendor
Applications
Legacy
Databases
Data
Lineage
Just Documentation
Application development project
17
Agile/Scrum development project
18
Integration project
19
Business modeling project
20
Package Application Project
Assessment/fit
21
Package Application Project
Logical Modeling
22
ORM scripts
…are good enough
ORM Code…
SELECT "ORMTable"."Name" ,"ORMTable"."Address1"
,"ORMTable"."City"
,"ORMTable"."State" ,"ORMTable"."ZipCode","ORMTable"."PKID"
,"ORMTable"."InceptionDate" ,"ORMTable"."PostDate"
,"ORMTable"."Address2" ,"ORMTable"."LoanID"
FROM "Database"."dbo"."ORMTable" "ORMTable"
INNER JOIN "Database"."dbo"."ORMTable" "ORMTable2" ON (
("ORMTable"."PKID" = "ORMTable2"."PKID")
AND ("ORMTable"."InceptionDate" = "ORMTable2"."InceptionDate")
)
AND ("ORMTable"."PostDate" = "ORMTable2"."PostDate")
WHERE (
"ORMTable"."PKID" = N'ABC1234'
AND (
"ORMTable"."InceptionDate" >= {ts '2015-06-07 00:00:00' }
AND "ORMTable"."InceptionDate" < {ts '2015-06-07 00:00:01' }
)
AND (
"ORMTable"."PostDate" >= {ts '2015-04-08 00:41:26' }
AND "ORMTable"."PostDate" < {ts '2015-04-08 00:41:27' }
)
AND "ORMTable"."ID" = N'1'
OR "ORMTable2"."PKID" = N'ABC1234'
SELECT *
Multiple pages of SQL
for a simple query
Full Table Scans
Unusable DDL/DML
Really??
Once upon a time, you might be
able to justify that Data
Modeling tools didn’t develop
usable DDL and DML scripts.
This ship has sailed.
Don’t break my source code…
I spent hours developing my code. Please
don’t break it!
SELECT * FROM TABLE
SELECT * FROM TABLE ORDER BY 2 DESC
I can script it better & faster
Data Models are Complex ….
Requirements
Data Model
Database*
More
requirements /
changes /
tuning / whims
+ Non Model Stuff
Data Model
Driven
Data Model Driven
Just boxes and lines
JobCandidate
JobCandidateID
Resume
ModifiedDate
Employee
BusinessEntityID
NationalIDNumber
JobTitle
Gender
HireDate
Becomes
N
1
Department
DepartmentID
Name
GroupName
ModifiedDate
N
EmployeePayHistory
BusinessEntityID
RateChangeDateRate
PayFrequency
ModifiedDate
Has
N
1
EmployeeDepartmentHistory
Has
1
N
Has
1
EndDateModifiedDate
Metadata
• More information than goes into the database
• Used to manage models
• Can be used to manage
• Security Requirements
• Privacy Requirements
• Stewardship
• Quality Requirements
• Semantics of data
Normalization is slow
7 Dangerous Myths DBAs Believe about Data Modeling
Is all about the keys, ‘bout the keys, ‘bout the keys…
Depends on understanding the MEANING of the keys and columns
Goes all to heck* when you have surrogate keys
Depends on the make up of the key parts (columns)
Normalization
Normalization, briefly
1NF – all instances (rows) have the same facts
(columns). There are no repeating duplicate
columns
2NF – only applies to multi-part keys. No fact is
about just part of the key
3NF – No fact depends on another non-key column
3NF
Every fact is either part of a key or depends upon the key,
the whole key, and nothing but the key.
….so help you Codd
Michael J Swart
7 Dangerous Myths DBAs Believe about Data Modeling
Remember…
“Normalization is like marriage…
…you always end up with more relations”
-Data Modeling Essentials, 3rd Edition
Simsion & Witt
It’s taught wrong?
Taught as a process, not a
measurement
1NF, 2NF, 3NF, etc.
Used like a grade, instead
of a measurement
Magical “3NF”
No data professional
Builds a data model/design by going from 1NF to 2NF to 3NF
to ….
Says a database has a normal form (only tables have normal
forms)
Thinks of #NF as a “Grade” for their database design
Wants to put data at risk just to make a poorly-written query
run faster
Data Models are Complex …
Sample data
Seller ID Book ID Book Name Price Seller Name
AB ABC I Can Be VenusBarbie 9.99 AmazingBooks
AB ABD Space Command Manual 10.00 AmazingBooks
AB ADH Bookmark VenusBarbie 3.99 Amazing Books
BD 3000001 I Can Be…VenusBarbie 6.00 Book Deals
BD 3000015 Data Stories 27.00 BookDeals
BD 4000200 Data Modeling Essentials 45.00 Book Deals
BD 4000002 I Can Be…VenusBarbie 9.99 Book Deals
DMC 110ABC Data Stories 28.00 Datamodel.com
Better…
Seller ID Seller Name
AB AmazingBooks
BD Book Deals
DMC Datamodel.com
Seller ID Book ID ISBN/
UPC
Price
AB ABC 11111 9.99
AB ABD 22222 10.00
AB ADH 33333 3.99
BD X000001 11111 6.00
BD X000015 44444 17.00
BD 4000200 55555 45.00
BD 4000002 11111 9.99
DMC 110ABC 44444 28.00
ISBN/UPC Book Name
11111 I Can Be… VenusBarbie
22222 Space Command Manual
33333 Bookmark VenusBarbie
44444 Data Stories
55555 Data Modeling Essentials
Bonus: Now we can keep data on
Sellers and books we don’t have
prices for yet.
Normalization & Data Volumes
Less redundant data means less data over all
Less redundant data means much faster DUI processing
Less redundant data means faster backups, and faster recovery
Less redundant data means smaller tables, indexes and
databases
What Other Myths Do You Hear?
Put them in the Q&A!
Checking...
Which Myths?
7 Dangerous Myths
1. Data models are only for creating new databases
2. Data modeling is just documentation
3. DDL generated from data models is unusable
4. Data modeling isn’t needed – ORM scripts are good
enough
5. I can code that myself faster & better than the data
modeling tool can
6. All normalization just slows things down
7. Data modeling is just boxes and lines (the diagram is
the model)
Recommended Books
Thank you, you were great.
Let’s do this again some time!
Karen Lopez @datachick
John Sterrett @johnsterrett

More Related Content

What's hot (9)

PPTX
Денис Резник "Relational Database Design. Normalize till it hurts, then Denor...
Fwdays
 
PDF
Data modeling for the business
Christopher Bradley
 
PDF
Neo4j Presentation
Max De Marzi
 
PDF
Data science in_action
Ji Li
 
PPTX
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
Brandon Fix
 
PDF
Data modelling 101
Christopher Bradley
 
PPT
Artificial Intelligence Expert Session Webinar
ibi
 
PDF
Trends in Data Modeling
DATAVERSITY
 
PDF
Neo4j Data Science Presentation
Max De Marzi
 
Денис Резник "Relational Database Design. Normalize till it hurts, then Denor...
Fwdays
 
Data modeling for the business
Christopher Bradley
 
Neo4j Presentation
Max De Marzi
 
Data science in_action
Ji Li
 
10 tough decisions donor data migration decisions (Webinar hosted by Bloomera...
Brandon Fix
 
Data modelling 101
Christopher Bradley
 
Artificial Intelligence Expert Session Webinar
ibi
 
Trends in Data Modeling
DATAVERSITY
 
Neo4j Data Science Presentation
Max De Marzi
 

Similar to 7 Dangerous Myths DBAs Believe about Data Modeling (20)

PPTX
Data In Action: Business Value of Data
Matt Turner
 
PPTX
How to Survive as a Data Architect in a Polyglot Database World
Karen Lopez
 
PPTX
Introduction: Relational to Graphs
Neo4j
 
PDF
Graph Databases - Where Do We Do the Modeling Part?
DATAVERSITY
 
PPTX
Graph all the things - PRathle
Neo4j
 
PDF
Mastering Customer Data on Apache Spark
Caserta
 
PDF
These Are The Data You Are Looking For
Embarcadero Technologies
 
PDF
Agile & Data Modeling – How Can They Work Together?
DATAVERSITY
 
PDF
Data Workflows for Machine Learning - SF Bay Area ML
Paco Nathan
 
PDF
RDBMS to Graph Webinar
Neo4j
 
PDF
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
PDF
RDBMS to Graphs
Neo4j
 
PPTX
Data Exploration and Transformation.pptx
lovepreet33653
 
PDF
Mastering your data with ca e rwin dm 09082010
ERwin Modeling
 
PDF
Thinking about graphs
Neo4j
 
PDF
Brochure data science learning path board-infinity (1)
NirupamNishant2
 
PDF
Modeling Webinar: State of the Union for Data Innovation - 2016
DATAVERSITY
 
DOCX
Case study 3 covers milestone data modeling.docx
stirlingvwriters
 
PPTX
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Dave Stokes
 
PDF
Graph Database Use Cases - StampedeCon 2015
StampedeCon
 
Data In Action: Business Value of Data
Matt Turner
 
How to Survive as a Data Architect in a Polyglot Database World
Karen Lopez
 
Introduction: Relational to Graphs
Neo4j
 
Graph Databases - Where Do We Do the Modeling Part?
DATAVERSITY
 
Graph all the things - PRathle
Neo4j
 
Mastering Customer Data on Apache Spark
Caserta
 
These Are The Data You Are Looking For
Embarcadero Technologies
 
Agile & Data Modeling – How Can They Work Together?
DATAVERSITY
 
Data Workflows for Machine Learning - SF Bay Area ML
Paco Nathan
 
RDBMS to Graph Webinar
Neo4j
 
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
 
RDBMS to Graphs
Neo4j
 
Data Exploration and Transformation.pptx
lovepreet33653
 
Mastering your data with ca e rwin dm 09082010
ERwin Modeling
 
Thinking about graphs
Neo4j
 
Brochure data science learning path board-infinity (1)
NirupamNishant2
 
Modeling Webinar: State of the Union for Data Innovation - 2016
DATAVERSITY
 
Case study 3 covers milestone data modeling.docx
stirlingvwriters
 
Why Your Database Queries Stink -SeaGl.org November 11th, 2016
Dave Stokes
 
Graph Database Use Cases - StampedeCon 2015
StampedeCon
 
Ad

More from Embarcadero Technologies (20)

PDF
PyTorch for Delphi - Python Data Sciences Libraries.pdf
Embarcadero Technologies
 
PDF
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Embarcadero Technologies
 
PDF
Linux GUI Applications on Windows Subsystem for Linux
Embarcadero Technologies
 
PDF
Python on Android with Delphi FMX - The Cross Platform GUI Framework
Embarcadero Technologies
 
PDF
Introduction to Python GUI development with Delphi for Python - Part 1: Del...
Embarcadero Technologies
 
PDF
FMXLinux Introduction - Delphi's FireMonkey for Linux
Embarcadero Technologies
 
PDF
Python for Delphi Developers - Part 2
Embarcadero Technologies
 
PPTX
Python for Delphi Developers - Part 1 Introduction
Embarcadero Technologies
 
PDF
RAD Industrial Automation, Labs, and Instrumentation
Embarcadero Technologies
 
PDF
Embeddable Databases for Mobile Apps: Stress-Free Solutions with InterBase
Embarcadero Technologies
 
PDF
Rad Server Industry Template - Connected Nurses Station - Setup Document
Embarcadero Technologies
 
PPTX
TMS Google Mapping Components
Embarcadero Technologies
 
PDF
Move Desktop Apps to the Cloud - RollApp & Embarcadero webinar
Embarcadero Technologies
 
PPTX
Useful C++ Features You Should be Using
Embarcadero Technologies
 
PPTX
Getting Started Building Mobile Applications for iOS and Android
Embarcadero Technologies
 
PPTX
Embarcadero RAD server Launch Webinar
Embarcadero Technologies
 
PPTX
ER/Studio 2016: Build a Business-Driven Data Architecture
Embarcadero Technologies
 
PPTX
The Secrets of SQL Server: Database Worst Practices
Embarcadero Technologies
 
PDF
Driving Business Value Through Agile Data Assets
Embarcadero Technologies
 
PDF
Troubleshooting Plan Changes with Query Store in SQL Server 2016
Embarcadero Technologies
 
PyTorch for Delphi - Python Data Sciences Libraries.pdf
Embarcadero Technologies
 
Android on Windows 11 - A Developer's Perspective (Windows Subsystem For Andr...
Embarcadero Technologies
 
Linux GUI Applications on Windows Subsystem for Linux
Embarcadero Technologies
 
Python on Android with Delphi FMX - The Cross Platform GUI Framework
Embarcadero Technologies
 
Introduction to Python GUI development with Delphi for Python - Part 1: Del...
Embarcadero Technologies
 
FMXLinux Introduction - Delphi's FireMonkey for Linux
Embarcadero Technologies
 
Python for Delphi Developers - Part 2
Embarcadero Technologies
 
Python for Delphi Developers - Part 1 Introduction
Embarcadero Technologies
 
RAD Industrial Automation, Labs, and Instrumentation
Embarcadero Technologies
 
Embeddable Databases for Mobile Apps: Stress-Free Solutions with InterBase
Embarcadero Technologies
 
Rad Server Industry Template - Connected Nurses Station - Setup Document
Embarcadero Technologies
 
TMS Google Mapping Components
Embarcadero Technologies
 
Move Desktop Apps to the Cloud - RollApp & Embarcadero webinar
Embarcadero Technologies
 
Useful C++ Features You Should be Using
Embarcadero Technologies
 
Getting Started Building Mobile Applications for iOS and Android
Embarcadero Technologies
 
Embarcadero RAD server Launch Webinar
Embarcadero Technologies
 
ER/Studio 2016: Build a Business-Driven Data Architecture
Embarcadero Technologies
 
The Secrets of SQL Server: Database Worst Practices
Embarcadero Technologies
 
Driving Business Value Through Agile Data Assets
Embarcadero Technologies
 
Troubleshooting Plan Changes with Query Store in SQL Server 2016
Embarcadero Technologies
 
Ad

Recently uploaded (20)

PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
PDF
Troubleshooting Virtual Threads in Java!
Tier1 app
 
PDF
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
PDF
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
PPTX
Presentation about Database and Database Administrator
abhishekchauhan86963
 
PDF
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
PDF
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
PDF
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
PPTX
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
PPT
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
PDF
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
PDF
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
PDF
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
PDF
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
PDF
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
PDF
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 
PPT
Brief History of Python by Learning Python in three hours
adanechb21
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PPTX
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
PDF
AWS_Agentic_AI_in_Indian_BFSI_A_Strategic_Blueprint_for_Customer.pdf
siddharthnetsavvies
 
Activate_Methodology_Summary presentatio
annapureddyn
 
Troubleshooting Virtual Threads in Java!
Tier1 app
 
Why Are More Businesses Choosing Partners Over Freelancers for Salesforce.pdf
Cymetrix Software
 
New Download FL Studio Crack Full Version [Latest 2025]
imang66g
 
Presentation about Database and Database Administrator
abhishekchauhan86963
 
System Center 2025 vs. 2022; What’s new, what’s next_PDF.pdf
Q-Advise
 
New Download MiniTool Partition Wizard Crack Latest Version 2025
imang66g
 
Adobe Illustrator Crack Full Download (Latest Version 2025) Pre-Activated
imang66g
 
Employee salary prediction using Machine learning Project template.ppt
bhanuk27082004
 
Why Reliable Server Maintenance Service in New York is Crucial for Your Business
Sam Vohra
 
Infrastructure planning and resilience - Keith Hastings.pptx.pdf
Safe Software
 
MiniTool Power Data Recovery Crack New Pre Activated Version Latest 2025
imang66g
 
WatchTraderHub - Watch Dealer software with inventory management and multi-ch...
WatchDealer Pavel
 
AI Image Enhancer: Revolutionizing Visual Quality”
docmasoom
 
10 posting ideas for community engagement with AI prompts
Pankaj Taneja
 
Step-by-Step Guide to Install SAP HANA Studio | Complete Installation Tutoria...
SAP Vista, an A L T Z E N Company
 
Brief History of Python by Learning Python in three hours
adanechb21
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
slidesgo-unlocking-the-code-the-dynamic-dance-of-variables-and-constants-2024...
kr2589474
 
AWS_Agentic_AI_in_Indian_BFSI_A_Strategic_Blueprint_for_Customer.pdf
siddharthnetsavvies
 

7 Dangerous Myths DBAs Believe about Data Modeling

  • 1. 7 Dangerous Myths DBAs Believe about Data Modeling Karen Lopez and John Sterrett
  • 2. Yes, Please do Tweet/Share today’s event @datachick @JohnSterrett @ERStudio
  • 3. Karen López Karen has 20+ years of data and information architecture experience on large, multi-project programs. She is a frequent speaker on data modeling, data-driven methodologies and pattern data models. She blogs at community.embarcadero.com, datamodel.com and Dataversity.net She wants you to love your data.
  • 4. John Sterrett John has 10+ years of experience with SQL Server performance tuning, high availability, disaster recovery, database design, development and database administration. He is a frequent speaker on performance tuning, high availability, disaster recovery and proactive monitoring. He blogs at community.embarcadero.com , johnsterrett.com , and sqlauthority.com. He likes to use tools and automate processes so he has more time playing with his family and mixing records on his turntables.
  • 7. Outcomes TODAY Understand & Describe Enterprise Data Modeling Process and Values Describe the Costs, Benefits & Risks of EDM-driven projects Understand how Modern Data Modeling efforts contribute to agility Understand how Modern Data Modeling efforts contribute to quality
  • 8. 7 Dangerous Myths 1. Data models are only for creating new databases 2. Data modeling is just documentation 3. DDL generated from data models is unusable 4. Data modeling isn’t needed – ORM scripts are good enough 5. I can code that myself faster & better than the data modeling tool can 6. All normalization just slows things down 7. Data modeling is just boxes and lines (the diagram is the model)
  • 9. What Other Myths Do You Hear? Put them in the Q&A!
  • 10. Dangerous, really? Computers are good at syntax and consistency, humans aren’t A non-trivial database will have tens or hundreds of thousands of objects in it Human error costs money, delays projects and puts business and customers at risk
  • 11. Facts of Life, 2015 Faster staff turnover on projects Geographically distributed resources Overloaded with projects IT must be agile Just the right “documentation” Models are better than documents Modeling leads to better designs Just go ask the guy who wrote it.
  • 13. Enterprise Data Modeling encourages a practical balance between enterprise and project points of view comprises both application and enterprise data models enables IT groups to respond more quickly and effectively to business needs delivers information that is the most useful to the business uses the proper tools and techniques in delivering project outcomes
  • 15. More than New Databases Reverse Engineering More than a picture Vendor Applications Legacy Databases Data Lineage
  • 24. ORM Code… SELECT "ORMTable"."Name" ,"ORMTable"."Address1" ,"ORMTable"."City" ,"ORMTable"."State" ,"ORMTable"."ZipCode","ORMTable"."PKID" ,"ORMTable"."InceptionDate" ,"ORMTable"."PostDate" ,"ORMTable"."Address2" ,"ORMTable"."LoanID" FROM "Database"."dbo"."ORMTable" "ORMTable" INNER JOIN "Database"."dbo"."ORMTable" "ORMTable2" ON ( ("ORMTable"."PKID" = "ORMTable2"."PKID") AND ("ORMTable"."InceptionDate" = "ORMTable2"."InceptionDate") ) AND ("ORMTable"."PostDate" = "ORMTable2"."PostDate") WHERE ( "ORMTable"."PKID" = N'ABC1234' AND ( "ORMTable"."InceptionDate" >= {ts '2015-06-07 00:00:00' } AND "ORMTable"."InceptionDate" < {ts '2015-06-07 00:00:01' } ) AND ( "ORMTable"."PostDate" >= {ts '2015-04-08 00:41:26' } AND "ORMTable"."PostDate" < {ts '2015-04-08 00:41:27' } ) AND "ORMTable"."ID" = N'1' OR "ORMTable2"."PKID" = N'ABC1234' SELECT * Multiple pages of SQL for a simple query Full Table Scans
  • 26. Really?? Once upon a time, you might be able to justify that Data Modeling tools didn’t develop usable DDL and DML scripts. This ship has sailed.
  • 27. Don’t break my source code… I spent hours developing my code. Please don’t break it! SELECT * FROM TABLE SELECT * FROM TABLE ORDER BY 2 DESC
  • 28. I can script it better & faster
  • 29. Data Models are Complex ….
  • 30. Requirements Data Model Database* More requirements / changes / tuning / whims + Non Model Stuff Data Model Driven Data Model Driven
  • 31. Just boxes and lines
  • 33. Metadata • More information than goes into the database • Used to manage models • Can be used to manage • Security Requirements • Privacy Requirements • Stewardship • Quality Requirements • Semantics of data
  • 36. Is all about the keys, ‘bout the keys, ‘bout the keys… Depends on understanding the MEANING of the keys and columns Goes all to heck* when you have surrogate keys Depends on the make up of the key parts (columns) Normalization
  • 37. Normalization, briefly 1NF – all instances (rows) have the same facts (columns). There are no repeating duplicate columns 2NF – only applies to multi-part keys. No fact is about just part of the key 3NF – No fact depends on another non-key column
  • 38. 3NF Every fact is either part of a key or depends upon the key, the whole key, and nothing but the key. ….so help you Codd Michael J Swart
  • 40. Remember… “Normalization is like marriage… …you always end up with more relations” -Data Modeling Essentials, 3rd Edition Simsion & Witt
  • 41. It’s taught wrong? Taught as a process, not a measurement 1NF, 2NF, 3NF, etc. Used like a grade, instead of a measurement Magical “3NF”
  • 42. No data professional Builds a data model/design by going from 1NF to 2NF to 3NF to …. Says a database has a normal form (only tables have normal forms) Thinks of #NF as a “Grade” for their database design Wants to put data at risk just to make a poorly-written query run faster
  • 43. Data Models are Complex …
  • 44. Sample data Seller ID Book ID Book Name Price Seller Name AB ABC I Can Be VenusBarbie 9.99 AmazingBooks AB ABD Space Command Manual 10.00 AmazingBooks AB ADH Bookmark VenusBarbie 3.99 Amazing Books BD 3000001 I Can Be…VenusBarbie 6.00 Book Deals BD 3000015 Data Stories 27.00 BookDeals BD 4000200 Data Modeling Essentials 45.00 Book Deals BD 4000002 I Can Be…VenusBarbie 9.99 Book Deals DMC 110ABC Data Stories 28.00 Datamodel.com
  • 45. Better… Seller ID Seller Name AB AmazingBooks BD Book Deals DMC Datamodel.com Seller ID Book ID ISBN/ UPC Price AB ABC 11111 9.99 AB ABD 22222 10.00 AB ADH 33333 3.99 BD X000001 11111 6.00 BD X000015 44444 17.00 BD 4000200 55555 45.00 BD 4000002 11111 9.99 DMC 110ABC 44444 28.00 ISBN/UPC Book Name 11111 I Can Be… VenusBarbie 22222 Space Command Manual 33333 Bookmark VenusBarbie 44444 Data Stories 55555 Data Modeling Essentials Bonus: Now we can keep data on Sellers and books we don’t have prices for yet.
  • 46. Normalization & Data Volumes Less redundant data means less data over all Less redundant data means much faster DUI processing Less redundant data means faster backups, and faster recovery Less redundant data means smaller tables, indexes and databases
  • 47. What Other Myths Do You Hear? Put them in the Q&A!
  • 49. 7 Dangerous Myths 1. Data models are only for creating new databases 2. Data modeling is just documentation 3. DDL generated from data models is unusable 4. Data modeling isn’t needed – ORM scripts are good enough 5. I can code that myself faster & better than the data modeling tool can 6. All normalization just slows things down 7. Data modeling is just boxes and lines (the diagram is the model)
  • 51. Thank you, you were great. Let’s do this again some time! Karen Lopez @datachick John Sterrett @johnsterrett