SlideShare a Scribd company logo
DB (safe) MIGRATIONS
rails db:migrate:safeeeeeee...
Migrations
● Migration is a set of database instruction.
● They describe database changes.
Rails migrations
● Rails Migration allows you to use Ruby to define changes to
your database schema, making it possible to use a version
control system to keep things synchronized with the actual
code.
● Adding a column
● Backfilling data
● Removing a column
● Changing the type of a column
● Renaming a column
● Renaming a table
● Creating a table with the force option
● Adding a check constraint
● Setting NOT NULL on an existing column
● Executing SQL directly
Some of the unsafe migrations
● Adding an index non-concurrently
● Adding a reference
● Adding a foreign key
● Adding a json column
Postgres-specific checks
Adding a column
Not really!
Adding a column
Locks!
DB locks
● Locks are a mechanism for ensuring multiple operations
don’t update the same row at the same time.
● There are 8 different lock modes, ranging from ACCESS
SHARE (anyone can read and write data) to ACCESS
EXCLUSIVE (no one else is permitted to read data).
● Certain database migrations will obtain an ACCESS
EXCLUSIVE lock, and prevent the rest of your application
from reading data until the migration completes.
Users table
Table locked
Column added
Sets default value
Migrated
How can this be
avoided???
DON’T add columns with
a default value.
Because,
● Of the locking mode it uses and can and will cause
downtime if you have enough rows in your database and
enough traffic on the system.
● Though Postgres 11 actually addresses this problem in
certain circumstances. Adding a static default value no
longer requires obtaining a table level access exclusive
locks. But note the caveat, under certain circumstances.
● For example adding backfilling a new UUID column will
obtain that lock.
Adding a column
(Without a default value)
Now let’s try that again
Adding a column (without a default value)
DONE!
Actually no!
Transactions!
DB transactions
● Transactions combine multiple database operations into a
single, “all-or-nothing” operation.
● They provide four guarantees: atomicity, consistency,
isolation, and durability (“ACID”).
● Consistency and isolation are guaranteed by locks.
● When a a row is being updated, an exclusive lock is issued,
and no one else can update that same row until the first
update is complete.
DB transactions
● Locks are issued on a first-come, first-served basis, and live
for the duration of a transaction, even if the statement that
requested the lock has already executed.
● Migrations are automatically wrapped in a transaction.
● So for most of your database operations this might not be a
problem, as it usually happens in a the order of milliseconds.
● But when you have to perform millions of database
operations on a very large datasets.
Updating in a transaction
Updating in a transaction
Updated successfully
So, how this transactions affect migrations
● Our columns were added, with row 1 we are not actually
locking the entire table, but instead the first row is locked,
mark it true and move on. Even though it was successful, as
I mentioned, that lock doesn't get released until your
transaction get committed.
Adding a column (THE CORRECT WAY)
DON’T BACKFILL DATA
INSIDE A TRANSACTION.
Backfilling data (THE CORRECT WAY)
disable_ddl_transaction!
● It disables that global transaction.
● It is implicitly enabled but you can explicitly when you're
running a particular migration.
● So, we write a separate migration and run once the column
was added.
● Rather than marking every single user inside/outside a
transaction, we iterate users in batches and wrapping each
individual batch inside of a transaction.
● Batch size defaults to 1000 of course it's configurable based
on your individual needs.
What’s the difference??
● This transaction that is updating 1000 rows is gonna
complete and commit much faster than a transaction
updating 10 million rows.
● That changes your lag time from minutes to order of
seconds or even lesser where an individual subset of users
might receive a slightly delayed response.
● So, users most likely won't even notice that anything
happened.
● So our rule of thumb here is???
DON’T MIX SCHEMA AND
DATA CHANGES.
What now??
● We have successfully added users who are active.
● But how are we gonna look up active users?
● Any idea??
Adding an index
Not really!
For Postgres only
Adding an index
Indexing will
● Interfere with regular operation of a database.
● Locks the table to be indexed against writes and performs
the entire index build with a single scan of the table.
● Have a severe effect if the system is a live production
database.
● Very large tables can take many hours to be indexed, and
even for smaller tables, an index build can lock out writers
for periods that are unacceptably long for a production
system.
DO ADD POSTGRES INDICES
CONCURRENTLY.
Adding an index (THE CORRECT WAY)
algorithm: :concurrently
● Waits for all existing transactions that could potentially
modify or use the index to terminate.
● Requires more total work than a standard index build and
takes significantly longer to complete.
● Useful for adding new indexes in a production environment.
● Of course, the extra CPU and I/O load imposed by the index
creation might slow other operations.
Concurrency!
L = λ * W
Little’s law
Concurrency Throughput Response Time
4 = 100 * 40 ms
Concurrent requests Req’s / sec Response Time
Concurrency
● Every application has a theoretical maximum level of
concurrency it can support at any given time.
● Your database obeys the same principles. How fast your
queries are, and how large your connection pool is,
determines how many queries you can concurrently handle.
● Requests start queueing when they arrive faster than your
application, or its database, can respond to them.
● If a database operation blocks many requests for a long
time, your entire application will grind to a halt.
DO TEST DATABASE
PERFORMANCE.
DB Performance
● You don't have to understand the performance
characteristics of the application.
● But you have to understand how they change during before
and after your migration.
● You have to do this on a regular basis.
● If we had an understanding on the effects of the migration
even before we migrate them live, makes an advantage on
us to not drop on outages.
Tools and resources
Gems
● To help your database healthy and still can add schema
changes.
● Static analysis will warn in advance about certain unsafe
migrations.
● Catch problems at dev time, not deploy time.
● ankane/strong_migrations
● LendingHome/zero_downtime_migrations
● Not technically a gem, but: Gitlab migration helpers
Strong migrations
● Catch unsafe migrations in development
● Detects potentially dangerous operations
● Prevents them from running by default
● Provides instructions on safer ways to do what you want
● Supports for PostgreSQL, MySQL, and MariaDB
Strong migrations - Warning and Suggestions
Strong migrations - Warning and Suggestions
Application Performance Monitoring
● Understanding your application's baseline performance is
critical to understanding how migrations will change its
performance characteristics.
Takeaways
● DON’T add columns with a default value.
● DON’T backfill data inside a transaction.
● DON’T mix schema and data changes in the same migration.
● DO add Postgres indexes concurrently.
● DO monitor and test database performance before, during,
and after migrations.
Questions???
IF WE WRITE SAFE MIGRATIONS,
WE'LL RUN SAFE MIGRATIONS.
Thank you!

More Related Content

Similar to Rails DB migrate SAFE.pdf (20)

PDF
next-level-database-techniques-for-developers.pdf
NedyalkoKarabadzhako
 
PDF
How MySQL can boost (or kill) your application
Federico Razzoli
 
PPT
Lec08
saryu2011
 
PDF
Speed up sql
Kaing Menglieng
 
PDF
Lightening Talk - PostgreSQL Worst Practices
PGConf APAC
 
PDF
Unbreaking Your Django Application
OSCON Byrum
 
PPTX
Day 4 - Models
Barry Jones
 
PPT
Svetlin Nakov - Database Transactions
Svetlin Nakov
 
PPTX
SQL Server 2012 Best Practices
Microsoft TechNet - Belgium and Luxembourg
 
PDF
Rails israel 2013
Reuven Lerner
 
PDF
How MySQL can boost (or kill) your application v2
Federico Razzoli
 
PPTX
Read Consistency: Why You Need it and How to Avoid it
David Kurtz
 
PDF
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
PDF
Architectural anti-patterns for data handling
Gleicon Moraes
 
PPT
Function
rey501
 
PDF
Database Design most common pitfalls
Federico Razzoli
 
PPTX
Migrating To PostgreSQL
Grant Fritchey
 
PPT
YUMNST.pptWDWWFFEFEWFFFCSCSDCSDSDSDDFSDSDS
AhmadSajjad34
 
PPTX
Database Performance Tuning
Arno Huetter
 
KEY
10x Performance Improvements
Ronald Bradford
 
next-level-database-techniques-for-developers.pdf
NedyalkoKarabadzhako
 
How MySQL can boost (or kill) your application
Federico Razzoli
 
Lec08
saryu2011
 
Speed up sql
Kaing Menglieng
 
Lightening Talk - PostgreSQL Worst Practices
PGConf APAC
 
Unbreaking Your Django Application
OSCON Byrum
 
Day 4 - Models
Barry Jones
 
Svetlin Nakov - Database Transactions
Svetlin Nakov
 
SQL Server 2012 Best Practices
Microsoft TechNet - Belgium and Luxembourg
 
Rails israel 2013
Reuven Lerner
 
How MySQL can boost (or kill) your application v2
Federico Razzoli
 
Read Consistency: Why You Need it and How to Avoid it
David Kurtz
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Alex Sharp
 
Architectural anti-patterns for data handling
Gleicon Moraes
 
Function
rey501
 
Database Design most common pitfalls
Federico Razzoli
 
Migrating To PostgreSQL
Grant Fritchey
 
YUMNST.pptWDWWFFEFEWFFFCSCSDCSDSDSDDFSDSDS
AhmadSajjad34
 
Database Performance Tuning
Arno Huetter
 
10x Performance Improvements
Ronald Bradford
 

Recently uploaded (20)

PDF
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PPTX
Tally software_Introduction_Presentation
AditiBansal54083
 
PDF
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
PDF
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PDF
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
PPTX
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
PPTX
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PPTX
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
PDF
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
PPTX
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
PDF
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
Automate Cybersecurity Tasks with Python
VICTOR MAESTRE RAMIREZ
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Tally software_Introduction_Presentation
AditiBansal54083
 
iTop VPN With Crack Lifetime Activation Key-CODE
utfefguu
 
HiHelloHR – Simplify HR Operations for Modern Workplaces
HiHelloHR
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
유니티에서 Burst Compiler+ThreadedJobs+SIMD 적용사례
Seongdae Kim
 
ChiSquare Procedure in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
Open Chain Q2 Steering Committee Meeting - 2025-06-25
Shane Coughlan
 
OpenChain @ OSS NA - In From the Cold: Open Source as Part of Mainstream Soft...
Shane Coughlan
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Transforming Mining & Engineering Operations with Odoo ERP | Streamline Proje...
SatishKumar2651
 
vMix Pro 28.0.0.42 Download vMix Registration key Bundle
kulindacore
 
Why Businesses Are Switching to Open Source Alternatives to Crystal Reports.pptx
Varsha Nayak
 
Thread In Android-Mastering Concurrency for Responsive Apps.pdf
Nabin Dhakal
 
Ad

Rails DB migrate SAFE.pdf

  • 1. DB (safe) MIGRATIONS rails db:migrate:safeeeeeee...
  • 2. Migrations ● Migration is a set of database instruction. ● They describe database changes. Rails migrations ● Rails Migration allows you to use Ruby to define changes to your database schema, making it possible to use a version control system to keep things synchronized with the actual code.
  • 3. ● Adding a column ● Backfilling data ● Removing a column ● Changing the type of a column ● Renaming a column ● Renaming a table ● Creating a table with the force option ● Adding a check constraint ● Setting NOT NULL on an existing column ● Executing SQL directly Some of the unsafe migrations
  • 4. ● Adding an index non-concurrently ● Adding a reference ● Adding a foreign key ● Adding a json column Postgres-specific checks
  • 8. DB locks ● Locks are a mechanism for ensuring multiple operations don’t update the same row at the same time. ● There are 8 different lock modes, ranging from ACCESS SHARE (anyone can read and write data) to ACCESS EXCLUSIVE (no one else is permitted to read data). ● Certain database migrations will obtain an ACCESS EXCLUSIVE lock, and prevent the rest of your application from reading data until the migration completes.
  • 14. How can this be avoided???
  • 15. DON’T add columns with a default value.
  • 16. Because, ● Of the locking mode it uses and can and will cause downtime if you have enough rows in your database and enough traffic on the system. ● Though Postgres 11 actually addresses this problem in certain circumstances. Adding a static default value no longer requires obtaining a table level access exclusive locks. But note the caveat, under certain circumstances. ● For example adding backfilling a new UUID column will obtain that lock.
  • 17. Adding a column (Without a default value) Now let’s try that again
  • 18. Adding a column (without a default value)
  • 21. DB transactions ● Transactions combine multiple database operations into a single, “all-or-nothing” operation. ● They provide four guarantees: atomicity, consistency, isolation, and durability (“ACID”). ● Consistency and isolation are guaranteed by locks. ● When a a row is being updated, an exclusive lock is issued, and no one else can update that same row until the first update is complete.
  • 22. DB transactions ● Locks are issued on a first-come, first-served basis, and live for the duration of a transaction, even if the statement that requested the lock has already executed. ● Migrations are automatically wrapped in a transaction. ● So for most of your database operations this might not be a problem, as it usually happens in a the order of milliseconds. ● But when you have to perform millions of database operations on a very large datasets.
  • 23. Updating in a transaction
  • 24. Updating in a transaction
  • 26. So, how this transactions affect migrations ● Our columns were added, with row 1 we are not actually locking the entire table, but instead the first row is locked, mark it true and move on. Even though it was successful, as I mentioned, that lock doesn't get released until your transaction get committed.
  • 27. Adding a column (THE CORRECT WAY)
  • 29. Backfilling data (THE CORRECT WAY)
  • 30. disable_ddl_transaction! ● It disables that global transaction. ● It is implicitly enabled but you can explicitly when you're running a particular migration. ● So, we write a separate migration and run once the column was added. ● Rather than marking every single user inside/outside a transaction, we iterate users in batches and wrapping each individual batch inside of a transaction. ● Batch size defaults to 1000 of course it's configurable based on your individual needs.
  • 31. What’s the difference?? ● This transaction that is updating 1000 rows is gonna complete and commit much faster than a transaction updating 10 million rows. ● That changes your lag time from minutes to order of seconds or even lesser where an individual subset of users might receive a slightly delayed response. ● So, users most likely won't even notice that anything happened. ● So our rule of thumb here is???
  • 32. DON’T MIX SCHEMA AND DATA CHANGES.
  • 33. What now?? ● We have successfully added users who are active. ● But how are we gonna look up active users? ● Any idea??
  • 34. Adding an index Not really! For Postgres only
  • 36. Indexing will ● Interfere with regular operation of a database. ● Locks the table to be indexed against writes and performs the entire index build with a single scan of the table. ● Have a severe effect if the system is a live production database. ● Very large tables can take many hours to be indexed, and even for smaller tables, an index build can lock out writers for periods that are unacceptably long for a production system.
  • 37. DO ADD POSTGRES INDICES CONCURRENTLY.
  • 38. Adding an index (THE CORRECT WAY)
  • 39. algorithm: :concurrently ● Waits for all existing transactions that could potentially modify or use the index to terminate. ● Requires more total work than a standard index build and takes significantly longer to complete. ● Useful for adding new indexes in a production environment. ● Of course, the extra CPU and I/O load imposed by the index creation might slow other operations.
  • 41. L = λ * W Little’s law Concurrency Throughput Response Time 4 = 100 * 40 ms Concurrent requests Req’s / sec Response Time
  • 42. Concurrency ● Every application has a theoretical maximum level of concurrency it can support at any given time. ● Your database obeys the same principles. How fast your queries are, and how large your connection pool is, determines how many queries you can concurrently handle. ● Requests start queueing when they arrive faster than your application, or its database, can respond to them. ● If a database operation blocks many requests for a long time, your entire application will grind to a halt.
  • 44. DB Performance ● You don't have to understand the performance characteristics of the application. ● But you have to understand how they change during before and after your migration. ● You have to do this on a regular basis. ● If we had an understanding on the effects of the migration even before we migrate them live, makes an advantage on us to not drop on outages.
  • 46. Gems ● To help your database healthy and still can add schema changes. ● Static analysis will warn in advance about certain unsafe migrations. ● Catch problems at dev time, not deploy time. ● ankane/strong_migrations ● LendingHome/zero_downtime_migrations ● Not technically a gem, but: Gitlab migration helpers
  • 47. Strong migrations ● Catch unsafe migrations in development ● Detects potentially dangerous operations ● Prevents them from running by default ● Provides instructions on safer ways to do what you want ● Supports for PostgreSQL, MySQL, and MariaDB
  • 48. Strong migrations - Warning and Suggestions
  • 49. Strong migrations - Warning and Suggestions
  • 50. Application Performance Monitoring ● Understanding your application's baseline performance is critical to understanding how migrations will change its performance characteristics.
  • 51. Takeaways ● DON’T add columns with a default value. ● DON’T backfill data inside a transaction. ● DON’T mix schema and data changes in the same migration. ● DO add Postgres indexes concurrently. ● DO monitor and test database performance before, during, and after migrations.
  • 53. IF WE WRITE SAFE MIGRATIONS, WE'LL RUN SAFE MIGRATIONS. Thank you!