SlideShare a Scribd company logo
Recovery of lost or corrupted InnoDB tables MySQL User Conference 2010, Santa Clara [email_address] Percona Inc. http://MySQLPerformanceBlog.com
Agenda InnoDB format overview Internal system tables SYS_INDEXES and SYS_TABLES InnoDB Primary and Secondary keys Typical failure scenarios InnoDB recovery tool - - Three things are certain: Death, taxes and lost data. Guess which has occurred?
1. InnoDB format overview
How MySQL stores data in InnoDB A table space (ibdata1) System tablespace(data dictionary, undo, insert buffer, etc.) PRIMARY indices (PK + data) SECONDARY indices (SK + PK) If the key is (f1, f2) it is stored as (f1, f2, PK) file per table (.ibd) PRIMARY index SECONDARY indices InnoDB pages size 16k (uncompressed) Every index is identified by  index_id
 
How MySQL stores data in InnoDB Page identifier index_id      TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1       COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0;                            created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;                     DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0;                     DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;            INDEX: name PRIMARY, id  0 254 , fields 1/7, type 3            root page 271, appr.key vals 1, leaf pages 1, size pages 1            FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at  mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb Error log:
InnoDB page format Fil Trailer  Page Directory FREE SPACE USER RECORDS   INFINNUM+SUPREMUM RECORDS PAGE_HEADER FIL HEADER
InnoDB page format Fil Header   the latest archived log file number at the time that  FIL_PAGE_FILE_FLUSH_LSN  was written (in the log)  4  FIL_PAGE_ARCH_LOG_NO   "the file has been flushed to disk at least up to this lsn" (log serial number), valid only on the first page of the file  8  FIL_PAGE_FILE_FLUSH_LSN   current defined types are:  FIL_PAGE_INDEX ,  FIL_PAGE_UNDO_LOG ,  FIL_PAGE_INODE ,  FIL_PAGE_IBUF_FREE_LIST   2  FIL_PAGE_TYPE   log serial number of page's latest log record  8  FIL_PAGE_LSN   offset of next page in key order  4  FIL_PAGE_NEXT   offset of previous page in key order  4  FIL_PAGE_PREV   ordinal page number from start of space  4  FIL_PAGE_OFFSET   4 ID of the space the page is in  4  FIL_PAGE_SPACE   Remarks   Size   Name   Data are stored in  FIL_PAGE_INODE  == 0x03
InnoDB page format Page  Header  "file segment header for the non-leaf pages in a B-tree" (this is irrelevant here)  10  PAGE_BTR_SEG_TOP   "file segment header for the leaf pages in a B-tree" (this is irrelevant here)  10  PAGE_BTR_SEG_LEAF   identifier of the index the page belongs to  8  PAGE_INDEX_ID   level within the index (0 for a leaf page)  2  PAGE_LEVEL   the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes)  8  PAGE_MAX_TRX_ID   number of user records  2  PAGE_N_RECS   number of consecutive inserts in the same direction, e.g. "last 5 were all to the left"  2  PAGE_N_DIRECTION   either  PAGE_LEFT ,  PAGE_RIGHT , or  PAGE_NO_DIRECTION   2  PAGE_DIRECTION   record pointer to the last inserted record  2  PAGE_LAST_INSERT   "number of bytes in deleted records"  2  PAGE_GARBAGE   record pointer to first free record  2  PAGE_FREE   number of heap records; initial value = 2  2  PAGE_N_HEAP   record pointer to first record in heap  2  PAGE_HEAP_TOP   number of directory slots in the Page Directory part; initial value = 2  2  PAGE_N_DIR_SLOTS   Remarks   Size   Name   index_id Highest bit is row format(1 -COMPACT, 0 - REDUNDANT )
InnoDB page format (REDUNDANT) Extra bytes   pointer to next record in page  16 bits  next 16 bits   1 if each Field Start Offsets is 1 byte long (this item is also called the "short" flag)  1 bit  1byte_offs_flag  number of fields in this record, 1 to 1023  10 bits  n_fields  record's order number in heap of index page  13 bits  heap_no  number of records owned by this record  4 bits  n_owned  1 if record is predefined minimum record  1 bit  min_rec_flag  1 if record is deleted  1 bit  deleted_flag  _ORDINAR Y,  _NODE_PTR ,  _INFIMUM ,  _SUPREMUM 2  bit s   record_status Description   Size   Name
InnoDB page format (COMPACT) Extra bytes   a relative pointer to the next record in the page 16 next 16 bits   000=conventional, 001=node pointer (inside B-tree),  010=infimum, 011=supremum, 1xx=reserved 3 record type the order number of this record in the heap of the index page 1 3 heap_no  the number of records owned by this record (this term is explained in page0page.h)  4   n_owned  4 bits used to delete mark a record, and mark a predefined minimum record in alphabetical order 4   Description   Size , bits Name
How to check row format? The highest bit of the  PAGE_N_HEAP   from the page header 0 stands for version  REDUNDANT , 1 - for  COMACT dc -e "2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p" | sed 's/./& /g' | awk '{ print $1}'
Rows in an InnoDB page Rows in a single pages is a linked  list The first record INFIMUM  The last record SUPREMUM Sorted by Primar y key infimum next supremum 0 100 data... next 101 data... next 102 data... next 103 data... next
Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, ' ccc '); insert into t1 values(20, ' bbb '); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2.. aaa ............. ...*....2.. ccc .... ...........+. ...2.. bbb ....................... ................................
Row format EXAMPLE: CREATE TABLE ` t1 ` ( ` ID ` int( 11 ) unsigned NOT NULL, ` NAME ` varchar(120), ` N_FIELDS ` int(10), PRIMARY KEY  (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 depends on content  Field Contents  6  bytes (5  bytes  if COMPACT format) Extra Bytes  (F*1) or (F*2) bytes  Field Start Offsets  Size   Name
REDUNDANT A row:  (10 , ‘abcdef’, 20 ) 4 6 7 Actualy stored as:  (10 , TRX_ID, PTR_ID, ‘abcdef’, 20 ) 6 4 Field Offsets … . next Extra 6 bytes: 0x00 00 00 0A record_status deleted_flag  min_rec_flag  n_owned  heap_no  n_fields  1byte_offs_flag   Fields ... ... abcdef 0x80 00 00 14
COMPACT A row:  (10 , ‘abcdef’, 20 ) 6 NULLS Actualy stored as:  (10 , TRX_ID, PTR_ID, ‘abcdef’, 20 ) Field Offsets … . next Extra 5 bytes: 0x00 00 00 0A Fields ... ... abcdef 0x80 00 00 14 A bit per NULL-able field
Data types INT types (fixed-size) String types VARCHAR(x) – variable-size CHAR(x) – fixed-size, variable-size if UTF-8 BLOBs If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally DECIMAL Stored in strings before 5.0.3, variable in size Binary format after 5.0.3, fixed-size.
BLOB and other long fields Field length(so called offset) is one or two byte long Page size is 16k If record size <  (UNIV_PAGE_SIZE/2-200)  == ~7k – the record is stored internally Otherwise – 768 bytes internally, the rest in external page
2 . Internal system tables SYS_INDEXES and SYS_TABLES
Why are SYS_* tables needed? Correspondence “table name” -> “index_id” Storage for other internal information
How MySQL stores data in InnoDB SYS_TABLES and SYS_INDEXES Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( ` TABLE_ID ` bigint(20) unsigned NOT NULL default '0', ` ID ` bigint(20) unsigned NOT NULL default '0', ` NAME ` varchar(120) default NULL, ` N_FIELDS ` int(10) unsigned default NULL, ` TYPE ` int(10) unsigned default NULL, ` SPACE ` int(10) unsigned default NULL, ` PAGE_NO ` int(10) unsigned default NULL, PRIMARY KEY  (`TABLE_ID`,`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 CREATE TABLE `SYS_TABLES` ( ` NAME ` varchar(255) NOT NULL default '', ` ID ` bigint(20) unsigned NOT NULL default '0', ` N_COLS ` int(10) unsigned default NULL, ` TYPE ` int(10) unsigned default NULL, ` MIX_ID ` bigint(20) unsigned default NULL, ` MIX_LEN ` int(10) unsigned default NULL, ` CLUSTER_NAME ` varchar(255) default NULL, ` SPACE ` int(10) unsigned default NULL, PRIMARY KEY  (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 Name:  PRIMARY GEN_CLUSTER_ID or unique index name
How MySQL stores data in InnoDB NAME   ID  …   SYS_TABLES &quot;archive19/9299_msg_store&quot;  40694  8 1 0 0 NULL 0 SYS_TABLES &quot;archive19/9299_msg_store&quot;  40694  8 1 0 0 NULL 0 SYS_TABLES &quot;archive19/9299_msg_store&quot;  40694  8 1 0 0 NULL 0 TABLE_ID   ID   NAME   … SYS_INDEXES  40694   196389  &quot;PRIMARY&quot; 2 3 0 21031026 SYS_INDEXES  40694   196390 &quot;msg_hash&quot; 1 0 0 21031028 SYS_TABLES SYS_INDEXES Example:
3. InnoDB Primary and Secondary keys
Primary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY  (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the PK: ID DB_TRX_ID DB_ROLL_PTR NAME N_FIELDS
Secondary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY  (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the SK: NAME ID    Primary key
4. Typical failure scenarios
Deleted records DELETE FROM table  WHERE id = 5; Forgotten WHERE clause Band-aid: Stop/kill mysqld ASAP
How delete is performed? &quot;row/row0upd.c“ : “… /* How is a delete performed?...The delete is performed by setting the delete bit in the record and substituting the id of the deleting transaction for the original trx id, and substituting a new roll ptr for previous roll ptr. The old trx id and roll ptr are saved in the undo log record. Thus, no physical changes occur in the index tree structure at the time of the delete . Only when the undo log is purged, the index records will be physically deleted from the index trees.…”
Dropped table/database DROP TABLE table; DROP DATABASE database; Often happens when restoring from SQL dump Bad because .FRM file goes away Especially painful when innodb_file_per_table Band-aid: Stop/kill mysqld ASAP Stop IO on an HDD or mount read-only or take a raw image
Corrupted InnoDB tablespace Hardware failures OS or filesystem failures InnoDB bugs Corrupted InnoDB tablespace by other processes Band-aid: Stop mysqld Take a copy of InnoDB files
Wrong UPDATE statement UPDATE user SET Password = PASSWORD(‘qwerty’)  WHERE User=‘root’ ; Again forgotten WHERE clause Bad because changes are applied in a PRIMARY index immediately Old version goes to UNDO segment Band-aid: Stop/kill mysqld ASAP
5. InnoDB recovery tool
Recovery prerequisites Media ibdata1 *.ibd HDD image Tables structure SQL dump *.FRM files
How to get CREATE info from .frm files 1.  CREATE TABLE t1 (id int) Engine=INNODB; 2.  Replace t1.frm  with the one’s you need to get scheme 3. R un “show create table t1” If mysqld crashes See the end of  bvi t1.frm  : .ID.NAME.N_FIELDS.. 2. *.FRM viewer  !TODO 3. InnoDB dictionary  !TODO
InnoDB recovery tool http:// launchpad.net / percona -innodb-recovery-tool / Written in Percona Contributed by Percona and community Supported by Percona Consists of two major tools page_parser  – splits InnoDB tablespace into 16k pages constraints_parser  – scans a page and finds good records
InnoDB recovery tool server #  ./page_parser -4 -f /var/lib/mysql/ibdata1 Opening file: /var/lib/mysql/ibdata1 Read data from fn=3... Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page page_parser
Page signature check 0{....0...4...4......=..E....... ........<..~...A.......|........ ................................ ... infimum ...... supremum f .....qT M/T/196001834/Titan Industries L imited_TITAN INTERNATIONAL HOLDI NGS B.V._CHRONATA_Cyprus_45829_4 5829_.. .e.....pTM/T/196001845/T itan Industries Limited_TITAN IN TERNATIONAL HOLDINGS B.V._TANISH INFIMUM and SUPREMUM records are in fixed positions Works with corrupted pages
InnoDB recovery tool server #  ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page constraints_parser Table structure is defined in  &quot;include/table_defs.h&quot; See HOWTO for details http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto Filters inside table_defs.h are very important
Check InnoDB page before reading recs # ./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V Initializing table definitions... Processing table: document_type_fieldsets_link - total fields: 5 - nullable fields: 0 - minimum header size: 5 - minimum rec size: 25 - maximum rec size: 25 Read data from fn=3... Page id: 12665 Checking a page Infimum offset: 0x63 Supremum offset: 0x70 Next record at offset: 0x9F (159) Next record at offset: 0xB0 (176) Next record at offset: 0x3D95 (15765) … Next record at offset: 0x70 (112) Page is good Check if the tool can follow all records by addresses If so, find a rec. exactly at the position where the record is. Helps a lot for COMPACT format!
Import result t1   1  &quot;browse&quot;  10 t1   2  &quot;dashboard&quot;  20 t1   3  &quot;addFolder&quot;  18 t1   4  &quot;editFolder&quot;  15 mysql> LOAD DATA INFILE '/path/to/datafile'  REPLACE INTO TABLE <table_name>  FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '&quot;'  LINES STARTING BY '<table_name>\t'  (id,sessionid,uniqueid,username,nasipaddress,@var1,@var2,etc) SET datefield1 = FROM_UNIXTIME(@var1), datefield2 = FROM_UNIXTIME(@var2,'%Y %D %M %h:%i:%s %x');
Questions ? Thank you for coming! References http://www.mysqlperformanceblog.com/ http://percona.com/ - - Applause :-)

More Related Content

What's hot (8)

PDF
PE102 - a Windows executable format overview (booklet V1)
Ange Albertini
 
PPTX
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Massimo Cenci
 
PPTX
Pe Format
Hexxx
 
PPTX
Online Vegetable Selling project Presentation
mayur patel
 
PDF
OER UNIT 4 PARTITION
Girija Muscut
 
PDF
Using MySQL in a web-scale environment
David Landgren
 
PDF
How sqlite works
Vikas Bansal
 
PE102 - a Windows executable format overview (booklet V1)
Ange Albertini
 
Data Warehouse and Business Intelligence - Recipe 4 - Staging area - how to v...
Massimo Cenci
 
Pe Format
Hexxx
 
Online Vegetable Selling project Presentation
mayur patel
 
OER UNIT 4 PARTITION
Girija Muscut
 
Using MySQL in a web-scale environment
David Landgren
 
How sqlite works
Vikas Bansal
 

Viewers also liked (20)

PDF
e-marketing p1c1
paulcostaseca
 
PPT
Le E Marketing vu par les E Marketers
Genaro Bardy
 
PDF
Add edit delete in Codeigniter in PHP
Vineet Kumar Saini
 
PPT
WEB MARKETING : Ou comment améliorer sa visibilité grâce à Internet
Lilian FOURCADIER
 
PDF
E-marketing pour les débutants
StrasWeb
 
PDF
PHP Data Objects
Wez Furlong
 
PPTX
Emailing comme outil marketing en 2014
Pierre-François Danse
 
PDF
CodeIgniter 2.0.X
Bo-Yi Wu
 
KEY
Object Calisthenics Applied to PHP
Guilherme Blanco
 
PDF
Mieux Développer en PHP avec Symfony
Hugo Hamon
 
PDF
The MD5 hashing algorithm
Bob Landstrom
 
PDF
Hierarchical data models in Relational Databases
navicorevn
 
PDF
232 md5-considered-harmful-slides
Dan Kaminsky
 
PDF
Hash Functions, the MD5 Algorithm and the Future (SHA-3)
Dylan Field
 
PPT
Le marketing à la performance en BtoB : quelle stratégie adopter ?
agence b2b
 
PDF
[Support de cours] WebMarketing et communication web - IPAC 2014
QWEB.ECO
 
PPTX
Dbms project.ppt
Vijayeandra Parthepan
 
PPSX
Cours e-marketing
Nicolas Vandenkerckhoven
 
PDF
Singletons in PHP - Why they are bad and how you can eliminate them from your...
go_oh
 
e-marketing p1c1
paulcostaseca
 
Le E Marketing vu par les E Marketers
Genaro Bardy
 
Add edit delete in Codeigniter in PHP
Vineet Kumar Saini
 
WEB MARKETING : Ou comment améliorer sa visibilité grâce à Internet
Lilian FOURCADIER
 
E-marketing pour les débutants
StrasWeb
 
PHP Data Objects
Wez Furlong
 
Emailing comme outil marketing en 2014
Pierre-François Danse
 
CodeIgniter 2.0.X
Bo-Yi Wu
 
Object Calisthenics Applied to PHP
Guilherme Blanco
 
Mieux Développer en PHP avec Symfony
Hugo Hamon
 
The MD5 hashing algorithm
Bob Landstrom
 
Hierarchical data models in Relational Databases
navicorevn
 
232 md5-considered-harmful-slides
Dan Kaminsky
 
Hash Functions, the MD5 Algorithm and the Future (SHA-3)
Dylan Field
 
Le marketing à la performance en BtoB : quelle stratégie adopter ?
agence b2b
 
[Support de cours] WebMarketing et communication web - IPAC 2014
QWEB.ECO
 
Dbms project.ppt
Vijayeandra Parthepan
 
Cours e-marketing
Nicolas Vandenkerckhoven
 
Singletons in PHP - Why they are bad and how you can eliminate them from your...
go_oh
 
Ad

Similar to Recovery of lost or corrupted inno db tables(mysql uc 2010) (20)

PPTX
cPanelCon 2014: InnoDB Anatomy
Ryan Robson
 
PPTX
Data Warehouse and Business Intelligence - Recipe 2
Massimo Cenci
 
PPTX
Optimizando MySQL
Marcelo Altmann
 
ODP
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
Ontico
 
PDF
Page Cache in Linux 2.6.pdf
ycelgemici1
 
PPTX
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...
Marco Gralike
 
PDF
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
Insight Technology, Inc.
 
PPT
0104 abap dictionary
vkyecc1
 
PPT
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 
PPT
Explain that explain
Fabrizio Parrella
 
PPT
Less08 Schema
vivaankumar
 
PPTX
cPanelCon 2015: InnoDB Alchemy
Ryan Robson
 
PDF
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
Marco Gralike
 
PPTX
Data Warehouse and Business Intelligence - Recipe 3
Massimo Cenci
 
DOCX
Write a task that will perform some of the functions performed by a s.docx
ajoy21
 
PDF
Btree. Explore the heart of PostgreSQL.
Anastasia Lubennikova
 
PDF
How to use Parquet as a basis for ETL and analytics
Julien Le Dem
 
PDF
MySQL innoDB split and merge pages
Marco Tusa
 
PPTX
Implementation
Bhandari Nawaraj
 
cPanelCon 2014: InnoDB Anatomy
Ryan Robson
 
Data Warehouse and Business Intelligence - Recipe 2
Massimo Cenci
 
Optimizando MySQL
Marcelo Altmann
 
InnoDB: архитектура транзакционного хранилища (Константин Осипов)
Ontico
 
Page Cache in Linux 2.6.pdf
ycelgemici1
 
OakTable World 2015 - Using XMLType content with the Oracle In-Memory Column...
Marco Gralike
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
Insight Technology, Inc.
 
0104 abap dictionary
vkyecc1
 
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 
Explain that explain
Fabrizio Parrella
 
Less08 Schema
vivaankumar
 
cPanelCon 2015: InnoDB Alchemy
Ryan Robson
 
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
Marco Gralike
 
Data Warehouse and Business Intelligence - Recipe 3
Massimo Cenci
 
Write a task that will perform some of the functions performed by a s.docx
ajoy21
 
Btree. Explore the heart of PostgreSQL.
Anastasia Lubennikova
 
How to use Parquet as a basis for ETL and analytics
Julien Le Dem
 
MySQL innoDB split and merge pages
Marco Tusa
 
Implementation
Bhandari Nawaraj
 
Ad

Recently uploaded (20)

PDF
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
PDF
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
PPTX
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
PDF
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
PPTX
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
PDF
July Patch Tuesday
Ivanti
 
PPTX
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPTX
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
PDF
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
PDF
Blockchain Transactions Explained For Everyone
CIFDAQ
 
PDF
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
PDF
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
PDF
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
PDF
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
PDF
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
PDF
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 
Fl Studio 24.2.2 Build 4597 Crack for Windows Free Download 2025
faizk77g
 
Chris Elwell Woburn, MA - Passionate About IT Innovation
Chris Elwell Woburn, MA
 
AUTOMATION AND ROBOTICS IN PHARMA INDUSTRY.pptx
sameeraaabegumm
 
Smart Trailers 2025 Update with History and Overview
Paul Menig
 
OpenID AuthZEN - Analyst Briefing July 2025
David Brossard
 
July Patch Tuesday
Ivanti
 
✨Unleashing Collaboration: Salesforce Channels & Community Power in Patna!✨
SanjeetMishra29
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Webinar: Introduction to LF Energy EVerest
DanBrown980551
 
How Startups Are Growing Faster with App Developers in Australia.pdf
India App Developer
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
"Beyond English: Navigating the Challenges of Building a Ukrainian-language R...
Fwdays
 
Blockchain Transactions Explained For Everyone
CIFDAQ
 
Empower Inclusion Through Accessible Java Applications
Ana-Maria Mihalceanu
 
CIFDAQ Token Spotlight for 9th July 2025
CIFDAQ
 
HCIP-Data Center Facility Deployment V2.0 Training Material (Without Remarks ...
mcastillo49
 
"AI Transformation: Directions and Challenges", Pavlo Shaternik
Fwdays
 
SWEBOK Guide and Software Services Engineering Education
Hironori Washizaki
 
HubSpot Main Hub: A Unified Growth Platform
Jaswinder Singh
 

Recovery of lost or corrupted inno db tables(mysql uc 2010)

  • 1. Recovery of lost or corrupted InnoDB tables MySQL User Conference 2010, Santa Clara [email_address] Percona Inc. http://MySQLPerformanceBlog.com
  • 2. Agenda InnoDB format overview Internal system tables SYS_INDEXES and SYS_TABLES InnoDB Primary and Secondary keys Typical failure scenarios InnoDB recovery tool - - Three things are certain: Death, taxes and lost data. Guess which has occurred?
  • 3. 1. InnoDB format overview
  • 4. How MySQL stores data in InnoDB A table space (ibdata1) System tablespace(data dictionary, undo, insert buffer, etc.) PRIMARY indices (PK + data) SECONDARY indices (SK + PK) If the key is (f1, f2) it is stored as (f1, f2, PK) file per table (.ibd) PRIMARY index SECONDARY indices InnoDB pages size 16k (uncompressed) Every index is identified by index_id
  • 5.  
  • 6. How MySQL stores data in InnoDB Page identifier index_id     TABLE: name test/site_folders, id 0 119, columns 9, indexes 1, appr.rows 1       COLUMNS: id: DATA_INT len 4 prec 0; name: type 12 len 765 prec 0; sites_count: DATA_INT len 4 prec 0;                            created_at: DATA_INT len 8 prec 0; updated_at: DATA_INT len 8 prec 0;                    DB_ROW_ID: DATA_SYS prtype 256 len 6 prec 0; DB_TRX_ID: DATA_SYS prtype 257 len 6 prec 0;                    DB_ROLL_PTR: DATA_SYS prtype 258 len 7 prec 0;           INDEX: name PRIMARY, id 0 254 , fields 1/7, type 3            root page 271, appr.key vals 1, leaf pages 1, size pages 1            FIELDS:  id DB_TRX_ID DB_ROLL_PTR name sites_count created_at updated_at mysql> CREATE TABLE innodb_table_monitor(x int) engine=innodb Error log:
  • 7. InnoDB page format Fil Trailer Page Directory FREE SPACE USER RECORDS INFINNUM+SUPREMUM RECORDS PAGE_HEADER FIL HEADER
  • 8. InnoDB page format Fil Header the latest archived log file number at the time that FIL_PAGE_FILE_FLUSH_LSN was written (in the log) 4 FIL_PAGE_ARCH_LOG_NO &quot;the file has been flushed to disk at least up to this lsn&quot; (log serial number), valid only on the first page of the file 8 FIL_PAGE_FILE_FLUSH_LSN current defined types are: FIL_PAGE_INDEX , FIL_PAGE_UNDO_LOG , FIL_PAGE_INODE , FIL_PAGE_IBUF_FREE_LIST 2 FIL_PAGE_TYPE log serial number of page's latest log record 8 FIL_PAGE_LSN offset of next page in key order 4 FIL_PAGE_NEXT offset of previous page in key order 4 FIL_PAGE_PREV ordinal page number from start of space 4 FIL_PAGE_OFFSET 4 ID of the space the page is in 4 FIL_PAGE_SPACE Remarks Size Name Data are stored in FIL_PAGE_INODE == 0x03
  • 9. InnoDB page format Page Header &quot;file segment header for the non-leaf pages in a B-tree&quot; (this is irrelevant here) 10 PAGE_BTR_SEG_TOP &quot;file segment header for the leaf pages in a B-tree&quot; (this is irrelevant here) 10 PAGE_BTR_SEG_LEAF identifier of the index the page belongs to 8 PAGE_INDEX_ID level within the index (0 for a leaf page) 2 PAGE_LEVEL the highest ID of a transaction which might have changed a record on the page (only set for secondary indexes) 8 PAGE_MAX_TRX_ID number of user records 2 PAGE_N_RECS number of consecutive inserts in the same direction, e.g. &quot;last 5 were all to the left&quot; 2 PAGE_N_DIRECTION either PAGE_LEFT , PAGE_RIGHT , or PAGE_NO_DIRECTION 2 PAGE_DIRECTION record pointer to the last inserted record 2 PAGE_LAST_INSERT &quot;number of bytes in deleted records&quot; 2 PAGE_GARBAGE record pointer to first free record 2 PAGE_FREE number of heap records; initial value = 2 2 PAGE_N_HEAP record pointer to first record in heap 2 PAGE_HEAP_TOP number of directory slots in the Page Directory part; initial value = 2 2 PAGE_N_DIR_SLOTS Remarks Size Name index_id Highest bit is row format(1 -COMPACT, 0 - REDUNDANT )
  • 10. InnoDB page format (REDUNDANT) Extra bytes pointer to next record in page 16 bits next 16 bits 1 if each Field Start Offsets is 1 byte long (this item is also called the &quot;short&quot; flag) 1 bit 1byte_offs_flag number of fields in this record, 1 to 1023 10 bits n_fields record's order number in heap of index page 13 bits heap_no number of records owned by this record 4 bits n_owned 1 if record is predefined minimum record 1 bit min_rec_flag 1 if record is deleted 1 bit deleted_flag _ORDINAR Y, _NODE_PTR , _INFIMUM , _SUPREMUM 2 bit s record_status Description Size Name
  • 11. InnoDB page format (COMPACT) Extra bytes a relative pointer to the next record in the page 16 next 16 bits 000=conventional, 001=node pointer (inside B-tree), 010=infimum, 011=supremum, 1xx=reserved 3 record type the order number of this record in the heap of the index page 1 3 heap_no the number of records owned by this record (this term is explained in page0page.h) 4 n_owned 4 bits used to delete mark a record, and mark a predefined minimum record in alphabetical order 4 Description Size , bits Name
  • 12. How to check row format? The highest bit of the PAGE_N_HEAP from the page header 0 stands for version REDUNDANT , 1 - for COMACT dc -e &quot;2o `hexdump –C d pagefile | grep 00000020 | awk '{ print $12}'` p&quot; | sed 's/./& /g' | awk '{ print $1}'
  • 13. Rows in an InnoDB page Rows in a single pages is a linked list The first record INFIMUM The last record SUPREMUM Sorted by Primar y key infimum next supremum 0 100 data... next 101 data... next 102 data... next 103 data... next
  • 14. Records are saved in insert order insert into t1 values(10, 'aaa'); insert into t1 values(30, ' ccc '); insert into t1 values(20, ' bbb '); JG....................N<E....... ................................ .............................2.. ...infimum......supremum......6. ........)....2.. aaa ............. ...*....2.. ccc .... ...........+. ...2.. bbb ....................... ................................
  • 15. Row format EXAMPLE: CREATE TABLE ` t1 ` ( ` ID ` int( 11 ) unsigned NOT NULL, ` NAME ` varchar(120), ` N_FIELDS ` int(10), PRIMARY KEY (`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 depends on content Field Contents 6 bytes (5 bytes if COMPACT format) Extra Bytes (F*1) or (F*2) bytes Field Start Offsets Size Name
  • 16. REDUNDANT A row: (10 , ‘abcdef’, 20 ) 4 6 7 Actualy stored as: (10 , TRX_ID, PTR_ID, ‘abcdef’, 20 ) 6 4 Field Offsets … . next Extra 6 bytes: 0x00 00 00 0A record_status deleted_flag min_rec_flag n_owned heap_no n_fields 1byte_offs_flag Fields ... ... abcdef 0x80 00 00 14
  • 17. COMPACT A row: (10 , ‘abcdef’, 20 ) 6 NULLS Actualy stored as: (10 , TRX_ID, PTR_ID, ‘abcdef’, 20 ) Field Offsets … . next Extra 5 bytes: 0x00 00 00 0A Fields ... ... abcdef 0x80 00 00 14 A bit per NULL-able field
  • 18. Data types INT types (fixed-size) String types VARCHAR(x) – variable-size CHAR(x) – fixed-size, variable-size if UTF-8 BLOBs If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally DECIMAL Stored in strings before 5.0.3, variable in size Binary format after 5.0.3, fixed-size.
  • 19. BLOB and other long fields Field length(so called offset) is one or two byte long Page size is 16k If record size < (UNIV_PAGE_SIZE/2-200) == ~7k – the record is stored internally Otherwise – 768 bytes internally, the rest in external page
  • 20. 2 . Internal system tables SYS_INDEXES and SYS_TABLES
  • 21. Why are SYS_* tables needed? Correspondence “table name” -> “index_id” Storage for other internal information
  • 22. How MySQL stores data in InnoDB SYS_TABLES and SYS_INDEXES Always REDUNDANT format! CREATE TABLE `SYS_INDEXES` ( ` TABLE_ID ` bigint(20) unsigned NOT NULL default '0', ` ID ` bigint(20) unsigned NOT NULL default '0', ` NAME ` varchar(120) default NULL, ` N_FIELDS ` int(10) unsigned default NULL, ` TYPE ` int(10) unsigned default NULL, ` SPACE ` int(10) unsigned default NULL, ` PAGE_NO ` int(10) unsigned default NULL, PRIMARY KEY (`TABLE_ID`,`ID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 CREATE TABLE `SYS_TABLES` ( ` NAME ` varchar(255) NOT NULL default '', ` ID ` bigint(20) unsigned NOT NULL default '0', ` N_COLS ` int(10) unsigned default NULL, ` TYPE ` int(10) unsigned default NULL, ` MIX_ID ` bigint(20) unsigned default NULL, ` MIX_LEN ` int(10) unsigned default NULL, ` CLUSTER_NAME ` varchar(255) default NULL, ` SPACE ` int(10) unsigned default NULL, PRIMARY KEY (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 index_id = 0-3 index_id = 0-1 Name: PRIMARY GEN_CLUSTER_ID or unique index name
  • 23. How MySQL stores data in InnoDB NAME ID … SYS_TABLES &quot;archive19/9299_msg_store&quot; 40694 8 1 0 0 NULL 0 SYS_TABLES &quot;archive19/9299_msg_store&quot; 40694 8 1 0 0 NULL 0 SYS_TABLES &quot;archive19/9299_msg_store&quot; 40694 8 1 0 0 NULL 0 TABLE_ID ID NAME … SYS_INDEXES 40694 196389 &quot;PRIMARY&quot; 2 3 0 21031026 SYS_INDEXES 40694 196390 &quot;msg_hash&quot; 1 0 0 21031028 SYS_TABLES SYS_INDEXES Example:
  • 24. 3. InnoDB Primary and Secondary keys
  • 25. Primary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the PK: ID DB_TRX_ID DB_ROLL_PTR NAME N_FIELDS
  • 26. Secondary key The table: CREATE TABLE `t1` ( `ID` int(11), `NAME` varchar(120), `N_FIELDS` int(10), PRIMARY KEY (`ID`), KEY `NAME` (`NAME`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 Fields in the SK: NAME ID  Primary key
  • 27. 4. Typical failure scenarios
  • 28. Deleted records DELETE FROM table WHERE id = 5; Forgotten WHERE clause Band-aid: Stop/kill mysqld ASAP
  • 29. How delete is performed? &quot;row/row0upd.c“ : “… /* How is a delete performed?...The delete is performed by setting the delete bit in the record and substituting the id of the deleting transaction for the original trx id, and substituting a new roll ptr for previous roll ptr. The old trx id and roll ptr are saved in the undo log record. Thus, no physical changes occur in the index tree structure at the time of the delete . Only when the undo log is purged, the index records will be physically deleted from the index trees.…”
  • 30. Dropped table/database DROP TABLE table; DROP DATABASE database; Often happens when restoring from SQL dump Bad because .FRM file goes away Especially painful when innodb_file_per_table Band-aid: Stop/kill mysqld ASAP Stop IO on an HDD or mount read-only or take a raw image
  • 31. Corrupted InnoDB tablespace Hardware failures OS or filesystem failures InnoDB bugs Corrupted InnoDB tablespace by other processes Band-aid: Stop mysqld Take a copy of InnoDB files
  • 32. Wrong UPDATE statement UPDATE user SET Password = PASSWORD(‘qwerty’) WHERE User=‘root’ ; Again forgotten WHERE clause Bad because changes are applied in a PRIMARY index immediately Old version goes to UNDO segment Band-aid: Stop/kill mysqld ASAP
  • 34. Recovery prerequisites Media ibdata1 *.ibd HDD image Tables structure SQL dump *.FRM files
  • 35. How to get CREATE info from .frm files 1. CREATE TABLE t1 (id int) Engine=INNODB; 2. Replace t1.frm with the one’s you need to get scheme 3. R un “show create table t1” If mysqld crashes See the end of bvi t1.frm : .ID.NAME.N_FIELDS.. 2. *.FRM viewer !TODO 3. InnoDB dictionary !TODO
  • 36. InnoDB recovery tool http:// launchpad.net / percona -innodb-recovery-tool / Written in Percona Contributed by Percona and community Supported by Percona Consists of two major tools page_parser – splits InnoDB tablespace into 16k pages constraints_parser – scans a page and finds good records
  • 37. InnoDB recovery tool server # ./page_parser -4 -f /var/lib/mysql/ibdata1 Opening file: /var/lib/mysql/ibdata1 Read data from fn=3... Read page #0.. saving it to pages-1259793800/0-18219008/0-00000000.page Read page #1.. saving it to pages-1259793800/0-0/1-00000001.page Read page #2.. saving it to pages-1259793800/4294967295-65535/2-00000002.page Read page #3.. saving it to pages-1259793800/0-0/3-00000003.page page_parser
  • 38. Page signature check 0{....0...4...4......=..E....... ........<..~...A.......|........ ................................ ... infimum ...... supremum f .....qT M/T/196001834/Titan Industries L imited_TITAN INTERNATIONAL HOLDI NGS B.V._CHRONATA_Cyprus_45829_4 5829_.. .e.....pTM/T/196001845/T itan Industries Limited_TITAN IN TERNATIONAL HOLDINGS B.V._TANISH INFIMUM and SUPREMUM records are in fixed positions Works with corrupted pages
  • 39. InnoDB recovery tool server # ./constraints_parser -4 -f pages-1259793800/0-16/51-00000051.page constraints_parser Table structure is defined in &quot;include/table_defs.h&quot; See HOWTO for details http://code.google.com/p/innodb-tools/wiki/InnodbRecoveryHowto Filters inside table_defs.h are very important
  • 40. Check InnoDB page before reading recs # ./constraints_parser -5 -U -f pages/0-418/12665-00012665.page -V Initializing table definitions... Processing table: document_type_fieldsets_link - total fields: 5 - nullable fields: 0 - minimum header size: 5 - minimum rec size: 25 - maximum rec size: 25 Read data from fn=3... Page id: 12665 Checking a page Infimum offset: 0x63 Supremum offset: 0x70 Next record at offset: 0x9F (159) Next record at offset: 0xB0 (176) Next record at offset: 0x3D95 (15765) … Next record at offset: 0x70 (112) Page is good Check if the tool can follow all records by addresses If so, find a rec. exactly at the position where the record is. Helps a lot for COMPACT format!
  • 41. Import result t1 1 &quot;browse&quot; 10 t1 2 &quot;dashboard&quot; 20 t1 3 &quot;addFolder&quot; 18 t1 4 &quot;editFolder&quot; 15 mysql> LOAD DATA INFILE '/path/to/datafile' REPLACE INTO TABLE <table_name> FIELDS TERMINATED BY '\t' OPTIONALLY ENCLOSED BY '&quot;' LINES STARTING BY '<table_name>\t' (id,sessionid,uniqueid,username,nasipaddress,@var1,@var2,etc) SET datefield1 = FROM_UNIXTIME(@var1), datefield2 = FROM_UNIXTIME(@var2,'%Y %D %M %h:%i:%s %x');
  • 42. Questions ? Thank you for coming! References http://www.mysqlperformanceblog.com/ http://percona.com/ - - Applause :-)