Migrating from PostgreSQL to MySQL  at Cocolog Naoto Yokoyama, NIFTY Corporation Garth Webb, Six Apart Lisa Phillips, Six Apart Credits: Kenji Hirohama, Sumisho Computer Systems Corp.
Agenda 1. What is Cocolog 2. History of Cocolog 3. DBP: Database Partitioning 4. Migration From PostgreSQL to MySQL
1. What is Cocolog
What is Cocolog NIFTY Corporation Established in 1986 A Fujitsu Group Company NIFTY-Serve (licensed and interconnected with CompuServe) One of the largest ISPs in Japan Cocolog First blog community at a Japanese ISP Based on TypePad technology by SixApart Several hundred million PV/month History Dec/02/2003: Cocolog for ISP users launch Nov/24/2005: Cocolog Free for free launch April/05/2007: Cocolog for Mobile Phone launch
Cocolog (Screenshot of home page) 2008/04 700 Thousand Users
Cocolog (Screenshot of home page) TypePad Cocolog
Cocolog template sets
Cocolog Growth (User)   ■ Cocolog   ■ Cocolog Free phase1 phase2 phase3 phase4
Cocolog Growth (Entry)   ■ Cocolog   ■ Cocolog Free phase1 phase2 phase3 phase4
Technology at Cocolog Core System Linux 2.4/2.6 Apache 1.3/2.0/2.2  &  mod_perl Perl 5.8+CPAN PostgreSQL 8.1 MySQL 5.0 memcached/TheSchwartz/cfengine Eco System LAMP,LAPP,Ruby+ActiveRecord, Capistrano Etc...
Monitoring Management Tool Proprietary in-house development with PostgreSQL, PHP, and Perl Monitoring points (order of priority) ‏ response time of each  post number of spam comments/trackbacks number of comments/trackbacks source IP address of spam number of entries number of comments via mobile devices page views via mobile devices time of batch completion amount of API usage bandwidth usage DB Disk I/O Memory and CPU usage time of VACUUM analyze APP number of active processes CPU usage Memory usage Hard DB Service APL
Tips for migration Troubles with PostreSQL 7.4-8.1&Linux 2.4/2.6 VACUUM Data size Character set Cleaning data Troubles with MySQL convert_tz function sort order
2. History of Cocolog
Phase1 2003/12 ~ (Entry: 0.04 Million ) Register Postgre SQL NAS WEB Static contents Published Before DBP 10servers TypePad
Phase2 2004/12 ~  (Entry: 7 Million ) Podcast Portal Profile   Etc.. Rich template Publish Book Tel Operator Support NAS WEB Static contents Published Postgre SQL Register TypePad Before DBP 50servers 2004/12 ~ 2005/5 ~
Phase2  - Problems The system is tightly coupled. Database server is receiving from multiple points. It is difficult to change the system design and database schema.
Phase3 2006/3 ~  (Entry: 12 Million ) NAS WEB Static contents Published Web-API memcached Podcast Portal Profile   Etc.. Postgre SQL Rich template Publish Book Tel Operator Support Register TypePad Before DBP 200servers
Phase4 2007/4 ~  (Entry: 16 Million ) Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Rich template Publish Book Tel Operator Support Register Typepad Postgre SQL Before DBP 300servers
Now 2008/4 ~ Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Typepad Rich template Publish Book Tel Operator Support Register Multi MySQL After DBP 150servers
3.  TypePad Database Partitioning
Steps for Transitioning Server Preparation      Hardware and software setup Global Write      Write user information to the global DB Global Read      Read/write user information on the global DB Move Sequence      Table sequences served by global DB User Data Move      Move user data to user partitions New User Partition      All new users saved directly to user partition 1 New User Strategy      Decide on a strategy for the new user partition Non User Data Move      Move all non-user owned data
TypePad Overview (PreDBP) ‏ Storage Database (Postgres) ‏ Static Content (HTML, Images, etc) ‏ Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) ‏ https(443) ‏ http(80) ‏ http(80) : atom api memcached(11211) ‏ postgres(5432) ‏ Mail Server Internet nfs(2049) ‏ ADMIN(CRON) Server smtp(25) / pop(110) ‏ Blog Readers Blog Owners Mobile Blog Readers smtp(25) / pop(110) ‏ Cron Server for periodic asynchronous tasks
Why Partition? TypePad TypePad TypePad Non- User Role TypePad User Role (User0) ‏ All inquires (access) go to one DB(Postgres)  After DBP Current setup Inquiries (access) are divided among several DB(MySQL)  TypePad TypePad TypePad TypePad Global Role Non-User Role User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏
Server Preparation Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role New expanded setup DB(MySQL) ‏  for partitioned data Current Setup Job Server + TypePad  + Schwartz Schwartz DB User information is partitioned Maintains user mapping and primary key generation Stores  job details Server for executing Jobs ※ Grey areas are not used in current steps Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
Global Write Creating the user map Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB ① ② Explanation  ①: For new registrations only, uniquely identifying user data is written to the global DB  ②: This same data continues to be written to the existing DB DB(MySQL) ‏  for partitioned data Asynchronous Job Server Maintains user mapping and primary key generation ※ Grey areas are not used in current steps
Global Read Use the user map to find the user partition Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: Migrate existing user data to the global DB  ②: At start of the request, the application queries global DB for the location of user data  ③: The application then talks to this DB for all queries about this user.  At this stage the global DB points to the user0 partition in all cases. DB(MySQL) ‏  for partitioned data Maintains user mapping and primary key generation ① Migrate existing user data Asynchronous Job Server ② ③ ※ Grey areas are not used in current steps
Move Sequence Migrating primary key generation Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that act as “pseudo-sequences”.  ②  Application requests new primary keys from global DB rather than the user partition. DB(MySQL) ‏  for partitioned data Maintains user mapping and primary key generation ① ※ Grey areas are not used in current steps Migrate sequence  management Asynchronous Job Server ②
User Data Move Moving user data to the new user-role partitions Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: Existing users that should be migrated by Job Server are submitted as new Schwartz jobs.  User data is then migrated asynchronously  ②: If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later.  ③: After being migrated all user data will exist on the user-role DB partitions  ④: Once all user data is migrated, only non-user data is on Postgres DB(MySQL) ‏  for partitioned data Stores  job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps ③ Migrating each  user data DB(MySQL) ‏  for partitioned data ④
New User Partition New registrations are created on one user role partition Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: When new users register, user data is written to a user role partition.  ②: Non-user data continues to be served off Postgres DB(MySQL) ‏  for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
New User Strategy Pick a scheme for distributing new users Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: When new users register, user data is written to one of the user role partitions, depending on a set distribution method (round robin, random, etc)  ②: Non-user data continues to be served off Postgres DB(MySQL) ‏  for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
Non User Data Move Migrate data that cannot be partitioned by user Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: Migrate non-user role data left on PostgreSQL to the MySQL side. DB(MySQL) ‏  for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps Migrate non-User  data Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
Data migration done Non- User Role TypePad User Role (User0) ‏ DB(Postgres) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad  + Schwartz Schwartz DB Explanation  ①: All data access is now done through MySQL  ②: Continue to use The Schwartz for asynchronous jobs DB(MySQL) ‏  for partitioned data Stores  job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps ① ② Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
The New TypePad configuration Storage Database (MySQL) ‏ Static Content (HTML, Images, etc) ‏ Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) ‏ https(443) ‏ http(80) ‏ http(80) : atom api memcached(11211) ‏ MySQL(3306) ‏ Mail Server Internet nfs(2049) ‏ ADMIN(CRON) Server smtp(25) / pop(110) ‏ Blog Readers Blog Owners (management interface) ‏ Mobile Blog Readers smtp(25) / pop(110) ‏ Cron Server for periodic asynchronous tasks Job Server TheSchwartz server for running ad-hoc jobs asynchronously
4. Migration from PostgreSQL to MySQL
DB Node Spec History History of scale up PostgreSQL server, Before DBP Yes 16GB MP3.3GHz/1M×4 〔 2Core×4 〕 AS4 (2.6.9) 2003/12 2007/11 Time AS4 (2.6.9) AS2.1(2.4.9) ES2.1(2.4.9) ES2.1(2.4.9) 7.4(2.4.9) OS(RedHat) 3.2GHz/1M×4 3.2GHz/1M×4 3.2GHz/1M×2 3.2GHz/1M×2 1.8GHz/512k×1 CPU Xeon 12GB 12GB 4GB 4GB 1GB MEM Yes Yes Yes No No DiskArray
DB DiskArray Spec [FUJITSU ETERNUS8000] Best I/O transaction performance in the world 146GB (15 krpm) * 32disk with RAID - 10 MultiPath FibreChannel 4Gbps QuickOPC  (One Point Copy) OPC copy functions let you create a duplicate copy of any data from the original at any chosen time. http://www.computers.us.fujitsu.com/www/products_storage.shtml?products/storage/fujitsu/e8000/e8000 History of scale up PostgreSQL server, Before DBP
Scale out MySQL servers, After DBP A role configuration Each role is configured as HA cluster HA Software: NEC ClusterPro Shared Storage
Scale out MySQL servers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat TypePad Application   MySQL Role3 MySQL Role2 MySQL Role1
Scale out MySQL servers, After DBP Backup Replication w/ Hot backup
Scale out MySQL servers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat MySQL BackupRole TypePad Application   mysqld mysqld mysqld rep rep rep opc mysqld mysqld mysqld MySQL Role3 MySQL Role2 MySQL Role1
Troubles with PostreSQL 7.4 – 8.1 Data size over 100 GB 40% is index Severe Data Fragmentation VACUUM “ VACUUM analyze” cause the performance problem Takes too long to VACUUM large amounts of data dump/restore is the only solution for de-fragmentation Auto VACUUM We don’t use Auto VACUUM since we are worried about latent response time
Troubles with PostgreSQL 7.4 – 8.1 Character set PostgreSQL   allow the out of boundary UTF-8 Japanese extended character sets and  multi  bytes character sets which normally should come back with an error - instead of accepting them.
“ Cleaning” data Removing characters set that are out of the boundries UTF-8 character sets. Steps PostgreSQL.dumpALL Split for Piconv UTF8 -> UCS2 -> UTF8 & Merge PostgreSQL.restore dump Split UTF8->UCS2->UTF8 Merge restore
Migration from PostgreSQL to MySQL using TypePad script Steps PostgreSQL -> PerlObject & tmp publish -> MySQL -> PerlObject & last publish diff tmp  &  last Object  ( data check ) diff tmp  &  last publish  ( file check ) TypePad TypePad PostgreSQL Document Object tmp Document Object last File check data check
Troubles with MySQL convert_tz function doesn't support the input value outside the scope of Unix Time sort order different sort order  without  “ order  by” clause
Cocolog Future Plans Dynamic Job queue
Consulting by Sumisho Computer Systems Corp. System Integrator first and best partner of MySQL in Japan since 2003 provide MySQL consulting, support, training service HA Maintenance online backup Japanese character support
Questions

5MB

  • 1.
    Migrating from PostgreSQLto MySQL at Cocolog Naoto Yokoyama, NIFTY Corporation Garth Webb, Six Apart Lisa Phillips, Six Apart Credits: Kenji Hirohama, Sumisho Computer Systems Corp.
  • 2.
    Agenda 1. Whatis Cocolog 2. History of Cocolog 3. DBP: Database Partitioning 4. Migration From PostgreSQL to MySQL
  • 3.
    1. What isCocolog
  • 4.
    What is CocologNIFTY Corporation Established in 1986 A Fujitsu Group Company NIFTY-Serve (licensed and interconnected with CompuServe) One of the largest ISPs in Japan Cocolog First blog community at a Japanese ISP Based on TypePad technology by SixApart Several hundred million PV/month History Dec/02/2003: Cocolog for ISP users launch Nov/24/2005: Cocolog Free for free launch April/05/2007: Cocolog for Mobile Phone launch
  • 5.
    Cocolog (Screenshot ofhome page) 2008/04 700 Thousand Users
  • 6.
    Cocolog (Screenshot ofhome page) TypePad Cocolog
  • 7.
  • 8.
    Cocolog Growth (User)  ■ Cocolog   ■ Cocolog Free phase1 phase2 phase3 phase4
  • 9.
    Cocolog Growth (Entry)  ■ Cocolog   ■ Cocolog Free phase1 phase2 phase3 phase4
  • 10.
    Technology at CocologCore System Linux 2.4/2.6 Apache 1.3/2.0/2.2  & mod_perl Perl 5.8+CPAN PostgreSQL 8.1 MySQL 5.0 memcached/TheSchwartz/cfengine Eco System LAMP,LAPP,Ruby+ActiveRecord, Capistrano Etc...
  • 11.
    Monitoring Management ToolProprietary in-house development with PostgreSQL, PHP, and Perl Monitoring points (order of priority) ‏ response time of each post number of spam comments/trackbacks number of comments/trackbacks source IP address of spam number of entries number of comments via mobile devices page views via mobile devices time of batch completion amount of API usage bandwidth usage DB Disk I/O Memory and CPU usage time of VACUUM analyze APP number of active processes CPU usage Memory usage Hard DB Service APL
  • 12.
    Tips for migrationTroubles with PostreSQL 7.4-8.1&Linux 2.4/2.6 VACUUM Data size Character set Cleaning data Troubles with MySQL convert_tz function sort order
  • 13.
  • 14.
    Phase1 2003/12 ~(Entry: 0.04 Million ) Register Postgre SQL NAS WEB Static contents Published Before DBP 10servers TypePad
  • 15.
    Phase2 2004/12 ~ (Entry: 7 Million ) Podcast Portal Profile Etc.. Rich template Publish Book Tel Operator Support NAS WEB Static contents Published Postgre SQL Register TypePad Before DBP 50servers 2004/12 ~ 2005/5 ~
  • 16.
    Phase2 -Problems The system is tightly coupled. Database server is receiving from multiple points. It is difficult to change the system design and database schema.
  • 17.
    Phase3 2006/3 ~ (Entry: 12 Million ) NAS WEB Static contents Published Web-API memcached Podcast Portal Profile Etc.. Postgre SQL Rich template Publish Book Tel Operator Support Register TypePad Before DBP 200servers
  • 18.
    Phase4 2007/4 ~ (Entry: 16 Million ) Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Rich template Publish Book Tel Operator Support Register Typepad Postgre SQL Before DBP 300servers
  • 19.
    Now 2008/4 ~Web-API NAS WEB Static contents Published memcached Atom Mobile WEB Typepad Rich template Publish Book Tel Operator Support Register Multi MySQL After DBP 150servers
  • 20.
    3. TypePadDatabase Partitioning
  • 21.
    Steps for TransitioningServer Preparation      Hardware and software setup Global Write      Write user information to the global DB Global Read      Read/write user information on the global DB Move Sequence      Table sequences served by global DB User Data Move      Move user data to user partitions New User Partition      All new users saved directly to user partition 1 New User Strategy      Decide on a strategy for the new user partition Non User Data Move      Move all non-user owned data
  • 22.
    TypePad Overview (PreDBP)‏ Storage Database (Postgres) ‏ Static Content (HTML, Images, etc) ‏ Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) ‏ https(443) ‏ http(80) ‏ http(80) : atom api memcached(11211) ‏ postgres(5432) ‏ Mail Server Internet nfs(2049) ‏ ADMIN(CRON) Server smtp(25) / pop(110) ‏ Blog Readers Blog Owners Mobile Blog Readers smtp(25) / pop(110) ‏ Cron Server for periodic asynchronous tasks
  • 23.
    Why Partition? TypePadTypePad TypePad Non- User Role TypePad User Role (User0) ‏ All inquires (access) go to one DB(Postgres) After DBP Current setup Inquiries (access) are divided among several DB(MySQL) TypePad TypePad TypePad TypePad Global Role Non-User Role User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏
  • 24.
    Server Preparation Non-User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role New expanded setup DB(MySQL) ‏ for partitioned data Current Setup Job Server + TypePad + Schwartz Schwartz DB User information is partitioned Maintains user mapping and primary key generation Stores job details Server for executing Jobs ※ Grey areas are not used in current steps Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
  • 25.
    Global Write Creatingthe user map Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB ① ② Explanation  ①: For new registrations only, uniquely identifying user data is written to the global DB  ②: This same data continues to be written to the existing DB DB(MySQL) ‏ for partitioned data Asynchronous Job Server Maintains user mapping and primary key generation ※ Grey areas are not used in current steps
  • 26.
    Global Read Usethe user map to find the user partition Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: Migrate existing user data to the global DB  ②: At start of the request, the application queries global DB for the location of user data  ③: The application then talks to this DB for all queries about this user. At this stage the global DB points to the user0 partition in all cases. DB(MySQL) ‏ for partitioned data Maintains user mapping and primary key generation ① Migrate existing user data Asynchronous Job Server ② ③ ※ Grey areas are not used in current steps
  • 27.
    Move Sequence Migratingprimary key generation Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: Postgres sequences (for generating unique primary keys) are migrated to tables on the global DB that act as “pseudo-sequences”.  ② Application requests new primary keys from global DB rather than the user partition. DB(MySQL) ‏ for partitioned data Maintains user mapping and primary key generation ① ※ Grey areas are not used in current steps Migrate sequence management Asynchronous Job Server ②
  • 28.
    User Data MoveMoving user data to the new user-role partitions Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: Existing users that should be migrated by Job Server are submitted as new Schwartz jobs. User data is then migrated asynchronously  ②: If a comment arrives while the user is being migrated, it is saved in the Schwartz DB to be published later.  ③: After being migrated all user data will exist on the user-role DB partitions  ④: Once all user data is migrated, only non-user data is on Postgres DB(MySQL) ‏ for partitioned data Stores job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps ③ Migrating each user data DB(MySQL) ‏ for partitioned data ④
  • 29.
    New User PartitionNew registrations are created on one user role partition Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: When new users register, user data is written to a user role partition.  ②: Non-user data continues to be served off Postgres DB(MySQL) ‏ for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
  • 30.
    New User StrategyPick a scheme for distributing new users Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: When new users register, user data is written to one of the user role partitions, depending on a set distribution method (round robin, random, etc)  ②: Non-user data continues to be served off Postgres DB(MySQL) ‏ for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ② ※ Grey areas are not used in current steps Asynchronous Job Server
  • 31.
    Non User DataMove Migrate data that cannot be partitioned by user Non- User Role TypePad User Role (User0) ‏ DB(PostgreSQL) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: Migrate non-user role data left on PostgreSQL to the MySQL side. DB(MySQL) ‏ for partitioned data Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps Migrate non-User data Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
  • 32.
    Data migration doneNon- User Role TypePad User Role (User0) ‏ DB(Postgres) ‏ User Role (User1) ‏ User Role (User2) ‏ User Role (User3) ‏ Global Role Non-User Role Job Server + TypePad + Schwartz Schwartz DB Explanation  ①: All data access is now done through MySQL  ②: Continue to use The Schwartz for asynchronous jobs DB(MySQL) ‏ for partitioned data Stores job details Server for executing Jobs Maintains user mapping and primary key generation User information is partitioned ① ※ Grey areas are not used in current steps ① ② Asynchronous Job Server Information that does not need to be partitioned (such as session information) ‏
  • 33.
    The New TypePadconfiguration Storage Database (MySQL) ‏ Static Content (HTML, Images, etc) ‏ Application Server Web Server TypeCast Server ATOM Server MEMCACHED Data Caching servers to reduce DB load Dedicated Server for TypeCast (via ATOM) ‏ https(443) ‏ http(80) ‏ http(80) : atom api memcached(11211) ‏ MySQL(3306) ‏ Mail Server Internet nfs(2049) ‏ ADMIN(CRON) Server smtp(25) / pop(110) ‏ Blog Readers Blog Owners (management interface) ‏ Mobile Blog Readers smtp(25) / pop(110) ‏ Cron Server for periodic asynchronous tasks Job Server TheSchwartz server for running ad-hoc jobs asynchronously
  • 34.
    4. Migration fromPostgreSQL to MySQL
  • 35.
    DB Node SpecHistory History of scale up PostgreSQL server, Before DBP Yes 16GB MP3.3GHz/1M×4 〔 2Core×4 〕 AS4 (2.6.9) 2003/12 2007/11 Time AS4 (2.6.9) AS2.1(2.4.9) ES2.1(2.4.9) ES2.1(2.4.9) 7.4(2.4.9) OS(RedHat) 3.2GHz/1M×4 3.2GHz/1M×4 3.2GHz/1M×2 3.2GHz/1M×2 1.8GHz/512k×1 CPU Xeon 12GB 12GB 4GB 4GB 1GB MEM Yes Yes Yes No No DiskArray
  • 36.
    DB DiskArray Spec[FUJITSU ETERNUS8000] Best I/O transaction performance in the world 146GB (15 krpm) * 32disk with RAID - 10 MultiPath FibreChannel 4Gbps QuickOPC (One Point Copy) OPC copy functions let you create a duplicate copy of any data from the original at any chosen time. http://www.computers.us.fujitsu.com/www/products_storage.shtml?products/storage/fujitsu/e8000/e8000 History of scale up PostgreSQL server, Before DBP
  • 37.
    Scale out MySQLservers, After DBP A role configuration Each role is configured as HA cluster HA Software: NEC ClusterPro Shared Storage
  • 38.
    Scale out MySQLservers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat TypePad Application MySQL Role3 MySQL Role2 MySQL Role1
  • 39.
    Scale out MySQLservers, After DBP Backup Replication w/ Hot backup
  • 40.
    Scale out MySQLservers, After DBP PostgreSQL FibreChannel SAN DiskArray … heart beat MySQL BackupRole TypePad Application mysqld mysqld mysqld rep rep rep opc mysqld mysqld mysqld MySQL Role3 MySQL Role2 MySQL Role1
  • 41.
    Troubles with PostreSQL7.4 – 8.1 Data size over 100 GB 40% is index Severe Data Fragmentation VACUUM “ VACUUM analyze” cause the performance problem Takes too long to VACUUM large amounts of data dump/restore is the only solution for de-fragmentation Auto VACUUM We don’t use Auto VACUUM since we are worried about latent response time
  • 42.
    Troubles with PostgreSQL7.4 – 8.1 Character set PostgreSQL allow the out of boundary UTF-8 Japanese extended character sets and multi bytes character sets which normally should come back with an error - instead of accepting them.
  • 43.
    “ Cleaning” dataRemoving characters set that are out of the boundries UTF-8 character sets. Steps PostgreSQL.dumpALL Split for Piconv UTF8 -> UCS2 -> UTF8 & Merge PostgreSQL.restore dump Split UTF8->UCS2->UTF8 Merge restore
  • 44.
    Migration from PostgreSQLto MySQL using TypePad script Steps PostgreSQL -> PerlObject & tmp publish -> MySQL -> PerlObject & last publish diff tmp & last Object ( data check ) diff tmp & last publish ( file check ) TypePad TypePad PostgreSQL Document Object tmp Document Object last File check data check
  • 45.
    Troubles with MySQLconvert_tz function doesn't support the input value outside the scope of Unix Time sort order different sort order without “ order by” clause
  • 46.
    Cocolog Future PlansDynamic Job queue
  • 47.
    Consulting by SumishoComputer Systems Corp. System Integrator first and best partner of MySQL in Japan since 2003 provide MySQL consulting, support, training service HA Maintenance online backup Japanese character support
  • 48.

Editor's Notes

  • #2 Nifty の横山です。今日はよろしくお願いします。 このような大変名誉ある場で発表の機会をいただけて、関係者各位に対し感謝いたします。 今日はよろしくお願いいたします。