SlideShare a Scribd company logo
Developing  Replication Plugins for Drizzle Jay Pipes [email_address] http://joinfu.com Padraig O'Sullivan [email_address] http://posulliv.com These slides released under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License
what we'll cover today Overview of Drizzle's architecture
Code walkthrough of Drizzle plugin basics
Overview of Drizzle's replication system
Understanding Google Protobuffers
The Transaction message
In depth walkthrough of the filtered replicator
In-depth walkthrough of the transaction log
Overview of Drizzle's Architecture
drizzle's system architecture “ Microkernel” design means most features are built as plugins Authentication, replication, logging, information schema, storage engine, etc
The kernel is really just the parser, optimizer, and runtime We are C++, not C+
We use open source libraries as much as possible STL, gettext, Boost, pcre, GPB, etc
Don't reinvent the wheel
drizzle's system architecture No single “right way” of implementing something Your solution may be great for your environment, but not good for others
And that's fine – it's what the plugin system is all about We focus on the APIs so you can focus on the implementation
Drizzle is just one part of a large ecosystem Web servers, caching layers, authentication systems
kernel Clients Parser Optimizer Listener Plugin (Protocol) Pluggable Storage Engine API MyISAM InnoDB MEMORY Archive PBXT Executor Authentication Plugin Query Cache Plugin Logging Plugin (Pre) Logging Plugin (Post) Replication Plugins Replication Services Transaction Services Scheduler Plugin Authorization Plugin User-Defined Function Plugins Dictionary Plugin Plugin Registration Metadata Services
ignore the kernel You should be able to ignore the kernel as a “black box”
Plugin developers should focus on their plugin or module and not change anything in the kernel
If you need to meddle with or change something in the kernel, it is a sign of a bad interface And you should file a bug! :)
Walkthrough of Drizzle Plugin Basics
plugin/module development basics A working C++ development environment http://www.joinfu.com/2008/08/getting-a-working-c-c-plusplus-development-environment-for-developing-drizzle/ A module in Drizzle is a set of source files in  /plugin/  that implements some functionality For instance  /plugin/transaction_log/*  contains all files for the Transaction Log module Each module must have a  plugin.ini  file The fabulous work by Monty Taylor on the Pandora build system automates most work for you
plugin/module development basics A module contains one or more implementations of a plugin class
A plugin class is any class interface declared in  /drizzled/plugin/ For instance, the header file   /drizzled/plugin/transaction_applier.h   declares the interface for the   plugin::TransactionApplier   API
The header files contain documentation for the plugin interfaces
You can also see documentation on the drizzle.org website:  http://drizzle.org/doxygen/
the plugin.ini A description file for the plugin
Read during compilation and Pandora build system creates appropriate linkage for you
Required fields: headers= <list of all header files in module>
sources= <list of all source files in module>
title= <name of the module/plugin>
description= <decription for the module>
from plugin.ini to data dictionary [plugin] title =Filtered Replicator author =Padraig O Sullivan version =0.2 license =PLUGIN_LICENSE_GPL description = A simple filtered replicator which allows a user to filter out events based on a schema or table name load_by_default =yes sources =filtered_replicator.cc headers =filtered_replicator.h drizzle> SELECT * FROM DATA_DICTIONARY.MODULES -> WHERE MODULE_NAME LIKE 'FILTERED%'\G *************************** 1. row *************************** MODULE_NAME: filtered_replicator MODULE_VERSION: 0.2 MODULE_AUTHOR: Padraig O'Sullivan IS_BUILTIN: FALSE MODULE_LIBRARY: filtered_replicator MODULE_DESCRIPTION: Filtered Replicator MODULE_LICENSE: GPL drizzle> SELECT * FROM DATA_DICTIONARY.PLUGINS -> WHERE PLUGIN_NAME LIKE 'FILTERED%'\G *************************** 1. row *************************** PLUGIN_NAME: filtered_replicator PLUGIN_TYPE: TransactionReplicator IS_ACTIVE: TRUE MODULE_NAME: filtered_replicator
module initialization Recommend placing module-level variables and routines in  /plugin/$module/module.cc
Required: an initialization function taking a reference to the  module::Context  object for your module as its only parameter Typically named  init() Optional: module-level system variables
Required:  DECLARE_PLUGIN($init, $vars)  macro inside above source file
module initialization example static  DefaultReplicator *default_replicator= NULL;  /* The singleton replicator */ static   int  init(module::Context &context) { default_replicator=  new  DefaultReplicator( &quot;default_replicator&quot; ); context.add(default_replicator); return  0; } DRIZZLE_PLUGIN(init,  NULL);
what are plugin hooks? Places in the source code that notify plugins about certain events are called  plugin hooks
During the course of a query's execution, many plugin hooks can be called
The subclass of  plugin::Plugin  determines on which events a plugin is notified and what gets passed as a state parameter to the plugin during notification
These plugin hooks define the plugin's  API
Example: plugin::Authentication class Authentication : public Plugin { public: explicit Authentication(std::string name_arg) : Plugin(name_arg, &quot;Authentication&quot;) {} virtual ~Authentication() {} virtual bool authenticate(const SecurityContext &sctx, const std::string &passwd)= 0; static bool isAuthenticated(const SecurityContext &sctx, const std::string &password); }; authenticate()  is the pure virtual method that an implementing class should complete
isAuthenticated()  is the plugin hook that is called by the kernel to determine authorization
example plugin hook class  AuthenticateBy :  public  unary_function<plugin::Authentication *,  bool > { ... inline  result_type operator()(argument_type auth) { return   auth->authenticate(sctx, password); } }; bool  plugin::Authentication::isAuthenticated( const  SecurityContext &sctx, const  string &password) { ... /* Use find_if instead of foreach so that we can collect return codes */ vector<plugin::Authentication *>::iterator iter= find_if(all_authentication.begin(), all_authentication.end(), AuthenticateBy(sctx, password) ); ... if (iter == all_authentication.end()) { my_error(ER_ACCESS_DENIED_ERROR, MYF(0), sctx.getUser().c_str(), sctx.getIp().c_str(), password.empty() ? ER(ER_NO) : ER(ER_YES)); return  false; } return  true; }
testing your plugin No plugin should be without corresponding test cases
Luckily, again because of the work of Monty Taylor, your plugin can easily hook into the Drizzle testing system
Create a  tests/  directory in your plugin's directory, containing a  t/  and an  r/  subdirectory (for “test” and “result”)
creating test cases Your plugin will most likely not be set to load by default
To activate your plugin, you need to start the server during your tests with: --plugin-add=$module To automatically have the server started with command-line options by the Drizzle test suite, create a file called  $testname-master.opt  and place it along with your test case in your  /plugin/$module/tests/t/  directory
running your test cases Simply run the test-run.pl script with your suite: jpipes@serialcoder:~/repos/drizzle/trunk$ cd tests/ jpipes@serialcoder:~/repos/drizzle/trunk/tests$ ./test-run --suite=transaction_log Drizzle Version 2010.04.1439 ... ================================================================================ DEFAULT STORAGE ENGINE: innodb TEST  RESULT  TIME (ms) -------------------------------------------------------------------------------- transaction_log.alter  [ pass ]  1025 transaction_log.auto_commit  [ pass ]  650 transaction_log.blob  [ pass ]  661 transaction_log.create_select  [ pass ]  688 transaction_log.create_table  [ pass ]  413 transaction_log.delete  [ pass ]  1744 transaction_log.filtered_replicator  [ pass ]  6132 ... transaction_log.schema  [ pass ]  137 transaction_log.select_for_update  [ pass ]  6496 transaction_log.slap  [ pass ]  42522 transaction_log.sync_method_every_write  [ pass ]  23 transaction_log.temp_tables  [ pass ]  549 transaction_log.truncate  [ pass ]  441 transaction_log.truncate_log  [ pass ]  390 transaction_log.udf_print_transaction_message  [ pass ]  408 transaction_log.update  [ pass ]  1916 -------------------------------------------------------------------------------- Stopping All Servers All 28 tests were successful.
Overview of Drizzle's Replication System
not in  Kansas  MySQL anymore Drizzle's replication system looks nothing like MySQL
Drizzle is entirely row-based
Forget the terms  master ,  slave , and  binlog
We use the terms  publisher ,  subscriber ,  replicator  and  applier
We have a transaction log, but it is  not required  for replication Drizzle's transaction log is a  module
The transaction log module has example implementations of an  applier
role of the kernel in replication Marshall  all sources of and targets for replicated data
Construct  objects of type  message::Transaction  that represent the changes made in the server
Push  the Transaction messages out to the replication streams
Coordinate  requests from Subscribers with registered Publishers
kernel Flow of events when client changes data state Client issues DML that modifies data TransactionServices constructs Transaction message object ReplicationServices pushes Transaction  message out to all replication streams plugin::StorageEngine makes changes to data store TransactionServices calls  commitTransaction() plugin::TransactionReplicator calls replicate() plugin::TransactionApplier calls apply()
what is a  replication stream ? A replication stream is the pair of a replicator and an applier
Each applier must be matched with a replicator Can be done via command-line arguments
Can be hard-coded To see the replication streams that are active, you can query  DATA_DICTIONARY.REPLICATION_STREAMS : drizzle> select * from data_dictionary.replication_streams; +--------------------+-------------------------+ | REPLICATOR  | APPLIER  | +--------------------+-------------------------+ | default_replicator | transaction_log_applier |  +--------------------+-------------------------+ 1 row in set (0 sec)

More Related Content

What's hot (20)

ODP
ZopeSkel & Buildout packages
Quintagroup
 
PDF
Rest API using Flask & SqlAlchemy
Alessandro Cucci
 
PDF
Flask patterns
it-people
 
PDF
CMake - Introduction and best practices
Daniel Pfeifer
 
PDF
Flask Introduction - Python Meetup
Areski Belaid
 
KEY
LvivPy - Flask in details
Max Klymyshyn
 
PDF
Quick flask an intro to flask
juzten
 
DOC
Oracle applications 11i dba faq
irshadulla kayamkhani
 
PDF
Kyiv.py #17 Flask talk
Alexey Popravka
 
PPTX
Going native with less coupling: Dependency Injection in C++
Daniele Pallastrelli
 
PPTX
CakePHP
Robert Blomdalen
 
PDF
Python Flask Tutorial For Beginners | Flask Web Development Tutorial | Python...
Edureka!
 
PDF
Continuous Quality Assurance
Michelangelo van Dam
 
PPT
Write book in markdown
Larry Cai
 
PDF
Analysis of merge requests in GitLab using PVS-Studio for C#
Andrey Karpov
 
PDF
CMake Talk 2008
cynapses
 
PDF
PuppetConf 2016: Getting to the Latest Puppet – Nate McCurdy & Elizabeth Witt...
Puppet
 
PPTX
short_intro_to_CMake_(inria_REVES_team)
Jérôme Esnault
 
ZopeSkel & Buildout packages
Quintagroup
 
Rest API using Flask & SqlAlchemy
Alessandro Cucci
 
Flask patterns
it-people
 
CMake - Introduction and best practices
Daniel Pfeifer
 
Flask Introduction - Python Meetup
Areski Belaid
 
LvivPy - Flask in details
Max Klymyshyn
 
Quick flask an intro to flask
juzten
 
Oracle applications 11i dba faq
irshadulla kayamkhani
 
Kyiv.py #17 Flask talk
Alexey Popravka
 
Going native with less coupling: Dependency Injection in C++
Daniele Pallastrelli
 
Python Flask Tutorial For Beginners | Flask Web Development Tutorial | Python...
Edureka!
 
Continuous Quality Assurance
Michelangelo van Dam
 
Write book in markdown
Larry Cai
 
Analysis of merge requests in GitLab using PVS-Studio for C#
Andrey Karpov
 
CMake Talk 2008
cynapses
 
PuppetConf 2016: Getting to the Latest Puppet – Nate McCurdy & Elizabeth Witt...
Puppet
 
short_intro_to_CMake_(inria_REVES_team)
Jérôme Esnault
 

Similar to Developing Drizzle Replication Plugins (20)

ODP
Beginner's guide to drizzle
Andrew Hutchings
 
ODP
Drizzle plugins
Andrew Hutchings
 
PPT
Drizzle @OpenSQL Camp
Brian Aker
 
PPT
Drizzle Keynote at the MySQL User's Conference
Brian Aker
 
PDF
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
eLiberatica
 
PPT
Drizzle Keynote from O'Reilly's MySQL's Conference
Brian Aker
 
ODP
Firebird 3: provider-based architecture, plugins and OO approach to API
Mind The Firebird
 
PDF
Fluentd and Embulk Game Server 4
N Masahiro
 
PDF
MySQL Proxy tutorial
Giuseppe Maxia
 
PDF
Fluentd meetup #2
Treasure Data, Inc.
 
PDF
Monitoring with Syslog and EventMachine (RailswayConf 2012)
Wooga
 
PDF
Bugzilla guide
Bhargavi Bhatt
 
PDF
"Meet rom_rb & dry_rb" by Piotr Solnica
Pivorak MeetUp
 
PDF
Fluentd unified logging layer
Kiyoto Tamura
 
PDF
Fluentd introduction at ipros
Treasure Data, Inc.
 
PDF
MySQL Compatible Open Source Connectors
Andrew Hutchings
 
KEY
DRb and Rinda
Mark
 
PPT
Developing Information Schema Plugins
Mark Leith
 
ODP
Drizzle to MySQL, Stress Free Migration
Andrew Hutchings
 
Beginner's guide to drizzle
Andrew Hutchings
 
Drizzle plugins
Andrew Hutchings
 
Drizzle @OpenSQL Camp
Brian Aker
 
Drizzle Keynote at the MySQL User's Conference
Brian Aker
 
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
eLiberatica
 
Drizzle Keynote from O'Reilly's MySQL's Conference
Brian Aker
 
Firebird 3: provider-based architecture, plugins and OO approach to API
Mind The Firebird
 
Fluentd and Embulk Game Server 4
N Masahiro
 
MySQL Proxy tutorial
Giuseppe Maxia
 
Fluentd meetup #2
Treasure Data, Inc.
 
Monitoring with Syslog and EventMachine (RailswayConf 2012)
Wooga
 
Bugzilla guide
Bhargavi Bhatt
 
"Meet rom_rb & dry_rb" by Piotr Solnica
Pivorak MeetUp
 
Fluentd unified logging layer
Kiyoto Tamura
 
Fluentd introduction at ipros
Treasure Data, Inc.
 
MySQL Compatible Open Source Connectors
Andrew Hutchings
 
DRb and Rinda
Mark
 
Developing Information Schema Plugins
Mark Leith
 
Drizzle to MySQL, Stress Free Migration
Andrew Hutchings
 
Ad

Recently uploaded (20)

PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
PPTX
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
PPTX
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
PPTX
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
PPTX
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
PDF
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
PDF
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Presentation about Hardware and Software in Computer
snehamodhawadiya
 
Applied-Statistics-Mastering-Data-Driven-Decisions.pptx
parmaryashparmaryash
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
NewMind AI Weekly Chronicles – July’25, Week III
NewMind AI
 
AI Code Generation Risks (Ramkumar Dilli, CIO, Myridius)
Priyanka Aash
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
Generative AI vs Predictive AI-The Ultimate Comparison Guide
Lily Clark
 
AVL ( audio, visuals or led ), technology.
Rajeshwri Panchal
 
Dev Dives: Automate, test, and deploy in one place—with Unified Developer Exp...
AndreeaTom
 
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Neo4j
 
State-Dependent Conformal Perception Bounds for Neuro-Symbolic Verification
Ivan Ruchkin
 
Researching The Best Chat SDK Providers in 2025
Ray Fields
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
Ad

Developing Drizzle Replication Plugins

  • 1. Developing Replication Plugins for Drizzle Jay Pipes [email_address] http://joinfu.com Padraig O'Sullivan [email_address] http://posulliv.com These slides released under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 License
  • 2. what we'll cover today Overview of Drizzle's architecture
  • 3. Code walkthrough of Drizzle plugin basics
  • 4. Overview of Drizzle's replication system
  • 7. In depth walkthrough of the filtered replicator
  • 8. In-depth walkthrough of the transaction log
  • 9. Overview of Drizzle's Architecture
  • 10. drizzle's system architecture “ Microkernel” design means most features are built as plugins Authentication, replication, logging, information schema, storage engine, etc
  • 11. The kernel is really just the parser, optimizer, and runtime We are C++, not C+
  • 12. We use open source libraries as much as possible STL, gettext, Boost, pcre, GPB, etc
  • 14. drizzle's system architecture No single “right way” of implementing something Your solution may be great for your environment, but not good for others
  • 15. And that's fine – it's what the plugin system is all about We focus on the APIs so you can focus on the implementation
  • 16. Drizzle is just one part of a large ecosystem Web servers, caching layers, authentication systems
  • 17. kernel Clients Parser Optimizer Listener Plugin (Protocol) Pluggable Storage Engine API MyISAM InnoDB MEMORY Archive PBXT Executor Authentication Plugin Query Cache Plugin Logging Plugin (Pre) Logging Plugin (Post) Replication Plugins Replication Services Transaction Services Scheduler Plugin Authorization Plugin User-Defined Function Plugins Dictionary Plugin Plugin Registration Metadata Services
  • 18. ignore the kernel You should be able to ignore the kernel as a “black box”
  • 19. Plugin developers should focus on their plugin or module and not change anything in the kernel
  • 20. If you need to meddle with or change something in the kernel, it is a sign of a bad interface And you should file a bug! :)
  • 21. Walkthrough of Drizzle Plugin Basics
  • 22. plugin/module development basics A working C++ development environment http://www.joinfu.com/2008/08/getting-a-working-c-c-plusplus-development-environment-for-developing-drizzle/ A module in Drizzle is a set of source files in /plugin/ that implements some functionality For instance /plugin/transaction_log/* contains all files for the Transaction Log module Each module must have a plugin.ini file The fabulous work by Monty Taylor on the Pandora build system automates most work for you
  • 23. plugin/module development basics A module contains one or more implementations of a plugin class
  • 24. A plugin class is any class interface declared in /drizzled/plugin/ For instance, the header file /drizzled/plugin/transaction_applier.h declares the interface for the plugin::TransactionApplier API
  • 25. The header files contain documentation for the plugin interfaces
  • 26. You can also see documentation on the drizzle.org website: http://drizzle.org/doxygen/
  • 27. the plugin.ini A description file for the plugin
  • 28. Read during compilation and Pandora build system creates appropriate linkage for you
  • 29. Required fields: headers= <list of all header files in module>
  • 30. sources= <list of all source files in module>
  • 31. title= <name of the module/plugin>
  • 33. from plugin.ini to data dictionary [plugin] title =Filtered Replicator author =Padraig O Sullivan version =0.2 license =PLUGIN_LICENSE_GPL description = A simple filtered replicator which allows a user to filter out events based on a schema or table name load_by_default =yes sources =filtered_replicator.cc headers =filtered_replicator.h drizzle> SELECT * FROM DATA_DICTIONARY.MODULES -> WHERE MODULE_NAME LIKE 'FILTERED%'\G *************************** 1. row *************************** MODULE_NAME: filtered_replicator MODULE_VERSION: 0.2 MODULE_AUTHOR: Padraig O'Sullivan IS_BUILTIN: FALSE MODULE_LIBRARY: filtered_replicator MODULE_DESCRIPTION: Filtered Replicator MODULE_LICENSE: GPL drizzle> SELECT * FROM DATA_DICTIONARY.PLUGINS -> WHERE PLUGIN_NAME LIKE 'FILTERED%'\G *************************** 1. row *************************** PLUGIN_NAME: filtered_replicator PLUGIN_TYPE: TransactionReplicator IS_ACTIVE: TRUE MODULE_NAME: filtered_replicator
  • 34. module initialization Recommend placing module-level variables and routines in /plugin/$module/module.cc
  • 35. Required: an initialization function taking a reference to the module::Context object for your module as its only parameter Typically named init() Optional: module-level system variables
  • 36. Required: DECLARE_PLUGIN($init, $vars) macro inside above source file
  • 37. module initialization example static DefaultReplicator *default_replicator= NULL; /* The singleton replicator */ static int init(module::Context &context) { default_replicator= new DefaultReplicator( &quot;default_replicator&quot; ); context.add(default_replicator); return 0; } DRIZZLE_PLUGIN(init, NULL);
  • 38. what are plugin hooks? Places in the source code that notify plugins about certain events are called plugin hooks
  • 39. During the course of a query's execution, many plugin hooks can be called
  • 40. The subclass of plugin::Plugin determines on which events a plugin is notified and what gets passed as a state parameter to the plugin during notification
  • 41. These plugin hooks define the plugin's API
  • 42. Example: plugin::Authentication class Authentication : public Plugin { public: explicit Authentication(std::string name_arg) : Plugin(name_arg, &quot;Authentication&quot;) {} virtual ~Authentication() {} virtual bool authenticate(const SecurityContext &sctx, const std::string &passwd)= 0; static bool isAuthenticated(const SecurityContext &sctx, const std::string &password); }; authenticate() is the pure virtual method that an implementing class should complete
  • 43. isAuthenticated() is the plugin hook that is called by the kernel to determine authorization
  • 44. example plugin hook class AuthenticateBy : public unary_function<plugin::Authentication *, bool > { ... inline result_type operator()(argument_type auth) { return auth->authenticate(sctx, password); } }; bool plugin::Authentication::isAuthenticated( const SecurityContext &sctx, const string &password) { ... /* Use find_if instead of foreach so that we can collect return codes */ vector<plugin::Authentication *>::iterator iter= find_if(all_authentication.begin(), all_authentication.end(), AuthenticateBy(sctx, password) ); ... if (iter == all_authentication.end()) { my_error(ER_ACCESS_DENIED_ERROR, MYF(0), sctx.getUser().c_str(), sctx.getIp().c_str(), password.empty() ? ER(ER_NO) : ER(ER_YES)); return false; } return true; }
  • 45. testing your plugin No plugin should be without corresponding test cases
  • 46. Luckily, again because of the work of Monty Taylor, your plugin can easily hook into the Drizzle testing system
  • 47. Create a tests/ directory in your plugin's directory, containing a t/ and an r/ subdirectory (for “test” and “result”)
  • 48. creating test cases Your plugin will most likely not be set to load by default
  • 49. To activate your plugin, you need to start the server during your tests with: --plugin-add=$module To automatically have the server started with command-line options by the Drizzle test suite, create a file called $testname-master.opt and place it along with your test case in your /plugin/$module/tests/t/ directory
  • 50. running your test cases Simply run the test-run.pl script with your suite: jpipes@serialcoder:~/repos/drizzle/trunk$ cd tests/ jpipes@serialcoder:~/repos/drizzle/trunk/tests$ ./test-run --suite=transaction_log Drizzle Version 2010.04.1439 ... ================================================================================ DEFAULT STORAGE ENGINE: innodb TEST RESULT TIME (ms) -------------------------------------------------------------------------------- transaction_log.alter [ pass ] 1025 transaction_log.auto_commit [ pass ] 650 transaction_log.blob [ pass ] 661 transaction_log.create_select [ pass ] 688 transaction_log.create_table [ pass ] 413 transaction_log.delete [ pass ] 1744 transaction_log.filtered_replicator [ pass ] 6132 ... transaction_log.schema [ pass ] 137 transaction_log.select_for_update [ pass ] 6496 transaction_log.slap [ pass ] 42522 transaction_log.sync_method_every_write [ pass ] 23 transaction_log.temp_tables [ pass ] 549 transaction_log.truncate [ pass ] 441 transaction_log.truncate_log [ pass ] 390 transaction_log.udf_print_transaction_message [ pass ] 408 transaction_log.update [ pass ] 1916 -------------------------------------------------------------------------------- Stopping All Servers All 28 tests were successful.
  • 51. Overview of Drizzle's Replication System
  • 52. not in Kansas MySQL anymore Drizzle's replication system looks nothing like MySQL
  • 53. Drizzle is entirely row-based
  • 54. Forget the terms master , slave , and binlog
  • 55. We use the terms publisher , subscriber , replicator and applier
  • 56. We have a transaction log, but it is not required for replication Drizzle's transaction log is a module
  • 57. The transaction log module has example implementations of an applier
  • 58. role of the kernel in replication Marshall all sources of and targets for replicated data
  • 59. Construct objects of type message::Transaction that represent the changes made in the server
  • 60. Push the Transaction messages out to the replication streams
  • 61. Coordinate requests from Subscribers with registered Publishers
  • 62. kernel Flow of events when client changes data state Client issues DML that modifies data TransactionServices constructs Transaction message object ReplicationServices pushes Transaction message out to all replication streams plugin::StorageEngine makes changes to data store TransactionServices calls commitTransaction() plugin::TransactionReplicator calls replicate() plugin::TransactionApplier calls apply()
  • 63. what is a replication stream ? A replication stream is the pair of a replicator and an applier
  • 64. Each applier must be matched with a replicator Can be done via command-line arguments
  • 65. Can be hard-coded To see the replication streams that are active, you can query DATA_DICTIONARY.REPLICATION_STREAMS : drizzle> select * from data_dictionary.replication_streams; +--------------------+-------------------------+ | REPLICATOR | APPLIER | +--------------------+-------------------------+ | default_replicator | transaction_log_applier | +--------------------+-------------------------+ 1 row in set (0 sec)
  • 66. the Transaction message The Transaction message is the basic unit of work in the replication system
  • 67. Represents a set of changes that were made to a server
  • 71. protobuffers are XML on crack Google protobuffers Compiler ( protoc )
  • 72. Library ( libprotobuf ) Compiler consumes a .proto file and produces source code files containing classes the represent your data In a variety of programming languages Library contains routines and classes used in working with, serializing, and parsing protobuffer messages http://code.google.com/apis/protocolbuffers/docs/overview.html
  • 73. The .proto file Declares message definitions Simple Java/C++-like format Messages have one or more fields
  • 74. Fields are of a specific type uint32, string, bytes, etc. Fields have a specifier required, optional, repeated Submessages and enumerations too!
  • 75. example .proto file package drizzled.message; /* Context for a transaction. */ message TransactionContext { required uint32 server_id = 1; /* Unique identifier of a server */ required uint64 transaction_id = 2; /* Globally-unique transaction ID */ required uint64 start_timestamp = 3; /* Timestamp of when the transaction started */ required uint64 end_timestamp = 4; /* Timestamp of when the transaction ended */ } package sets the namespace for the generated code In C++, the TransactionContext class would be created in the drizzled::message:: namespace To compile the .proto, we use the protoc compiler: $> protoc --cpp_out=. transaction.proto
  • 76. generated code files For C++, protoc produces two files, one header and one source file transaction.pb.h, transaction.pb.cc To use these classes, simply #include the header file and start using your new message classes: #include “transaction.pb.h” ; using namespace drizzled; message::TransactionContext tc; tc.set_transaction_id(100000); ...
  • 77. The C++ POD GPB API in one slide To access the data, method is same as the field
  • 78. To set the data, append set_ to the field name
  • 79. To check existence, append has_ to the field name
  • 80. To add a new repeated field, append add_ to the field name
  • 81. To get a pointer to a field that is a submessage, append mutable_ to the field name All memory for fields is managed by GPB; when you delete the main object, all memory is freed
  • 83. the Transaction message The Transaction message is the basic unit of work in the replication system
  • 85. Represents a set of changes that were made to a server
  • 86. Most of the time, the Transaction message represents the work done in a single SQL transaction Large SQL transactions may be broken into multiple Transaction messages
  • 87. the Transaction message format ... TransactionContext Transaction ID
  • 88. Start and end timestamps
  • 90. Channel ID (optional) Statements One or more Statement submessages
  • 91. Describes the rows modified in a SQL statement Transaction Context Statements Statement 1 Statement 2 Statement N
  • 92. TransactionContext message message Transaction { required TransactionContext transaction_context = 1; repeated Statement statement = 2; } message TransactionContext { required uint32 server_id = 1; /* Unique identifier of a server */ required uint64 transaction_id = 2; /* Channel-unique transaction ID */ required uint64 start_timestamp = 3; /* Timestamp of when the transaction started */ required uint64 end_timestamp = 4; /* Timestamp of when the transaction ended */ optional uint32 channel_id = 5; /* Scope of uniqueness of transaction ID */ } Would you add additional fields? user_id? session_id? something else? Add fields as optional, recompile, able to use those custom fields right away in your plugins Now that's extensible!
  • 93. the Statement message format Required fields Type
  • 94. Start and end timestamps Optional SQL string
  • 95. Statement-dependent fields For DML: header and data message
  • 96. For DDL: submessage representing a DDL statement Required Fields Statement-dependent Fields Optional SQL string
  • 97. the Statement message message Statement { enum Type { ROLLBACK = 0; /* A ROLLBACK indicator */ INSERT = 1; /* An INSERT statement */ DELETE = 2; /* A DELETE statement */ UPDATE = 3; /* An UPDATE statement */ TRUNCATE_TABLE = 4; /* A TRUNCATE TABLE statement */ CREATE_SCHEMA = 5; /* A CREATE SCHEMA statement */ ALTER_SCHEMA = 6; /* An ALTER SCHEMA statement */ DROP_SCHEMA = 7; /* A DROP SCHEMA statement */ CREATE_TABLE = 8; /* A CREATE TABLE statement */ ALTER_TABLE = 9; /* An ALTER TABLE statement */ DROP_TABLE = 10; /* A DROP TABLE statement */ SET_VARIABLE = 98; /* A SET statement */ RAW_SQL = 99; /* A raw SQL statement */ } required Type type = 1; /* The type of the Statement */ required uint64 start_timestamp = 2; /* Nanosecond precision timestamp of when the Statement was started on the server */ required uint64 end_timestamp = 3; /* Nanosecond precision timestamp of when the Statement finished executing on the server */ optional string sql = 4; /* May contain the original SQL string */ /* ... (cont'd on later slide) */ }
  • 98. getting data from the message For data fields in a message, to get the value of the field, simply call a method the same as the name of the field: message::Transaction &transaction= getSomeTransaction(); const message::TransactionContext &trx_ctx= transaction.transaction_context(); cout << “Transaction ID: “ << trx_ctx.transaction_id << endl; message::Statement::Type type= statement.type(); switch (type) { case message::Statement::INSERT: // do something for an insert... case message::Statement::UPDATE: // do something for an update... } Enumerations are also easily used:
  • 99. accessing a repeated element Elements in a repeated field are accessed via an index, and a $fieldname_size() method returns the number of elements: using namespace drizzled; const message::Transaction &transaction= getSomeTransaction(); /* Get the number of elements in the repeated field */ size_t num_statements= transaction.statement_size(); for (size_t x= 0; x < num_statements; ++x) { /* Access the element via the 0-based index */ const message::Statement &statement= transaction.statement(x); /* For optional fields, a has_$fieldname() method is available to check for existence */ if (statement.has_sql()) { cout << statement.sql() << endl; } }
  • 100. the specific Statement message message Statement { /* ... cont'd from a previous slide */ /* * Each Statement message may contain one or more of * the below sub-messages, depending on the Statement's type. */ optional InsertHeader insert_header = 5; optional InsertData insert_data = 6; optional UpdateHeader update_header = 7; optional UpdateData update_data = 8; optional DeleteHeader delete_header = 9; optional DeleteData delete_data = 10; optional TruncateTableStatement truncate_table_statement = 11; optional CreateSchemaStatement create_schema_statement = 12; optional DropSchemaStatement drop_schema_statement = 13; optional AlterSchemaStatement alter_schema_statement = 14; optional CreateTableStatement create_table_statement = 15; optional AlterTableStatement alter_table_statement = 16; optional DropTableStatement drop_table_statement = 17; optional SetVariableStatement set_variable_statement = 18; } Example: for an INSERT SQL statement, the Statement message will contain an insert_header and insert_data field
  • 101. insert header and data messages /* * Represents statements which insert data into the database: * * INSERT * INSERT SELECT * LOAD DATA INFILE * REPLACE (is a delete and an insert) * * @note * * Bulk insert operations will have >1 data segment, with the last data * segment having its end_segment member set to true. */ message InsertHeader { required TableMetadata table_metadata = 1; /* Metadata about the table affected */ repeated FieldMetadata field_metadata = 2; /* Metadata about fields affected */ } message InsertData { required uint32 segment_id = 1; /* The segment number */ required bool end_segment = 2; /* Is this the final segment? */ repeated InsertRecord record = 3; /* The records inserted */ } /* * Represents a single record being inserted into a single table. */ message InsertRecord { repeated bytes insert_value = 1; }
  • 102. tip: statement_transform Looking for examples of how to use the Transaction and Statement messages?
  • 103. The /drizzled/message/transaction.proto file has extensive documentation
  • 104. Also check out the statement_transform library in /drizzled/message/statement_transform.cc
  • 105. Shows how to contruct SQL statements from the information in a Transaction message
  • 106. The statement_transform library is used in utility programs such as /drizzled/message/table_raw_reader.cc
  • 107. Code walkthrough of the Filtered Replicator module
  • 108. replicators can filter/transform plugin::TransactionReplicator 's function is to replicate the Transaction message to the plugin::TransactionApplier in a replication stream
  • 109. You can filter or transform a Transaction message before passing it off to the applier
  • 110. Only one method in the API: /** * Replicate a Transaction message to a TransactionApplier. * * @param Pointer to the applier of the command message * @param Reference to the current session * @param Transaction message to be replicated */ virtual ReplicationReturnCode replicate(TransactionApplier *in_applier, Session &session, message::Transaction &to_replicate)= 0;
  • 111. module overview Allows filtering of transaction messages by schema name or table name We construct a new transaction message containing only Statement messages that have not been filtered Includes support for the use of regular expressions
  • 112. Schemas and tables to filter are specified in system variables filtered_replicator_filteredschemas
  • 114. module initialization Very similar to what we saw with the default replicator: static FilteredReplicator *filtered_replicator= NULL ; static int init(plugin::Context &context) { filtered_replicator= new(std::nothrow) FilteredReplicator( &quot;filtered_replicator&quot; , sysvar_filtered_replicator_sch_filters, sysvar_filtered_replicator_tab_filters); if (filtered_replicator == NULL ) { return 1 ; } context.add(filtered_replicator); return 0 ; }
  • 115. obtaining schema/table name For each statement in the transaction message, we obtain the schema name and table name in the parseStatementTableMetadata method: void parseStatementTableMetadata(const message::Statement &in_statement, string &in_schema_name, string &in_table_name) const { switch (in_statement.type()) { case message::Statement::INSERT: { const message::TableMetadata &metadata= in_statement.insert_header().table_metadata(); in_schema_name.assign(metadata.schema_name()); in_table_name.assign(metadata.table_name()); break ; } case message::Statement::UPDATE: … } }
  • 116. filtering by schema name We search through the list of schemas to filter to see if there is a match pthread_mutex_lock(&sch_vector_lock); vector<string>::iterator it= find(schemas_to_filter.begin(), schemas_to_filter.end(), schema_name); if (it != schemas_to_filter.end()) { pthread_mutex_unlock(&sch_vector_lock); return true ; } pthread_mutex_unlock(&sch_vector_lock);
  • 117. filtering Statements Schema and table name are converted to lower case since we store the list of schemas and tables to filter in lower case
  • 118. If neither matches a filtering condition, we add the statement to our new filtered transaction: /* convert schema name and table name to lower case */ std::transform(schema_name.begin(), schema_name.end(), schema_name.begin(), ::tolower); std::transform(table_name.begin(), table_name.end(), table_name.begin(), ::tolower); if (! isSchemaFiltered(schema_name) && ! isTableFiltered(table_name)) { message::Statement *s= filtered_transaction.add_statement(); *s= statement; /* copy construct */ }
  • 119. pass Transaction on to applier Finally, we pass on our filtered transaction to an applier: if (filtered_transaction.statement_size() > 0) { /* * We can now simply call the applier's apply() method, passing * along the supplied command. */ message::TransactionContext *tc= filtered_transaction.mutable_transaction_context(); *tc= to_replicate.transaction_context(); /* copy construct */ return in_applier->apply(in_session, filtered_transaction); }
  • 120. Code walkthrough of the Transaction Log module
  • 121. appliers can log/analyze/apply plugin::TransactionApplier 's function is to apply the Transaction message to some target or analyze the transaction in some way
  • 122. You cannot modify the Transaction message If you need to modify the message, you likely should be using TransactionReplicator::replicate() Only one method in the API: /** * Applies a Transaction message to some target * * @param Reference to the current session * @param Transaction message to be applied */ virtual ReplicationReturnCode apply(Session &session, const message::Transaction &to_apply)= 0;
  • 123. module overview Provides a log of compressed, serialized Transaction messages
  • 124. Supports checksumming of written messages
  • 125. Flexible file sync behaviour Similar to innodb_flush_log_at_trx_commit Uses a scoreboard of write buffers to minimize memory usage
  • 126. Components are all plugin examples TransactionApplier, Data Dictionary, user-defined Functions
  • 127. transaction log components TransactionLogApplier vector<WriteBuffer> TransactionLog Data Dictionary TransactionLogView TransactionLogEntriesView TransactionLogTransactionsView TransactionLogIndex vector<TransactionLogIndexEntry> User Defined Functions HexdumpTransactionMessageFunction PrintTransactionMessageFunction
  • 128. code flow through module transaction log entry format TransactionLogApplier::apply() TransactionLog::packTransactionInLogEntry() TransactionLog::writeEntry() TransactionLogIndex::addEntry() MessageLite::SerializeWithCachedSizesToArray() pwrite() entry type (4 bytes) entry length (4 bytes) transaction message (variable # bytes) checksum (4 bytes)
  • 129. TransactionLogApplier header class TransactionLogApplier: public drizzled::plugin::TransactionApplier { public : TransactionLogApplier( const std::string name_arg, TransactionLog *in_transaction_log, uint32_t in_num_write_buffers); /** Destructor */ ~TransactionLogApplier(); /** * Applies a Transaction to the transaction log * * @param Session descriptor * @param Transaction message to be replicated */ drizzled::plugin::ReplicationReturnCode apply(drizzled::Session &in_session, const drizzled::message::Transaction &to_apply); private : TransactionLog &transaction_log; /* This Applier owns the memory of the associated TransactionLog - so we have to track it. */ TransactionLog *transaction_log_ptr; uint32_t num_write_buffers; ///< Number of write buffers used std::vector<WriteBuffer *> write_buffers; ///< array of write buffers /** * Returns the write buffer for the supplied session * * @param Session descriptor */ WriteBuffer *getWriteBuffer( const drizzled::Session &session); };
  • 130. TransactionLog header class TransactionLog { public : static size_t getLogEntrySize(const drizzled::message::Transaction &trx); uint8_t *packTransactionIntoLogEntry(const drizzled::message::Transaction &trx, uint8_t *buffer, uint32_t *checksum_out); off_t writeEntry(const uint8_t *data, size_t data_length); private : static const uint32_t HEADER_TRAILER_BYTES= sizeof ( uint32_t ) + /* 4-byte msg type header */ sizeof ( uint32_t ) + /* 4-byte length header */ sizeof ( uint32_t ); /* 4 byte checksum trailer */ int syncLogFile(); int log_file; ///< Handle for our log file drizzled::atomic< off_t > log_offset; ///< Offset in log file where we write next entry uint32_t sync_method; ///< Determines behaviour of syncing log file time_t last_sync_time; ///< Last time the log file was synced bool do_checksum; ///< Do a CRC32 checksum when writing Transaction message to log? };
  • 131. TransactionLogApplier::apply() plugin::ReplicationReturnCode TransactionLogApplier::apply(Session &in_session, const message::Transaction &to_apply) { size_t entry_size= TransactionLog::getLogEntrySize(to_apply); WriteBuffer *write_buffer= getWriteBuffer(in_session); uint32_t checksum; write_buffer->lock(); write_buffer->resize(entry_size); uint8_t *bytes= write_buffer->getRawBytes(); bytes= transaction_log.packTransactionIntoLogEntry(to_apply, bytes, &checksum); off_t written_to= transaction_log.writeEntry(bytes, entry_size); write_buffer->unlock(); /* Add an entry to the index describing what was just applied */ transaction_log_index->addEntry(TransactionLogEntry(ReplicationServices::TRANSACTION, written_to, entry_size), to_apply, checksum); return plugin::SUCCESS; }
  • 132. What's up with the Publisher and Subscriber plugins?
  • 133. we need your input These plugin's APIs are still being developed
  • 134. The idea is for responsibility to be divided like so: plugin::Publisher will be responsible for describing the state of each replication channel and communicating with subscribers on separate ports
  • 135. plugin::Subscriber will be responsible for pulling data from a plugin::Publisher and applying that data to a replica node
  • 136. rabbitmq and replication Developed by Marcus Eriksson http://developian.com Can replicate externally or internally External by reading the Drizzle transaction log and sending logs to RabbitMQ Multi-threaded applier constructs SQL statements from transaction messages in log files on replica Internal via a C++ plugin /plugin/rabbitmq/
  • 139. Cassandra Applier Plugin Simple plugin that applies transactions to a Cassandra keyspace http://posulliv.github.com/2010/06/01/replication-plugins.html Implements plugin::TransactionApplier
  • 140. Sends inserts/updates/deletes to a pre-specified keyspace
  • 142. Similar plugin could be developed for any database solution (PostgreSQL?)
  • 143. A Memcached Query Cache Google Summer of Code project
  • 144. Two students Djellel Difallah
  • 145. Siddharth Singh Uses plugin::TransactionApplier and plugin::QueryCache to implement a query cache with fine-grained invalidation MySQL query cache has coarse invalidation plugin::TransactionApplier API uses the row-based Transaction message to determine tuple ranges that must be invalidated
  • 146. Summary & More Info Slides on SlideShare http://www.slideshare.net/posullivan/developing-drizzle-replication-plugins Lots of plugin examples, probably the best way to start filtered_replicator
  • 148. cassandra_applier If interested or have any input or suggestions, mailing list is a great place to start discussion (http://launchpad.net/~drizzle-discuss)