SlideShare a Scribd company logo
™
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
Avro
Kafka & Avro:
Confluent Schema
Registry
Managing Record Schema in
Kafka
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Confluent Schema Registry
❖ Confluent Schema Registry stores Avro Schemas for Kafka
clients
❖ Provides REST interface for putting and getting Avro schemas
❖ Stores a history of schemas
❖ versioned
❖ allows you to configure compatibility setting
❖ supports evolution of schemas
❖ Provides serializers used by Kafka clients which handles schema
storage and serialization of records using Avro
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Why Schema Registry?
❖ Producer creates a record/message, which is an Avro record
❖ Record contains the schema and data
❖ Schema Registry Avro Serializer serializes the data and schema id (just id)
❖ Keeps a cache of registered schemas from Schema Registry to ids
❖ Consumer receives payload and deserializes it with Schema Registry Avro Deserializers
❖ Deserializer looks up the full schema from cache or Schema Registry based on id
❖ Consumer has its schema, one it is expecting record/message to conform to
❖ Compatibility check is performed or two schemas
❖ if no match, but are compatible, then payload transformation happens aka Schema Evolution
❖ if not failure
❖ Kafka records have Key and Value and schema can be done on both
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Compatibility
❖ Backward Compatibility (default)
❖ New, backward compatible schema, will not break consumers
❖ Producers could be using older schema that is backwards compatible with Consumer
❖ Forward compatibility
❖ Records sent with new forward compatible schema can be deserialized with older schemas
❖ Consumers can use an older schema and never be updated (maybe never needs new fields)
❖ Full compatibility
❖ New version of a schema is backward and forward compatible
❖ None
❖ Schema will not be validated for compatibility at all
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Config
❖ Compatibility can be configured globally or per schema
❖ Options are:
❖ NONE - don’t check for schema compatibility
❖ FORWARD - check to make sure last schema version is forward
compatible with new schemas
❖ BACKWARDS (default) - make sure new schema is backwards
compatible with latest
❖ FULL - make sure new schema is forwards and backwards
compatible from latest to new and from new to latest
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Actions
❖ Register schemas for key and values of Kafka records
❖ List schemas (subjects)
❖ List all versions of a subject (schema)
❖ Retrieve a schema by version or id
❖ get latest version of schema
❖ Check to see if schema is compatible with a certain version
❖ Get the compatibility level setting of the Schema Registry
❖ BACKWARDS, NONE
❖ Add compatibility settings to a subject/schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Evolution
❖ Avro schema is changed after data has been written to store using an older version of
that schema, then Avro might do a Schema Evolution
❖ Schema evolution is automatic transformation of Avro schema
❖ transformation is between version of consumer schema and what the producer put
into the Kafka log
❖ When Consumer schema is not identical to the Producer schema used to serialize
the Kafka Record then a data transformation is performed on the Kafka record (key or
value)
❖ If the schemas match then no need to do a transformation
❖ Schema evolution is happens only during deserialization at the Consumer
❖ If Consumer’s schema is different from Producer’s schema, then value or key is
automatically modified during deserialization to conform to consumers reader schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Allowed Schema Modifications
❖ Add a field with a default
❖ Remove a field that had a default value
❖ Change a fields order attribute
❖ Change a fields default value
❖ Remove or add a field alias
❖ Remove or add a type alias
❖ Change a type to a union that contains original type
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Rules of the Road for modifying
Schema
❖ Provide a default value for fields in your schema
❖ Allows you to delete the field later later
❖ Don’t change a field's data type
❖ When adding a new field to your schema, you have to
provide a default value for the field
❖ Don’t rename an existing field
❖ You can add an alias
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Remember our example
Employee
Avro covered in Avro/Kafka Tutorial
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Let’s say
❖ Employee did not have an age in version 1 of the
schema
❖ Later we decided to add an age field with a default value
of -1
❖ Now let’s say we have a Producer using version 2, and
a Consumer using version 1
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Scenario adding a new field age with
default value
❖ Producer uses version 2 of the Employee schema and creates a
com.cloudurable.Employee record, and sets age field to 42, then sends it to Kafka topic
new-employees
❖ Consumer consumes records from new-employees using version 1 of the Employee
Schema
❖ Since Consumer is using version 1 of schema, age field is removed during
deserialization
❖ Same consumer modifies name field and then writes the record back to a NoSQL store
❖ When it does this, the age field is missing from value that it writes to the store
❖ Another client using version 2 reads the record from the NoSQL store
❖ Age field is missing from the record (because the Consumer wrote it with version 1),
age is set to default value of -1
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry Actions
❖ Register schemas for key and values of Kafka records
❖ List schemas (subjects)
❖ List all versions of a subject (schema)
❖ Retrieve a schema by version or id
❖ get latest version of schema
❖ Check to see if schema is compatible with a certain version
❖ Get the compatibility level setting of the Schema Registry
❖ BACKWARDS, FORWARD, FULL, NONE
❖ Add compatibility settings to a subject/schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Register a Schema
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Register a Schema
{"id":2}
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json"
--data '{"schema": "{"type": …}’ 
http://localhost:8081/subjects/Employee/versions
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
List All Schema
["Employee","Employee2","FooBar"]
curl -X GET http://localhost:8081/subjects
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Working with versions
[1,2,3,4,5]
{“subject”:"Employee","version":2,"id":4,"schema":"
{"type":"record","name":"Employee",
”namespace”:"com.cloudurable.phonebook", …
{“subject”:"Employee","version":1,"id":3,"schema":"
{"type":"record","name":"Employee",
”namespace”:"com.cloudurable.phonebook", …
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Working with Schemas
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Changing Compatibility
Checks
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Incompatible Change
{“error_code":409,"
message":"Schema being registered is incompatible with an e
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Incompatible Change
{"is_compatible":false}
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use Schema Registry
❖ Start up Schema Registry server pointing to Zookeeper
cluster
❖ Import Kafka Avro Serializer and Avro Jars
❖ Configure Producer to use Schema Registry
❖ Use KafkaAvroSerializer from Producer
❖ Configure Consumer to use Schema Registry
❖ Use KafkaAvroDeserializer from Consumer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Start up Schema Registry
Server
cat ~/tools/confluent-3.2.1/etc/schema-registry/schema-registry.properties
listeners=http://0.0.0.0:8081
kafkastore.connection.url=localhost:2181
kafkastore.topic=_schemas
debug=false
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Import Kafka Avro Serializer &
Avro Jars
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Configure Producer to use Schema
Registry
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use KafkaAvroSerializer from
Producer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Configure Consumer to use Schema
Registry
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Use KafkaAvroDeserializer from
Consumer
Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka
Consulting
™
Schema Registry
❖ Confluent provides Schema Registry to manage Avro
Schemas for Kafka Consumers and Producers
❖ Avro provides Schema Migration
❖ Confluent uses Schema compatibility checks to see if
Producer schema and Consumer schemas are
compatible and to do Schema evolution if needed
❖ Use KafkaAvroSerializer from Producer
❖ Use KafkaAvroDeserializer from Consumer

More Related Content

What's hot (20)

PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
PDF
Fundamentals of Apache Kafka
Chhavi Parasher
 
PPTX
Apache kafka
Viswanath J
 
PDF
Introduction to Kafka Streams
Guozhang Wang
 
PPTX
Kafka 101
Clement Demonchy
 
PPTX
Introduction to Apache Kafka
Jeff Holoman
 
PDF
Apache Kafka Architecture & Fundamentals Explained
confluent
 
PPTX
Schema registry
Whiteklay
 
PDF
Introduction to Apache Kafka
Shiao-An Yuan
 
PDF
Implementing Domain Events with Kafka
Andrei Rugina
 
ODP
Stream processing using Kafka
Knoldus Inc.
 
PPSX
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
PDF
Kafka Streams: What it is, and how to use it?
confluent
 
PDF
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
PPTX
kafka
Amikam Snir
 
PPSX
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
PPTX
Kafka presentation
Mohammed Fazuluddin
 
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
PDF
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Jean-Paul Azar
 
Fundamentals of Apache Kafka
Chhavi Parasher
 
Apache kafka
Viswanath J
 
Introduction to Kafka Streams
Guozhang Wang
 
Kafka 101
Clement Demonchy
 
Introduction to Apache Kafka
Jeff Holoman
 
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Schema registry
Whiteklay
 
Introduction to Apache Kafka
Shiao-An Yuan
 
Implementing Domain Events with Kafka
Andrei Rugina
 
Stream processing using Kafka
Knoldus Inc.
 
Event Sourcing & CQRS, Kafka, Rabbit MQ
Araf Karsh Hamid
 
Kafka Streams: What it is, and how to use it?
confluent
 
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Databricks
 
Microservices Architecture - Cloud Native Apps
Araf Karsh Hamid
 
Kafka presentation
Mohammed Fazuluddin
 
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Kai Wähner
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Kai Wähner
 

Viewers also liked (6)

PPTX
Avro Tutorial - Records with Schema for Kafka and Hadoop
Jean-Paul Azar
 
PPTX
Performance Comparison of Streaming Big Data Platforms
DataWorks Summit/Hadoop Summit
 
PDF
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Cloudera, Inc.
 
PDF
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
PPTX
Kafka Tutorial - basics of the Kafka streaming platform
Jean-Paul Azar
 
PPTX
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
 
Avro Tutorial - Records with Schema for Kafka and Hadoop
Jean-Paul Azar
 
Performance Comparison of Streaming Big Data Platforms
DataWorks Summit/Hadoop Summit
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
Cloudera, Inc.
 
Developing Real-Time Data Pipelines with Apache Kafka
Joe Stein
 
Kafka Tutorial - basics of the Kafka streaming platform
Jean-Paul Azar
 
Real time Analytics with Apache Kafka and Apache Spark
Rahul Jain
 
Ad

Similar to Kafka and Avro with Confluent Schema Registry (20)

PDF
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
HostedbyConfluent
 
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Jean-Paul Azar
 
PDF
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
HostedbyConfluent
 
PPTX
Kafka Tutorial - introduction to the Kafka streaming platform
Jean-Paul Azar
 
PPTX
Kafka Tutorial, Kafka ecosystem with clustering examples
Jean-Paul Azar
 
PPTX
Kafka Intro With Simple Java Producer Consumers
Jean-Paul Azar
 
PPTX
Brief introduction to Kafka Streaming Platform
Jean-Paul Azar
 
PDF
Real-time, real estate listings with Apache Kafka
Ferran Galí Reniu
 
PPTX
Kafka Tutorial: Streaming Data Architecture
Jean-Paul Azar
 
PPTX
Kafka Tutorial - DevOps, Admin and Ops
Jean-Paul Azar
 
PDF
Deep Dive into Cassandra
Brent Theisen
 
PDF
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
PDF
Simplify Governance of Streaming Data
confluent
 
PDF
Streaming Operational Data with MariaDB MaxScale
MariaDB plc
 
PPTX
Apache Cassandra, part 3 – machinery, work with Cassandra
Andrey Lomakin
 
PDF
Streaming Data from Cassandra into Kafka
Abrar Sheikh
 
PDF
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
PPTX
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Jean-Paul Azar
 
ODP
Intro to cassandra
Aaron Ploetz
 
PDF
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Evolve Your Schemas in a Better Way! A Deep Dive into Avro Schema Compatibili...
HostedbyConfluent
 
Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Jean-Paul Azar
 
Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022
HostedbyConfluent
 
Kafka Tutorial - introduction to the Kafka streaming platform
Jean-Paul Azar
 
Kafka Tutorial, Kafka ecosystem with clustering examples
Jean-Paul Azar
 
Kafka Intro With Simple Java Producer Consumers
Jean-Paul Azar
 
Brief introduction to Kafka Streaming Platform
Jean-Paul Azar
 
Real-time, real estate listings with Apache Kafka
Ferran Galí Reniu
 
Kafka Tutorial: Streaming Data Architecture
Jean-Paul Azar
 
Kafka Tutorial - DevOps, Admin and Ops
Jean-Paul Azar
 
Deep Dive into Cassandra
Brent Theisen
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
HostedbyConfluent
 
Simplify Governance of Streaming Data
confluent
 
Streaming Operational Data with MariaDB MaxScale
MariaDB plc
 
Apache Cassandra, part 3 – machinery, work with Cassandra
Andrey Lomakin
 
Streaming Data from Cassandra into Kafka
Abrar Sheikh
 
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
Hisham Mardam-Bey
 
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Jean-Paul Azar
 
Intro to cassandra
Aaron Ploetz
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Kai Wähner
 
Ad

Recently uploaded (20)

PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
PDF
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
PPTX
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PDF
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
PDF
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
LOOPS in C Programming Language - Technology
RishabhDwivedi43
 
SIZING YOUR AIR CONDITIONER---A PRACTICAL GUIDE.pdf
Muhammad Rizwan Akram
 
Seamless Tech Experiences Showcasing Cross-Platform App Design.pptx
presentifyai
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
ICONIQ State of AI Report 2025 - The Builder's Playbook
Razin Mustafiz
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Bitcoin for Millennials podcast with Bram, Power Laws of Bitcoin
Stephen Perrenod
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Transcript: Book industry state of the nation 2025 - Tech Forum 2025
BookNet Canada
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
Go Concurrency Real-World Patterns, Pitfalls, and Playground Battles.pdf
Emily Achieng
 
Peak of Data & AI Encore AI-Enhanced Workflows for the Real World
Safe Software
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
[Newgen] NewgenONE Marvin Brochure 1.pdf
darshakparmar
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 

Kafka and Avro with Confluent Schema Registry

  • 1. ™ Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Avro Kafka & Avro: Confluent Schema Registry Managing Record Schema in Kafka
  • 2. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Confluent Schema Registry ❖ Confluent Schema Registry stores Avro Schemas for Kafka clients ❖ Provides REST interface for putting and getting Avro schemas ❖ Stores a history of schemas ❖ versioned ❖ allows you to configure compatibility setting ❖ supports evolution of schemas ❖ Provides serializers used by Kafka clients which handles schema storage and serialization of records using Avro
  • 3. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Why Schema Registry? ❖ Producer creates a record/message, which is an Avro record ❖ Record contains the schema and data ❖ Schema Registry Avro Serializer serializes the data and schema id (just id) ❖ Keeps a cache of registered schemas from Schema Registry to ids ❖ Consumer receives payload and deserializes it with Schema Registry Avro Deserializers ❖ Deserializer looks up the full schema from cache or Schema Registry based on id ❖ Consumer has its schema, one it is expecting record/message to conform to ❖ Compatibility check is performed or two schemas ❖ if no match, but are compatible, then payload transformation happens aka Schema Evolution ❖ if not failure ❖ Kafka records have Key and Value and schema can be done on both
  • 4. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Compatibility ❖ Backward Compatibility (default) ❖ New, backward compatible schema, will not break consumers ❖ Producers could be using older schema that is backwards compatible with Consumer ❖ Forward compatibility ❖ Records sent with new forward compatible schema can be deserialized with older schemas ❖ Consumers can use an older schema and never be updated (maybe never needs new fields) ❖ Full compatibility ❖ New version of a schema is backward and forward compatible ❖ None ❖ Schema will not be validated for compatibility at all
  • 5. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Config ❖ Compatibility can be configured globally or per schema ❖ Options are: ❖ NONE - don’t check for schema compatibility ❖ FORWARD - check to make sure last schema version is forward compatible with new schemas ❖ BACKWARDS (default) - make sure new schema is backwards compatible with latest ❖ FULL - make sure new schema is forwards and backwards compatible from latest to new and from new to latest
  • 6. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Actions ❖ Register schemas for key and values of Kafka records ❖ List schemas (subjects) ❖ List all versions of a subject (schema) ❖ Retrieve a schema by version or id ❖ get latest version of schema ❖ Check to see if schema is compatible with a certain version ❖ Get the compatibility level setting of the Schema Registry ❖ BACKWARDS, NONE ❖ Add compatibility settings to a subject/schema
  • 7. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Evolution ❖ Avro schema is changed after data has been written to store using an older version of that schema, then Avro might do a Schema Evolution ❖ Schema evolution is automatic transformation of Avro schema ❖ transformation is between version of consumer schema and what the producer put into the Kafka log ❖ When Consumer schema is not identical to the Producer schema used to serialize the Kafka Record then a data transformation is performed on the Kafka record (key or value) ❖ If the schemas match then no need to do a transformation ❖ Schema evolution is happens only during deserialization at the Consumer ❖ If Consumer’s schema is different from Producer’s schema, then value or key is automatically modified during deserialization to conform to consumers reader schema
  • 8. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Allowed Schema Modifications ❖ Add a field with a default ❖ Remove a field that had a default value ❖ Change a fields order attribute ❖ Change a fields default value ❖ Remove or add a field alias ❖ Remove or add a type alias ❖ Change a type to a union that contains original type
  • 9. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Rules of the Road for modifying Schema ❖ Provide a default value for fields in your schema ❖ Allows you to delete the field later later ❖ Don’t change a field's data type ❖ When adding a new field to your schema, you have to provide a default value for the field ❖ Don’t rename an existing field ❖ You can add an alias
  • 10. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Remember our example Employee Avro covered in Avro/Kafka Tutorial
  • 11. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Let’s say ❖ Employee did not have an age in version 1 of the schema ❖ Later we decided to add an age field with a default value of -1 ❖ Now let’s say we have a Producer using version 2, and a Consumer using version 1
  • 12. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Scenario adding a new field age with default value ❖ Producer uses version 2 of the Employee schema and creates a com.cloudurable.Employee record, and sets age field to 42, then sends it to Kafka topic new-employees ❖ Consumer consumes records from new-employees using version 1 of the Employee Schema ❖ Since Consumer is using version 1 of schema, age field is removed during deserialization ❖ Same consumer modifies name field and then writes the record back to a NoSQL store ❖ When it does this, the age field is missing from value that it writes to the store ❖ Another client using version 2 reads the record from the NoSQL store ❖ Age field is missing from the record (because the Consumer wrote it with version 1), age is set to default value of -1
  • 13. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry Actions ❖ Register schemas for key and values of Kafka records ❖ List schemas (subjects) ❖ List all versions of a subject (schema) ❖ Retrieve a schema by version or id ❖ get latest version of schema ❖ Check to see if schema is compatible with a certain version ❖ Get the compatibility level setting of the Schema Registry ❖ BACKWARDS, FORWARD, FULL, NONE ❖ Add compatibility settings to a subject/schema
  • 14. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Register a Schema
  • 15. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Register a Schema {"id":2} curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{"type": …}’ http://localhost:8081/subjects/Employee/versions
  • 16. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ List All Schema ["Employee","Employee2","FooBar"] curl -X GET http://localhost:8081/subjects
  • 17. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Working with versions [1,2,3,4,5] {“subject”:"Employee","version":2,"id":4,"schema":" {"type":"record","name":"Employee", ”namespace”:"com.cloudurable.phonebook", … {“subject”:"Employee","version":1,"id":3,"schema":" {"type":"record","name":"Employee", ”namespace”:"com.cloudurable.phonebook", …
  • 18. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Working with Schemas
  • 19. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Changing Compatibility Checks
  • 20. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Incompatible Change {“error_code":409," message":"Schema being registered is incompatible with an e
  • 21. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Incompatible Change {"is_compatible":false}
  • 22. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use Schema Registry ❖ Start up Schema Registry server pointing to Zookeeper cluster ❖ Import Kafka Avro Serializer and Avro Jars ❖ Configure Producer to use Schema Registry ❖ Use KafkaAvroSerializer from Producer ❖ Configure Consumer to use Schema Registry ❖ Use KafkaAvroDeserializer from Consumer
  • 23. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Start up Schema Registry Server cat ~/tools/confluent-3.2.1/etc/schema-registry/schema-registry.properties listeners=http://0.0.0.0:8081 kafkastore.connection.url=localhost:2181 kafkastore.topic=_schemas debug=false
  • 24. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Import Kafka Avro Serializer & Avro Jars
  • 25. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Configure Producer to use Schema Registry
  • 26. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use KafkaAvroSerializer from Producer
  • 27. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Configure Consumer to use Schema Registry
  • 28. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Use KafkaAvroDeserializer from Consumer
  • 29. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Schema Registry ❖ Confluent provides Schema Registry to manage Avro Schemas for Kafka Consumers and Producers ❖ Avro provides Schema Migration ❖ Confluent uses Schema compatibility checks to see if Producer schema and Consumer schemas are compatible and to do Schema evolution if needed ❖ Use KafkaAvroSerializer from Producer ❖ Use KafkaAvroDeserializer from Consumer