Skip to content

Latest commit

 

History

History

README.md

structuredmerge Logo by Aboling0, CC BY-SA 4.0

☯️ Markdown::Merge

Version GitHub tag (latest SemVer) License: AGPL-3.0-only OR PolyForm-Small-Business-1.0.0 Downloads Rank CI Current

if ci_badges.map(&:color).detect { it != "green"} ☝️ let me know, as I may have missed the discord notification.


if ci_badges.map(&:color).all? { it == "green"} 👇️ send money so I can do more of this. FLOSS maintenance is now my full-time job.

Sponsor Me on Github Liberapay Goal Progress Donate on PayPal Buy me a coffee Donate at ko-fi.com

👣 How will this project approach the September 2025 hostile takeover of RubyGems? 🚑️

I've summarized my thoughts in this blog post.

🌻 Synopsis Galtzo FLOSS Logo by Aboling0, CC BY-SA 4.0 ruby-lang Logo, Yukihiro Matsumoto, Ruby Visual Identity Team, CC BY-SA 2.5

Markdown::Merge provides intelligent Markdown file merging using tree_haver backends. It can be used standalone or through parser-specific wrappers.

Direct usage (with auto-detected or specified backend):

require "markdown/merge"

# Auto-detect available backend (commonmarker or markly)
merger = Markdown::Merge::SmartMerger.new(template_content, dest_content)
result = merger.merge

# Or specify a backend explicitly
merger = Markdown::Merge::SmartMerger.new(template_content, dest_content, backend: :markly)

Via parser-specific wrappers (for hard dependencies and backend-specific defaults):

Key Features

  • Multiple Backends: Supports Commonmarker and Markly through tree_haver's unified API
  • Type Normalization: Canonical node types (:heading, :paragraph, etc.) work across all backends
  • Extensible: Register custom backends via NodeTypeNormalizer.register_backend
  • Structure-Aware: Understands headings, paragraphs, lists, code blocks, tables, and other block elements
  • Freeze Block Support: Respects freeze markers (default: markdown-merge:freeze / markdown-merge:unfreeze) for template merge control - customizable to match your project's conventions
  • Inner-Merge Code Blocks: Optionally merge fenced code blocks using language-specific mergers (Ruby → prism-merge, YAML → psych-merge, JSON → json-merge, TOML → toml-merge)
  • Table Match Refiner: Fuzzy matching algorithm for tables with similar but not identical headers
  • Full Provenance: Tracks origin of every node
  • Customizable:
    • backend - select :commonmarker, :markly, or :auto
    • signature_generator - callable custom signature generators
    • preference - setting of :template, :destination, or a Hash for per-node-type preferences
    • add_template_only_nodes - setting to retain sections that do not exist in destination
    • freeze_token - customize freeze block markers (default: "markdown-merge")
    • inner_merge_code_blocks - enable language-aware code block merging
    • match_refiner - fuzzy matching for unmatched nodes (e.g., TableMatchRefiner)

Supported Node Types

Signatures computed by default for common Markdown block elements:

Node Type Signature Format Matching Behavior
Heading [:heading, level, text] Headings match by level and text content
Paragraph [:paragraph, content_hash] Paragraphs match by content hash
List [:list, type, item_count] Lists match by type (bullet/ordered) and item count
Code Block [:code_block, language, content_hash] Code blocks match by language and content
Block Quote [:blockquote, content_hash] Block quotes match by content hash
Table [:table, row_count, header_hash] Tables match by structure and header content
HTML Block [:html, content_hash] HTML blocks match by content hash
Thematic Break [:hrule] Horizontal rules always match
Footnote Definition [:footnote_definition, label] Footnotes match by label/name

💡 Info you can shake a stick at

Tokens to Remember Gem name Gem namespace
Works with MRI Ruby 4 Ruby current Compat
Support & Community Join Me on Daily.dev's RubyFriends Live Chat on Discord Get help from me on Upwork Get help from me on Codementor
Source Source on GitLab.com Source on CodeBerg.org Source on Github.com The best SHA: dQw4w9WgXcQ!
Documentation Current release on RubyDoc.info YARD on Galtzo.com Maintainer Blog GitLab Wiki GitHub Wiki
Compliance License: AGPL-3.0-only OR PolyForm-Small-Business-1.0.0 Apache license compatibility: Category X 📄ilo-declaration-img Security Policy Contributor Covenant 2.1 SemVer 2.0.0
Style Enforced Code Style Linter Keep-A-Changelog 1.0.0 Gitmoji Commits Compatibility appraised by: appraisal2
Maintainer 🎖️ Follow Me on LinkedIn Follow Me on Ruby.Social Follow Me on Bluesky Contact Maintainer My technical writing
... 💖 Find Me on WellFound: Find Me on CrunchBase My LinkTree More About Me 🧊 🐙 🛖 🧪

Compatibility

Compatible with MRI Ruby 4.0.0+, and concordant releases of JRuby, and TruffleRuby. CI workflows and Appraisals are generated for MRI Ruby 4.0.0+. This test floor is configured by ruby.test_minimum in .kettle-jem.yml and may be higher than the gem's runtime compatibility floor when legacy Rubies are not practical for the current toolchain.

kettle-dev Logo by Aboling0, CC BY-SA 4.0

The amazing test matrix is powered by the kettle-dev stack.

How kettle-dev manages complexity in tests
Gem Source Role Daily download rank
appraisal2 GitHub multi-dependency Appraisal matrix generation Daily download rank for appraisal2
appraisal2-rubocop GitHub RuboCop Appraisal generator integration Daily download rank for appraisal2-rubocop
kettle-dev GitHub development, release, and CI workflow tooling Daily download rank for kettle-dev
kettle-jem GitHub Appraisals & CI workflow templates Daily download rank for kettle-jem
kettle-soup-cover GitHub SimpleCov coverage policy and reporting Daily download rank for kettle-soup-cover
kettle-test GitHub standard test runner and coverage harness Daily download rank for kettle-test
rubocop-lts GitHub Ruby-version-aware linting Daily download rank for rubocop-lts
turbo_tests2 GitHub parallel test execution Daily download rank for turbo_tests2

✨ Installation

Install the gem and add to the application's Gemfile by executing:

bundle add markdown-merge

If bundler is not being used to manage dependencies, install the gem by executing:

gem install markdown-merge

⚙️ Configuration

SmartMerger Configuration

The SmartMerger class is the main entry point for merging Markdown files:

require "markdown/merge"

merger = Markdown::Merge::SmartMerger.new(
  template_content,
  dest_content,

  # Backend selection (default: :auto)
  # :auto - auto-detect available backend (tries commonmarker first, then markly)
  # :commonmarker - use Commonmarker (comrak Rust parser)
  # :markly - use Markly (cmark-gfm C library)
  backend: :auto,

  # Which version to prefer when nodes match but differ
  # :destination (default) - keep destination content (preserves customizations)
  # :template - use template content (applies updates)
  preference: :destination,

  # Whether to add template-only nodes to the result
  # false (default) - only include sections that exist in destination
  # true - include all template sections
  add_template_only_nodes: false,

  # Token for freeze block markers
  # Default: "markdown-merge"
  # Looks for: <!-- markdown-merge:freeze --> / <!-- markdown-merge:unfreeze -->
  freeze_token: "markdown-merge",

  # Enable inner-merge for fenced code blocks
  # false (default) - use standard conflict resolution for code blocks
  # true - merge code block contents using language-specific mergers
  # CodeBlockMerger instance - use custom CodeBlockMerger
  inner_merge_code_blocks: false,

  # Match refiner for fuzzy matching of unmatched nodes
  # nil (default) - exact matching only
  # TableMatchRefiner.new - enable fuzzy table matching
  match_refiner: nil,

  # Custom signature generator (optional)
  # Receives a node (wrapped with canonical merge_type), returns a signature array or nil
  # Return the node itself to fall through to default signature
  signature_generator: nil,

  # Backend-specific options (passed through to parser)
  # For commonmarker: options: {}
  # For markly: flags: Markly::DEFAULT, extensions: [:table]
)

Text Matching Behavior

Important: When matching nodes by text content (such as for anchor patterns in PartialTemplateMerger), the .text method returns plain text without markdown formatting.

This means:

  • Markdown: ### The `*-merge` Gem Family
  • .text returns: "The *-merge Gem Family\n"

The backticks around *-merge are stripped because they are inline formatting, not content. This is true for both Commonmarker and Markly backends.

Anchor pattern examples:

# ❌ WRONG - backticks are stripped, so this won't match
anchor: { type: :heading, text: /`\*-merge` Gem Family/ }

# ✅ CORRECT - match the plain text content
anchor: { type: :heading, text: /\*-merge.*Gem Family/ }

# ✅ CORRECT - use beginning anchor for exact heading match
anchor: { type: :heading, text: /^The \*-merge Gem Family/ }

Other markdown formatting that is stripped from .text:

  • Bold: **text**text
  • Italic: *text* or _text_text
  • Code: `code`code
  • Links: [text](url)text
  • Images: ![alt](src)alt

Note: Different parsers may have other idiosyncrasies. For example:

  • Trailing newlines may or may not be present
  • Whitespace normalization may differ
  • Entity encoding may vary

Always test your patterns against actual parsed content when building merge recipes.

Node Type Normalization

markdown-merge normalizes node types across backends so merge rules are portable:

# These are equivalent regardless of backend
# Markly's :header becomes :heading
# Markly's :hrule becomes :thematic_break
# etc.

# Register a custom backend's type mappings
Markdown::Merge::NodeTypeNormalizer.register_backend(:my_parser, {
  h1: :heading,
  h2: :heading,
  para: :paragraph,
  # ...
})

Parser-Specific Wrappers

For convenience, parser-specific wrappers provide backend-specific defaults:

# commonmarker-merge (freeze_token: "commonmarker-merge", inner_merge_code_blocks: false)
require "commonmarker/merge"
merger = Commonmarker::Merge::SmartMerger.new(template, dest, options: {})

# markly-merge (freeze_token: "markly-merge", inner_merge_code_blocks: true)
require "markly/merge"
merger = Markly::Merge::SmartMerger.new(template, dest, flags: Markly::DEFAULT, extensions: [:table])

Freeze Blocks

Freeze blocks protect sections from being modified during merges. They are marked with HTML comments that are invisible when the Markdown is rendered:

<!-- markdown-merge:freeze -->

## This Section Is Protected

Any content here will be preserved exactly as-is during merges.
The merge tool will not modify, replace, or remove this content.

<!-- markdown-merge:unfreeze -->

Add an optional frozen reason to document why:

<!-- markdown-merge:freeze Custom table - manually maintained -->
| Feature | Status |
|---------|--------|
| Custom  ||
<!-- markdown-merge:unfreeze -->

Inner-Merge Code Blocks

When enabled, fenced code blocks are merged using language-specific *-merge gems:

merger = SomeParser::Merge::SmartMerger.new(
  template,
  destination,
  inner_merge_code_blocks: true,
)

Supported languages and their mergers:

Language Fence Info Merger
Ruby ruby, rb prism-merge
YAML yaml, yml psych-merge
JSON json json-merge
TOML toml toml-merge

Example with a Ruby code block:

```ruby

# Template

class MyClass
  def new_method
    puts "from template"
  end
end
```

When merged(with:

```ruby

# Destination

class MyClass
  def existing_method
    puts "custom"
  end
end)
```

Result (with inner_merge_code_blocks: true):

```ruby
class MyClass
  def existing_method
    puts "custom"
  end

  def new_method
    puts "from template"
  end
end
```

Table Match Refiner

When tables don't match by exact signature, the TableMatchRefiner uses fuzzy matching to pair tables with similar structure:

refiner = Markdown::Merge::TableMatchRefiner.new(
  threshold: 0.5,  # Minimum similarity (0.0-1.0)
  algorithm_options: {
    weights: {
      header_match: 0.25,  # Header cell similarity
      first_column: 0.20,  # Row label similarity
      row_content: 0.25,   # Row content overlap
      total_cells: 0.15,   # Overall cell matching
      position: 0.15,      # Position distance
    },
  },
)

merger = SomeParser::Merge::SmartMerger.new(
  template,
  destination,
  match_refiner: refiner,
)

Debug Logging

Enable debug logging to see merge decisions:

export MARKDOWN_MERGE_DEBUG=1

🔧 Basic Usage

Note: This gem provides base classes for implementers. End users should use commonmarker-merge or markly-merge instead.

For End Users

Use a parser-specific implementation:

Option 1: Using commonmarker-merge (Comrak/Rust)

require "commonmarker/merge"

template = File.read("template.md")
destination = File.read("destination.md")

merger = Commonmarker::Merge::SmartMerger.new(template, destination)
result = merger.merge

File.write("merged.md", result.content)

Option 2: Using markly-merge (libcmark-gfm/C)

require "markly/merge"

template = File.read("template.md")
destination = File.read("destination.md")

merger = Markly::Merge::SmartMerger.new(template, destination)
result = merger.merge

File.write("merged.md", result.to_markdown)

For Implementers

Creating a new parser-specific implementation:

require "markdown/merge"

module MyParser
  module Merge
    class FileAnalysis < Markdown::Merge::FileAnalysisBase
      def parse_document(source)
        # Parse source and return root document node
        MyParser.parse(source)
      end

      def next_sibling(node)
        # Return the next sibling of a node
        node.next_sibling
      end

      def compute_parser_signature(node)
        # Compute signature for parser-specific nodes
        # Or call super for default implementation
        super
      end
    end

    class SmartMerger < Markdown::Merge::SmartMergerBase
      def create_file_analysis(content, **options)
        FileAnalysis.new(content, **options)
      end

      def node_to_source(node, analysis)
        case node
        when Markdown::Merge::FreezeNode
          node.full_text
        else
          # Convert node back to source text
          node.to_markdown
        end
      end
    end
  end
end

Freeze Block Protection

Both implementations support freeze blocks for protecting customized sections:

# My Project

## Installation

<!-- markdown-merge:freeze Custom install instructions -->
This installation section has been customized and will be preserved
during template merges, regardless of what the template contains.
<!-- markdown-merge:unfreeze -->

## Usage

Standard usage section - can be updated from template.

Content between freeze markers is always preserved from the destination file, even when the template has different content for that section.

🔐 Security

See SECURITY.md.

🤝 Contributing

If you need some ideas of where to help, you could work on adding more code coverage, or if it is already 💯 (see below) check issues or PRs, or use the gem and think about how it could be better.

We Keep A Changelog so if you make changes, remember to update it.

See CONTRIBUTING.md for more detailed instructions.

📌 Versioning

This library follows Semantic Versioning 2.0.0 for its public API where practical. For most applications, prefer the Pessimistic Version Constraint with two digits of precision.

For example:

spec.add_dependency("markdown-merge", "~> 7.0")
📌 Is "Platform Support" part of the public API? More details inside.

Dropping support for a platform can be a breaking change for affected users. If a release changes supported platforms, it should be called out clearly in the changelog and versioned with that impact in mind.

To get a better understanding of how SemVer is intended to work over a project's lifetime, read this article from the creator of SemVer:

See CHANGELOG.md for a list of releases.

📄 License

The gem is available under the following licenses: AGPL-3.0-only, PolyForm-Small-Business-1.0.0. See LICENSE.md for details.

If none of the available licenses suit your use case, please contact us to discuss a custom commercial license.