Skip to content

Confluence Data Center to Notion Migration: The On-Prem Exit Path

Confluence Data Center to Notion migration has no native path. Covers extraction methods, Notion API limits, format translation, and what breaks.

Raaj Raaj · · 15 min read
Confluence Data Center to Notion Migration: The On-Prem Exit Path
TALK TO AN ENGINEER

Planning a migration?

Get a free 30-min call with our engineers. We'll review your setup and map out a custom migration plan — no obligation.

Schedule a free call
  • 1,500+ migrations completed
  • Zero downtime guaranteed
  • Transparent, fixed pricing
  • Project success responsibility
  • Post-migration support included

Migrating from Confluence Data Center to Notion is not the same as migrating from Confluence Cloud to Notion. The standard Atlassian migration funnel — CCMA to Cloud, then Notion's native importer — does not apply to most DC environments. DC instances have no clean export-to-Notion path, typically carry far more data volume and macro complexity than Cloud tenants, and were self-hosted for reasons that moving to a US-hosted SaaS platform directly contradicts.

This guide covers the real extraction methods, the hard API constraints on both sides, what survives the translation between Confluence's XHTML storage format and Notion's block model, and what doesn't.

If you're weighing the architectural differences between the two platforms before committing, read our Notion vs. Confluence (2026) comparison first. For the Cloud-to-Notion path and Notion's native importer limits, see our Confluence to Notion migration limits and macro guide.

Why Confluence Data Center to Notion Is a Different Beast

Three things make this migration fundamentally harder than Cloud-to-Notion:

No native migration path exists. The Confluence Cloud Migration Assistant (CCMA) only moves data from Server/DC to Confluence Cloud — it cannot target third-party platforms. (support.atlassian.com) Notion's built-in Confluence importer offers a ZIP upload path and an API import path. The ZIP path accepts HTML exports from any Confluence instance (including DC) but is capped at roughly 5GB. The API path connects directly to a Confluence instance and handles up to ~30GB or ~50,000 pages — but it requires network access to your server, which most DC deployments behind firewalls cannot provide. (notion.com)

DC instances are bigger and more complex. Enterprise DC deployments routinely have 50,000–500,000+ pages across hundreds of spaces, heavy use of custom and Marketplace macros (draw.io, Scroll Versions, ScriptRunner), and space-level permission schemes that Cloud tenants rarely match. The data volume alone changes the engineering approach. Notion warns that large imports may fail beyond ~50,000 pages or 30GB, and that most macros and plugins degrade. (notion.com)

You chose self-hosting for a reason. DC customers typically selected on-prem deployment for data sovereignty, regulatory compliance (GDPR, ITAR, FedRAMP), or air-gapped environments. Moving that content to Notion — a SaaS platform hosted on AWS — requires explicitly addressing those original concerns. Notion offers EU data residency for Enterprise plans, but that is not equivalent to controlling your own infrastructure. With Atlassian Data Center reaching end of life on March 28, 2029, the pressure to move is real, but the destination needs to satisfy the same constraints that put you on DC in the first place.

Data Extraction From Confluence Data Center: XML vs REST API vs Direct Database

Getting data out of a DC instance is the first engineering decision. There are three real options, each with different trade-offs. For a deeper dive on all export methods, see our complete Confluence export guide.

XML Space Export

Confluence's native XML export produces a ZIP archive containing every page, comment, and attachment in a space. It's the simplest extraction method and requires only the Export Space permission. (confluence.atlassian.com)

What it drops silently:

  • Page-level restrictions are permission-dependent. If a site admin runs the export, all content is included — even restricted pages. If a space admin runs it, restricted content the admin can't view is silently excluded with no error message.
  • Team Calendars are excluded entirely.
  • Large spaces with many attachments can fail on certain file systems. Confluence copies attachments to a temp directory during export, and file systems like ext3 have subdirectory limits that large spaces can exceed — Atlassian's KB cites failures when a space has more than 32,000 pages with attachments on ext3, producing a cryptic "Error creating temp file in folder" failure. (support.atlassian.com)
Warning

XML exports are space-by-space. There is no native "export all spaces" button. For a 200-space DC instance, this means 200 separate export operations — each of which queues behind any other running export job.

The XML format is also designed for Confluence-to-Confluence restore, not third-party transformation. Parsing it to extract granular user mapping or feed a Notion pipeline is awkward — it's restore-oriented data, not a clean page-by-page transform format.

REST API Extraction

The Confluence REST API (/rest/api/content) gives you programmatic control over what you extract: page bodies in storage format, metadata, labels, attachments, and child page hierarchies. This is the most flexible option for building a migration pipeline and the default for most serious projects.

A minimal page-body fetch:

curl -H 'Authorization: Bearer $PAT' \
  $BASE/rest/api/content/$ID?expand=body.storage,version

DC-specific gotchas:

  • Rate limiting is admin-configurable and uses a token bucket model. Admins set a requests-per-interval limit (e.g., 100 requests per hour) that applies per node in a clustered deployment. If rate limiting is enabled, your extraction scripts will hit HTTP 429 responses. The retry-after header tells you how long to wait. (confluence.atlassian.com)
  • Rate limiting targets external REST API requests only. Requests from within the Confluence UI are not throttled. But the classification logic checks for specific headers and cookies — reverse proxy misconfiguration (e.g., Nginx stripping the Referer header) can cause internal UI requests to be misclassified as external, triggering rate limits on regular users.
  • Version differences matter. Confluence 6.x has a more limited REST API surface and no personal access tokens (PATs); expect older auth and more brittle operating conditions. PATs arrive in Confluence 7.9 — that is the practical breakpoint for safer scripted extraction. (confluence.atlassian.com) Rate limiting was introduced in the 7.x line. Confluence 8.x has broader API capabilities and better rate limit headers (x-ratelimit-limit, x-ratelimit-remaining), but 8.0 removes Hibernate2-based plugin access — any extractor or legacy utility leaning on old persistence assumptions needs review. (developer.atlassian.com)

Direct Database Extraction

For DC instances with 100K+ pages, the fastest extraction method is querying the underlying PostgreSQL or MySQL database directly. Page content lives in the BODYCONTENT table, linked to CONTENT (pages) via CONTENTID. Attachments are stored on the file system under <confluence-home>/attachments/.

Why this is hard:

  • Confluence's body storage format changed from wiki markup to XHTML-based storage format in Confluence 4.0. If your DC instance was upgraded from a very old version, historical page versions may still contain wiki markup wrapped in an unmigrated-wiki-markup macro.
  • The storage format is XML with custom Atlassian-namespaced elements (ac:structured-macro, ac:image, ac:link, ri:attachment). Standard HTML parsers will choke. You need a parser that understands Atlassian's XHTML extensions. (confluence.atlassian.com)
  • Attachment file paths use Confluence's internal directory structure (ver003/<space-id>/<page-id>/<attachment-id>/), not human-readable filenames. You need the database to map file paths back to page context.
  • Atlassian does not document direct SQL extraction as a public migration interface. Internal persistence assumptions change across major versions — this is an engineering project, not a feature toggle.
Method Best for Speed Completeness Complexity
XML Export <10K pages, simple spaces Slow (queued) High, minus restrictions Low
REST API 10K–100K pages Medium High with correct permissions Medium
Direct DB 100K+ pages Fast Highest (with file system access) High

The Notion API Ingest Bottleneck: Rate Limits and Block Ceilings

Getting data into Notion is where most migration scripts break. The Notion API has some of the tightest ingest constraints in the SaaS ecosystem, and there is no native bulk import endpoint for custom data payloads.

Hard limits (developers.notion.com):

  • 3 requests/second average rate limit per integration token. Bursts above this are allowed briefly, but sustained throughput is capped. Exceeding it returns HTTP 429 with a Retry-After header.
  • 100 block children per append request. A single PATCH /v1/blocks/{id}/children call accepts a maximum of 100 blocks. A long Confluence page with 300 paragraphs requires at least 3 API calls just for the body content. (developers.notion.com)
  • 2,000 characters per rich text object. Any text block exceeding this must be split into multiple rich text objects. Confluence pages with large unbroken paragraphs or code blocks will hit this.
  • 2 levels of nesting per request. The API allows at most two levels of nested children in a single append call. Confluence supports virtually unlimited nesting (a list inside a panel inside a table). Reconstructing deeply nested structures requires recursive API calls — each consuming rate limit budget.
  • 500KB request payload ceiling and a 1,000-block element maximum per request.
  • File uploads are more nuanced than the often-repeated "5MB limit." Current Notion API docs show that free-plan bots are capped at 5MB per file, while paid workspaces can upload small files (up to 20MB) via a two-step process and larger files (up to 5GB) via multi-part uploads. You still need an attachment pipeline, because Confluence inline files often exceed the simplest upload path. (developers.notion.com)
Info

The math on a 200K-page migration: At 3 requests/second, you get ~10,800 API calls per hour. Each page requires at minimum: 1 call to create the page + 1–3 calls to append body content + 1 call per attachment. A conservative estimate of 5 calls per page × 200,000 pages = 1,000,000 API calls. At 10,800/hour, that's ~93 hours of continuous API throughput — assuming zero errors, zero retries, and no attachment uploads. In practice, expect 2–4 weeks for a large DC migration using a single integration token.

Parallelization strategies:

  • Use multiple integration tokens (each with its own 3 req/s budget) authorized against different teamspaces.
  • Batch block creation — always fill each append request to the 100-block maximum.
  • Queue and retry with exponential backoff on 429 responses. Keep retries idempotent.
  • Pre-process content to minimize API calls: resolve all text splitting, nesting flattening, and attachment staging before hitting the Notion write boundary. The more work you do upstream of the API, the faster the migration finishes.

Translating Confluence Storage Format to Notion Blocks

Confluence DC stores content in an XHTML-based storage format with Atlassian-specific XML namespaces (ac:structured-macro, ri:attachment). Notion uses a block-based data model where every element — paragraph, heading, image, table row — is a discrete JSON block object. The two models overlap just enough to fool teams into underestimating the conversion layer. (confluence.atlassian.com)

What Translates Cleanly

  • Headings (<h1> through <h6>) → Notion heading_1, heading_2, heading_3 blocks. Notion only supports 3 heading levels, so <h4><h6> must be downgraded or converted to bold paragraphs.
  • Paragraphs (<p>) → paragraph blocks. Inline formatting (bold, italic, code, links) maps to Notion's rich text annotations.
  • Ordered and unordered lists (<ol>, <ul>) → numbered_list_item and bulleted_list_item blocks.
  • Task lists (<ac:task-list>) → to_do blocks.

What Translates With Loss

  • Tables translate to Notion table blocks, but merged cells are lost (Notion doesn't support colspan or rowspan), column widths are lost (Notion auto-sizes), and colored rows/cells are lost (no cell-level styling). Notion's own docs warn that table styling is lost and complicated data can fail. (notion.com)
  • Code blocks (<ac:structured-macro ac:name="code">) translate to Notion code blocks. The language parameter maps in most cases, but Confluence supports languages Notion doesn't recognize — those default to plain text. Don't trust language metadata or display fidelity until you test representative pages.
  • Inline images (<ac:image>) require extraction from the DC instance's attachment store (referenced by ri:attachment with a filename), upload to Notion via the File Upload API, and embedding as image blocks. This is the single most time-consuming part of the migration pipeline. Do not assume an XHTML image reference automatically becomes a durable Notion asset.

What Has No Notion Equivalent

  • Page includes (include and excerpt-include macros) — Notion has no transclusion. Referenced content must be inlined before migration or replaced with a link.
  • Jira issue macros — Dynamic content pulling live data from Jira. Must be snapshot as static text or replaced with a link to the Jira issue.
  • draw.io / Gliffy diagrams — Embedded as macro content in Confluence. Must be exported as PNG/SVG and re-uploaded as static images. Notion's troubleshooting docs explicitly call out Gliffy content as a known problem area. (notion.com)
  • Roadmap macros, Chart macros — Dynamic Confluence features with no Notion parallel.
  • Section/column layout macros — Confluence supports multi-column layouts. Notion supports columns in the UI, but the API does not natively create column blocks as of the current version.

For Data Center estates with custom Marketplace macros, macro resolution is its own workstream — not post-go-live cleanup.

What Dies in Migration: The Honest List

Every platform migration loses something. Here's what you cannot bring from Confluence DC to Notion:

Data Type What Happens
Info/warning/note/tip panels Convert to Notion callout blocks, but lose Confluence's color customization. All panels become generic callouts.
Status macros No Notion equivalent. Convert to colored text or inline tags — color fidelity varies.
Page-level restrictions Notion uses a different sharing model (workspace → teamspace → page sharing). Confluence's per-page, per-group view/edit restrictions do not map. Must be re-implemented manually. Notion confirms imported pages land private and permissions must be reapplied. (confluence.atlassian.com)
Confluence labels No direct Notion equivalent for regular pages. Can be converted to database properties if pages are migrated as database entries. Lost entirely if migrated as standalone pages.
Page version history No API to import historical versions. Only the current (latest) version of each page can be migrated. (notion.com)
Comments Notion's built-in API importer can retain comments on the API import path. ZIP imports do not carry them. Custom public-API pipelines cannot import them — the Comments API is extremely limited. (notion.com)
Watch/notification subscriptions No migration path. Users must re-follow pages manually.
Blog posts Confluence treats blogs as distinct content types with date-based organization. Notion has no blog concept — they become regular pages.

For a detailed breakdown of macro-specific losses, see our Confluence to Notion macro mapping guide.

Space-to-Workspace Architecture Mapping

Confluence DC organizes content into spaces — each with its own page tree, permissions, and admin. Enterprise instances commonly have 100–500+ spaces organized by team, project, department, or function.

Notion's hierarchy is workspace → teamspace → page tree. Two mapping strategies are sane:

Option A: One Confluence space = one Notion teamspace. Cleanest mapping. Each space gets its own teamspace with an isolated page tree and simplest ownership model. The downside: 300 spaces means 300 teamspaces. Discoverability suffers, and full permission control on teamspaces requires a Notion Enterprise plan.

Option B: Consolidate related spaces into fewer teamspaces. Group related Confluence spaces (e.g., all engineering spaces into one "Engineering" teamspace) with top-level divider pages. Better navigation, but deeper page trees and less clean permission boundaries.

If a Confluence space already maps to a stable audience and owner group, give it its own teamspace. If it's mostly historical sprawl, consolidate and use the migration to clean house.

Permission model mismatch: Confluence space permissions operate at the space level with per-group granularity (view, edit, admin, export). Notion's sharing model is per-page or per-teamspace, with simpler permission levels (full access, can edit, can view, can comment). There is no way to replicate Confluence's group-based, action-specific permission matrix in Notion without manual configuration after migration.

Tip

Audit before you migrate. Archive inactive spaces. Merge overlapping spaces. The migration is your chance to clean up years of organic sprawl — don't replicate a messy structure into a new platform.

Migration Methods Compared

Manual Copy-Paste

Viable for: fewer than ~100 pages with simple content.

Strips all page hierarchy, metadata, and formatting beyond what clipboard paste preserves. No internal links, no attachments, no labels. This is content rescue, not migration.

Time cost: 2–5 minutes per page. 100 pages = 3–8 hours of manual work.

Notion's Built-In Importer

Notion's importer offers a ZIP upload path and an API import path. The API path connects directly to a Confluence instance and can retain comments and user mapping — an advantage over every other method. The ZIP path accepts exported HTML archives but does not carry comments. (notion.com)

Info

Built-in importer and public API are different tools. If you bypass the built-in importer and build directly against Notion's public API, you gain full control over structure and edge cases, but you also own block chunking, retries, attachment handling, and permission remapping. The built-in importer handles much of this automatically — within its documented limits.

For DC instances, the API import path requires network access to your Confluence server. Most DC deployments behind firewalls or VPNs cannot provide this without exposing the instance. The ZIP path works with DC exports but is limited to ~5GB. The API path handles up to ~30GB or ~50,000 pages; larger instances may fail.

If your DC estate fits inside those envelopes and you can provide network access, the built-in importer is the fastest starting point. If it doesn't, you need a custom pipeline.

Open-Source DIY Scripts

Tools like the confluence2notion Python script iterate over Confluence spaces via the REST API and recursively create Notion pages. They work for small to mid-size migrations but hit walls on DC-specific scenarios:

  • No direct DB extraction — they rely on the REST API, subject to DC rate limiting.
  • Limited macro handling — most scripts convert known macros (code blocks, panels) but silently drop unknown or custom macros.
  • No attachment pipeline — inline images and file attachments need separate download/upload logic that most scripts don't include.
  • No nesting management — Notion's 2-level nesting limit per API request requires recursive chunking logic that basic scripts lack.

Time cost: 1–3 weeks of engineering effort to adapt for a DC instance, plus migration runtime (days to weeks depending on volume).

ClonePartner's Custom Pipeline

This is the approach we use for DC-to-Notion migrations at scale, engineered around the specific constraints of both platforms:

  • Direct database extraction for instances with 100K+ pages — bypassing REST API rate limits entirely by reading from PostgreSQL/MySQL and the attachment file system.
  • Macro audit and resolution before migration. We inventory every macro across all spaces, categorize them (translatable, resolvable to static content, or no-equivalent), and resolve dynamic content pre-migration so nothing is silently dropped.
  • Attachment pipeline that handles Notion's file upload tiers — extracting inline images from Confluence's attachment store, validating file sizes, and uploading with proper retry logic.
  • Hierarchy flattening for deeply nested page trees. Our scripts recursively create parent pages and append children in the correct order, preserving the full tree structure without hitting API nesting limits.
  • Post-migration validation — page count reconciliation (source vs. destination), broken link detection (rewriting internal Confluence links to Notion page URLs), and attachment integrity checks.

Each stage is testable. You know, before cutover, which macros will be flattened, which pages need restructuring, which attachments need a special path, and whether your final page count and link graph match the source.

Time cost: Typically 3–10 business days depending on volume and macro complexity.

Method Page Limit Macros Attachments Hierarchy Time
Manual copy-paste <100 ❌ Lost ❌ Manual ❌ Flat Hours
Notion native importer ~50K (needs network access for DC) ⚠️ Flattened ✅ Included ✅ Preserved Hours–days
DIY scripts <50K ⚠️ Partial ⚠️ Manual ⚠️ Partial Weeks
ClonePartner Unlimited ✅ Audited & resolved ✅ Full pipeline ✅ Full tree Days

Before You Start

A Confluence DC to Notion migration is not a product switch — it's a platform architecture change. You're moving from a self-hosted, page-tree wiki with deep Atlassian ecosystem integration to a SaaS block-based workspace with a fundamentally different data model.

  1. Audit your macros. Run a macro usage report across all spaces. If more than 20% of your pages use dynamic macros (Jira issues, draw.io, page includes), budget significant time for pre-migration resolution.
  2. Address compliance first. If your DC deployment exists for data sovereignty reasons, confirm that Notion's data residency options satisfy your requirements before starting the technical work.
  3. Map your hierarchy on paper. Decide which spaces become teamspaces, which become top-level pages, and where permissions will be simplified rather than copied exactly. Get stakeholder sign-off before page one lands — restructuring after migration is painful.
  4. Accept the losses. Page history, inline restrictions, watch subscriptions, and comments (unless using Notion's API importer) will not survive. Communicate this to your teams early.
  5. Run a pilot. Pick a representative space — one with macros, attachments, and nested pages — and migrate it first. Measure what translates, what breaks, and how long it takes under real Notion API constraints. A pilot turns assumptions into data.

The on-prem exit path works when you accept the constraints early: Confluence extraction is version-sensitive, Notion ingest is rate-limited, macros need explicit decisions, and permissions need redesign.

Frequently Asked Questions

Can I use Notion's native importer for Confluence Data Center?
Notion's built-in importer has a ZIP upload path (which accepts exports from DC instances, capped at ~5GB) and an API import path (which connects directly to a Confluence instance, handling up to ~30GB or ~50,000 pages). For DC behind a firewall, the API path requires network access most on-prem deployments don't expose. You can also first migrate DC to Confluence Cloud via CCMA, then use the importer — but that's a two-hop migration with data loss at each step.
How long does a Confluence Data Center to Notion migration take?
It depends on page volume. The Notion API is rate-limited to 3 requests/second per integration token. A 200K-page instance requires roughly 1 million API calls — about 93 hours of continuous throughput at best. With retries, attachment uploads, and error handling, expect 2–4 weeks for DIY scripts using a single token, or 3–10 business days with a parallelized pipeline like ClonePartner's.
What Confluence data is lost when migrating to Notion?
Page version history, page-level restrictions, Confluence labels (unless migrated as database properties), watch/notification subscriptions, and dynamic macro content (Jira issues, draw.io diagrams, roadmap macros) have no import path into Notion. Info/warning panels convert to callouts but lose color customization. Comments can be retained by Notion's built-in API importer, but not by ZIP imports or custom API pipelines.
Does Confluence XML export include page-level restrictions?
It depends on who runs the export. A site admin's XML export includes all content, even restricted pages. A space admin export silently excludes pages they cannot view — with no error or warning. This makes XML exports unreliable for migration unless run by a site admin.
What happens to Confluence macros when migrating to Notion?
Most dynamic Atlassian macros (Jira issue links, draw.io diagrams, page includes, roadmaps) have no native Notion equivalent. They must be pre-processed and converted into static text, images, or standard Notion blocks before ingestion. Basic macros like code blocks and info panels have partial translations. Notion warns that most macros and plugins degrade during import.

More from our Blog