Skip to content

How to Export All Data from Confluence: Methods, Limits & Tools

Learn how to export all data from Confluence using native formats, Backup Manager, the REST API, and tools — plus what each method silently drops.

Raaj Raaj · · 13 min read
How to Export All Data from Confluence: Methods, Limits & Tools

How to Export All Data from Confluence (The Quick Answer)

There is no single "Export Everything" button in Confluence. Getting all data out means running space-by-space exports across every space, choosing the right format for each use case, and accepting that every native format drops something.

Here is what each format actually includes:

Format Blog Posts Attachments Comments Best For
PDF ❌ No ❌ No ❌ No Printable manuals, human-readable docs
HTML ❌ No ✅ Yes (by page ID folder) ❌ No Static website conversion
CSV ✅ Yes ✅ Yes ✅ Yes Cloud-to-Cloud space import
XML ✅ Yes ✅ Yes ✅ Yes Import into Confluence Data Center
Word Single page only ✅ First 50 images only ❌ No Sharing individual pages externally

If your goal is a compliance-grade backup or a platform migration, CSV or XML are the only real options among native space exports. PDF and HTML silently drop blog posts and comments without warning.

For a full site backup (not space-level), Confluence Cloud offers a Backup Manager under Settings → Data management → Backup manager. But this is a site-wide dump with its own constraints — including a gap around Confluence Databases — covered below.

The practical rule: use native exports for reading, use site backups for backup, and use API-led extraction for migration or compliance work. (support.atlassian.com)

Warning

If your site uses Confluence Databases, treat them as a separate workstream. Whole-site backups show databases in the tree but do not retain their content, data, or functionality. Database exports must be handled separately, per database, in CSV, HTML, or PDF. (support.atlassian.com)

If you are evaluating Confluence against alternatives, our Notion vs. Confluence comparison covers the architectural differences that often drive migration decisions.

Native Space Exports: Formats, Steps, and Limitations

How to export a Confluence space

The process is identical across formats in Confluence Cloud:

  1. Navigate to the space you want to export.
  2. Select More actions (•••) next to the space name in the sidebar, then Space settings.
  3. Open the General menu and select Export space.
  4. Choose PDF, CSV, HTML, or XML.
  5. For PDF and HTML, select Normal Export (all pages) or Custom Export (selected pages). CSV only supports full export.
  6. Select Export space and download the file when the process completes.

You need the Export Space permission. If a site admin performs the CSV or XML export, all content is exported — including content they cannot view through the UI. A normal space admin running the same export will only get content visible to their account, and restricted content disappears from the output without any error message. (support.atlassian.com)

Organization admins can also block exports entirely for some instances or classification levels. If the export button is missing or disabled, check admin policies before assuming a permissions bug.

What each format drops

PDF export:

  • Blog posts are excluded from space-level PDF exports. You can export individual blog posts to PDF one at a time, but there is no bulk option.
  • Comments are always excluded.
  • The newer PDF renderer does not support @page CSS tags (e.g., @top-right, @top-left properties). If your custom CSS uses these tags, Confluence silently falls back to an older, less accurate rendering engine — often breaking layout completely. (support.atlassian.com)
  • Large spaces with complex pages can fail with out-of-memory errors or timeouts during the PDF stitching process.

HTML export:

  • Blog posts are not included. This is a long-standing limitation with no fix on Atlassian's roadmap.
  • Attachments are placed in folders named by internal page ID (...\download\attachments\xxxxxx). You lose the human-readable file-to-page mapping. If you are writing a migration script, you must parse every page's HTML to find relative links to those attachment folders, extract the files, upload them to the target system, and rewrite the HTML tags to point to the new URLs.
  • Inline comments and page comments are excluded.
  • Dynamic macros (Jira issue macros, third-party diagramming tools) render as static images or blank placeholders.

CSV export:

  • The most complete native format for Confluence Cloud. Includes blog posts, attachments, and comments.
  • Best option for importing into another Confluence Cloud instance.
  • Destroys page hierarchy — everything is dumped into a flat table. Importing this into a non-Confluence wiki produces a flat list of thousands of pages, forcing manual reconstruction of the knowledge base.
  • If a data security policy blocks exports for a specific classification level, child pages of classified pages are also excluded — a behavior unique to CSV exports.

XML export:

  • Includes pages, blog posts, comments, and attachments.
  • Team Calendars are not included.
  • Can only be imported into the same or later version of Confluence — never an earlier version.
  • Atlassian explicitly recommends against using XML exports as your primary backup strategy.
  • The schema is highly specific to Atlassian's internal database architecture. Parsing Confluence XML to migrate into a different platform is a significant engineering challenge.

Word export:

  • Individual pages and blog posts only — no bulk space export.
  • Capped at 50 attached images per export. Anything beyond that is silently dropped to prevent out-of-memory errors.
  • The exported .doc file is only compatible with Microsoft Word. It will not open correctly in LibreOffice, Google Docs, or OpenOffice.
Warning

Every native export format captures only the published version of each page. Unpublished draft changes are silently excluded. There is no way to export unpublished drafts through the Confluence UI. (support.atlassian.com)

Confluence Cloud Backup Manager: What It Includes and What It Does Not

For a whole-instance export, go to Confluence Administration → Settings → Data management → Backup manager, optionally enable Backup attachments, then choose Create backup for cloud. After it finishes, the download link stays available for 14 days. (support.atlassian.com)

A site backup includes pages, blog posts, whiteboards, selected user and group settings, team calendar data, and attachments (if selected). Attachments in the archive or trash are also included, which can make the backup larger than your reported storage usage.

Key constraints:

  • One backup at a time. Creating a new backup overwrites the previous one.
  • Download link expires after 14 days.
  • Maximum size: 30 GB of content plus 800 GB of attachments.
  • Cloud site import is capped at 200 MB of uncompressed XML data inside the ZIP — regardless of how large the backup file itself is. Teams regularly miss this and only discover it during cutover testing. Atlassian's workaround is to move content space by space instead of using a direct site import. (support.atlassian.com)
  • Confluence Databases are not retained. Databases appear in the backup tree, but their content, data, and functionality are not preserved. They must be exported separately per database.
  • Soft-deleted spaces in the trash are not backed up.
  • Automated backups require an Enterprise plan. The backup and restore solution with automation via backup policies is available exclusively to Enterprise customers.

Atlassian also enforces a cooldown between backups. If you hit the Backup frequency is limited message, Atlassian Support has to reset the timer. And if the file is large enough that browser downloads time out, Atlassian recommends downloading with curl or wget: (support.atlassian.com)

curl --location --output '<name_your_backup>.zip' --retry 450 -C - --request GET 'https://<your_site>.atlassian.net/wiki/download/temp/filestore/<backup_fileID>' --header 'Authorization: Basic <Base64_encodedstring>'

That pattern comes directly from Atlassian's KB for large Confluence Cloud backup downloads. (support.atlassian.com)

The Real Failure Modes: Drafts, Permissions, Attachments, and Databases

Four failure modes catch teams off guard during export projects. All of them are easy to miss until you have already started loading data into a target system.

Unpublished drafts are invisible to exports

Every native export — PDF, HTML, CSV, XML, and Word — captures only published content. If a contributor has been editing a page for weeks without hitting "Publish," those changes do not exist in your export. Atlassian treats this as a feature ("you can export even while people are working"), but for migration purposes it is a data loss risk.

Before exporting, audit your spaces for pages with unpublished drafts. There is no built-in report for this. You need to use the REST API's status=draft filter or ask contributors to publish their work. For engineering teams using Confluence for long-running architectural design documents, losing drafts means losing days of uncommitted work — and that loss often goes unnoticed until weeks after the old system is shut down.

Attachments over 100 MB lose indexed content

Confluence stops extracting text and indexing file contents for any attachment larger than 100 MB. Only the filename remains searchable. (support.atlassian.com)

This does not mean the file is missing from the export. It means migration scripts that rely on API search queries to discover and map large PDF, Excel, or ZIP attachments will skip them entirely. These files become opaque blobs — you can move the binary, but any downstream system expecting searchable content comes up empty.

Smaller files have their own limits. Confluence extracts a maximum of 1 MB of text from Excel and PowerPoint files, and 16 MB from Word documents. If extraction fails, Confluence does not retry — it is a one-shot operation.

Restricted and private content

The REST API endpoints are permission-aware. Only content the calling user can view will be returned. Atlassian staff have confirmed that admins cannot use the regular REST API to view other users' private items. (developer.atlassian.com)

A service account with incomplete space access will produce an incomplete crawl, even if your team assumes it is acting like a site admin. Plan accordingly — the service account needs explicit access to every space and every restricted page.

Databases are a separate export path

This is the quietest trap in current Confluence Cloud. Whole-site backup does not retain database content, but individual databases can be exported to CSV, HTML, or PDF from the database's own menu. If your workspace uses databases heavily, you do not have one export plan — you have at least two. (support.atlassian.com)

Confluence API Limits and the Deprecated Backup Endpoint

The legacy backup API

For years, admins automated Confluence Cloud backups by calling /wiki/rest/obm/1.0/runbackup. Community-maintained scripts (the atlassianlabs/automatic-cloud-backup repository on Bitbucket) used this endpoint with basic auth and API tokens.

This endpoint is now unreliable. Multiple Atlassian Community reports from 2025 confirm it stopped working for some instances, though others report it still functions with minor script modifications. Atlassian's developer community describes the endpoint as undocumented and unsupported. The intended replacement is the Backup Management REST API. (community.developer.atlassian.com)

Danger

Do not build new automation against /wiki/rest/obm/1.0/runbackup. Its behavior is inconsistent across instances and Atlassian provides no official support for it. If you need automated backups, evaluate the Enterprise-tier Backup Management API or a third-party tool.

The Backup Management REST API

Atlassian's documented backup automation path is the Backup Management API, currently at v2 (v1 APIs will be deprecated). Data stored through this system has a 30-day retention period. (developer.atlassian.com)

Rate limits are explicit: 100 GET requests per minute per org, with lower limits for some POST and PUT operations. If you are exporting at scale, that constrains how aggressively you can parallelize backup automation. (developer.atlassian.com)

REST API v2 for content extraction

If you are building a custom export script using the content API rather than the backup path, you will work with the v2 API for pages and blog posts. Think in terms of content families — pages, blog posts, attachments, descendants, inline comments, footer comments, and content properties all live behind separate endpoints:

GET /wiki/api/v2/pages?status=current&limit=250
GET /wiki/api/v2/blogposts?status=current&limit=250
GET /wiki/api/v2/pages/{id}/attachments?limit=250
GET /wiki/api/v2/pages/{id}/footer-comments?limit=250
GET /wiki/api/v2/pages/{id}/inline-comments?limit=250
GET /wiki/api/v2/blogposts/{id}/footer-comments?limit=250
GET /wiki/api/v2/blogposts/{id}/inline-comments?limit=250
GET /wiki/api/v2/pages/{id}/descendants?limit=250

Key constraints:

  • Cursor-based pagination: The v2 API uses limit and cursor parameters. You must follow the _links.next URL in each response to retrieve subsequent results.
  • Expansion limits: When using the expand parameter to request body.export_view or body.styled_view, responses are capped at 25 results per page — regardless of your limit parameter.
  • Points-based rate limiting: Starting March 2, 2026, Atlassian is enforcing a new points-based rate-limiting model for all Forge, Connect, and OAuth 2.0 apps. Each API call consumes points based on operation complexity, not just request count. API-token-based traffic continues under existing burst rate limits.

A crawler that only reads /pages will miss blog posts by design. Any export script needs checkpointing and retry logic — one big request will not work. (developer.atlassian.com)

No official REST API for space exports on Cloud

This is the gap that frustrates most admins: Confluence Cloud does not provide a REST API endpoint for triggering space exports. The feature request (CONFCLOUD-40457) has been open since 2016 and remains unresolved. The old XML-RPC and SOAP APIs that included exportSpace functionality were deprecated in Confluence 5.5.

Confluence Data Center 8.3+ has a REST API endpoint for XML exports, but this does not apply to Cloud. On Cloud, your options for programmatic space export are:

  1. The atlassian-python-api library, which provides a get_space_export method that reverse-engineers the browser-based export flow.
  2. Custom scripts that call the content API page-by-page, pulling body content and attachments individually.
  3. Third-party backup tools that wrap the API complexity for you.

If you are on Data Center and planning a migration to Cloud or another platform, our Atlassian Data Center End of Life guide covers the full timeline and migration paths.

Comparing Confluence Export and Migration Tools

Approach Scope Automation Blog Posts Attachments Drafts Best For
Native PDF/HTML One space Manual only Partial (HTML only) Documentation handoffs
Native CSV/XML One space Manual only (Cloud) Cloud-to-Cloud or Cloud-to-DC import
Backup Manager Entire site Manual (Enterprise: scheduled) ✅ (opt-in) Disaster recovery
Rewind / GitProtect Entire site Scheduled Automated cloud backup
K15t Scroll Exporter Per space Manual Varies Varies Polished PDF/Word output
Custom API scripts Flexible Fully scriptable Possible (v2 API) Full-control migrations
ClonePartner Entire instance Engineer-managed Platform migrations, compliance-grade extraction

Third-party backup tools like Rewind and GitProtect are designed for disaster recovery — getting data back into Confluence if something goes wrong. Rewind positions around automated daily backups, 365-day history, and item-level restore. GitProtect offers bring-your-own S3-compatible storage, though its Confluence offering is currently labeled alpha. Neither is designed for extracting raw data to migrate to a different platform. (rewind.com)

K15t's Scroll Exporter apps solve the formatting problem — producing clean, branded PDFs and Word documents. They fix the @page tag and layout issues that plague native exports. But they are document generators, not data extraction tools. They will not give you a structured dataset for import into a different system.

Custom API scripts give maximum control but require significant engineering effort: handling pagination, respecting rate limits, downloading attachment binaries via the v1 API, and reconstructing page hierarchies from parent-child relationships.

How ClonePartner Extracts Confluence Data at Scale

When a team comes to us with dozens of Confluence spaces and needs everything out — pages, blogs, comments, attachments, metadata, page trees, internal links — we do not use the native export UI. We build custom extraction scripts against the Confluence API, handling the edge cases that manual exports miss.

What this looks like in practice:

  • Full page hierarchy preservation. Native CSV and HTML exports flatten or break parent-child relationships. Our scripts reconstruct the complete page tree with correct nesting, so the target system receives content in context and cross-page links stay intact.
  • Attachment handling at any size. We pull every attachment binary, including files over 100 MB that Confluence has stopped indexing. No file gets left behind because it exceeded an internal threshold.
  • Unpublished draft extraction. The REST API can access draft content that native exports ignore. If your team needs every page state — published and in-progress — we extract both.
  • Rate-limit-aware orchestration. Our extraction pipelines monitor X-RateLimit-Remaining headers in real time and throttle with backoff-and-retry logic, so the export never stalls or gets blocked by Atlassian's rate limiters.
  • Internal link resolution. Confluence wiki links reference internal page IDs. During extraction, we resolve these to stable identifiers that can be remapped in the target system, preventing broken links after migration.
  • Structural transformation. When moving from Confluence to platforms like Notion, the data structures are fundamentally incompatible. Confluence uses a rigid Space → Page → Child Page hierarchy, while modern tools use nested databases and dynamic blocks. We handle that transformation in transit.

Every extraction comes with a post-export reconciliation report: source page count vs. extracted page count, attachment checksums, and a diff of any content the export could not capture. And because we work through the API, we guarantee zero-downtime migration — your team keeps working in Confluence while we move the data in the background.

Which Export Method Should You Use?

The right approach depends entirely on what happens after the export:

  • Archiving for compliance? Use XML or CSV space exports, one space at a time. Accept the manual effort. Verify blog posts and attachments are present in the archive.
  • Disaster recovery? Use Backup Manager if you are on Cloud, or evaluate Rewind/GitProtect for automated scheduling.
  • Migrating to another platform? Native exports get you partway there, but you lose page hierarchies, internal links, unpublished content, and database data. A custom API extraction — whether you build it yourself or bring in a team like ours — is the only way to guarantee complete data fidelity.
  • Sharing docs externally? PDF or Word exports work for individual pages. For polished multi-page output, K15t's Scroll Exporter is the standard Marketplace solution.
Tip

Before starting any bulk export, run a quick inventory: How many spaces? How many total pages and blog posts? What is the total attachment volume? Any spaces with data security policies blocking exports? Any pages with unpublished drafts that stakeholders care about? Any Confluence Databases? These answers determine whether native exports are sufficient or whether you need an API-based approach.

Frequently Asked Questions

Does Confluence HTML export include blog posts?
No. Confluence's HTML space export does not include blog posts. This is a documented, long-standing limitation. Use CSV or XML exports to include blog posts in a bulk space export. ([support.atlassian.com](https://support.atlassian.com/confluence-cloud/docs/export-content-to-word-pdf-html-and-xml/))
How do I export unpublished drafts from Confluence?
Native exports (PDF, HTML, CSV, XML, Word) only capture published content. To export unpublished drafts, you need to use the Confluence REST API v2 with a status=draft filter. There is no UI-based method.
Are Confluence Databases included in a full site backup?
No. Atlassian says databases appear in the backup tree but do not retain their content, data, or functionality. They must be exported separately as CSV, HTML, or PDF from each database's own menu. ([support.atlassian.com](https://support.atlassian.com/confluence-cloud/docs/create-a-site-backup/))
Is there a REST API to export a Confluence Cloud space?
No. Confluence Cloud does not have an official REST API endpoint for triggering space exports. The feature request (CONFCLOUD-40457) has been open since 2016. You can use the atlassian-python-api library or build custom page-by-page extraction scripts.
What is the maximum size for a Confluence Cloud backup?
Confluence Cloud Backup Manager supports up to 30 GB of content plus 800 GB of attachments. Only one backup file is stored at a time, the download link expires after 14 days, and importing that backup into another Cloud site is limited to 200 MB of uncompressed XML data. ([support.atlassian.com](https://support.atlassian.com/confluence-cloud/docs/create-a-site-backup/))

More from our Blog

Zero Downtime Guaranteed: Why You Won't Have to
General

Zero Downtime Guaranteed: Why You Won't Have to "Pause" Your Business

Discover why "maintenance mode" is obsolete for modern businesses. ClonePartner guarantees zero downtime data migrations by replacing rigid automated tools with engineer-led, continuous synchronization bridges. Our custom approach allows for unlimited sample migrations and ensures your CRM, help desk, HRIS, E-commerce, etc remains fully operational throughout the entire transition.

Raaj Raaj · · 13 min read
The Ultimate Guide: Migrating from Notion to Confluence (Technical & Strategic)
Notion/Confluence

The Ultimate Guide: Migrating from Notion to Confluence (Technical & Strategic)

A practical guide to migrating Notion to Confluence that explains how Notion blocks map to Confluence pages and macros, highlights common migration failures such as flattened databases and broken relations, and outlines when to use native export, custom API scripts, or automated migration pipelines. The guide helps teams preserve structured data, maintain documentation integrity, and plan a reliable transition to Confluence.

Raaj Raaj · · 5 min read