Skip to content

How to Export Blogs from Confluence (PDF, Word, HTML, XML & API)

Learn how to export Confluence blog posts to PDF, Word, HTML, CSV, XML, and via the REST API — including bulk blog export workarounds and hidden limitations.

Raaj Raaj · · 17 min read
How to Export Blogs from Confluence (PDF, Word, HTML, XML & API)

Anyone who has managed an Atlassian environment knows that getting data in is easy. Getting it out exactly how you want it is an entirely different story.

Whether you are archiving old company announcements, extracting release notes for an external vendor, or preparing for a full-scale migration away from Confluence, you need a reliable way to export your blog posts. Confluence gives you five formats — PDF, Word, HTML, CSV, and XML — but not every format works for every content type. Blog posts in particular have specific limitations that trip people up constantly.

This guide covers every export method, which content types each one actually supports, the hidden traps in the native tools, and how to use the REST API when the UI fails.

How Confluence Structures Blog Data Under the Hood

Before you start clicking export buttons or writing API scripts, understand how Confluence treats blog posts compared to standard wiki pages.

In the Atlassian architecture, a blog post is essentially a page with a different type attribute (blogpost instead of page) and a strict chronological hierarchy. Unlike pages, which are nested in parent-child trees, blogs are organized by their creation date (Year > Month > Day). They live outside the page tree entirely.

This structural difference is exactly why many space export tools fail to capture blogs correctly. Export formats designed to crawl a space's page tree often miss the chronological blog index entirely — which explains the gaps you will see in the compatibility matrix below.

The Confluence Export Compatibility Matrix

Before picking an export method, consult this table. It will save you from the most common mistake: assuming all export formats treat all content types the same way.

Export Format Pages Blog Posts Comments Attachments Requires
PDF (single) Images only View permission
PDF (space) Images only Space admin
Word (single) First 50 images View permission
HTML (space) Space admin
CSV (space) Space admin
XML (space) Space admin
REST API API token
Warning

The blog post export gap is real. Space-level PDF and HTML exports do not include blog posts. Comments are never included in PDF exports. If your primary goal is bulk-exporting blog posts, you need the CSV export, the XML export, or the REST API.

The quick rule of thumb:

  • One blog post: use the built-in single-post export.
  • A whole space including blogs: use CSV or XML, not PDF or HTML.
  • Migration-grade data with metadata: use the REST API, then pull attachments separately.
  • Everything in the site: use Backup Manager.
Tip

Atlassian also supports public links as an alternative when you want a view-only version that stays current instead of a static file. If the real job is sharing rather than archiving or migrating, a public link is often the better choice.

Method 1: Export a Single Blog Post (PDF or Word)

This is the simplest approach and works for both pages and blog posts. You do not need any special permissions — just the ability to view the content.

For PDF:

  1. Open the blog post you want to export.
  2. Click the ⋯ (More actions) menu in the top right.
  3. Select Export to PDF.

For Word:

  1. Open the blog post.
  2. Click ⋯ (More actions)Export to Word.

Only published content is exported. If there are unpublished drafts or edits sitting in the editor, they are ignored. This is actually useful — you can create exports even while people are still editing the post.

This path is good when your output needs to be readable by a human in the next five minutes: an approval archive, an email attachment, a compliance handoff, or a content review outside Confluence. It is not the best path when you need structured data or when your target system cares about metadata more than layout.

Hidden Limitations of Single Exports

While this seems straightforward, there are technical quirks you need to know before handing these files to a stakeholder:

  • PDF layouts vs. stylesheets: When you export a single blog to PDF, Confluence applies your custom PDF stylesheets but completely ignores PDF layout customizations. Custom headers, footers, and title pages will not render on a single-page export. You have to use the space-level multi-page export to force layout rules.
  • The @page tag issue: Confluence's newer PDF rendering engine does not support @page tags in the stylesheet. If your CSS relies on these tags for pagination or margins, Confluence silently falls back to an older, less reliable export engine. No warning, no error — just different output.
  • The Word image limit: The Word export only includes the first 50 attached images. This is Atlassian's hard limit to prevent out-of-memory errors affecting the entire Confluence site. If your blog post is a highly visual post-mortem or technical guide, anything beyond 50 images is permanently dropped from the output file.
  • Word compatibility: The exported .doc file is intended for Microsoft Word specifically. Atlassian states it is not compatible with LibreOffice, Google Docs, or OpenOffice. Plan accordingly if your team does not use Microsoft Office.
Info

Data Center caveat: If you are on Confluence Data Center rather than Cloud, Atlassian's published docs describe exporting blog posts to PDF, but Word export for blog posts is only documented for Cloud. Atlassian ended Confluence Server support on February 15, 2024, so if you are working from old Server-era habits, check your platform before promising stakeholders a Word file.

Method 2: Export an Entire Space (PDF, HTML, CSV, XML)

Space-level exports let you pull out large volumes of content in one shot. If you are a space admin, you can export an entire space or a group of selected content.

Step-by-Step: Space Export in Confluence Cloud

  1. Next to your space's name in the sidebar, select More actions (•••), then Space settings.
  2. Open the General menu and select Export space.
  3. Select your format: PDF, CSV, HTML, or XML.
  4. For PDF, choose between exporting all pages or selecting specific ones.
  5. For XML or HTML, select the format and click Next to choose between exporting everything you can view or selecting specific content items.
  6. Click Export space and download when complete.
Info

A common stumble: many users try to export from the space landing page and get confused when nothing happens. You need to use the Space Settings menu — accessed via the ••• icon next to the space name in the sidebar — not the top-level space page.

Choosing the Right Format

Each format serves a different purpose, and — critically — each one has different behavior with blog posts:

  • PDF — Good for creating printable user manuals from technical documentation. Does NOT include blog posts in the space export. Comments are also excluded.
  • HTML — Useful for converting your space into a static website or a read-only documentation portal. Does NOT include blog posts. Page comments are also currently excluded.
  • CSV — Dumps everything you have permission to view, including blogs, attachments, and comments. Best option for extracting raw data to parse into a spreadsheet, a BI tool, or a migration script. As a site admin, CSV exports all content regardless of your view permissions.
  • XML — A full XML export of the space includes blog posts, pages, comments, and attachments. This is the standard format for Confluence Data Center imports. The catch: XML output is Confluence's internal storage format — machine-readable XML, not something you would hand to a stakeholder or read over coffee.

The Blog Post Gotcha

This is the part that trips everyone up and the reason most people land on this article: Cloud space PDF and HTML exports do not include blog posts. This has been a known limitation for years and there is no sign Atlassian plans to fix it.

If your goal is specifically "export all blog posts from a space," PDF and HTML are the wrong tools. They are page-oriented exports. For bulk blog work, your realistic options are:

  1. CSV or XML space export — Includes blog posts, but in formats designed for machine processing or re-import, not human reading.
  2. REST API — Full control over exactly what gets extracted and in what format. Covered in detail in Method 3 below.
  3. One-by-one PDF/Word export — You can export individual blog posts from each blog post page. If you only have a handful, this works. For 50+ posts, it is miserable.
  4. FlyingPDF action URL (Server/Data Center only) — You can examine the browser's developer console while exporting a single blog post to PDF, grab all the blog pageId values, and automate calls to the export URL. This is a hack, not an officially supported workflow.

Timeout and Performance Issues

Massive spaces frequently time out during HTML or PDF generation. Atlassian says a typical export of around 100 pages should generally complete within a few minutes, but recommends exporting smaller subsets if bulk exports are slow or error out.

Warning

Timeout risks: If your space contains thousands of macros or heavy attachments, the export job may fail silently or return a corrupted zip file. You will not get an error message until you try to open the archive.

Method 3: Extract Blog Posts via the REST API

When native exports fail — usually because of format limitations, macro timeouts, or the PDF/HTML blog exclusion — the REST API is your best option. This is the only reliable method for large-scale migrations, continuous data syncs, or any scenario where you need fine-grained control over the output.

Setting Up API Authentication

For Confluence Cloud, Atlassian requires an API token instead of your account password.

  1. Go to https://id.atlassian.com/manage-profile/security/api-tokens
  2. Click Create API token
  3. Give it a name like "Confluence Export"
  4. Copy and store the token securely

Key API Endpoints

Get blog posts in a space (v1):

GET /wiki/rest/api/content?type=blogpost&spaceKey={KEY}&expand=body.storage&limit=25

Get blog posts in a space (v2):

GET /wiki/api/v2/spaces/{spaceId}/blogposts?body-format=storage&limit=100

Get a single page by ID:

GET /wiki/rest/api/content/{pageId}?expand=body.view

Get attachments for a content item:

GET /wiki/rest/api/content/{id}/child/attachment

One mildly annoying detail about Confluence's API surface: blog posts and comments are documented in REST v2, while attachment operations and binary downloads are still documented in REST v1. Plan for that mismatch before you tell anyone this will be a "simple export script."

Storage Format vs. View Format

The expand or body-format parameter controls what format you get back:

  • body.storage — Returns Atlassian's proprietary XHTML-based storage format, including raw macro definitions like <ac:structured-macro>. Use this if you are writing a script to translate Confluence content into another system's native format (like Notion blocks or SharePoint web parts).
  • body.view — Returns the fully rendered HTML as a user would see it in the browser. Use this for archival or static content that does not need to be editable.
  • export_view — Available via the async content body conversion API. Closer to what users see in printed output. Atlassian documents that the conversion returns an asyncId, and completed results are available for polling for up to five minutes.

Python Script: Export All Blog Posts as HTML Files

Here is a practical script that exports every blog post in a given space as individual HTML files, with proper pagination and filename sanitization:

import requests
import os
import re
 
# === CONFIG ===
BASE_URL = "https://your-domain.atlassian.net/wiki"
EMAIL = "your-email@example.com"
API_TOKEN = "your-api-token"
SPACE_KEY = "ENG"
EXPORT_DIR = "confluence_blog_export"
 
# === SETUP ===
auth = (EMAIL, API_TOKEN)
os.makedirs(EXPORT_DIR, exist_ok=True)
 
def sanitize_filename(name):
    return re.sub(r'[^\w\s-]', '', name).strip()[:100]
 
def export_blog_posts():
    start = 0
    limit = 25
    total_exported = 0
 
    while True:
        url = (f"{BASE_URL}/rest/api/content"
               f"?type=blogpost&spaceKey={SPACE_KEY}"
               f"&expand=body.view,history"
               f"&start={start}&limit={limit}")
 
        resp = requests.get(url, auth=auth)
        resp.raise_for_status()
        data = resp.json()
 
        results = data.get("results", [])
        if not results:
            break
 
        for post in results:
            title = post["title"]
            html = post["body"]["view"]["value"]
            created = post["history"]["createdDate"][:10]
 
            filename = f"{created}_{sanitize_filename(title)}.html"
            filepath = os.path.join(EXPORT_DIR, filename)
 
            with open(filepath, "w", encoding="utf-8") as f:
                f.write(f"<h1>{title}</h1>\n{html}")
 
            print(f"Exported: {filename}")
            total_exported += 1
 
        start += limit
 
    print(f"\nDone. Exported {total_exported} blog posts.")
 
if __name__ == "__main__":
    export_blog_posts()
Tip

Want Markdown instead of HTML? Pipe the output through a library like markdownify or html2text. This is especially useful if you are migrating blog content to GitHub Pages, Hugo, or another Markdown-based CMS.

Including Comments in Your Export

If comments matter for your use case — and they often do during migrations or compliance archives — you need separate API calls. The v2 API exposes dedicated endpoints for each comment type:

GET /wiki/api/v2/blogposts/{blogpostId}/footer-comments?body-format=storage&limit=100
GET /wiki/api/v2/blogposts/{blogpostId}/inline-comments?body-format=storage&limit=100

Footer comments and inline comments are separate resources in Confluence's data model. The built-in CSV export includes comments by default, but PDF drops them and HTML has gaps. If you need full fidelity, the API is the only reliable path.

Handling Attachments Programmatically

The API response for blog body content does not include actual image files — it only includes HTML <img> tags referencing the Atlassian CDN. If you are exporting data to store offline or move to another platform, you need to:

  1. Parse the HTML to find every image URL.
  2. Authenticate against the /rest/api/content/{id}/child/attachment endpoint.
  3. Download each binary file.
  4. Rewrite the image source in your exported document to point to the local file.

Attachment download is still documented through the v1 API, which returns a redirect to the binary file. This is exactly where DIY migration scripts start looking fine in a demo and then fall apart in QA.

Batch-Converting HTML to PDF

Once you have HTML files from the script above, batch-convert them to PDF:

# Using wkhtmltopdf
for file in confluence_blog_export/*.html; do
  wkhtmltopdf "$file" "${file%.html}.pdf"
done

You can also use weasyprint or Puppeteer for more control over rendering, fonts, and page layout.

Exporting a Single Page as PDF via API (Data Center Only)

Since there is no official Cloud API for exporting pages as PDFs directly, Data Center and Server users can use the FlyingPDF export action as a workaround:

curl -u user:password -H "X-Atlassian-Token: no-check" \
  "https://your-instance.com/wiki/spaces/flyingpdf/pdfpageexport.action?pageId=PAGE_ID"

The response returns a 302 redirect with the download URL in the Location header. Make a second request to that URL to get the actual PDF. For Confluence Cloud, fetch the HTML via the REST API and convert it client-side with one of the tools above.

Method 4: Full Site Backup

When you need the entire Confluence instance — not just one space — use Backup Manager.

In Confluence Cloud, admins can go to Settings → Data management → Backup manager, optionally include attachments, and create a backup. Atlassian stores only one backup file at a time, the download link is available for 14 days, and the maximum supported size is 30 GB plus 800 GB of attachments.

Backup Manager includes pages, blog posts, whiteboards, user and group settings, team calendar data, and attachments if you opt in. The ugly limitation: Atlassian says databases currently cannot export their content through Backup Manager.

Use Backup Manager for:

  • Whole-instance backup
  • Tenant-to-tenant migration staging
  • Disaster recovery copies
  • Large consolidation projects

Do not use it when the real request is "send me the blog posts from this one space by Friday." For that, Backup Manager is too wide, too heavy, and not selective enough.

Marketplace Apps for Better Exports

If you need branded, polished exports — customer-facing manuals, product docs, or compliance reports — the built-in export often is not enough. This is where Marketplace apps earn their keep.

Scroll PDF Exporter (by K15t)

Exports single pages or entire page trees as PDF documents with full customization options. Starts at USD 5.00 per month for up to 10 users on Confluence Cloud.

Why it is worth considering:

  • Define your brand once — logos, fonts, colors — and apply it across export templates. Create different templates for different content types.
  • Visual template editor with no CSS hacking required.
  • K15t also offers Scroll Word Exporter and Scroll HTML Exporter as separate apps.

Some teams automatically export selected Confluence Cloud pages as PDFs using Scroll PDF Exporter, then sync those files to external storage like Microsoft SharePoint, AWS EFS, Azure Storage, or Google Cloud Filestore.

Comala Document Management

When using this app, you can choose to only export pages that have been published through Comala's review process. This ensures only reviewed and completed pages appear in your exports — useful for regulated industries where you need an audit trail from draft through approval to export.

Customizing the Default PDF Export

The default Confluence PDF export looks functional. Not great. If you are exporting documentation for external consumption, you will want to customize it.

You can add a title page, table of contents, and customized headers and footers to the PDF output. For more advanced customizations, you can apply CSS modifications. These customizations are space-specific, and you need Space Administrator permission to apply them.

In Confluence Cloud, go to Space Settings → Look and Feel → PDF Stylesheet to add custom CSS. Be aware: the newest version of the export feature improves rendering but does not support @page tags with @top-right, @top-left, and similar properties in the PDF stylesheet or HTML in the header/footer/title in space or site settings. If your content uses these customizations, Confluence will silently fall back to an older version of the export engine.

Permissions and Security Policies

Permission issues are the number one reason exports silently fail or return incomplete results.

Export Permissions by Type

  • Single page/blog export — Only requires view permission on that specific content.
  • Space export (PDF, HTML) — Requires the "Export Space" permission. Only space admins get this by default.
  • Space export (XML) — Requires space admin. Note: the XML export includes all pages in the space, including those you do not have permission to view. This is by design.
  • CSV export — Only content visible to you is exported, unless you are a site admin, in which case all content is exported regardless of permissions.

Data Security Policies Can Block Exports

This is a recent change that catches people off guard. Atlassian's data security policies allow organizations to use rules to control how users, apps, and external people interact with content. Atlassian has extended the data export rule to also block downloading of files attached to Confluence and Jira. After this rule takes effect, users will no longer find a download button in attachment lists, macros, and file previews.

The data export rule requires Atlassian Guard Standard. If your org has this enabled and exports are blocked, talk to your Atlassian admin.

However — and this matters — Atlassian explicitly states this control does not stop Marketplace apps or custom apps from exporting content, and it does not block browser print-save actions. The security model and the UI restrictions are related but not identical.

To check: Go to admin.atlassian.com → Security → Data security policies, select a policy, and verify whether exporting data is blocked.

Warning

Planning an XML import into another instance? For best results, Atlassian recommends having a Site Admin perform the export so that email addresses are always included. Import into Confluence 6.15.4 or later so that email address matching is available. This avoids content being attributed to "unknown user" after the import.

Common Export Mistakes

These are the mistakes that waste the most time. We see them repeatedly across the migrations we handle:

  1. Using Cloud space PDF or HTML when you need blog posts. Neither format includes them. This is the single most common error.
  2. Assuming comments come along everywhere. They do not. PDF drops them, HTML has gaps, and inline vs. footer comments are separate API resources.
  3. Treating Word export as a migration format. It is fine for human review, not suitable for structured re-import into another system.
  4. Forgetting admin security policies. Atlassian Guard can block exports from the page, blog, space settings, and even via URL or API. If export is blocked, users may lose the menu option entirely or get a generic error with no explanation.
  5. Relying on a single export for migration. Native exports are point-in-time snapshots. If someone updates a blog post five minutes after your export finishes, your data is already stale. Dynamic macros like Jira issue lists, third-party diagram plugins, or complex table macros often render as broken text blocks in PDF and Word exports.
  6. Thinking the security block is absolute. Atlassian explicitly says the data export control does not stop Marketplace apps from exporting content and does not block browser print-save. If security is your concern, audit your Marketplace apps too.

Quick Decision Guide

Your Goal Best Method
Share one blog post with a non-Confluence user Single blog → PDF or Word
Create a printable user manual Space export → PDF (Custom)
Archive a space for backup Space export → XML
Move a space to another Cloud instance Space export → CSV
Move a space to Data Center Space export → XML
Turn docs into a static site Space export → HTML
Bulk export blog posts in readable format REST API → HTML → Convert to PDF
Branded, professional documents Scroll PDF Exporter (Marketplace)
Full instance backup Backup Manager
Full migration to another platform REST API + custom scripts or migration service

Migrating Confluence Blogs to Other Platforms

Extracting the data is only step one. If you are exporting blogs to migrate them into Notion, you have to convert Atlassian's XHTML storage format into Markdown, then map that Markdown into Notion's block-based API. If you are moving to SharePoint, you have to convert the blogs into SharePoint News posts using the Microsoft Graph API.

Neither of these paths is a simple 1:1 mapping. Confluence macros, nested content, and proprietary markup all need to be translated — and the hard part is rarely getting the raw text out. The hard part is reconstructing comments, author history, dates, attachments, labels, permissions, and content-type quirks in the target system. That is why teams often move from "we just need an export" to "we need a migration plan" in about two meetings.

If you are planning a larger move, check out our Notion to Confluence migration guide to understand the architectural differences, or our guide on Atlassian Data Center end of life for context on where the Atlassian ecosystem is headed.

When to Call in Engineering Help

Exporting from Confluence is straightforward for one-off needs — grabbing a PDF for a client, backing up a space, archiving old content. But things get complicated fast when you are moving content to a completely different platform, migrating between Confluence instances with large volumes, preserving metadata and relationships during the move, or handling attachments, inline images, and macro-generated content at scale.

When you hit the limits of native CSVs and API rate limits, data extraction stops being a quick afternoon task and turns into a sprint-consuming engineering project.

At ClonePartner, we do not rely on standard Atlassian export buttons. We build custom extraction engines that pull Confluence data — pages, blogs, attachments, and nested macros — directly via the API. We parse the proprietary storage format and map it to your target system. We handle the pagination, the API rate limits, the macro translations, and the attachment downloads so your engineering team does not have to.

Not because clicking Export is hard. Because getting the data out cleanly, mapping it into the target system, and preserving history without breaking business workflows is the part that bites.

Frequently Asked Questions

Can you export blog posts from Confluence as PDF?
You can export individual blog posts to PDF one at a time from the ••• > Export menu. However, blog posts are excluded from the space-level PDF export. For bulk export, use the REST API to fetch blog post HTML and batch-convert it to PDF with a tool like wkhtmltopdf or weasyprint.
Does a Confluence space export include blog posts?
It depends on the format. In Confluence Cloud, space-level PDF and HTML exports do NOT include blog posts. CSV and XML space exports do include them. If you need bulk blog posts in a readable format, the REST API is the most reliable approach.
How do I bulk export all blog posts from a Confluence space?
Use the CSV or XML export via Space Settings for machine-readable output, or query the Confluence REST API (GET /wiki/rest/api/content?type=blogpost&spaceKey={KEY}&expand=body.view) to extract them programmatically as HTML files you can then convert to any format.
What format should I use to export a Confluence space?
Use PDF for printable documentation (pages only), CSV to import into another Cloud instance, HTML for static websites (pages only), and XML to import into Confluence Data Center. XML is the most complete built-in option — it includes pages, blog posts, comments, and attachments.
Why can't I find the export option in Confluence?
You need to access export from Space Settings (via the ••• menu next to the space name in the sidebar), not from the space landing page. You also need the Export Space permission. If your organization uses Atlassian Guard, a data security policy may be blocking exports entirely.

More from our Blog

The Ultimate Guide: Migrating from Notion to Confluence (Technical & Strategic)
Notion/Confluence

The Ultimate Guide: Migrating from Notion to Confluence (Technical & Strategic)

A practical guide to migrating Notion to Confluence that explains how Notion blocks map to Confluence pages and macros, highlights common migration failures such as flattened databases and broken relations, and outlines when to use native export, custom API scripts, or automated migration pipelines. The guide helps teams preserve structured data, maintain documentation integrity, and plan a reliable transition to Confluence.

Raaj Raaj · · 5 min read