---
title: "SuccessFactors to Greenhouse Migration: The CTO's Guide"
slug: successfactors-to-greenhouse-migration-the-ctos-guide
date: 2026-04-23
author: Raaj
categories: [Migration Guide, Greenhouse, SuccessFactors]
excerpt: "A technical guide to migrating from SAP SuccessFactors Recruiting to Greenhouse — covering OData extraction, Harvest v3 rate limits, entity mapping, and edge cases for CTOs."
tldr: "Beyond small active-candidate imports, SuccessFactors to Greenhouse is an API-first ETL project focused on preserving candidate, application, attachment, and stage relationships across incompatible data models."
canonical: https://clonepartner.com/blog/successfactors-to-greenhouse-migration-the-ctos-guide/
---

# SuccessFactors to Greenhouse Migration: The CTO's Guide


Migrating from SAP SuccessFactors Recruiting to Greenhouse is a data-model translation problem. SuccessFactors stores recruiting data inside an HRIS architecture — `Candidate`, `JobApplication`, `JobRequisition`, and `JobOffer` entities live within a deeply nested OData v2 framework designed for enterprise HR. Greenhouse separates concerns into distinct **Candidate**, **Application**, and **Job** objects connected through integer IDs and a strict relational hierarchy.

If you need a fast decision: **Greenhouse's native bulk import recommends a maximum of 8,000 candidates per batch and strips out relational data like interview history and scorecards.** API-based extraction via SuccessFactors OData v2, combined with the Greenhouse Harvest API for loading, is the only path that preserves full-fidelity historical candidate data at enterprise scale. Middleware tools like Zapier or Workato are designed for forward-syncing new records, not backfilling 100k+ historical candidates with attachments. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360053674012-Bulk-import-candidates-from-spreadsheet))

This guide covers the real API constraints on both sides, entity-by-entity mapping, every viable migration method with trade-offs, and the edge cases that cause silent data loss.

For a broader HRIS migration framework, see [The Ultimate HRIS Data Migration Checklist](https://clonepartner.com/blog/blog/hris-data-migration-checklist/). For patterns specific to extracting data from SuccessFactors, see [SuccessFactors to Workday Migration: The CTO's Technical Guide](https://clonepartner.com/blog/blog/successfactors-to-workday-migration-the-ctos-technical-guide/). For ATS migration pitfalls, check [5 "Gotchas" in ATS Migration](https://clonepartner.com/blog/blog/ats-migration-gotchas/).

> [!WARNING]
> Greenhouse Harvest API v1 and v2 will be deprecated and unavailable after **August 31, 2026**. Build your migration pipeline against Harvest v3 (OAuth 2.0) from day one. Do not build on v1/v2 only to rewrite months later. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360029266032-Harvest-API-overview))

## Why Companies Migrate from SuccessFactors to Greenhouse

The migration drivers typically fall into three categories:

- **Dedicated ATS vs. HRIS recruiting module.** SuccessFactors Recruiting is a module within an HRIS suite. It's powerful for enterprise HR but imposes overhead for teams that only need a focused applicant tracking system. Greenhouse is purpose-built for structured hiring — scorecards, interview kits, approval workflows, and offer management are first-class features, not bolt-ons.
- **Structured hiring enforcement.** Greenhouse forces hiring managers to define scorecards and interview plans before a job goes live. SuccessFactors supports configurable workflows but doesn't enforce evaluation consistency with the same rigor.
- **Integration ecosystem.** Greenhouse's open API and partner network give recruiting teams flexibility to connect sourcing tools, assessments, and HRIS systems without SAP's middleware stack.

## Data Model Differences: SuccessFactors Recruiting vs. Greenhouse

This is where migrations break. The two systems model recruiting data in fundamentally different ways.

**SuccessFactors Recruiting** uses OData v2 entities inside the broader HXM Suite:

- `Candidate` — the person record, which can exist before any application
- `JobApplication` — the candidacy linking a Candidate to a JobRequisition, with status tracking and custom fields
- `JobRequisition` — the job posting with template-driven fields, department/location lookups, and approval workflows
- `JobOffer` — offer details attached to a JobApplication
- `Attachment` — resumes and documents accessible via the OData Attachment entity

**Greenhouse** separates concerns more cleanly:

- `Candidate` — the person (name, email, phone, tags, custom fields)
- `Application` — the candidacy tying a Candidate to a Job (with stage, status, rejection reason)
- `Job` — the position with departments, offices, hiring team, interview plan
- `Offer` — linked to an Application
- `Scorecard` — structured evaluation tied to an interview stage
- `Activity Feed` — notes, emails, and events on the Candidate timeline

### The Core Mapping Challenge

| SuccessFactors Entity | Greenhouse Equivalent | Translation Notes |
|---|---|---|
| `Candidate` | `Candidate` | Direct 1:1 mapping. Deduplicate by email before load. |
| `JobApplication` | `Application` | Each SF application becomes a GH Application linked to a Candidate and Job. Custom fields may require Enterprise tier. |
| `JobRequisition` | `Job` | Map department, location, and hiring manager. GH Jobs have a fixed interview plan structure. |
| `JobOffer` | `Offer` | Offer fields differ significantly. Custom offer fields in SF may not have GH equivalents. |
| `Attachment` (resume) | `Candidate Attachment` | Must be base64-encoded for API upload. GH hosts docs on AWS S3 with signed URLs that expire in 7 days. ([developers.greenhouse.io](https://developers.greenhouse.io/harvest.html)) |
| Interview notes/comments | `Notes` on Activity Feed | No structured scorecard equivalent — import as notes unless rebuilding scorecards from scratch. |
| Screening question responses | `Custom Fields` or `Notes` | GH has no native screening question import path. |

Two field-level details catch teams late:

1. **SuccessFactors OData only exposes fields configured in the active RCM template.** An old mapping spreadsheet can be wrong even when the API is technically working. Always query the OData metadata for your specific instance before building field mappings. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/candidate-and-candidatebackground))
2. **Greenhouse custom fields should be mapped using immutable field keys** via `keyed_custom_fields`, not display names. Names can change; keys are stable. ([developers.greenhouse.io](https://developers.greenhouse.io/harvest.html))

Keep separate immutable source keys for candidate, application, requisition/job, attachment, and user throughout the pipeline. If you flatten candidate and application into one row too early, you lose the one-to-many relationships that matter during validation and rollback.

## Migration Approaches: CSV vs. API vs. Middleware vs. Managed Service

### 1. Native CSV Bulk Import

**How it works:** Export candidate data from SuccessFactors (via Integration Center reports or Admin Center exports), format it into Greenhouse's bulk import spreadsheet template, map every source job to a Greenhouse job, and upload through the Greenhouse UI alongside a `.zip` of resumes.

**When to use:** Small datasets (<5,000 candidates), no requirement for historical interview data, minimal custom fields.

**Constraints:**
- Greenhouse recommends a maximum of **8,000 candidates per batch** ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360053674012-Bulk-import-candidates-from-spreadsheet))
- Resume .zip file capped at **5 GB**
- Milestone placement is limited to Application, Assessment, Face to Face, and Offer
- Relational data (which candidate applied to which job, at which stage) is flattened
- Candidates linked to unmapped jobs are silently skipped
- GDPR/CCPA consent emails can be automatically triggered for imported candidates — disable this or use Greenhouse's "container job" method for historical imports
- Interview history cannot be backdated (scorecards can be, but not the interview event itself)

> [!NOTE]
> Greenhouse's support docs are inconsistent on bulk-import tier availability. One article lists the feature for Core, Plus, and Pro, while another says Plus/Pro only. Verify entitlement in your tenant before designing your cutover around the UI importer. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360053674012-Bulk-import-candidates-from-spreadsheet))

**Complexity:** Low | **Scalability:** Small datasets only | **Data fidelity:** Low

For a deeper analysis of CSV-based migrations, see [Using CSVs for SaaS Data Migrations: Pros and Cons](https://clonepartner.com/blog/blog/csv-saas-data-migration/).

### 2. API-Based Migration (SuccessFactors OData → Greenhouse Harvest API)

**How it works:** Extract data programmatically via SuccessFactors OData v2 API (Candidate, JobApplication, JobRequisition, Attachment entities), transform in a staging layer, and load via Greenhouse Harvest API v3 POST endpoints.

**When to use:** >5,000 candidates, need to preserve application history, resumes, and custom fields.

**Constraints:**
- SuccessFactors OData API is commonly cited at **40 requests per second**, though SAP's current documentation says limits vary by API and server conditions — `429 Too Many Requests` plus `Retry-After` is the authoritative signal ([rizing.com](https://rizing.com/human-capital-management/h1-2022-sap-successfactors-release-analysis-integrations/))
- OData API returns a **maximum of 1,000 records per page** — implement server-side pagination
- SuccessFactors batch operations are limited to **180 requests per $batch call**
- Greenhouse Harvest API v1/v2 enforces **50 requests per 10-second window**; v3 uses a **30-second fixed window**
- All Greenhouse write operations require the `On-Behalf-Of` header with a valid Greenhouse user ID
- Attachments must be **base64-encoded** or provided as a direct download URL

**Complexity:** High | **Scalability:** Enterprise-grade | **Data fidelity:** High

### 3. Custom ETL Pipeline

**How it works:** Build a dedicated extract → stage → transform → load pipeline using Python, Node.js, or an orchestration tool like Airflow. Includes rate-limit handling, error logging, attachment processing, field mapping logic, and idempotent loaders with a staging database and deterministic crosswalk tables.

**When to use:** Enterprise migrations with complex custom fields, large attachment volumes, strict compliance requirements, or when you need repeatable delta runs and phased cutovers.

**This approach preserves the highest data fidelity.** Every other method makes compromises.

**Complexity:** High | **Scalability:** Enterprise-grade | **Data fidelity:** Highest

### 4. Middleware / iPaaS (Zapier, Workato, Make)

**How it works:** Configure triggers and actions between SuccessFactors and Greenhouse using a visual integration builder.

**When to use:** Ongoing forward-sync of new candidates or requisitions **after** the historical backfill is complete. Not suitable for historical migrations.

**Why it fails for migrations:**
- These tools are designed for **event-driven, real-time sync** — not bulk historical extraction
- SuccessFactors connectors in most iPaaS platforms don't expose the full Recruiting entity set
- No built-in support for paginated bulk extraction of 100k+ records
- Attachment handling (download from SF, base64-encode, upload to GH) is not natively supported
- Rate limit management is abstracted away, reducing your control over retry logic

Zapier's Greenhouse app exposes event-driven triggers and candidate create/update actions. Workato documents a Greenhouse-to-SuccessFactors new-hire sync. That's where iPaaS fits: operational automation after the backfill, not a 100k-record historical migration. ([zapier.com](https://zapier.com/apps/greenhouse/integrations))

**Complexity:** Low–Medium | **Scalability:** Low for bulk; Medium for ongoing | **Data fidelity:** Low for historical data

### 5. Managed Migration Service

**How it works:** A specialist team handles the full pipeline — data audit, field mapping, ETL scripting, rate-limit management, validation, and UAT support.

**When to use:** Your engineering team doesn't have ATS API experience, you can't afford a failed migration, or you need the migration completed in days rather than sprints.

**Complexity:** Low (for your team) | **Scalability:** Enterprise-grade | **Data fidelity:** Highest

### Comparison Table

| Method | Best For | Data Fidelity | Scalability | Attachments | Complexity |
|---|---|---|---|---|---|
| CSV Bulk Import | <8k active candidates | Low | Small datasets | Via .zip only | Low |
| API-Based (DIY) | Historical backfill | High | Enterprise | Yes (base64) | High |
| Custom ETL Pipeline | Audit + repeatability | Highest | Enterprise | Yes | High |
| Middleware (Zapier/Workato) | Forward sync only | Low | Real-time only | No | Low–Medium |
| Managed Service | Low internal bandwidth | Highest | Enterprise | Yes | Low (yours) |

Unified API platforms (Merge.dev, Truto) normalize data structures across ATS systems, but they're designed for building product integrations, not executing one-time migrations. They often strip platform-specific custom fields that don't fit the unified schema.

### Recommendations by Scenario

- **Small business, active candidates only:** Use native CSV import. Accept milestone-only stage placement and the loss of interview history.
- **Enterprise with history, attachments, and custom fields:** Use an API-led ETL pipeline or a managed migration service.
- **Ongoing coexistence:** Backfill with ETL first, then add a narrow iPaaS sync layer for deltas.
- **Low engineering bandwidth, real deadline:** Use a managed service. A one-time migration script is rarely an asset worth maintaining.

## Navigating API Rate Limits and Extraction Bottlenecks

The technical bottleneck of this migration is not network throughput — it is vendor API rate limits.

### SuccessFactors Side

SuccessFactors OData API rate limiting was formally introduced in the H1 2022 release. The commonly cited limits:

- **OData APIs:** ~40 requests per second (SAP says limits vary by API and server conditions — code to the `Retry-After` header, not a hard-coded constant from an old slide deck)
- **SFAPIs (legacy):** 20 requests per second
- **HTTP 429 response** with a `Retry-After` header (commonly 300 seconds)
- **$batch operations:** Maximum 180 requests per batch call
- **Page size:** Maximum 1,000 records per response
- **MDF data exports via Admin Center:** Limited to 500 MB per file; larger datasets require SFTP export (up to 1 GB)

SAP recommends a **maximum of 10 concurrent requests/threads per client** and advises against multithreading for read operations and batch calls. Enable **session reuse** on the SuccessFactors side — each OData request creates a server-side login session, which is resource-intensive. Reusing sessions avoids repeated authentication overhead.

```python
import time
import requests

def extract_with_rate_limit(url, headers, max_retries=5):
    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)
        if response.status_code == 200:
            return response.json()
        elif response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 300))
            print(f"Rate limited. Waiting {retry_after}s (attempt {attempt+1})")
            time.sleep(retry_after)
        else:
            response.raise_for_status()
    raise Exception(f"Max retries exceeded for {url}")
```

> [!NOTE]
> SuccessFactors Recruiting OData entities like `JobRequisition` and `JobApplication` are template-driven. The available fields depend on your RCM template configuration and may not appear in the OData API Dictionary if there are misconfigurations in RCM templates or Recruiting Settings. Verify entity availability before building your pipeline.

### Greenhouse Side

Greenhouse Harvest API rate limits differ by version:

- **v1/v2:** 50 requests per 10-second rolling window
- **v3:** Fixed 30-second window (exact limit returned in `X-RateLimit-Limit` header)
- **HTTP 429** response with `Retry-After` header and `X-RateLimit-Reset` timestamp
- **Pagination:** v1/v2 uses RFC-5988 Link headers (page-based, up to 500 per page). v3 uses cursor-based pagination with `per_page` up to 500.

**Throughput math for loading:** At 50 requests per 10 seconds, creating a candidate + attaching a resume + creating an application = 3 API calls per record. That's ~16 candidates per 10-second window, or ~5,760 candidates per hour under ideal conditions. A 50,000-candidate migration takes ~8.7 hours of pure API time — before retries, errors, and attachment uploads. Scale linearly: 100,000 candidates is ~17 hours minimum.

> [!WARNING]
> Custom integrations not enrolled in the Greenhouse Partner Program may be subject to stricter rate limits. Confirm your limits by checking the `X-RateLimit-Limit` response header on your first API call.

## Handling Attachments, Resumes, and Custom Fields

### Attachments

SuccessFactors stores attachments accessible via the OData `Attachment` entity. Resumes and cover letters linked to `JobApplication` records can be fetched with:

```
GET /odata/v2/Attachment?$filter=module eq 'RECRUITING'
```

On the Greenhouse side, attachments must be uploaded as **base64-encoded content** via the Harvest API `POST /v1/candidates/{id}/attachments` endpoint, or provided as a direct download URL. Greenhouse hosts all documents on AWS S3 and provides **signed, temporary URLs that expire in 7 days** — do not rely on these URLs for future access. ([developers.greenhouse.io](https://developers.greenhouse.io/harvest.html))

**Key risk:** Download files synchronously with extraction. If you extract attachment metadata from SuccessFactors one day and attempt to download the files later, time-limited URLs may have expired. On the Greenhouse side, the same rule applies: download, then upload fresh. Never chain off expiring URLs.

### Custom Fields

> [!WARNING]
> Custom fields on the Greenhouse **Application** object are gated by subscription tier. Greenhouse's own support articles conflict on exactly which tiers support Application custom fields. Verify in your tenant before finalizing field mappings — if your tier doesn't support them, the API will silently drop the data. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360001421452-System-Default-Fields-vs-Custom-Fields))

SuccessFactors Recruiting uses template-driven fields on `JobRequisition` and `JobApplication` entities. The available fields are defined per RCM template, not globally. Before mapping:

1. Query the OData metadata for your specific instance to discover available fields
2. Identify which fields are standard vs. custom
3. Map custom SF fields to Greenhouse **Candidate** custom fields (available on all tiers) or **Application** custom fields (tier-dependent)
4. Handle picklist values — SF picklists won't automatically match GH dropdown options
5. Use Greenhouse's immutable `keyed_custom_fields` for mapping, not display names

SuccessFactors adds another wrinkle: nonreportable `JobApplication` custom fields are stored as CLOB data, and Recruiting OData APIs do not allow updating a property to `null`. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/candidate-and-candidatebackground))

### Screening Question Responses

SuccessFactors stores screening question responses via `JobApplicationQuestionResponse` entities linked to `JobReqScreeningQuestion`. Greenhouse has no native screening question import path. Your options:

- Import as **custom candidate fields** (if the data is simple key-value)
- Import as **notes** on the candidate's activity feed (preserves the data but loses structure)
- Discard (if the data is no longer relevant)

## Pre-Migration Planning

### Data Audit Checklist

Before extracting anything, inventory what you have:

- [ ] **Candidates** — total count, active vs. inactive, duplicate rate (query by email)
- [ ] **Applications** — count per status (active, hired, rejected), distribution across requisitions
- [ ] **Job Requisitions** — open, closed, draft. Map departments and locations to GH structure
- [ ] **Offers** — which offer fields are populated? Any custom offer fields?
- [ ] **Attachments** — total file count and aggregate size. Resume vs. cover letter vs. other
- [ ] **Custom fields** — list all custom fields on Candidate, Application, and Requisition entities
- [ ] **Screening questions** — are responses stored? Do they need to migrate?
- [ ] **Interview notes/comments** — format and volume
- [ ] **Downstream integrations** — webhooks, HRIS links, job board connections

Audit the **live** templates, not a stale field inventory. SuccessFactors only exposes configured fields in OData metadata. Greenhouse's support docs are inconsistent around application custom-field availability, so confirm features in the target tenant before you finalize the mapping. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/candidate-and-candidatebackground))

### Define Migration Scope

Not everything needs to migrate. Common exclusions:

- Draft or cancelled requisitions with zero applications
- Candidates rejected >3 years ago (check GDPR/CCPA retention policies)
- Internal-only notes that reference other HRIS data
- Test candidates or sandbox data

### Migration Strategy

| Strategy | When to Use | Risk Level |
|---|---|---|
| **Big bang** | Small dataset (<10k candidates), clean data, low integration complexity | Medium |
| **Phased by department** | Multiple hiring teams, need validation gates between waves | Low |
| **Phased by status** | Migrate active candidates first, historical second | Low |
| **Incremental** | Ongoing sync needed during transition period | High (requires delta tracking) |

A "big bang" cutover over a weekend is standard for ATS migrations to prevent split-brain scenarios between recruiters. For larger datasets, phased approaches reduce risk at the cost of a longer coexistence period.

## Step-by-Step Migration Process

### Step 1: Extract from SuccessFactors

Use the OData v2 API with OAuth 2.0 authentication to extract Recruiting entities. HTTP Basic Auth for SuccessFactors APIs has been deprecated. The Integration Center can also produce scheduled CSV exports via SFTP for supplementary data. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/authentication-using-oauth-2-0))

```python
BASE_URL = "https://{your-instance}.successfactors.com/odata/v2"

def extract_candidates(headers, page_size=1000):
    url = f"{BASE_URL}/Candidate?$top={page_size}&$select=candidateId,firstName,lastName,cellPhone,email&$expand=jobsApplied"
    all_candidates = []
    while url:
        data = extract_with_rate_limit(url, headers)
        all_candidates.extend(data.get('d', {}).get('results', []))
        url = data.get('d', {}).get('__next', None)
    return all_candidates
```

Follow the server's pagination links (`__next`) instead of reconstructing URLs. SAP documents snapshot-based pagination (`paging=snapshot`) and custom page sizing but warns not to mix `paging=snapshot` with `$top` and `$skip`. ([help.sap.com](https://help.sap.com/docs/SAP_SUCCESSFACTORS_PLATFORM/d599f15995d348a1b45ba5603e2aba9b/93ef8631b93b4d58be235b047dae2b57.html?locale=en-US))

### Step 2: Transform Data

The transformation layer handles:

- **Deduplication:** Merge duplicate Candidate records by email address
- **Field mapping:** Convert SF field names and data types to GH equivalents
- **Picklist normalization:** Map SF picklist values to GH dropdown options or create new ones
- **Relationship reconstruction:** Build the Candidate → Application → Job graph
- **Attachment processing:** Download files, base64-encode for GH upload
- **Data cleaning:** Strip invalid characters, normalize phone numbers, validate email formats

Store raw payloads and source IDs in a staging layer before any transformation. Preserve a crosswalk for `source_candidate_id`, `source_application_id`, and `source_job_req_id`.

### Step 3: Load into Greenhouse

Load order matters. Greenhouse enforces referential integrity:

1. **Create Jobs** — `POST /v3/jobs` or configure manually. If creating via API, Greenhouse requires an existing job to be used as the template for the new job. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360013450271-Template-jobs-and-custom-HRIS-integrations))
2. **Create Candidates** — `POST /v3/candidates` with first_name, last_name, email, phone, custom fields
3. **Create Applications** — `POST /v3/candidates/{candidate_id}/applications` linking to a job_id
4. **Upload Attachments** — `POST /v3/candidates/{candidate_id}/attachments` with base64 content
5. **Add Notes** — `POST /v3/candidates/{candidate_id}/activity_feed/notes` for interview comments
6. **Move Applications to correct stage** — `PUT /v3/applications/{id}/advance` or `PUT /v3/applications/{id}/move`
7. **Reject historical applications** — `POST /v3/applications/{id}/reject` with rejection reason

```python
import base64

def create_greenhouse_candidate(candidate, headers, on_behalf_of):
    payload = {
        "first_name": candidate["firstName"],
        "last_name": candidate["lastName"],
        "email_addresses": [{"value": candidate["email"], "type": "personal"}],
        "phone_numbers": [{"value": candidate.get("cellPhone", ""), "type": "mobile"}],
        "tags": ["sf-migration"]
    }
    headers["On-Behalf-Of"] = str(on_behalf_of)
    response = requests.post(
        "https://harvest.greenhouse.io/v3/candidates",
        json=payload,
        headers=headers
    )
    if response.status_code == 429:
        wait = int(response.headers.get('Retry-After', 10))
        time.sleep(wait)
        return create_greenhouse_candidate(candidate, headers, on_behalf_of)
    response.raise_for_status()
    # Re-fetch: create responses can be truncated
    candidate_id = response.json().get("id")
    return requests.get(
        f"https://harvest.greenhouse.io/v3/candidates/{candidate_id}",
        headers=headers
    ).json()
```

> [!NOTE]
> Greenhouse's `POST: Add Candidate` and `POST: Add Candidate Application` can return truncated responses before the full record is available. Re-fetch created records by ID before chaining dependent operations like application creation or attachment upload.

### Step 4: Validate Data

Validation happens at multiple levels:

| Check | Method | Pass Criteria |
|---|---|---|
| Candidate count | Compare SF export total vs. GH `GET /candidates` count | ±0 variance |
| Application count | Compare SF JobApplication count vs. GH Application count | ±0 variance |
| Email accuracy | Sample 100 records, verify email in GH matches SF | 100% match |
| Attachment presence | Sample 50 candidates with resumes, verify in GH UI | 100% downloadable |
| Stage accuracy | Sample 50 active applications, verify current stage | 100% match |
| Custom field values | Sample 50 records per custom field | 100% match |
| Job assignment | Verify every Application links to correct Job | 100% match |

Validate in the **Greenhouse UI**, not just via API responses — what the API returns and what the recruiter sees can differ.

### Step 5: UAT and Delta Cutover

1. Run a pilot migration with 100–200 candidates from one department
2. Have the recruiting team verify data in the Greenhouse UI
3. Document every discrepancy
4. Run a second pilot at 1,000+ candidates across multiple jobs
5. Get sign-off from recruiting ops before the production run
6. After approval, run a delta extract for records changed since the baseline
7. Switch users to Greenhouse

### Error Handling Architecture

Your loader should:

- Respect `Retry-After` and `X-RateLimit-Remaining` headers on every call
- Use a central rate-limit controller (token bucket or queue)
- Store request and response metadata per object for audit
- Keep dead-letter queues for records that fail after max retries
- Be idempotent on rerun — if a run fails halfway, restarting should not create duplicates
- Re-fetch created Greenhouse objects before chaining dependent calls

## Edge Cases and Challenges

### Duplicate Records

SuccessFactors may have duplicate Candidate records with slightly different emails (john@company.com vs. j.doe@company.com). Greenhouse's auto-merge feature matches on email during bulk import — but not during API-based creation. You must deduplicate before loading via the API. Be careful: auto-merge is useful until it merges something you intended to keep separate for validation. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360053674012-Bulk-import-candidates-from-spreadsheet))

### Multi-Application Candidates

A single Candidate in SuccessFactors can have multiple `JobApplication` records across different `JobRequisition` entries. This maps cleanly to Greenhouse's model (one Candidate, many Applications), but you must maintain `candidateId` as a consistent key throughout the pipeline to avoid creating duplicate Candidate records in GH.

### Interview History

Greenhouse does not allow backdating interview events — only scorecards can be backdated. If you need historical interview data in Greenhouse, it must be imported as notes on the activity feed, not as native scheduled interviews.

### GDPR Consent Auto-Emails

When importing candidates into Greenhouse jobs configured for GDPR compliance, Greenhouse automatically emails consent requests. For bulk historical imports, either disable this setting or use Greenhouse's recommended "container job" method — create a single historical job and import all legacy candidates there.

### Missing Fields in OData

If a SuccessFactors field is not configured in the active RCM template, it won't appear in metadata or queries. An old mapping spreadsheet can reference fields that simply don't exist in the API response. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/candidate-and-candidatebackground))

### Job Mapping Misses in Bulk Import

In CSV bulk import, candidates linked to unmapped jobs are silently skipped — they're not flagged as errors. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360053674012-Bulk-import-candidates-from-spreadsheet))

### Truncated API Create Responses

Greenhouse create endpoints can return before the full record is available. If you chain dependent calls (create candidate → create application using the returned ID), re-fetch the created record first to avoid working with incomplete data.

## Limitations and Constraints

### Greenhouse vs. SuccessFactors

- **No true custom objects.** SuccessFactors' MDF framework supports custom objects. Greenhouse has custom fields on Candidate and Application objects, but no user-defined entity types. If SF uses deep background-element structures or template-specific subforms, that data must be flattened into custom fields, serialized into notes, or preserved in an external archive.
- **Application custom fields are tier-gated.** Scripts built for a lower tier will silently miss this data.
- **No structured screening questions on import.** SF's prescreening data has no 1:1 target.
- **No interview backdating.** Only scorecards can be backdated.
- **Attachment URLs are ephemeral.** Documents retrieved from Greenhouse are served via signed AWS S3 URLs that expire in 7 days.

### SuccessFactors Extraction Constraints

- **MDF data exports via Admin Center are limited to 500 MB per file.** Larger datasets require SFTP export (up to 1 GB).
- **Recruiting OData entities are template-dependent.** The available fields on `JobRequisition` and `JobApplication` depend on your RCM template configuration.
- **OAuth 2.0 is required.** HTTP Basic Auth for SuccessFactors APIs was deprecated. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/restricting-odata-api-access-through-basic-authentication))
- **Nonreportable application fields are CLOB data.** OData APIs do not allow setting these properties to `null`.

## Rollback Planning

Tag all migrated records with a consistent label (e.g., `sf-migration`). If rollback is needed, filter by tag and bulk-delete via the Harvest API. Export your existing Greenhouse data before the migration as well — protect what you already have.

Preserve raw source extracts, ID crosswalk tables, and a precise cutover timestamp so you can replay deltas or prove exactly what moved.

## Post-Migration Tasks

- **Rebuild interview plans and scorecards.** These are configured per Job in Greenhouse and cannot be migrated — they must be set up manually or via API.
- **Configure offer approval workflows.** Greenhouse's offer system differs from SuccessFactors.
- **Reconnect integrations.** Job board postings, background check providers, and HRIS sync. (You may need to integrate Greenhouse back into SuccessFactors for employee onboarding.)
- **Train recruiting teams.** Greenhouse's structured hiring model differs from SuccessFactors. Budget 1–2 weeks for team onboarding.
- **Monitor for data inconsistencies.** Run validation checks weekly for the first month. Watch for duplicates, missing attachments, and stage mismatches.
- **Keep SuccessFactors readable** until full business sign-off.

## Best Practices

- **Back up everything before migration.** Export your full SuccessFactors Recruiting dataset and your existing Greenhouse data.
- **Run at least two test migrations.** First: 100 candidates. Second: 1,000+ candidates across multiple jobs. Include ugly records — not just happy-path data.
- **Build for Harvest v3 from day one.** v1/v2 are deprecated August 31, 2026.
- **Tag all migrated records** with a consistent identifier for traceability and rollback.
- **Disable GDPR consent auto-emails** during bulk import to avoid sending consent requests to historical candidates.
- **Download attachments synchronously** with candidate extraction — URLs on both sides can expire.
- **Validate in the Greenhouse UI**, not just via API responses.
- **Preserve source IDs.** Store `candidateId` and `applicationId` from SuccessFactors as custom fields in Greenhouse for audit and troubleshooting.
- **Use `keyed_custom_fields`.** Field labels are for humans; keys are for migrations. ([developers.greenhouse.io](https://developers.greenhouse.io/harvest.html))
- **Code to response headers, not hard-coded constants.** Respect `Retry-After` and `X-RateLimit-Remaining` on every call.
- **Make your loader idempotent.** If a run fails halfway, you need to restart without creating duplicates.
- **Document every compromise.** If stage history or custom-object depth cannot be recreated natively, archive it deliberately and tell stakeholders before go-live.

## Sample Data Mapping Table

| SuccessFactors Field | Entity | Greenhouse Field | Object | Notes |
|---|---|---|---|---|
| `candidateId` | Candidate | Custom field `source_candidate_id` | Candidate | Immutable reconciliation key |
| `firstName` | Candidate | `first_name` | Candidate | Direct map |
| `lastName` | Candidate | `last_name` | Candidate | Direct map |
| `email` | Candidate | `email_addresses [].value` | Candidate | Normalize format, deduplicate |
| `cellPhone` | Candidate | `phone_numbers [].value` | Candidate | Strip formatting |
| `jobReqId` | JobRequisition | `job_id` | Job | Map via lookup table; jobs must exist first |
| `jobTitle` | JobRequisition | `name` | Job | Direct map |
| `department` | JobRequisition | `departments [].name` | Job | Must match GH department list |
| `location` | JobRequisition | `offices [].name` | Job | Must match GH office list |
| `applicationId` | JobApplication | Custom field `source_application_id` | Application | Needed for rollback and audit |
| `applicationStatus` | JobApplication | `status` | Application | Map to GH status (active/rejected/hired) |
| `offerAmount` | JobOffer | Custom fields | Offer | GH offer fields are limited |
| Resume file | Attachment | `attachments [].filename` | Candidate Attachment | Base64-encode; download fresh |
| Interview notes | Comments | `notes [].body` | Activity Feed Note | Loses structure; becomes plain text |

These mappings depend on the live template configuration in SuccessFactors and the enabled custom-field features in Greenhouse. Confirm both before building. ([help.sap.com](https://help.sap.com/docs/successfactors-platform/sap-successfactors-api-reference-guide-odata-v2/candidate-and-candidatebackground))

## When to Use a Managed Migration Service

**Build in-house when:**
- You have a dedicated engineer with ATS API experience
- The dataset is small (<5,000 candidates) with minimal custom fields
- You have 4–6 weeks of engineering bandwidth

**Use a managed service when:**
- Your dataset exceeds 10,000 candidates with attachments
- You have complex custom fields or template-driven data in SuccessFactors
- Your engineering team doesn't have ATS API experience
- The migration needs to complete in days, not sprints
- You can't afford silent data loss (broken relationships, missing resumes, dropped custom fields)

The hidden cost of DIY migrations isn't the initial script — it's the 3–4 iterations of debugging edge cases, handling API throttling, processing failed attachment uploads, and reconciling record counts that don't match. We've seen teams spend 6–8 engineering weeks on what they estimated at 2. Greenhouse's own docs describe active-candidate migration as a manual process — a signal that the product alone does not remove the execution burden. ([support.greenhouse.io](https://support.greenhouse.io/hc/en-us/articles/360040034991-Active-candidate-migration))

At ClonePartner, we've run [1,200+ migrations](https://clonepartner.com/blog/blog/how-we-run-migrations-at-clonepartner/) including complex HRIS-to-ATS moves. Our pipelines handle SuccessFactors OData extraction with built-in rate-limit management, Greenhouse Harvest v3 loading with attachment processing, and automated validation — typically completing enterprise migrations in days. We treat ATS migrations as relationship-preservation projects, not row-copy exercises. If you're evaluating build vs. buy, see our broader analysis in [In-House vs. Outsourced Data Migration](https://clonepartner.com/blog/blog/in-house-vs-outsourced-data-migration/).

## What to Do Next

Start with the data audit. Query your SuccessFactors instance to understand your Recruiting entity structure — the OData metadata endpoint will show you exactly which fields are available on your `JobRequisition` and `JobApplication` templates. Count your candidates, applications, and attachments. That inventory determines whether you need a CSV import, an API-based pipeline, or expert help.

> Need help migrating from SuccessFactors to Greenhouse? Our engineers will review your data model and map out a migration plan — no obligation.
>
> [Talk to us](https://cal.com/clonepartner/meet?duration=30&utm_source=blog&utm_medium=button&utm_campaign=demo_bookings&utm_content=cta_click&utm_term=demo_button_click)

## Frequently asked questions

### What are the API rate limits for SuccessFactors OData and Greenhouse Harvest APIs?

SuccessFactors OData APIs are commonly cited at 40 requests per second, though SAP says limits vary by API — code to the Retry-After header. Greenhouse Harvest API v1/v2 allows 50 requests per 10-second window; v3 uses a 30-second fixed window. Both return HTTP 429 when exceeded.

### How many candidates can I bulk import into Greenhouse at once?

Greenhouse recommends a maximum of 8,000 candidates per bulk import batch. The resume .zip file has a 5 GB size limit. The bulk import tool strips relational data like interview history and scorecards, and candidates linked to unmapped jobs are silently skipped.

### Is Greenhouse Harvest API v1 being deprecated?

Yes. Greenhouse Harvest API v1 and v2 will be deprecated and unavailable after August 31, 2026. All new integrations should use Harvest v3, which requires OAuth 2.0 authentication and uses cursor-based pagination.

### Can I migrate interview history from SuccessFactors to Greenhouse?

Partially. Greenhouse does not allow backdating interview events — only scorecards can be backdated. Interview notes and comments from SuccessFactors must be imported as notes on the Greenhouse activity feed, losing their structured format.

### Do Greenhouse Application custom fields work on all subscription tiers?

No. Application custom fields are gated by subscription tier. Greenhouse's own support articles conflict on exactly which tiers support them. Verify in your actual tenant before finalizing field mappings — if your tier doesn't support them, the API will silently drop the data.
