Cloud Import Overview

This document is a guide for how to use Cloud Import, which allows large volumes of data to be batch-imported into Mixpanel from a cloud bucket. Our API allows users to set up a connector for import, check on the status, fetch a list of all connectors, and delete specific connectors.

How to Use Cloud Import

1. Prepare the data in Mixpanel format

  • Data must be in NDJSON format, one line per event, and all its properties
  • Ideally, data should be split into multiple files
    • Suggested file size for optimal performance : 500MB -1GB per file
    • There is a hard limit of 2GB per file (job will be cancelled if the bucket contains any file 2GB or larger)
  • No restriction on file names
  • For more information on how to format data for Cloud Import, please refer to Cloud Import Data Preparation Guide
  • Supported data formats are:
  • Example event data is found below. More examples can be found on Sample Data
{"event": "Ring Acquired", "properties": {"distinct_id": "Sauron", "time": 1601251200, "location": "Mount Doom"}}
{"event": "Ring Acquired", "properties": {"distinct_id": "Isildur", "time": 1601431446, "location": "Mount Doom", "previous_ring_bearer": "Sauron"}}
{"event": "Ring Acquired", "properties": {"distinct_id": "Sméagol", "time": 1601559543, "location": "Gladden Fields", "previous_ring_bearer": "Déagol"}}
{"event": "Ring Acquired", "properties": {"distinct_id": "Bilbo", "time": 1601661989, "location": "Misty Mountains", "previous_ring_bearer": "Sméagol"}}
{"event": "Ring Acquired", "properties": {"distinct_id": "Frodo", "time": 1601720421, "location": "Shire", "previous_ring_bearer": "Bilbo"}}
{"event": "Ring Acquired", "properties": {"distinct_id": "Sméagol", "time": 1601813532, "location": "Mount Doom", "previous_ring_bearer": "Frodo"}}

2. Upload the data into a bucket, and configure permissions

3. Create a Mixpanel project & generate service account

Create a Service Account in your project settings. Service Accounts used for Cloud Import should need to have an Admin Project Role. Once the account is created, be sure to copy the Username and Secret, You can create Service Accounts for your level or below, and the lifetime of service accounts are infinite. Please refer to the Service Account documentation for more information

Create New Service Account

Settings for Service Accounts for Cloud Import

4. Create a Connector for Import

To create a connector for import, POST a JSON blob to the /connectors/ endpoint as follows:

📘

Notice for EU customers

Please use https://eu.mixpanel.com/api/app/projects/{PROJECT_ID}/connectors/

You can also use the Connectors API Reference page to help build the cURL command, as well as find more details on supported params

curl -X POST https://mixpanel.com/api/app/projects/123456789/connectors/ --user "Samoyed.mp-service-account:a1mfy35GlmSbxGZbRuEl3CYXqYr71XCc" --data '{"connector_type":"gcsImport",  "connector_properties": { "gcs_bucket": "cloud-import-mp-demo", "gcs_prefix": "events_dir/signup","gcs_region": "us-east1"}, "category_properties": { "format": "mixpanel_event", "compression": "none"}}'

{
    "status": "ok",
    "results": {
        "connector_id": "232f40cc-7czc-4da3-b9c3-d8c1d14142c6",
        "label": "",
        "connector_type": "gcsImport",
        "connector_properties": {
            "gcs_bucket": "cloud-import-mp-demo",
            "gcs_prefix": "events_dir/signup",
            "gcs_region": "us-east1"
        },
        "category_properties": {
            "format": "mixpanel_event",
            "compression": "none"
        },
        "status": "active",
        "created_at": "2020-09-17T02:10:33.799738Z",
        "created_by": "[email protected]"
    }
}

📘

Please save the connector_id from the response

If you forgot, don't worry - you can still retrieve it :)

5. Check on the status of the job

For a given connector, you can check on its history of run and details using /connectors/{connector_id}/history endpoint

In case of a completed job, details about the number of files and events processed will be reported. In case of a failed job, it will provide a reason as to why the job failed. Please refer to the Connector API Reference for more details.

Statues :

  • success: all items have been imported and available for viewing in the project. Report on import statistics provided
  • queued: request has been scheduled and is waiting to be run
  • running: data is currently being imported
  • failed: import could not be processed due to a non-retryable error. Please refer to the error message in response for more details

progress is a float value, with a range of [0,1]

curl -X GET https://mixpanel.com/api/app/projects/123456789/connectors/232f40cc-7ccc-4da3-b9c3-d8c1d14142c6/history --user "Samoyed.mp-service-account:a1mfy35GlmSbxGZbRuEl3CYXqYr71XCc"

{
    "status": "ok",
    "results": {
        "url": null,
        "has_more": false,
        "data": [{
            "id": "",
            "start_time": "2020-09-16 07:00:00 +0000 UTC",
            "end_time": "2020-09-17 02:17:49.831793 +0000 UTC",
            "status": "succeeded",
            "progress": 1,
            "error": {
                "message": "",
                "code": ""
            },
            "connector_properties": {
                "num_events_imported": 13,
                "num_events_processed": 13,
                "num_events_dropped": 0,
                "num_files_imported": 4,
                "size_imported": 1610
            }
        }]
    }
}

6. Verify imported data in Mixpanel project

Go to your Mixpanel report - note that data imported via Cloud Import will not be displayed under LiveView

The easiest way to check data is to view All Events under Insights report. Use Bar chart, with sufficient date range to view the data

Imported Events

Example of Imported Profile

Updated 2 days ago

Cloud Import Overview


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.