Amazon Web Services

To set up a raw export pipeline to an S3 bucket from Mixpanel, you must configure S3 to receive the exported data, then create a pipeline to export the data.

The following document summarizes the steps to edit S3 bucket permissions so that it accepts the Mixpanel export. Consult AWS documentation for any AWS specific tasks, such as creating an S3 bucket and editing permissions.

To prepare S3 for the incoming data you must:

  1. Create an S3 bucket.
  2. Give Mixpanel the required permissions to write to the bucket.

S3 Bucket Permissions

Mixpanel supports a wide range of configurations to secure and manage your data on S3. To access resources, the pipeline uses AWS cross-account roles.

This section highlights the permissions you must give Mixpanel depending on the configuration of the target S3 bucket.

Data Modification Policy

All exports from Mixpanel to AWS require that you create a new data modification policy, or add the following permissions to an existing data modification policy.

Replacing <BUCKET_NAME> with your bucket name before inserting this JSON:

{
    "Version": "VERSION",
    "Statement": [
        {
            "Sid": "SID-YOU-CHOOSE",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::BUCKET-NAME",
                "arn:aws:s3:::BUCKET-NAME/*"
            ]
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SomeSidYouChoose",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET_NAME>",
                "arn:aws:s3:::<BUCKET_NAME>/*"
            ]
        }
    ]
}

Server-Side Encryption

Mixpanel always sends data to your S3 bucket on a TLS encrypted connection. To secure your data at rest on S3, you can enable Server-Side Encryption (SSE).

There are two options when using SSE: Encryption with Amazon S3-Managed Keys (SSE-S3) and Encryption with AWS KMS-Managed Keys (SSE-KMS)

Encryption with Amazon S3-Managed Keys (SSE-S3)

This setting on your bucket encrypts data at rest using the AES-256 algorithm that uses keys managed by S3.

If you are using this type of SSE, you only need to configure your pipeline by passing the s3_encryption=aes parameter when calling the Mixpanel Data Warehouse Export API. See AWS S3 and Glue Parameters.

Encryption with AWS KMS-Managed Keys (SSE-KMS)

You have a choice of keys if you use the Key Management Service (KMS).

For S3 buckets, you can pick a default key named aws/s3. If you opt to use the default key you don’t need any further configuration on AWS, and only need to configure your pipeline by passing s3_encryption=kms when calling the Mixpanel Data Warehouse Export API.

If you choose to use your own custom keys for encrypting the contents of your bucket, you will need to allow Mixpanel to use the key to encrypt the data properly as it is written to your bucket.

To achieve this, create an IAM policy that gives permission to Mixpanel to use the KMS key. Use the following JSON snippet and replace <KEY_ARN> with your custom key’s ARN:

{
    "Version": "VALUE",
    "Statement": [
        {
            "Sid": "SID-YOU-CHOOSE",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:GenerateDataKey",
                "kms:ReEncryptTo",
                "kms:GenerateDataKeyWithoutPlaintext",
                "kms:DescribeKey",
                "kms:ReEncryptFrom"
            ],
            "Resource": "KEY-ARN"
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "SomeSidYouChooseAgain",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:GenerateDataKey",
                "kms:ReEncryptTo",
                "kms:GenerateDataKeyWithoutPlaintext",
                "kms:DescribeKey",
                "kms:ReEncryptFrom"
            ],
            "Resource": "<KEY_ARN>"
        }
    ]
}

You must configure your pipeline by passing s3_encryption=kms and s3_kms_key_id=<KEY_ARN> when calling the Mixpanel Data Warehouse Export API.

S3 Access Role

After creating the policies in the sections above, you must create a cross account IAM Role to assign the policies to the role.

  • Go to the AWS IAM service on the console.
  • Click Roles in the sidebar.
  • Click Create Role.
  • Select Other AWS Accounts on the trust policy page and enter "485438090326" for the account ID.
  • In the Permissions page, find and select the policies you created above.
  • In the Review page, enter a name and description for the role and click Save.

Next, limit the trust relationship to the Mixpanel export user to ensure only Mixpanel has the ability to assume this specific role.

  • Navigate to the AWS IAM service in the console.
  • Click Roles in the sidebar.
  • Find and click the role you just created.
  • Navigate to the Trust Relationships tab.
  • Click Edit trust relationship.
  • Replace the contents with the following JSON:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::485438090326:user/mixpanel-export"
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

Use The Data Pipelines API

After permissions have been granted, use the Data Pipelines API to create the pipeline.

curl https://data.mixpanel.com/api/2.0/nessie/pipeline/create \
-u API-SECRET: \
-d type="S3-EXPORT-TYPE" \
-d from_date="VALUE" \
-d s3_bucket="S3-EXPORT" \
-d s3_region="REGION" \
-d s3_prefix="PREFIX" \
-d s3_role="ROLE-VALUE" \
-d schema_type="SCHEMA-TYPE" \
-d s3_encryption="ENCRYPTION-TYPE" \
-d data_format="FORMAT"
curl https://data.mixpanel.com/api/2.0/nessie/pipeline/create \
-u API_SECRET: \
-d type="s3-raw" \
-d from_date="2019-08-10" \
-d s3_bucket="example-s3-export" \
-d s3_region="us-west-2" \
-d s3_prefix="test" \
-d s3_role="arn:aws:iam::<account-id>:role/example-s3-role" \
-d schema_type="multischema" \
-d s3_encryption="aes" \
-d data_format="json"

Response

Upon success, the pipeline will create the resulting JSON or Parquet file at this location:
example-s3-export/test/<project_id>/2019/08/10/full_day/

Updated 6 months ago


Amazon Web Services


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.