Data Discrepancy FAQ

Data discrepancy happens when what you see in Mixpanel is different from what you have in your object storage or data warehouse. Here we try to answer common questions regarding data discrepancies.

What are different types of data discrepancy?

  1. Some of the events you wanted to export are not showing up in the destination
  2. Some of the properties are being excluded from the exported data
  3. The number of events in Mixpanel doesn't match the number of events in the destination

Why some of the events are not being exported to the destination ?

This normally happens when you have a huge number of events because of a bad implementation (e.g you are tracking eventName-uuid to Mixpanel) which causes the export process to exceed a limitation in the target destination e.g. number of tables. In these cases, we try to identify the bad patterns and exclude them from the export process. We always try to communicate this to the customers through their Customer Success Managers.

Why some of the properties are not being exported to the destination?

This normally happens when you have a huge number of properties because of a bad implementation (e.g you are tracking events with properties like newproduct{timestamp} to Mixpanel) which causes the export process to exceed a limitation in the target destination e.g. number of columns in the table.

Why the number of events in Mixpanel doesn't match the number of exported events to my destination?

This can happen for a couple of reasons:

  • Data Sync is not enabled or not supported for your pipeline.
  • You are not counting the number of events in Mixpanel correctly
  • You are not counting the number of events in your destination correctly

If none of these are true, there is a rare chance that we have a regression in our data export stack that's causing this and we urge you to contact support so we can investigate and resolve the issue as soon as possible.

How to count the number of events in Mixpanel correctly?

The number of events shown in Mixpanel UI depends on factors like sampling and what is hidden or not in Lexicon etc. In particular, custom or merged events in Lexicon will not be exported. To get the right number of events exported from Mixpanel, you can run the following JQL query

function main() {
  return Events({
    from_date: "2016-01-04",
    to_date: "2016-01-04"
  }).reduce(mixpanel.reducer.count());
}

You will need to adjust from and to dates to your specific daterange.

📘

from_date and to_date are in your Mixpanel project's timezone. This is important as you will get the number of events in that timezone and should adjust your data warehouse queries to reflect that as well.

To get the number of events for specific event names you can use the following JQL query

function main() {
  // Get all signups and purchases by users with email addresses
  // from Yahoo or Gmail between January 1st and January 2nd
  return Events({
    from_date: '2016-01-01',
    to_date: '2016-01-02',
    event_selectors: [
        {event: 'signup', label: 'Signup'},
        {event: 'purchase', selector: '"yahoo" in properties["$email"]',
            label: 'Purchase (Yahoo)'},
        {event: 'purchase', selector: '"gmail" in properties["$email"]',
            label: 'Purchase (Gmail)'}
    ]
  })

See Write JQL to get more information about writing JQL queries.

How to count the number of events in data warehouses correctly?

For each data warehouse, we use different partitioning methods and the right query to get the number of events in your Mixpanel project timezone can be different from what you think. Consult the following links to get the right query.


Did this page help you?