Augmenting Collected Events

Upon entering the CDP, all collected events are transformed in real-time to include additional information or to remove user sensitive information before they are persisted. The following sections described the data we augment events with.

Metadata#

The following event metadata is derived and available for all events.

MetadataLocation on the tracking eventExample
Timestamp (readable)meta.timestamp2022-07-07T16:09:17.809Z
Timestamp in milliseconds (epoch unix format)meta.timestampMillis1657210157809
Is a valid eventmeta.isValidEventtrue
User IDmeta.userId{"id": "1pCookie", "id": "c6ac2829223e182cc225b2278a2e2622"}
Request made by a bot/crawlermeta.fromBotfalse
Request made by device with a blacklisted user-agentmeta.isBlacklistedfalse

IP Address#

For requests that are directly sent by the user's device, through one of our SDKs, we use the IP address of the device that executed the HTTP request. For server-to-server events or events that were received through one of the supported webhooks, the IP has to be explicitly stated as part of the request. Some third-party integrations via webhooks might not support sending the IP address. Refer to each webhook details for information on how to send IP addresses.

After extracting the IP address, it is persisted in the event under the field meta.ip.

IP Anonymization#

IP addresses are automatically anonymized by default.

Anonymization occurs as soon as it is possible, before any logging or persistence takes place, and the full IP address is never stored. The anonymization principle is simple: every IP found on the request will have its last octet set to 0. For example, if we have 123.22.22.14 it will become 123.22.22.0.

Geolocation#

The IP address from the user tracking event is analyzed so that we can derive geolocation information. The information obtained through this process is purposely imprecise in order to avoid tracing the address or location of a particular user.

Derived InformationLocation on the tracking eventExample
Countrymeta.countryUnited States of America
Citymeta.cityNew York
Location (latitude and longitude)meta.location[40.7128, 74.0060]
Autonomous System Number (ASN)meta.asnCOMCAST-7922
Postal Codemeta.postalCode32073

In order to provide this functionality the CDP uses GeoLite2 databases created by MaxMind, available from https://www.maxmind.com.

Entrypoint#

The meta.entrypoint field can be used to identify how the event was collected on the CDP.

Under meta.entrypoint.type we label the channel through which the event reached the CDP:

  • tag - the event was collected directly by either the JavaScript tag or the Android/iOS SDK.
  • server - the event entered the CDP through the server-to-server endpoint.
  • templated - the event entered the system through one of the templated webhooks.
  • source - similar to templated but currently it only applies to the integration with Segment.

Entrypoints of type templated and source can also contain an additional field, name, which states the third-party integration from which the event originated from.

Derive Information#

We use the value of the User-Agent HTTP header to assert the user's device. Similar to the IP address, the User-Agent is directly fetched from events sent using one of our SDKs or has to be explicitly sent if the request entered through a server-to-server endpoint or through a third-party webhook.

We store the following information under meta.user-device:

  • user-agent-family
  • user-agent-major
  • user-agent-minor
  • os-family
  • os-major
  • os-minor
  • device-family
  • device-type

Interaction Type#

We classify each event type according an interaction type. This information is stored under meta.interactionType.

The interaction type can either be:

  • passive - events with type activationRequest, matchRequest, and cookieSyncRequest;
  • outbound - events with type adView, emailDelivery, and emailSend;
  • active - all the other event types.

Product Information#

When a event contains one or more products, the received product details will be complemented with information about the product that is stored in the CDP, if a product feed is available. More information on how to enable and setup this feature can be found in the Offline Imports section.

Currency Conversion#

We convert every field that contains a currency to the default system currency. We support a variety of currencies and the conversion rates used are updated daily. All fields under meta.data should have the currency converted to the system default.

We only convert the currency for fields that are part of our events schema. Custom fields are not converted. The unaltered payload with the original currency can be found under meta.rawData.

Origin#

We add some metadata about how the event originated under meta.origin. This field encompasses various factors that might have led to the occurrence of the event. These fields can be derived by either relying on event metadata or by looking at the UTM parameters present in the referrer URL.

Sub-fields present under meta.origin:

  • source - Where the user came from. Possible values:
    • direct - The user accessed the website directly;
    • email - The event is related to an email event;
    • none - The event occurred while the user was navigating the website, after it had already entered through some other channel;
    • google - The user accessed the website through one of Google channels (like Adwords and DoubleClick);
    • Other - Related to UTM vars or the web page the user was what before being directed to your website;
  • medium - The type of traffic or tool used to get to the website;
  • campaign - Ad campaign that originated this event;
  • keyword - Any keyword related to a possible ad that originated this event;
  • content - Used to differentiate the content of a possible ad that originated this event.
Last updated on