AOQL

AOQL is the language developed and used by Velocidi to query event databases. Its main objective is to provide a set of operations common to all data storage backends in use, such as relational databases and Elasticsearch.

The language is a subset of SQL and its semantics try to mimic the latter's behavior on a standard relational database. It is intended to be suitable both for "table scanning" queries (e.g. selections, filters) and aggregation queries (with GROUP BY). It most notably differs from SQL due to the absence of subqueries – AOQL intends to be targeted at data retrieval with a low server-side computational cost.

Using AOQL#

AOQL can be used through gRPC operations exposed by the Event Storage module. The endpoint in which each service resides is client-dependent. The Event Storage gRPC API documentation is available here.

AOQL Relations#

Through the Event Storage module, the Velocidi Private CDP provides querying access to three AOQL relations. Each relation exposes an event dataset, whose schema is not necessarily the same as the real schema used by their backend. The representation of a full record ( SELECT * ) may be different from the representation of individual projections, both for efficiency and convenience purposes.

Match Events#

The relation providing access to match events is named MatchEvents and can be accessed using the matchQuerySource gRPC operation. Its star (full record) projection is a MatchEvent.

Columns#

  • requestId (String) : Optional. the identifier of the request that triggered the match event
  • providerId (String) : the identifier of the match provider;
  • userIds (Set[UserId]) : the set of matched user IDs;
  • attributes (UserAttributes) : the attributes of the user after the match operation;
  • timestamp (Long) : the timestamp of the match, in milliseconds.

Examples#

Return a stream of distinct providers of matches in January 2018:

SELECT DISTINCT providerId FROM MatchEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000

Return an all-time count of events per match provider:

SELECT providerId, COUNT(*) FROM MatchEvents GROUP BY providerId

Activation Events#

The relation providing access to activation events is named ActivationEvents and can be accessed using the activationQuerySource gRPC operation. Its star (full record) projection is an ActivationEvent.

Columns#

  • requestId (String) : Optional. the identifier of the request that triggered the activation event
  • receiverId (String) : the identifier of the data receiver;
  • userId (UserId) : the activated user ID;
  • linkedUserIds (Set[UserId]) : the user IDs linked with the activated one at the time of the activation;
  • activatedAttributes (UserData) : the activated attributes before the application of any mapping rule;
  • timestamp (Long) : the timestamp of the activation, in milliseconds.

Examples#

Return a stream of distinct user IDs of matches in January 2018:

SELECT DISTINCT userId FROM ActivationEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000

Return an all-time count of events per data receiver:

SELECT receiverId, COUNT(*) FROM ActivationEvents GROUP BY receiverId

CookieSync Events#

The relation providing access to cookiesync events is named CookieSyncEvents and can be accessed using the cookieSyncQuerySource gRPC operation. Its star (full record) projection is a CookieSyncEvent.

Columns#

  • requestId (String) : Optional. The identifier of the request that triggered the Cookie Sync event;
  • cookieSyncId (String) : the identifier of the cookie sync configuration;
  • primaryUserId (String) : the primary user id used when syncing;
  • syncedUserId (SyncedUserId) : the id to which the primary user id was synced to;
  • link (Boolean) : whether the request signaled the intent to link the two user ids internally;
  • timestamp (Long) : the timestamp of the cookie sync, in milliseconds.

Examples#

Return a stream of distinct primary user ids used in syncing in January 2018:

SELECT DISTINCT primaryUserId FROM CookieSyncEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000

Return an all-time count of cookie syncs per cookie sync ID:

SELECT cookieSyncId, COUNT(*) FROM CookieSyncEvents GROUP BY cookieSyncId

Tracking Events#

The relation providing access to tracking events is named TrackingEvents and can be accessed using the trackingQuerySource gRPC operation. Its star (full record) projection is a JSON object (the exact type depends on the programming language).

The JSON object contains all the information of an HTTP request, as well as metadata externally collected or extracted from the event itself. The "Columns" section below contains some of the most common fields.

Columns#

  • meta.timestamp (Long) : the timestamp of the event, in milliseconds;
  • meta.user (UserId) : the tracked user ID;
  • meta.type (String) : the type of the tracking event;
  • meta.clientId (String) : the client ID specified in the request;
  • meta.siteId (String) : the site ID specified in the request;
  • meta.country (String) : the country where the request was made, if available;
  • meta.cookie.sticky (Boolean) : whether the browser has cookies enabled or not, as detected by the server;
  • meta.user–device.os–family (String) : the operating system provided in the User–Agentheader, as detected by the server;
  • uri.query.<name> (String) : the value of the query param name, if available;
  • headers.<name> (String) : the value of the HTTP header name, if available.

Examples#

Return a stream of distinct tracked user IDs in January 2018:

SELECT DISTINCT meta.user FROM TrackingEvents WHERE meta.timestamp >= 1514764800000 AND meta.timestamp < 1517443200000

Return an all-time count of tracking events per client ID:

SELECT meta.clientId, COUNT(*) FROM TrackingEvents GROUP BY meta.clientId

Error Events#

The relation providing access to error events is named ErrorEvents and can be accessed using the errorQuerySource gRPC operation. Its star (full record) projection is an ErrorEvent.

Columns#

  • entityId (String) : the id of the entity that caused the error;
  • entityType (String) : the type of the entity that caused the error;
  • errorType (ErrorType) : type of error;
  • payload (Json) : complementary data for the error, in JSON;
  • timestamp (Long) : the timestamp of the error occurrence, in milliseconds.

Examples#

Return a stream of distinct type of errors in January 2018:

SELECT DISTINCT errorType FROM ErrorEvents WHERE timestamp >= 1514764800000 AND timestamp < 1517443200000

Return an all-time count of error events per type of entity that caused the error:

SELECT entityType, COUNT(*) FROM ErrorEvents GROUP BY entityType

Extended* relations#

AOQL provides an extra Extended* relation for each of the relations described above:

  • ExtendedMatchEvents;
  • ExtendedActivationEvents;
  • ExtendedCookieSyncEvents.
  • ExtendedTrackingEvents.
  • ExtendedErrorEvents.

These extended relations augment the original relations with an internal ID. The star projection becomes:

  • id
  • event

where event encapsulates all the original event columns. Thus, when using these relations, one should reference the event columns by correctly namespacing them. For example:

SELECT receiverId, COUNT(*) FROM ActivationEvents GROUP BY receiverId

should become:

SELECT event.receiverId, COUNT(*) FROM ExtendedActivationEvents GROUP BY event.receiverId

when using the ActivationEvents extended version.

These extra relations allows one to inspect the internal event IDs. These IDs can be used on subsequent queries to match against a specific event. For example:

Return the latest match:

SELECT * FROM ExtendedMatchEvents ORDER BY event.timestamp DESC limit 1

Use a given ID to match against a specific match event:

SELECT * FROM ExtendedMatchEvents WHERE id='2030'
Last updated on