Indicator

Indicators compute datasets on one or several data sources in order to evaluate their data quality.

Table: indicator

AttributeTypeDescription
idIntegerUnique identifier of the indicator, used as a primary key.
nameTextName of the indicator, must be unique.
descriptionTextDescription of the indicator.
execution_orderIntegerNumeric value to indicate in which order the indicator should be executed within a batch.
flag_activeBooleanBoolean value to indicate if the indicator is active or inactive. Inactive indicators are not computed when a batch is executed. Default value is False.
created_dateTimestampRecord created date.
updated_dateTimestampRecord last updated date.
created_by_idIntegerForeign key of the user table, to indicate which user created the record.
updated_by_idIntegerForeign key of the user table, to indicate which user updated the record.
user_group_idIntegerForeign key of the user_group table, to indicate to which user group the record belongs to.
indicator_type_idIntegerForeign key of the indicator_type table, to indicate which type is the indicator.
indicator_group_idIntegerForeign key of the indicator_group table, to indicate to which group belongs the indicator.

Indicator Group

Indicator groups define collections of indicators to be computed in a same batch.

Table: indicator_group

AttributeTypeDescription
idIntegerUnique identifier of the group, used as a primary key.
nameTextName of the group, must be unique.
created_dateTimestampRecord created date.
updated_dateTimestampRecord last updated date.
created_by_idIntegerForeign key of the user table, to indicate which user created the record.
updated_by_idIntegerForeign key of the user table, to indicate which user updated the record.
user_group_idIntegerForeign key of the user_group table, to indicate to which user group the record belongs to.

Indicator Type

Indicator types describe the type of indicators MobyDQ can compute. The indicator_type table stores the classes and methods used for the computation. Supported types of indicators are:

  • Completeness
  • Freshness
  • Latency
  • Validity

Completeness

A completeness indicator connects to two different data sources, a source and a target. It computes a dataset on each of the data sources and compares their results. For each measure of each record in both datasets, MobyDQ computes the difference in percentage as follow:

(Measure from target request - Measure from source request) / Measure from source request.

It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The comparison with the alert threshold is done in absolute value.

Example of Completeness Indicator: To be documented

Freshness

A freshness indicator connects to one single target data source. It computes its last updated timestamp and compares it to the current timestamp. For each record in the dataset, MobyDQ computes the difference in minutes as follow:

Current Timestamp - Last updated timestamp from target request.

It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The value of the Measure indicator parameter must be set to ['last_update'].

Example of Freshness Indicator: To be documented

Latency

A latency indicator connects to two different data sources, a source and a target. It computes the last updated timestamp on each of them and compares their results. For each record in both datasets, MobyDQ computes the difference in minutes as follow:

Last updated timestamp from source request - Last updated timestamp from target request.

It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The value of the Measure indicator parameter must be set to ['last_update'].

Example of Latency Indicator: To be documented

Validity

A validity indicator connects to one single target data source and computes a dataset on it. For each record it compares the measures values with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met.

Example of Validity Indicator: To be documented

Table: indicator_type

AttributeTypeDescription
idIntegerUnique identifier of the type of indicator, used as a primary key.
nameTextType of indicator, must be unique.
moduleTextPython module (file) used to compute this indicator type.
classTextPython class used to compute this indicator type.
methodTextPython method used to compute this indicator type.
created_dateTimestampRecord created date.
updated_dateTimestampRecord last updated date.
created_by_idIntegerForeign key of the user table, to indicate which user created the record.
updated_by_idIntegerForeign key of the user table, to indicate which user updated the record.

List of Indicator Type Values

idnamemoduleclassmethod
1CompletenesscompletenessCompletenessexecute
2FreshnessfreshnessFreshnessexecute
3LatencylatencyLatencyexecute
4ValidityvalidityValidityexecute

Parameter

Parameters used to compute indicators.

Table: parameter

AttributeTypeDescription
idIntegerUnique identifier of the indicator parameter, used as a primary key.
valueTextIndicator parameter value.
created_dateTimestampRecord created date.
updated_dateTimestampRecord last updated date.
created_by_idIntegerForeign key of the user table, to indicate which user created the record.
updated_by_idIntegerForeign key of the user table, to indicate which user updated the record.
user_group_idIntegerForeign key of the user_group table, to indicate to which user group the record belongs to.
parameter_type_idIntegerType of parameter, the combination of parameter_type_id and indicator_id must be unique.
indicator_idIntegerForeign key of the indicator table, to indicate to which indicator belongs the parameter. The combination of parameter_type_id and indicator_id must be unique.

Parameter Type

Parameter types describe the types of parameters can be used by MobyDQ to compute indicators.

Table: parameter_type

AttributeTypeDescription
idIntegerUnique identifier of the parameter type, used as a primary key.
nameTextType of indicator parameter, must be unique.
descriptionTextDescription of the parameter type.
created_dateTimestampRecord created date.
updated_dateTimestampRecord last updated date.
created_by_idIntegerForeign key of the user table, to indicate which user created the record.
updated_by_idIntegerForeign key of the user table, to indicate which user updated the record.

List of Parameter Type Values

idnamedescription
1Alert operatorOperator used to compare the results of the indicator with the alert threshold. Example: ==, >, >=, <, <=, <>
2Alert thresholdNumeric value used to evaluate the results of the indicator and determine if an alert must be sent.
3Distribution listList of e-mail addresses to which alerts must be sent. Example: ['email_1', 'email_2', 'email_3']
4DimensionList of values to indicate dimensions in the results of the indicator. Example: ['dimension_1', 'dimension_2', 'dimension_3']
5MeasureList of values to indicate measures in the results of the indicator. Example: ['measure_1', 'measure_2', 'measure_3']
6SourceName of the data source which serves as a reference to evaluate the quality of the data.
7Source requestSQL query used to compute the indicator on the source system.
8TargetName of the data source on which to evaluate the quality of the data.
9Target requestSQL query used to compute the indicator on the target system.

Matrix of Parameter Types per Indicator Type

Parameter TypeCompletenessFreshnessLatencyValidity
Alert operatorMandatoryMandatoryMandatoryMandatory
Alert thresholdMandatoryMandatoryMandatoryMandatory
Distribution listMandatoryMandatoryMandatoryMandatory
DimensionOptionalOptionalOptionalOptional
MeasureMandatoryMandatoryMandatoryMandatory
SourceMandatoryN/AMandatoryN/A
Source requestMandatoryN/AMandatoryN/A
TargetMandatoryMandatoryMandatoryMandatory
Target requestMandatoryMandatoryMandatoryMandatory