Indicator
Indicators compute datasets on one or several data sources in order to evaluate their data quality.
Table: indicator
Attribute | Type | Description |
---|---|---|
id | Integer | Unique identifier of the indicator, used as a primary key. |
name | Text | Name of the indicator, must be unique. |
description | Text | Description of the indicator. |
execution_order | Integer | Numeric value to indicate in which order the indicator should be executed within a batch. |
flag_active | Boolean | Boolean value to indicate if the indicator is active or inactive. Inactive indicators are not computed when a batch is executed. Default value is False. |
created_date | Timestamp | Record created date. |
updated_date | Timestamp | Record last updated date. |
created_by_id | Integer | Foreign key of the user table, to indicate which user created the record. |
updated_by_id | Integer | Foreign key of the user table, to indicate which user updated the record. |
user_group_id | Integer | Foreign key of the user_group table, to indicate to which user group the record belongs to. |
indicator_type_id | Integer | Foreign key of the indicator_type table, to indicate which type is the indicator. |
indicator_group_id | Integer | Foreign key of the indicator_group table, to indicate to which group belongs the indicator. |
Indicator Group
Indicator groups define collections of indicators to be computed in a same batch.
Table: indicator_group
Attribute | Type | Description |
---|---|---|
id | Integer | Unique identifier of the group, used as a primary key. |
name | Text | Name of the group, must be unique. |
created_date | Timestamp | Record created date. |
updated_date | Timestamp | Record last updated date. |
created_by_id | Integer | Foreign key of the user table, to indicate which user created the record. |
updated_by_id | Integer | Foreign key of the user table, to indicate which user updated the record. |
user_group_id | Integer | Foreign key of the user_group table, to indicate to which user group the record belongs to. |
Indicator Type
Indicator types describe the type of indicators MobyDQ can compute. The indicator_type
table stores the classes and methods used for the computation. Supported types of indicators are:
- Completeness
- Freshness
- Latency
- Validity
Completeness
A completeness indicator connects to two different data sources, a source and a target. It computes a dataset on each of the data sources and compares their results. For each measure of each record in both datasets, MobyDQ computes the difference in percentage as follow:
(Measure from target request - Measure from source request) / Measure from source request
.
It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The comparison with the alert threshold is done in absolute value.
Example of Completeness Indicator: To be documented
Freshness
A freshness indicator connects to one single target data source. It computes its last updated timestamp and compares it to the current timestamp. For each record in the dataset, MobyDQ computes the difference in minutes as follow:
Current Timestamp - Last updated timestamp from target request
.
It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The value of the Measure
indicator parameter must be set to ['last_update']
.
Example of Freshness Indicator: To be documented
Latency
A latency indicator connects to two different data sources, a source and a target. It computes the last updated timestamp on each of them and compares their results. For each record in both datasets, MobyDQ computes the difference in minutes as follow:
Last updated timestamp from source request - Last updated timestamp from target request
.
It compares this result with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met. The value of the Measure
indicator parameter must be set to ['last_update']
.
Example of Latency Indicator: To be documented
Validity
A validity indicator connects to one single target data source and computes a dataset on it. For each record it compares the measures values with the alert operator and threshold defined in the indicator parameters and triggers an alert if the condition is met.
Example of Validity Indicator: To be documented
Table: indicator_type
Attribute | Type | Description |
---|---|---|
id | Integer | Unique identifier of the type of indicator, used as a primary key. |
name | Text | Type of indicator, must be unique. |
module | Text | Python module (file) used to compute this indicator type. |
class | Text | Python class used to compute this indicator type. |
method | Text | Python method used to compute this indicator type. |
created_date | Timestamp | Record created date. |
updated_date | Timestamp | Record last updated date. |
created_by_id | Integer | Foreign key of the user table, to indicate which user created the record. |
updated_by_id | Integer | Foreign key of the user table, to indicate which user updated the record. |
List of Indicator Type Values
id | name | module | class | method |
---|---|---|---|---|
1 | Completeness | completeness | Completeness | execute |
2 | Freshness | freshness | Freshness | execute |
3 | Latency | latency | Latency | execute |
4 | Validity | validity | Validity | execute |
Parameter
Parameters used to compute indicators.
Table: parameter
Attribute | Type | Description |
---|---|---|
id | Integer | Unique identifier of the indicator parameter, used as a primary key. |
value | Text | Indicator parameter value. |
created_date | Timestamp | Record created date. |
updated_date | Timestamp | Record last updated date. |
created_by_id | Integer | Foreign key of the user table, to indicate which user created the record. |
updated_by_id | Integer | Foreign key of the user table, to indicate which user updated the record. |
user_group_id | Integer | Foreign key of the user_group table, to indicate to which user group the record belongs to. |
parameter_type_id | Integer | Type of parameter, the combination of parameter_type_id and indicator_id must be unique. |
indicator_id | Integer | Foreign key of the indicator table, to indicate to which indicator belongs the parameter. The combination of parameter_type_id and indicator_id must be unique. |
Parameter Type
Parameter types describe the types of parameters can be used by MobyDQ to compute indicators.
Table: parameter_type
Attribute | Type | Description |
---|---|---|
id | Integer | Unique identifier of the parameter type, used as a primary key. |
name | Text | Type of indicator parameter, must be unique. |
description | Text | Description of the parameter type. |
created_date | Timestamp | Record created date. |
updated_date | Timestamp | Record last updated date. |
created_by_id | Integer | Foreign key of the user table, to indicate which user created the record. |
updated_by_id | Integer | Foreign key of the user table, to indicate which user updated the record. |
List of Parameter Type Values
id | name | description |
---|---|---|
1 | Alert operator | Operator used to compare the results of the indicator with the alert threshold. Example: ==, >, >=, <, <=, <> |
2 | Alert threshold | Numeric value used to evaluate the results of the indicator and determine if an alert must be sent. |
3 | Distribution list | List of e-mail addresses to which alerts must be sent. Example: ['email_1', 'email_2', 'email_3'] |
4 | Dimension | List of values to indicate dimensions in the results of the indicator. Example: ['dimension_1', 'dimension_2', 'dimension_3'] |
5 | Measure | List of values to indicate measures in the results of the indicator. Example: ['measure_1', 'measure_2', 'measure_3'] |
6 | Source | Name of the data source which serves as a reference to evaluate the quality of the data. |
7 | Source request | SQL query used to compute the indicator on the source system. |
8 | Target | Name of the data source on which to evaluate the quality of the data. |
9 | Target request | SQL query used to compute the indicator on the target system. |
Matrix of Parameter Types per Indicator Type
Parameter Type | Completeness | Freshness | Latency | Validity |
---|---|---|---|---|
Alert operator | Mandatory | Mandatory | Mandatory | Mandatory |
Alert threshold | Mandatory | Mandatory | Mandatory | Mandatory |
Distribution list | Mandatory | Mandatory | Mandatory | Mandatory |
Dimension | Optional | Optional | Optional | Optional |
Measure | Mandatory | Mandatory | Mandatory | Mandatory |
Source | Mandatory | N/A | Mandatory | N/A |
Source request | Mandatory | N/A | Mandatory | N/A |
Target | Mandatory | Mandatory | Mandatory | Mandatory |
Target request | Mandatory | Mandatory | Mandatory | Mandatory |