Asseco Aggregation Engine Data Governance Policy Template In order to make sure that the results provided by an engine or engine complex are always available and correct, the input data must be monitored for availability and reviewed for completeness and accuracy. The following is a selection of policies an organization can tailor and adopt to make sure that this takes place. 1 of 5
Continuous Availability of Inputs It is desirable that any input to the engine complex is continuously available - ie, that the end of validity of one entry is the same as the start of validity for the following entry, for all generations of the element in question. This enables the engine to provide results based on data in the past, even if the calculations needed were not implemented (or indeed, imagined) at the time. A policy that ensures the continuous availability of input elements will also improve the general performance of results calculations, since fetching remote inputs can be time-consuming. Automated Sources Automated data sources must be configured to provide continously valid data even in the face of temporary outages in the provider or the engine itself. Use appropriate soft validity settings for pull sources, such as database queries. Push sources are usually considered valid until a new entry is successfully uploaded. Manual Inputs To ensure that manual data is kept up to date, warning reports must be constructed that highlight manually sourced elements about to go out of validity. Use the CACHE_METADATA function to determine the authorization groups required, and the expiry time, for all relevant manual entries. it is possible (with the proper authorization) to "fill in the blanks" in a timeline using the manual sourcing option, by setting a valid-from and valid-to date for the input manually. To fill in backdated data for automated sourced, temporarily convert it to a manual source, enter the backdated entry, and convert it back. Audit reports will then show that the source data was entered in the present, and by whom. The Data Governance Policy must specify which manual inputs to monitor, and what actions to take when manual inputs become invalid. 2 of 5
Automated Data Validation For data sources where the data quality is not assured, engine validation mechanisms must ensure that calculations are only based on valid data. In addition, the engine must be able to alert human users when input data fails validation, to allow remedial action to take place as soon as possible. It is best practice to perform validation on a separate engine from the calculations and reporting that is done; This has two benefits: 1. 2. The consuming engine can ensure that there is a continuous availability timeline for an element, even if a particular data source is temporarily invalid. The validation rules are typically organization-specific, and the engines implementing them not subject to externallysourced upgrades in the same manner as for instance a Solvency II SCR calculation or QRT report. In the validation engine, create a single element for each source to be validated, with an appropriately named input and output. Use the "Create pipeline element" to create validation elements and reports for all sources. Define the specific validation rules in the created validation element, and determine which failure mode that a specific validation failure should trigger. The Data Governance Policy must specify the general validation policy, including failure modes for failed validations. 3 of 5
Human Approval A human approval mechanism can ensure input data is properly sanitized before use, and can ensure that results calculations and reports are sanity checked before distribution. Inputs approval Approval at the input data level ensures that errors in input data that were not (or could not be) detected by the automated validation mechanisms are caught, preventing calculations on invalid data to be done. This improves the data quality of the final results, since there then won't be versions of the final report that are invalid. Inputs approval is, like input validation, best handled in a dedicated engine, for the same reasons. After creating the validation element, use the "Create pipeline element" tool to create approval elements for each source to be approved. Observe that all approvals are, by their nature, manual inputs, and should thus be covered by warning reports as described under continous availability of inputs, to ensure that elements are reapproved as quickly as possible after changing. Note also that it is generally impossible to ensure a continous timeline of an approved source within an engine, since there will always be unavoidable gaps in the validity in the time from an element is changed until it is reapproved. For this reason, input approvals should be used with caution, and alternate means of creating identical, but unapproved, reports using the inputs should be provided. The Data Governance Policy must specify which inputs require approval, which roles are authorized to approve them, and what actions to take if an input is not approved. Results Approval Approval at the results level ensures that the final results or reports are read and agreed upon by a human reviewer before it is broadcast to the final audience (the public, supervisory agencies or the company board). Results approval can be added to the reporting engine, or managed in a dedicated engine like inputs approval. They are otherwise defined and configured exactly like input approvals. The Data Governance Policy must specify which results require approval, which roles are authorized to approve them, and what actions to take if a result is not approved. Tiered Approvals Tiered approvals can be used in scenarios where different groups of line employees are responsible for preparing and approving subsections of a report, and a department head is responsible for collating and approving a summary or compilation. Tiered approvals are configured like other approvals, by adding approvals to elements that already has approvals. The configurator should be careful to ensure that the chained approval manual sources requires different authorizations, and that there exists separation of duty between the roles. When describing tiered approvals in the Data Governance Policy, it is usually best to group all the approvals together, listing the different sub-approvals after the top-level approval. 4 of 5
Periodic Results Organizations may want to prepare, publish and archive reports to a schedule, using inputs from specific dates. Archive reports are best managed on a separate engine, where an authorized user can create new elements and results. To create an archive report, use the "Create new element" tool, naming both a result and a (manual) source. The archived result name should be same as the live result name, suffixed by an appropriate identifier - for example, a quarter-end copy of the result "Balance Sheet" could be named "Balance Sheet 2011Q3". Change the manual source to be an AAE source, fetching from the live result, and use the "Create a pipeline element" to create an intermediate copy. Change the formula in the pipeline element from LOOKUP('elementname_2011q3') to LOOKUPAT('elementname_2011q3', TIME('2011-10-01')). Change the domain of the formula (eg, from A1 to A1:XX999) and copy the styling and labels manually from the source, if those should be included. Note that archive reports can be created at any time after the fact, provided it existed (or could be calculated for) the time specified. Archive reports created using this process does not have an expiry time, since they depend only on immutable data in the past; In practice, they are valid for 100 years. The Data Governance Policy must specify which results to archive and publish, the frequency of the process, and who is responsible for this procedure. 5 of 5