Developing a multi-level system to assess and signal the quality of processed data: the Destination Canada case

DESTINATION MANAGEMENT ORGANIZATION

Destination Canada is a Crown corporation wholly owned by the Canadian Government and founded under the Canadian Tourism Commission Act. It is responsible for promoting the Canada destination by attracting tourists and investment, and balancing the benefits for the local territories. DMO activities are supported by extensive research aimed at conducting evidence-based marketing activities and providing important information to key stakeholders, including the private sector, government, and communities.

DESTINATION CONTEXT AND NEEDS

In 2022 Destination Canada launched the Canadian Tourism Data Collective, a centralized national platform for the collection and sharing of tourism data. Powered by advanced analytics systems and AI, it has translated into a wide range of accessible data products including dashboards, reports and publications, maps, and interactive tools.

To conduct in-depth research on the tourism scenario of Canada, a large and multifaceted country, the Data Collective needs to rely on several data sources, including:

  • Official statistics. The platform collects more than 100 public datasets, provided by National Statistics Offices.
  • Private partnership data sources. The project can count on about 30 industry partners that actively collaborate by sharing relevant data in exchange for non-monetary benefits (e.g. logo visibility).
  • Paid private data sources. To gain a more in-depth view of the tourism sectors, it is also necessary to integrate paid data sets from about 15 partners that share data under specific contractual agreements.

Given the variety of sources and formats, the project needed to be supported from the very beginning by a system to monitor data quality and report any issues.

IMPLEMENTED SOLUTION

The Data Collective adopts a structured system to keep data quality under control and to report any issues. The assessment approach is based on four key elements:

  1. Overview. The monitoring system is based on a common and shared understanding of key data quality variables, the most prominent of which can be identified as:
  2. Completeness, which is the degree to which the dataset meets the expected boundaries (e.g. geographic coverage, time coverage);
  3. Volume, meaning datasets should guarantee statistical significance through an adequate amount of findings;
  4. Consistency, meaning data must be unified internally and formatted the same way;
  5. Freshness, as data should always be up to date;
  6. Validity, representing compliance with specific data processing formats.
  7. Context-specific standards. As the Canada case highlights, in assessing the quality of a dataset there are no “one-size-fits-all” rules: tailor-made monitoring measures need to be developed for different data sources, based on their specific characteristics. To better ensure receiving high-quality data, predefined quality expectations are built directly into contractual agreements with data partners.
  8. Alerting system. The platform is designed to automatically alert possible data quality issues to affected users at different levels. The solution implemented by the Data Collective is made of a three-level alerting system: level 1 indicates a small issue (e.g. a slightly uncommon level of outliers), level 2 indicates a mid-sized problem (e.g. a significantly unusual number of outliers), and level 3 is highly critical (e.g. the dataset is seriously incomplete). Data marked with a level 3 warning is blocked from entering the BI platform, so that it can’t be used by the data analysis teams to develop public dashboards until the quality issue is managed and solved.
  9. Issue management. Level 3 alerts require careful handling. If the identified problem is the result of noncompliance of the dataset with pre-defined rules and standards, efforts should be made to try and solve the issue together with the data provider.

MAIN BENEFITS

Having a formalized, as well as continuously implemented and revised, data quality management system ensures the reliability and the reputation of the organization and its data products.
In addition, the inclusion of quality monitoring as part of the general data intelligence project has made it easier to collect the right amount of resources for its implementation and maintenance, whereas it would have been more difficult to negotiate specific public funds for retrospective implementation.

MAIN CHALLENGES

  • Contracts. Signing agreements with private data providers is a challenging aspect of the process, as industry players tend to have very restrictive data sharing policies.Furthermore, contracting is complicated as it is the stage in which data quality expectations must be defined and agreed upon.
  • Post-agreement data retention. For the long-term maintenance of data products, a key challenge is to negotiate with partners the right to use previously shared data even after the contract expiration date.
  • Representation issues. The use of public and official data sources involves a problem of data granularity, since not all data is shared with province-level detail. The representation challenge involves, on one side, the different provinces and territories (13 in total) and on the other, remote rural and under-represented areas.

FUTURE PERSPECTIVES

In less than two years of activity, the Data Collective Project has proven effective in strengthening Destination Canada’s primary role as research center for the national tourism industry. The constant effort towards integrating new data sources and setting high quality standards paves the way for further development; for example, two new data products will be launched in 2024 with the aim of facilitating valuable connections between local destination stakeholders, potential investors, and tourists.

Siamo a tua disposizione per informazioni e assistenza

Martina Vertemati

Martina Vertemati

Acquisti e abbonamenti Da Lunedì al Venerdì, dalle 09 alle 18
Alessia Barone

Alessia Barone

Assistenza Da Lunedì al Venerdì, dalle 09 alle 18
Developing a multi-level system to assess and signal the quality of processed data: the Destination Canada case

Le migliori Aziende italiane si aggiornano su Osservatori.net