7 Challenges of Mastering Clinical Data Registries: Complex Data Provenance

This series outlines the Seven Informatics Challenges for Clinical Data Registries, the questions you should ask when addressing research data management, and how our RexStudy platform is engineered to ensure your research teams generate high-quality, reliable, and statistically sound data.

Building CDRs that support acquisition, curation, and dissemination of clinical research data poses a number of unique challenges. For example, a center-level CDR system needs to accumulate data across multiple studies, time points, and data-types. At the same time, it needs to support research operations workflows that are heterogeneous across projects. CDRs must operate within a complex ecology of data sources, consumers, and governance. The complex ecology poses a number of informatics challenges to delivering CDRs.

Informatics Challenge #4: Complex Data Provenance

Knowing how data was collected, by whom, from whom, under what conditions, is critical for downstream use of research data. Systems must capture provenance information as part of the operational workflow and store it in a way that makes data context available to enable effective data reuse.

Questions you should ask before building your CDR

  • What mechanisms will be available in the system to allow users to track complex provenance?

  • Is there sufficient structure and granularity in the data models to represent complex contexts for data elements?

  • If not, can it be added?

  • How is provenance data connected with research data?

  • Can the system output both in a unified way?

  • Does the data provenance strategy integrate with the schema volatility strategy?

  • How is provenance preserved where instrument and protocols change over time?

How RexStudy maintains Data Provenance and Context

RexStudy is a unified system that preserves provenance and context for all data across research operations workflows and data collection methods. Instrument definitions are easily configurable, supporting the high levels of metadata variety and volatility found in research, and allows a systematic approach to versioning and data merging when versions are changed. Research staff can configure workflows to match research operation protocols and thereby support the inherent variability in research with an elegant user experience.

Don’t miss the first three parts of this series:

Part 1: Metadata Variety

Part 2: Schema Volatility

Part 3: Workflow Variability

If you enjoyed this article, register to receive notification of our latest posts, webinars, white papers, and more using the form at the top of our DataBytes blog page here.