Most studies require multiple forms to capture all necessary participant information: a medical history form, a family history form, a demographic information form, a previous diagnoses form, etc. If any of the forms capture the same data—which is likely—there’s a potential for data discrepancies.

As researchers, our natural inclination is to capture as much information as possible in each form (more is not always better). When designing a form, we think, ‘well it can’t hurt to ask for the full name again, and maybe the birth date, let’s add the previous diagnoses in there as well.’ What you’re left with is a study with 8-10 forms that all capture the same data.

Is this a problem? Well, it does introduce complexity in data entry and analysis for very little (if any) benefit.

When you capture duplicate data in different forms and the fields aren’t filled in exactly the same, you’re left wondering which field contains the correct information. You’ll then need to create data quality checks to verify the consistency of these fields. For example, let’s pretend you collect birth date on two separate forms. Best-case scenario? The data quality check alerts you that you have different birth dates. You’ll be forced to check the paper form or contact the participant again to determine which birth date is correct. Worst-case scenario? You don’t have data quality checks in place so you’re analyzing the data based on a potentially incorrect birth date (ouch).


The best way to avoid redundant data fields is to design forms that capture unique data. For example, a family contact form should only capture family contact information and should not contain medical history details. In the same manner, you don’t need to add contact information in a diagnoses or family history form. By designing forms that capture unique data, you’ll always know exactly where to find the data you need.

During times when adding a duplicate field is unavoidable, you can take precautions to identify discrepant data. The data quality checks I previously mentioned are a great way to double check that duplicate fields consistently display accurate data.

Remember to always keep data analysis in mind during form design. If you avoid collecting redundant data,  you’ll save yourself major headaches during data analysis.

