1. Executive Summary

MILRD used the funds to integrate our Metagenomics + Microbial Surveillance VTP into courses at high-need high schools to enroll 297 students.

2. Demographics

Impact Assessment

To understand the impact of this project, we conducted a pre/post self-efficacy with 170 students from the project who completed the Metagenomics + Microbial Surveillance VTP.

Methods

This study assessed the Metagenomics + Microbial Surveillance VTP as an intervention to increase knowledge of: genomics data format/structure, metagenomics sample processing & analysis, Linux/Bash terminal use, and applications of bioinformatics tools.

The study design is a within-subjects (pre-post) design where we assess relevant dependent measures immediately before and immediately after workshop participation. There is no control group.

Students were asked to rate their knowledge level for each of these questions on a 6-point, whole number, scale from 0 (None) to 6 (Expert):

  • I understand how genomics data are structured/formatted
  • I understand how metagenomics data are collected and processed
  • I understand how metagenomics data are analyzed
  • I understand how to use the Linux/Bash terminal
  • I understand how bioinformatics tools can be used to answer a scientific question

Independent Variable: Time of Assessment (pre, post)

Dependent Measures: knowledge of (1) genomics data format/structure, (2) metagenomics sample processing, (3) metagenomics analysis, (4) Linux/Bash terminal use, and (5) applications of bioinformatics tools.

Results

Overall, students reported substantial increases in knowledge across all assessed categories following VTP completion: (a.) genomics data format knowledge (Cohen’s d = 2.67), (b.) metagenomics data collection/processing knowledge (Cohen’s d = 4.15), (c.) metagenomics analysis knowledge (Cohen’s d = 3.28), (d.) linux terminal/bash knowledge (Cohen’s d = 2.37), (e.) R/RStudio knowledge (Cohen’s d = 3.29), (f.) bioinformatics application knowledge (Cohen’s d = 2.21).

Please note: because students replied to the survey with whole-number answers, some of the lines that connect pre/post responses are on top of each other; thus 20 distinct lines aren’t always available.

a. Genomics data format

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of genomics data format knowledge before completing the intervention (M = 0.550, SD = 0.999) than after (M = 3.55, SD = 1.23), Mdiff = 3.0, t(20) = 8.45, p < .0001. The effect size is very large (Cohen’s d = 2.67).

b. Metagenomics data collection

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of metagenomics data collection/processing knowledge before completing the intervention (M = 0.450, SD = 0.605) than after (M = 3.85, SD = 0.988), Mdiff = 3.40, t(20) = 13.1, p < .0001. The effect size is very large (Cohen’s d = 4.15).

c. Metagenomics analysis

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of metagenomics analysis knowledge before completing the intervention (M = 0.400, SD = 0.598) than after (M = 3.65, SD = 1.27), Mdiff = 3.25, t(20) = 10.4, p < .0001. The effect size is very large (Cohen’s d = 3.28).

d. Linux/Bash terminal

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of linux terminal/bash knowledge before completing the intervention (M = 0.450, SD = 1.28) than after (M = 3.65, SD = 1.42), Mdiff = 3.20, t(20) = 7.48, p < .0001. The effect size is very large (Cohen’s d = 2.37).

e. R/RStudio

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of R/RStudio knowledge before completing the intervention (M = 0.350, SD = 0.8123) than after (M = 3.80, SD = 1.24), Mdiff = 3.45, t(20) = 10.4, p < .0001. The effect size is very large (Cohen’s d = 3.29).

f. Bioinformatics applications

A paired-samples t-test showed that, as hypothesized, participants reported lower levels of bioinformatics application knowledge before completing the intervention (M = 1.25, SD = 1.25) than after (M = 3.95, SD = 1.19), Mdiff = 2.7, t(20) = 6.99, p < .0001. The effect size is very large (Cohen’s d = 2.21).