Impact of the choice of reference genome on the ability of the core genome SNV methodology to distinguish strains of Salmonella enterica serovar Heidelberg
Salmonella enterica serovar Heidelberg (S. Heidelberg) is one of the top serovars causing human salmonellosis. The core genome single nucleotide variant pipeline (cgSNV) is one of several whole genome based sequence typing methods used for the laboratory investigation of foodborne pathogens. SNV detection using this method requires a reference genome. The purpose of this study was to investigate the impact of the choice of the reference genome on the cgSNV-informed phylogenetic clustering and inferred isolate relationships. We found that using a draft or closed genome of S. Heidelberg as reference did not impact the ability of the cgSNV methodology to differentiate among 145 S. Heidelberg isolates involved in foodborne outbreaks. We also found that using a distantly related genome such as S. Dublin as choice of reference led to a loss in resolution since some sporadic isolates were found to cluster together with outbreak isolates. In addition, the genetic distances between outbreak isolates as well as between outbreak and sporadic isolates were overall reduced when S. Dublin was used as the reference genome as opposed to S. Heidelberg.