00% 27.98 +/- 3.40 892.61 +/- 204.62 Thermobaculum 2 1 1 0 100.00% 56.02 +/- 11.51 1550.79 +/- 673.39 Thermotogae 11 0 6 5 54.55% 40.19 +/- 6.51 1976.74 +/- 160.46 Verrucomicrobia 4 3 1 0 100.00% 55.24 +/- 8.47 3664.91 +/- 1649.61 Total 1173 696 269 208 82.27% * Average GC content and standard deviations (SD) were calculated according to the different strains in the phylum. $Average length was calculated
by averaging the complete genome length in the phylum. The acquisition of foreign DNA may modify compositional bias, and GC content change is a predominant outcome of this process. Another outcome of foreign DNA insertion is the appearance of GIs, which may change the virulence or function of the host strain (Figure 1D). In this study, we calculated GC content deviations for all the bacterial genomes. Dinaciclib datasheet Then, we searched the genomic sequence for GIs by identifying the genomic segments with GC contents significantly different from the mean value of the genome (i.e., greater than three times the standard deviation). From all of the genomes analyzed, 20,541 GIs were detected, according to the above criteria, with lengths from 2 to 80 kb, depending on the size of the sliding window used. 3.2 GIs are located next to sGCSs Bacterial genomes selleck chemical exhibit strong sGCSs
signals, which Sorafenib manufacturer is easy to understand because the genomes of different strains often share one replicon (Figure 2 AB). For a better comparison, we aligned all the genomes at the ori, and calculated relative genomic positions by dividing them with the length of each genome. sGCSs and pGIs were then plotted according to their relative genomic positions. When aligned at the origin and marked with relative distances, the genomes had an overrepresentation of sGCSs at 1/3, 1/2, and 3/4 marks. (Figure 2 AB). Furthermore, we found
that aside from their special distribution (Figure 2 A), sGCSs are closely correlated with GIs. These GIs are thought to have come from lateral gene transfer (LGT) events between different species but not from vertical inheritance due to their different genomic features. Based on the correlation between sGCSs and GIs, we suspect that sGCS regions are hotspots for horizontal DNA transfer in bacterial genomes, Figure 2 Distribution of GI, sGCS, and PAIs in the genome. (A) Scatter plot of the positions of GIs vs. sGCSs. For each genome, we coupled the positions of sGCSs and GIs. (B) Distribution of sGCSs, GIs, and PAIs in the genome. (C) Frequency of Ds along the genome with different sGCSs groups. (D) Gene classification according to COG functions in GIs (red) and all of the genomes.