Supplementary Material for Karen D Crow, Peter F. Stadler, Vincent J. Lynch, Chris Amemiya, and Günter P. Wagner. 2005 The fish specific Hox cluster duplication is coincident with the origin of teleosts. Molecular Biology and Evolution, in press. HoxD4 amino acid sequence analysis Gunter P. Wagner Date document finished: 1/20/05 Objective: analyze the fish HoxD4 genes to clarify whether the fugu/zebrafish HoxD4a are orthologous. Add new sequences one by one, and restrict the analysis to the most critical species to determine orthology of the new sequences. 1) Orthology of zebrafish HoxD4 and euteleost HoxD4a: Approach: use complete amino acid sequences of exon 1 of zebrafish and euteleost genes to establish orthology of zebrafish and euteleost HoxD4a. Sequences: shark (Hfr) HoxD4, coelacanth HoxD4 (Lme), zebrafish HoxD4, pufferfish (Sne) HoxD4a/b, medaka (Ola) HoxD4a/b, fugu (Tru) HoxD4b (8 sequences) ClustalW amino acid alignment yields a good alignment, with few corrections necessary. For analysis I used the file D4ab_aaAligNoGap.phy. The original alignment is 150 positions, the no-gap alignment is 127 amino acid positions. The analysis was done with the protein sequence algorithms of Phylip3.6 (NJ, MP, ML). Results: In all analysis the zebrafish HoxD4 gene groups with the a-clade of euteleosts. The bootstrap support is reasonable (77/70/57) for (NJ/MP/ML). Hence it is most likely that the zebrafish HoxD4 gene is orthologous to HoxD4a. This is consistent with the HoxD9 result of Prohaska and Stadler (2004).
NJ: 97 77 93 57 2
MP: 70 49 52 97 3
ML: 92 51 57 79 4
2) Identity of elopomorph genes: For both elopomorph species, tarpon and eel, we obtained two paralogs. The orthology of these genes was investigated with an alignment of the known zebrafish and euteleost genes and the four elopomorph sequences. The file is D4ab_aaAroMat copy.phy. In all analyses the four elopomorph sequences form a clade nested in the HoxD4a clade with support values (97/86/74). Hence the elopomorph paralogs are probably the result of a gene duplication which happened after the split of the zebrafish and the elopomorph lineage but before the split of the eel and tarop lineage. NJ: 84 Aro3D4a 66 64 97 MatD4a MatD4 31 47 AroD4n1 68 86 5
MP: 81 MatD4a 39 86 AroD4n1 86 Aro3D4an 67 78 MatD4 79 91 6
ML: 50 MatD4a 51 54 AroD4n1 74 Aro3D4an 49 52 MatD4 66 74 7
3) The orthology of Hiodon paralogs: From the goldeye two genes were found and compared to the known HoxD4a/b sequences and outgroups, shark and coelacanth(d4ab_aahalnogap copy.phy). The trees support an affiliation of one of the Hiodon sequences with the HoxD4a and HoxD4b clade respectively, suggesting orthology. In this analysis the most significant observation is the association of the HalD4b gene with the euteleost HoxD4b genes, because the association of the other paralog with the HoxD4a clade could in principle be artifactual, as the clustering of the coelacanth sequence shows. NJ: HalD4b 86 99 86 HalD4a 33 80 55 8
MP: 99 99 HalD4a 52 50 HalD4b 99 9
ML: HalD4b 50 86 HalD4a 40 27 45 43
4) The orthology of Amia HoxD4 gene: One sequence was recovered from bowfin (D4ab_aaAcaLme copy.phy). This sequence is consistently placed inside the HoxD4a clade, but this result is certainly artifactual because even when the duplication occurred before the split of the Amia/Teleost lineage one would predict the Amia sequence to diverge before the zebrafish sequence, because the monophyly of teleosts is not in question. In summary this data is not able to constrain the duplication event downwards. On the other hand it also does not provide evidence for a duplication date different from that suggested by the HoxA11 and HoxB5 data. NJ: 74 AcaD4 74 50 79 95 11
MP: 60 78 AcaD4 50 39 OlaD4 98 12
ML: 50 59 AcaD4 51 65 90 13
Conclusions: The zebrafish HoxD4 gene is orthologous to HoxD4a. The two paralogs of Hiodon are orthologous to the HoxD4a and HoxD4b respectively, i.e. the duplication occurred before the most recent common ancestor of teleosts. The two paralogs of HoxD4 genes recovered in this study arose through one additional duplication event in the stem of the elopomorph clade. The present analysis cannot decide whether the duplication occurred after the most recent common ancestor of Amia and teleosts, but it also does not contain any evidence against a duplication along with HoxA11 and HoxB5. 14