The HMG box Is a novel type of DNA-blndlng domain found in a diverse group of proteins. The HMG box superfamily comprises a.o. the High Mobility Group proteins HMG1 and HMG2, the nucleolar transcription factor UBF, the lymphoid transcription factors TCF-1 and LEF-1, the fungal mating-type genes mat-Mc and MATA1, and the mammalian sex-determining gene SRY. The superfamily dates back to at least 1,000 million years ago, as Its members appear in animals, plants and yeast. Alignment of all known HMG boxes defined an unusually loose consensus sequence. We constructed phylogenetlc trees connecting the members of the HMG box superfamily in order to understand their evolution. This analysis led us to distinguish two subfamilies: one comprising proteins with a single sequence-specific HMG box, the other encompassing relatively non sequence-specific DNA-bindlng proteins with multiple HMG boxes. By studying the extent of diversification of the superfamily, we found that the speed of evolution was very different within the various groups of HMG-box containing factors. Comparison of the evolution of the two boxes of ABF2 and of mtTF1 implied different diversification models for these two proteins. Finally, we provide a tree for the highly complex group of SRY-like ('Sox' genes), clustering at least 40 different loci that rapidly diverged in various animal lineages.