Description
ABSTRACT
This paper is objectively conceived to find the maximum plausible extent of shared synteny in the chromosomes that could possibly be encountered across an exhaustively-searched, large number of orthologous genomes pertinent to a given pair of (test) species. A viable solution is proposed thereof is as follows: First, by searching a sample-space with a limited number of underlying orthologs of the test species, relevant sparse synteny data (for example, as seen in an Oxford-grid), is gathered and the associated entropy details are ascertained. Then, by judiciously extending such entropy details (however, with the constraint posed by the fixed number of chromosomes in each test species), the maximum synteny that could be encountered when an ensemble of a large number of ortholog pairs is exhaustively searched, is elucidated in terms of appropriately defined metrics. These metrics are derived to specify the maximum extent of plausible shared-synteny, as a function of the number of ortholog-pairs compared and the associated upper and lower stochastic bounds. The analysis performed refers to the following pairs of test species: Mouse versus Human, Medaka versus Human and Medaka versus Zebrafish. In essence, the proposed approach is new and computationally feasible in finding the maximum plausible extent of shared-synteny of the test species; and, it is found consistent with the underlying stochastic basis of Shannon information, entropy-theoretics and Schur-convexity.
Key words: Chromosomal synteny, synteny correlation, conserved shared-synteny, statistical association, Oxford-grid, Shannon’s entropy, persisting mutual-information, Schur-convexity