This section first describes a generic method for calculating Pr(Di| Hj) any type of data that can be calculated for single positions in a sequence alignment, using chemical mapping as an example. We then extend this method to data that can be calculated for pairs of positions in an alignment, such as thermodynamic pairing probabilities and mutual information.
Per-position data, such as a pattern of light and dark bands on a gel derived from chemical mapping, indicates whether the base at a particular position is paired but says nothing about the identity of the partner. Any structure, by providing a list of base pairs, implicitly divides the positions into two groups: paired and unpaired. The better a structure picks out a set of particularly dark bands corresponding to paired (or unpaired, depending on the chemical) positions, the better the mapping data support it. We focus on unpaired positions because they are rarer than paired positions.
All structures pick out groups of 'paired' and
'unpaired' scores from the same population of
per-position scores. Consequently, the
mean and variance of the population
can be calculated directly. The best
structure, then, is the structure whose unpaired
positions have the most different mean score from the
population mean. This can be assessed using a
standard z-test, where SEM, the
standard error of the mean of a randomly
chosen sample of size n, is
σ√n. The most
significant distance Dmax
is that for which (
- μ)/SEM is
maximized, picking out the best structure
Hmax.
Having found the structure Hmax with the best support, we need to find the conditional probability of each structure given the data. By hypothesis, the population of scores at unpaired positions differs from the overall population. To find the probability of the observed scores for each structure, we thus use a standard 2-sample t test to compare the scores of the unpaired positions in the best structure and in the structure currently under consideration:

... where
1 and
2
are the two sample means, n1 and
n2 are the two sample sizes, and
s1 and
s2 are the two sample
standard deviations.
The t test gives the probability that we would see a mean as bad (depending on whether the best score is high or low) as that observed for each structure if the scores for its unpaired bases were drawn from the same population as the scores for the best structure's unpaired bases. This is the same as finding Pr(Di|Hj) for each structure.
In the particular case of chemical mapping, the chemical typically cleaves the sequence at particular bases that are paired or unpaired, or modifies certain bases in a way that prevents primer extension. When the RNA is treated, reverse-transcribed, amplified, end-labeled, and run on a sequencing gel, the intensity of the bands at each position in the sequence indicate the extent to which that position was modified. Because the efficiency of cleavage differs for different nucleotides, we calculate the scores for each type of base separately and combine the results using the general method for combining different types of data.
For chemical mapping, this method allows far more flexibility than does fixing particular bases as paired or unpaired after examining the sequencing gel, because it allows uncertainty in the assignment of paired/unpaired states. This is important, because the results of chemical mapping can be influenced by several factors other than the secondary structure (including 'docking' between loops and helices, and inflexibility in unpaired regions at close-packed helix junctions).
Several methods, notably mutual information, provide data about pairs of positions in an alignment rather than about individual positions. Our strategy here is similar to that with per-position data, except that instead of scores for N positions in the alignment, there are scores for each of the N(N - 1)/2 possible combinations of two positions. We need to determine whether the pairs that make up a particular structure have surprisingly high or low scores. Specifically, the question is whether the mean of the sample of all possible pairs that corresponds to the actual pairs in a structure differs from the expected mean of a set of the same number of pairs picked at random.
Even if every position in a sequence were paired, there could only be only N/2 pairs, so the number of actual pairs can only be a small fraction of the number of possible pairs. Again, we calculate the population mean and standard deviation for the set of all possible pairs, and calculate Hmax as the structure whose set of pair scores would be most surprising as a random sample from the set of possible scores using the z-test. Each structure is then compared to Hmax via a 2-sample t-test, treating the scores from that structure and the scores from the best structure as the two samples. The result is the probability of getting a mean score as bad as that found in each structure if the true distribution were the population from which the scores in the best structure were sampled, which is Pr(Di|Hj).
The RNAsubopt program in the Vienna package provides, for a single sequence, the probability that each base pairs with each other base in the ensemble of all possible structures. By averaging these probabilities across the set of sequences, we get an idea of how frequently the positions pair across the alignment. The best-supported structure has pairs with a particularly high average probability.
For any two columns in the alignment, we can ask how often the bases in the two columns could participate in a base pair (giving a static estimate of pairing, contrasted with the dynamic estimate based on the sequence variations presented in the next section). Here we calculate, for each pair of columns, the fraction of sequences in which the two bases are the potential pairs (GC), (CG), (AU), (UA), (GU), or (UG) rather than some other combination. The best-supported structure picks out pairs of positions that have relatively few mismatches across the alignment, i.e. a high probability of being paired.
If two columns are paired, it should be possible to predict the base in one column from the base in the other. The mutual information between two positions in an alignment is defined as the difference between the uncertainty about the two columns taken individually and the uncertainty about the two columns taken together.
In this case, uncertainty has a specific technical meaning from information theory: one bit of uncertainty is the same as the uncertainty about a fair coin toss (or any experiment that has two equiprobable outcomes). The information conveyed by a particular position p in a sequence alignment is the uncertainty H about which character c is present at that position p in any one of the sequences chosen at random:

As an example of mutual information, suppose position p in an alignment could be A or C, and position q could be G or U. If p and q are not paired with each other, then the state of pshould be independent of the state of q: in other words, if q is G, p should have equal chances of being A or C, leading to a low mutual information score. In contrast, if p and q are paired, we might expect p to be A whenever q is U, and to be C whenever q is G, leading to a high mutual information score. Thus the formula for the mutual information Mp,q between two positions p and q is:

... where Hp,q is calculated as for Hp only with 16 possible characters instead of 4, excluding gaps. The best-supported structure thus picks out pairs of positions with surprisingly high mutual information scores.