David R. Bickel, PhD News: Statomics Lab Contact: email 
More Recent Research and Software
(The page you are now viewing is no longer updated or maintained.)
Updated citations and hyperlinks
Research interests  Papers and software  

Foundations of statistics 


Statistical genomics 


Older statistical methodology & applications 

“...there is a strong element of intellectual arrogance present in all of us,
not excluding myself. We tend to think that we are "God's gift to the world," and that we are highly original in posing problems and suggesting solutions.”
— O. Kempthorne, p. 451, discussion on Lindley, D. V. (1971) The estimation
of many parameters (with discussion). In: Foundations of Statistical Inference,
eds. V. P. Godambe and D. A. Sprott, Toronto: Holt, Rinehart & Winston, pp. 435455.
Evidence in GenomeWide Association Studies
Y. Yang and D. R. Bickel, “Minimum description length and empirical Bayes methods of identifying SNPs associated with disease,” Technical Report, Ottawa Institute of Systems Biology, COBRA Preprint Series, Article 74, available at biostats.bepress.com/cobra/ps/art74 (2010). Full preprint  Software
Methods for Levels of Differential Gene Expression
Z. Montazeri, C. M. Yanofsky, and D. R. Bickel [the first two authors contributed equally], “Shrinkage estimation of effect sizes as an alternative to hypothesis testing followed by estimation in highdimensional biology: Applications to differential gene expression,” Statistical Applications in Genetics and Molecular Biology 9 (1) 23 (2010). Article  Software  Supplementary material  Draft
C. M. Yanofsky and D. R. Bickel, “Validation of differential gene expression algorithms: Application comparing fold change estimation to hypothesis testing,” BMC Bioinformatics 11, 63 (2010). Article  Draft
D. R. Bickel, “Correcting the estimated level of differential expression for gene selection bias: Application to a microarray study,” Statistical Applications in Genetics and Molecular Biology 7 (1) 10, (2008). Article
D. R. Bickel, “Degrees of differential gene expression: Detecting biologically significant expression differences and estimating their magnitudes,” Bioinformatics 20, 682688 (2004). Abstract and main article  Supplementary material  Software (Statomics)
application of a likelihood method for quantifying the strength of evidence
Gene Network Reconstruction and Inference of Gene Coexpression
D. R. Bickel, Z. Montazeri, P.C. Hsieh, M. Beatty, S. J. Lawit, and N. J. Bate, “Gene network reconstruction from transcriptional dynamics under kinetic model uncertainty: A case for the second derivative,” Bioinformatics 25, 772779 (2009). Open access (PDF)  Supplement & software  Data
D. R. Bickel, “Probabilities of spurious connections in gene networks: Application to expression time series,” Bioinformatics 21, 11211128 (2005). Abstract and main article  Uncorrected version  Supplementary material (corrected)  Scalable Fig. 3 (corrected)  Software (Statomics 0.4 fixed bug)
D. R. Bickel, “Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically,” Bioinformatics 19, 818824 (2003). Abstract and article  Software (PLATO)  Full preprint
SmallScale Estimators of the Local False Discovery Rate
D. R. Bickel, “Minimum description length methods of mediumscale simultaneous inference,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1009.5981 (2010). Full preprint
Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment,” Technical Report, Ottawa Institute of Systems Biology, Technical Report, Ottawa Institute of Systems Biology, arXiv:1201.0153 (2011). Full preprint  2010 seed  Software
D. R. Bickel, “Simple estimators of false discovery rates given as few as one or two pvalues without strong parametric assumptions,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1106.4490 (2011). Full preprint
D. R. Bickel, “Smallscale inference: Empirical Bayes and confidence methods for as few as a single comparison,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1104.0341 (2011). Full preprint
Multiple Hypothesis Testing and Applications to Differential Gene Expression
D. R. Bickel, “Estimating the null distribution to adjust observed confidence levels for genomescale screening,” Biometrics 67, 363370 (2011). Abstract and article  French abstract  Supplementary material  Full preprint
M. Guo, S. Yang, M. Rupe, B. Hu, D. R. Bickel, L. Arthur, and O. Smith, “Genomewide allelespecific expression analysis using Massively Parallel Signature Sequencing (MPSS) reveals cis and transeffects on gene expression in maize hybrid meristem tissue,” Plant Molecular Biology 66, 551563 (2008).
D. R. Bickel, “Errorrate and decisiontheoretic methods of multiple testing: Which genes have high objective probabilities of differential expression?” Statistical Applications in Genetics and Molecular Biology 3 (1) 8, (2004). Article  SPLUS software  R software  Full preprint [earlier version cited as 'Bickel, D. R. (2003), "Selecting an optimal rejection region for multiple testing: A decisiontheoretic alternative to FDR control, with an application to microarrays," Tech. rep., Medical College of Georgia']
D. R. Bickel, “On 'Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates': Does a large number of tests obviate confidence intervals of the FDR?” Technical Report, Medical College of Georgia, arXiv:qbio.GN/0404032 (2004). Technical report  Software (Statomics)
D. R. Bickel, “Reliably determining which genes have a high posterior probability of differential expression: A microarray application of decisiontheoretic multiple testing,” Technical Report, Medical College of Georgia, arXiv:qbio.QM/0402048 (2004). Technical report  SPLUS software  R software
D. R. Bickel, “Microarray gene expression analysis: Data transformation and multiplecomparison bootstrapping,” Computing Science and Statistics 34, 383400, Interface Foundation of North America (Proceedings of the 34th Symposium on the Interface, Montréal, Québec, Canada, April 1720, 2002). Abstract  Full article  Software (BioinfoStat)
D. R. Bickel, “Conservative identification of differentially expressed genes using cDNA or oligonucleotide microarrays: Inference about values of high probability density,” 2002 Proceedings of the American Statistical Association, Biometrics Section [CDROM], American Statistical Association: Alexandria, VA (2002). Abstract  Full article
Strength of Statistical Evidence
D. R. Bickel, “A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons,” Canadian Journal of Statistics 39, 610–631 (2011). Full article  Revised preprint  2010 draft
D. R. Bickel, “The strength of statistical evidence for composite hypotheses: Inference to the best explanation,” Statistica Sinica 22, 11471198 (2012). Full article  2010 version
D. R. Bickel, “Measuring support for a hypothesis about a random parameter without estimating its unknown prior,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1101.0305 (2011). Full preprint
more likelihood paradigm papers
Gametheoretic Strategies for Sets of Posteriors
D. R. Bickel, “Gametheoretic probability combination with applications to resolving conflicts between statistical methods,” International Journal of Approximate Reasoning 53, 880891 (2012). Full article  2011 preprint
D. R. Bickel, “Controlling the degree of caution in statistical inference with the Bayesian and frequentist approaches as opposite extremes,” Electronic Journal of Statistics 6, 686709 (2012). Full article (open access)  2011 preprint
D. R. Bickel, “Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing,” Technical Report, Ottawa Institute of Systems Biology, available from http://goo.gl/kCVUs (2012). 2012 preprint  2011 preprint
more "imprecise probability" papers
Confidence Posterior Distributions
D. R. Bickel, “Coherent frequentism: A decision theory based on confidence sets,” Communications in Statistics – Theory and Methods 41, 14781496 (2012). Full article (open access)  2009 preprint
D. R. Bickel, “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Statistical Applications in Genetics and Molecular Biology 11 (3), art. 7 (2012). Full article  2010 preprint
D. R. Bickel, “A priorfree framework of coherent inference and its derivation of simple shrinkage estimators,” Technical Report, Ottawa Institute of Systems Biology, available from http://goo.gl/aUSLr (2012). 2012 preprint
D. R. Bickel, “A frequentist framework of inductive reasoning,” Technical Report, Ottawa Institute of Systems Biology, arXiv:math.ST/0602377 (2009). Christmas revision
more confidence distribution papers
Robust Estimation of the Mode and Skewness
D. R. Bickel, “Robust estimators of the mode and skewness of continuous data,” Computational Statistics and Data Analysis 39, 153163 (2002). Abstract  Full preprint
D. R. Bickel, “Robust and efficient estimation of the mode of continuous data: The mode as a viable measure of central tendency,” Journal of Statistical Computation and Simulation 73, 899912 (2003); peerreviewed preprint: InterStat, November 2001, http://interstat.stat.vt.edu/interstat/articles/2001/abstracts/n01001.htmlssi. Abstract  Full article
Fractal Stochastic Process Models of DNA Evolution
The following papers describe fractal models of evolution and their compatibility with DNA data.
D. R. Bickel and B. J. West, “Multiplicative and fractal processes in DNA evolution,” Fractals 6, 211217 (1998). This paper provides an introduction to molecular evolution and its assumptions. Abstract
B. J. West and D. R. Bickel, “Fractionaldifference stochastic model of evolutionary substitutions in DNA sequences,” Physics Letters A 256, 188196 (1999). This paper also discusses assumptions made in models of DNA evolution. Abstract
D. R. Bickel, “Implications of fluctuations in substitution rates: Impact on the uncertainty of branch lengths and on relativerate tests,” Journal of Molecular Evolution 50, 381390 (2000). Abstract
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal Poisson process in agreement with mammalian sequence comparisons,” Molecular Biology and Evolution 15, 967977 (1998). Abstract
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal renewal point process in agreement with the dispersion of substitutions in mammalian genes,” Journal of Molecular Evolution 47, 551556 (1998). Abstract
B. J. West and D. R. Bickel, “Molecular evolution modeled as a fractal stochastic process,” Physica A 249, 544552 (1998). Abstract
Stochastic Intermittency Publications
D. R. Bickel, “Smoothing before estimating uncertainty, scaling, and intermittency: Application to short heart rate signals,” Fractals 11, 245252 (2003). Abstract  Full preprint
D. R. Bickel and D. Lai, “Asymptotic distribution of timeseries intermittency estimates: applications to economic and clinical data,” Computational Statistics and Data Analysis 37, 419431 (2001). Abstract  Full preprint
D. R. Bickel, “Estimating the intermittency of point processes with applications to human activity and viral DNA,” Physica A 265, 634648 (1999). Abstract
D. R. Bickel, “Simple estimation of intermittency in multifractal stochastic processes: Biomedical applications,” Physics Letters A 262, 251256 (1999). Abstract
D. R. Bickel, “Rest quantified by a fractal dimension of movement events: A biomedical application of intermittency estimation,” Fractals 8, 16 (2000). Abstract  Full article
D. R. Bickel, “Generalized entropy and multifractality of timeseries:
Relationship between order and intermittency,” Chaos, Solitons & Fractals
13, 491497 (2002). Abstract
D. R. Bickel, M. T. Verklan, and J. Moon, “Detection of anomalous diffusion using confidence intervals of the scaling exponent with application to preterm neonatal heart rate variability,” Physical Review E 58, 64406448 (1998). Abstract  Full article
M. T. Verklan, D. R. Bickel, and J. Moon, “Heart rate variability of preterm neonates quantified by energy entropy,” Nursing and Health Sciences 1, 103111 (1999). Abstract
D. R. Bickel, comment on "Sequential Monte Carlo for Bayesian Computation" (P. Del Moral, A. Doucet, A. Jasra) in Bayesian Statistics 8 (Oxford Science Publications, 2007, p. 140), available as arXiv:math.ST/0606557 (2006).  Abstract: Is there a class of static inference problems for which the backwardkernel approach is better suited than a mixture transition kernel that automatically adapts to the target distribution?
C. A. Ordonez, D. R. Bickel, V. C. Venezia, F. D. McDaniel, S. E. Matteson, and M. I. Molina, “Electronic ion energy loss calculations on the basis of the binary encounter approximation,” Journal of Nuclear Materials 264, 133140 (1999). Abstract  This article includes a description of the acceptreject Monte Carlo simulations of D. R. Bickel, The Stopping Power of Amorphous and Channelled Silicon at all Energies as Computed with the Binary Encounter Approximation, MA thesis, University of North Texas, Denton, Texas (1994).
The two gene expression sections include work applying empirical Bayes (decisiontheoretic) and BIC methodology.
Selected Abstracts
D. R. Bickel, “Robust and efficient estimation of the mode of continuous data: The mode as a viable measure of central tendency,” Journal of Statistical Computation and Simulation; peerreviewed preprint: InterStat, November 2001, http://interstat.stat.vt.edu/interstat/articles/2001/abstracts/n01001.htmlssi.
Although a natural measure of the central tendency of a sample of continuous data is its mode (the most probable value), the mean and median are the most popular measures of location due to their simplicity and ease of estimation. The median is often used instead of the mean for asymmetric data because it is closer to the mode and is insensitive to extreme values in the sample. However, the mode itself can be reliably estimated by first transforming the data into approximately normal data by raising the values to a real power, and then estimating the mean and standard deviation of the transformed data. With this method, two estimators of the mode of the original data are proposed: a simple estimator based on estimating the mean by the sample mean and the standard deviation by the sample standard deviation, and a more robust estimator based on estimating the mean by the median and the standard deviation by the standardized median absolute deviation. Both of these mode estimators were tested using simulated data drawn from normal (symmetric), lognormal (asymmetric), and Pareto (very asymmetric) distributions. The latter two distributions were chosen to test the generality of the method since they are not power transforms of the normal distribution. Each of the proposed estimators of the mode has a much lower variance than the mean and median for the two asymmetric distributions. When outliers were added to the simulations, the more robust of the two proposed mode estimators had a lower bias and variance than the median for the asymmetric distributions, especially when the level of contamination approached the 50% breakdown point. It is concluded that the mode is often a more reliable measure of location than the mean or median for asymmetric data. The proposed estimators also performed well relative to previous estimators of the mode. While different estimators are better under different conditions, the proposed robust estimator is reliable for a wide variety of distributions and contamination levels.
D. R. Bickel, “Microarray gene expression analysis: Data transformation and multiplecomparison bootstrapping,” Computing Science and Statistics 34, 383400, Interface Foundation of North America (Proceedings of the 34th Symposium on the Interface, Montréal, Québec, Canada, April 1720, 2002).
A simple transform function is proposed to preprocess the intensity of gene expression, where the intensity can be that of a colored dye for cDNA microarrays or a gauge of probe matching for oligonucleotide arrays. A new measure of skewness is introduced to show that the transform function effectively reduces the asymmetry of intensity values for Affymetrix data of Golub et al. (1999). This transform approaches a logarithmic transform for large intensities, but approaches a linear transform for small intensities, so that the effect of spurious ratios of small intensities is avoided. When the intensity is the average difference (AD) score, the suggested transform function preserves the stochastic nature of AD values rather than resetting negative values to an arbitrary positive value. A conservative estimator of the foldchange based on this transform is proposed. After the Bcell ALL and AML data of Golub et al. (1999) was transformed, a nonparametric bootstrapping method found that the number of genes considered differentially expressed is 48 when controlling the familywise error rate at the 5% level and 572 when controlling the falsediscovery rate at the 1% level.
D. R. Bickel, “Conservative identification of differentially expressed genes using cDNA or oligonucleotide microarrays: Inference about values of high probability density,” 2002 Proceedings of the Joint Statistical Meetings of the American Statistical Association (Biometrics Section).
Many methods of identifying differential expression in genes depend on testing the null hypothesis of equal mean expression for each gene across two groups, even though a difference in the mean does not imply any difference in the distribution center. This can lead to many genes considered differentially expressed that might only differ in the tails of their expression distributions. A more conservative approach is to specifically test whether distributions differ in a parameter of location that does not depend on the tails. This can be accomplished by bootstrapping outlierrejecting estimators of location parameters. Genes identified as differentially expressed can then be used in classification. In distinguishing microarrays from patients with different types of leukemia, the expression values of many more genes were found to differ in their means than were found to differ in their central values. The data was preprocessed using a transform that approaches a logarithmic transform for large intensities, but approaches a linear transform for small intensities, so that the effect of spurious ratios of small intensities was avoided; negative AD values were not arbitrarily truncated.
D. R. Bickel, “Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically,” Bioinformatics.
Motivation: The success of each method of cluster analysis depends on how well its underlying model describes the patterns of expression. Outlierresistant and distributioninsensitive clustering of genes are robust against violations of model assumptions. Results: A measure of dissimilarity that combines advantages of the Euclidean distance and the correlation coefficient is introduced. The measure can be made robust using a rank order correlation coefficient. A robust graphical method of summarizing the results of cluster analysis and a biological method of determining the number of clusters are also presented. These methods are applied to the data of DeRisi et al. (1997), showing that rankbased methods perform better than logbased methods.
Return to Top  Full article  Software (PLATO)
D. R. Bickel, “Robust estimators of the mode and skewness of continuous data,” Computational Statistics and Data Analysis 39, 153163 (2002).
Measures of location based on the shortest half sample, including the shorth and the location of the least median of squares, are more robust than the median to outliers, but less robust to contamination near the location. Although such measures can estimate the mode, the proposed estimator of the mode, based on densest half ranges, has a much lower bias while having similar robustness. Like the median, this mode estimator has the highest breakdown point possible: the estimator has meaning if less than half the sample consists of outliers. The mode is more robust than the median in that the mode estimates are unaffected by outliers, whereas the median is influenced by each outlier. Robustness in this sense is quantified by the rejection point, the largest absolute value that is not rejected, which is low for the mode but infinite for the median. Even though the median is changed less by contamination near the location than is the mode, outliers generally pose more of a problem to estimation than contamination near the location, so the mode is more robust for data that may have a large number of outliers. A robust estimator of skewness is based on this mode estimator. Copyright (c) 2002 Elsevier Science B.V. All rights reserved.
D. R. Bickel, “Smoothing before estimating uncertainty, scaling, and intermittency: Application to short heart rate signals,” Fractals
Three aspects of time series are uncertainty (dispersion at a given time scale), scaling (timescale dependence), and intermittency (inclination to change dynamics). Simple measures of dispersion are the mean absolute deviation and the standard deviation; scaling exponents describe how dispersions change with the time scale. Intermittency has been defined as a difference between two scaling exponents. After taking a moving average, these measures give descriptive information, even for short heart rate records. For this data, dispersion and intermittency perform better than scaling exponents.
D. R. Bickel and D. Lai, “Asymptotic distribution of timeseries intermittency estimates: applications to economic and clinical data,” Computational Statistics and Data Analysis 37, 419431 (2001).
The intermittency of a time series can be defined as its normalized difference in scaling parameters. We establish the central limit theorem for the estimates of intermittency under the null hypothesis of a random walk. Simulations of random walks indicate that the distribution of intermittency estimates is slightly negatively skewed and positively biased, but that the skewness and bias approach zero as the length n of the random walks increases. We provide a formula by which the sample variance of the intermittency estimates of these simulations can be used to approximate the standard error of the intermittency for any large n. These results can be used to test whether the intermittency estimate of an observed long time series is significantly greater than zero, the intermittency of a random walk. This test reveals that the intermittency estimates of the S&P 500 index and of the heart rate of a human adult are significantly positive. The hypothesis testing proposed in this paper can also be applied to other observed time series to determine whether their intermittency estimates are sufficiently high for the series to be considered intermittent, or whether their estimates are small enough to be consistent with a random walk. Copyright (c) 2001 Elsevier Science B.V. All rights reserved.
D. R. Bickel, “Rest quantified by a fractal dimension of movement events: A biomedical application of intermittency estimation,” Fractals 8, 16 (2000).
The intermittency of a timeseries, i.e. the extent to which it departs from slowlyvarying, unifractal dynamics, can often be quantified by simple scalefree statistics. For fractal point processes, singular measures, and certain other models that describe physical and biological phenomena, the correlation codimension, C_{2}, quantifies intermittency. C_{2} of human activity during the night quantifies restfulness in that it is negatively correlated with the average activity level. However, C_{2} of activity appears to be more sensitive to the use of steroids than the average activity level. Copyright (c) 2000 World Scientific Publishing Company.
D. R. Bickel, “Generalized entropy and multifractality of timeseries: Relationship between order and intermittency,” Chaos, Solitons & Fractals 13, 491497 (2002).
The intermittency of a timeseries is its tendency to have large departures
from its characteristic dynamics. The quantification of intermittency has
applications to the study of physical, biological, and economic phenomena.
Intermittency has been quantified by multifractality, the extent to which
generalized Hurst exponents differ. As an alternative descriptor of intermittent
processes, we present a nonextensive measure of order, based on the Tsallis
entropy of a sequence of symbols corresponding to the timeseries. Like multifractality,
nonextensive order increases with intermittency. Nonextensive order has the
advantage that it does not assume scaling in the timeseries, whereas a scaling
region has to be identified in order to estimate multifractality. However,
unlike multifractality, nonextensive order requires the selection of parameters
used to generate subsequences of symbols from the timeseries.
Both nonextensive order and multifractality can distinguish timeseries that
have different levels of intermittency. In distinguishing simulated point
processes of D=0.1 from those of D=0.5, nonextensive order and
multifractality performed about equally well and nonextensive order performed
better than its extensive counterpart. Multifractality more accurately distinguished
processes with D=0.5 from those of D=0.9. Which statistic better
describes a timeseries depends on the specific application. Copyright (c)
2002 Elsevier Science B.V. All rights reserved.
B. J. West and D. R. Bickel, “Fractionaldifference stochastic model of evolutionary substitutions in DNA sequences,” Physics Letters A 256, 188196 (1999).
The number of molecular substitutions occurring in a DNA sequence over a given time is described by a fractional difference random walk model. This is empirically motivated stochastic model of molecular evolution and does not address the detailed evolutionary mechanisms that lead to the substitution of nucleotides. This fractal stochastic process yields a Fano Factor, the ratio of the variance to the mean in the number of molecular substitutions, that increases as a power law in time. This prediction agrees with the observed statistics across 49 different genes in mammals. The fractionaldifference model of molecular evolution is episodic and can be made consistent with the punctuated equilibrium model of macroevolution. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
D. R. Bickel, “Implications of fluctuations in substitution rates: Impact on the uncertainty of branch lengths and on relativerate tests,” Journal of Molecular Evolution 50, 381390 (2000).
Many tests of the lineagedependence of substitution rates, computations of the error of evolutionary distances, and simulations of molecular evolution assume that the rate of evolution is constant in time within each lineage descended from a common ancestor. However, estimates of the index of dispersion of numbers of mammalian substitutions suggest that the rate has timedependent variations consistent with a fractalGaussianrate Poisson process, which assumes common descent without assuming rate constancy. While this model does not affect certain relativerate tests, it substantially increases the uncertainty of branch lengths. Thus, fluctuations in the rate of substitution cannot be neglected in calculations that rely on evolutionary distances, such as the confidence intervals of divergence times and certain phylogenetic reconstructions. The fractalGaussianrate Poisson process is compared and contrasted with previous models of molecular evolution, including other Poisson processes, the fractal renewal process, a Lvystable process, a fractionaldifference process, and a logBrownian process. The fractal models are more compatible with mammalian data than the nonfractal models considered, and they may also be better supported by Darwinian theory. Although the fractalGaussianrate Poisson process has not been proven to have better agreement with data or theory than the other fractal models, its Gaussian nature simplifies the exploration of its impact on evolutionary distance errors and relativerate tests. Copyright (c) 2000 SpringerVerlag New York Inc.
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal Poisson process in agreement with mammalian sequence comparisons,” Molecular Biology and Evolution 15, 967977 (1998).
The fractal doublystochastic Poisson process (FDSPP) model of molecular evolution, like other doublystochastic Poisson models, agrees with the high estimates for the index of dispersion found from sequence comparisons. Unlike certain previous models, the FDSPP also predicts a positive geometric correlation found between the index of dispersion and the mean number of substitutions. Such a relationship is statistically proven herein using comparisons between 49 mammalian genes. There is no characteristic rate associated with molecular evolution according to this model, but there is a scaling relationship in rates according to a fractal dimension of evolution. The FDSPP is a suitable replacement for the homogeneous Poisson process in tests of the lineagedependence of rates and in estimating confidence intervals for divergence times. As opposed to other fractal models, this model can be interpreted in terms of Darwinian selection and drift. Copyright (c) 1998 Society for Molecular Biology and Evolution.
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal renewal point process in agreement with the dispersion of substitutions in mammalian genes,” Journal of Molecular Evolution 47, 551556 (1998).
A fractal renewal point process (FRPP) is used to model molecular evolution in agreement with the relationship between the variance and mean numbers of nonsynonymous and synonymous substitutions in mammals. Like other episodic models such as the doublystochastic Poisson process, this model accounts for the large variances observed in amino acid substitution rates, but unlike certain other episodic models, it also accounts for the increase in the index of dispersion with the mean number of substitutions in Ohta's (1995) data. We find that this correlation is significant for nonsynonymous substitutions at the 1% level and for synonymous substitutions at the 10% level, even after removing lineage effects and when using Bulmer's (1989) unbiased estimator of the index of dispersion. This model is simpler than most other overdispersed models of evolution in the sense that it is fully specified by a single interevent probability distribution. Interpretations in terms of chaotic dynamics and in terms of chance and selection are discussed. Copyright (c) 1998 SpringerVerlag New York Inc.
B. J. West and D. R. Bickel, “Molecular evolution modeled as a fractal stochastic process,” Physica A 249, 544552 (1998).
Modeling the rate of nucleotide substitutions in DNA as a dichotomous stochastic process with an inverse powerlaw correlation function describes evolution by a fractal stochastic process (FSP). This FSP model agrees with recent findings on the relationship between the variance and mean number of synonymous and nonsynonymous substitutions in 49 different genes in mammals, that being a powerlaw increase in the ratio of the variance to the mean, the index of dispersion, with the number of substitutions in a protein. The probability of a given number of substitutions occurring in a time t is determined by a fractional diffusion equation whose solution is a truncated Lvy distribution implying that evolution is a Lvy process in time and yields the same functional behavior for the variance in the number of substitutions as the FSP model. In addition to obtaining these relationships, the FSP model implies lognormal statistics for the index of dispersion as a function of the mean number of substitutions in a protein, which is confirmed in the regression of the FSP model to data. Lognormal statistics suggest that molecular evolution can be viewed as a multiplicative stochastic process, rather than the linear additive process of Darwinian selection and drift. Copyright (c) 1998 Elsevier Science B.V. All rights reserved.
D. R. Bickel, “Estimating the intermittency of point processes with applications to human activity and viral DNA,” Physica A 265, 634648 (1999).
The intermittency of a point process is the extent to which the number of events in a time window has pronounced departures from typical values. Combining point process and multifractal formalisms indicates that the correlation codimension can be used to quantify intermittency. The correlation codimension is easily estimated and is simply related to other second order scaling exponents, such as those of the Fano factor and spectral density. The correlation codimensions are derived for various uncorrelated, fractal, and fractalrate point processes. In addition, the estimation of intermittency as the correlation codimension of experimental events is illustrated with applications to experimental data. Human activity during bed rest is highly intermittent, while other human activity and viral DNA composition are nonintermittent. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
D. R. Bickel, “Simple estimation of intermittency in multifractal stochastic processes: Biomedical applications,” Physics Letters A 262, 251256 (1999).
A number of physical and biological phenomena are intermittent in the sense that they tend to have large departures from their typical dynamics. The intermittency of a multifractal can be qualified and quantified by differential or nondifferential multifractality, the extent to which the generalized Hurst exponents differ. Multifractality is related to the generalized dimension of a singular measure, but also applies to other signals, including noises, walks, anomalous diffusion, and point processes. Multifractality has uses in datamodel and datadata comparisons; e.g., the multifractality of the heart rate reveals the inadequacy of unifractal models and distinguishes healthy subjects from those with heart failure. In addition, the multifractality of human activity quantifies restfulness at night. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
D. R. Bickel and B. J. West, “Multiplicative and fractal processes in DNA evolution,” Fractals 6, 211217 (1998).
Darwin's theory of evolution by natural selection revolutionized science in the nineteenth century. Not only providing a new paradigm for biology, the theory formed the basis for analogous interpretations of complex systems studied by other disciplines, such as sociology and psychology. With the subsequent linking of macroscopic phenomena to microscopic processes, the Darwinian interpretation was adopted to patterns observed in molecular evolution by assuming that natural selection operates fundamentally at the level of DNA. Thus, patterns of molecular evolution have important implications in many fields of science. Although the evolution rate of a given gene seems to be of approximately the same order of magnitude in all species, genes appear to differ in rate from one another by orders of magnitude, a fact which standard theory does not adequately explain. An understanding of the statistics of rates across different genes may shed light on this problem. The evolution rates of mammalian DNA, based on recent estimates of numbers of nonsynonymous substitutions in 49 genes of humans, rodents, and artiodactyls, are studied. We find that the rate variations are better described by lognormal statistics, as would be the case for a multiplicative process, than by Gaussian statistics, which would correspond to a linear, additive process. Thus, we introduce a multiplicative evolution statistical hypothesis (MESH), in which the theoretical explanation of these statistics requires the evolution of different substitution rates in different genes to be a multiplicative process in that each rate results from the interaction of a number of interdependent contingency processes. Lognormal statistics lend support to fractal process models of DNA substitutions, including anomalous diffusion processes and fractal stochastic point processes, such as the fractal renewal process and the fractal doublystochastic Poisson process. The realization of a fractal process is a random selfsimilar time series with a powerlaw autocorrelation function, spectral density, and Fano factor over many time scales. Copyright (c) 1998 World Scientific Publishing Company.
D. R. Bickel, M. T. Verklan, and J. Moon, “Detection of anomalous diffusion using confidence intervals of the scaling exponent with application to preterm neonatal heart rate variability,” Physical Review E 58, 64406448 (1998).
The scaling exponent of the root mean square (rms) displacement quantifies the roughness of fractal or multifractal time series; it is equivalent to other secondorder measures of scaling, such as the powerlaw exponents of the spectral density and autocorrelation function. For selfsimilar time series, the rms scaling exponent equals the Hurst parameter, which is related to the fractal dimension. A scaling exponent of 0.5 implies that the process is normal diffusion, which is equivalent to an uncorrelated random walk; otherwise, the process can be modeled as anomalous diffusion. Higher exponents indicate that the increments of the signal have positive correlations, while exponents below 0.5 imply that they have negative correlations. Scaling exponent estimates of successive segments of the increments of a signal are used to test the null hypothesis that the signal is normal diffusion, with the alternate hypothesis that the diffusion is anomalous. Dispersional analysis, a simple technique which does not require long signals, is used to estimate the scaling exponent from the slope of the linear regression of the logarithm of the standard deviation of binned data points on the logarithm of the number of points per bin. Computing the standard error of the scaling exponent using successive segments of the signal is superior to previous methods of obtaining the standard error, such as that based on the sum of squared errors used in the regression; the regression error is more of a measure of the deviation from powerlaw scaling than of the uncertainty of the scaling exponent estimate. Applying this test to preterm neonate heart rate data, it is found that time intervals between heart beats can be modeled as anomalous diffusion with negatively correlated increments. This corresponds to power spectra between 1/f and 1/f^{2}, whereas healthy adults are usually reported to have 1/f spectra, suggesting that the immaturity of the neonatal nervous system affects the scaling properties of the heart rate. Copyright (c) 1998 The American Physical Society.
M. T. Verklan, D. R. Bickel, and J. Moon, “Heart rate variability of preterm neonates quantified by energy entropy,” Nursing and Health Sciences 1, 103111 (1999).
Identifying variables predictive of neurobehavioral sequelae is a key objective in the study of highrisk neonates. Examination of heart rate variability (HRV) characteristics may be a finer discriminator of the neonate's response to physiologic stressors than the mean heart rate. The energy entropy of the heartbeat tachogram, computed in four different domains, was used to quantify the HRV in 13 preterm neonates. The entropies of energies were computed from 1024 interbeat time intervals obtained once per week from 26 to 35 weeks postconceptional age (PCA). The energy entropy computed in three of the domains, like the standard deviation of intervals, distinguished between the 10 neonates that were measured at 35 weeks PCA with 100% specificity and 67% sensitivity, but did not distinguish between healthy and unhealthy neonates at earlier ages. The findings suggest that energy entropy may be a discerning measure of physiologic stress in the preterm infant, although future research is needed to refine the test and determine statistical significance. Copyright (c) 1999 The Japanese Urological Association.
C. A. Ordonez, D. R. Bickel, V. C. Venezia, F. D. McDaniel, S. E. Matteson, and M. I. Molina, “Electronic ion energy loss calculations on the basis of the binary encounter approximation,” Journal of Nuclear Materials 264, 133140 (1999).
Electronic ion energy loss calculations on the basis of the binary encounter approximation are presented for protons in oxygen, nitrogen and silicon. Calculations using both an analytical approach as well as a Monte Carlo approach are found to agree well with experimental data even down to energies below the stopping cross section maximum. Energy loss calculations for protons in silicon under channeling conditions are included and predictions are made for channeling in the <110> direction at low energies (5 to 500 keV). Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
Last modified September 4, 2013
Author information. personal web page
This web site is not affiliated with the University of Ottawa.