(The page you are now viewing is no longer updated or maintained.)
|Research interests||Papers and software|
Foundations of statistics
Older statistical methodology & applications
“...there is a strong element of intellectual arrogance present in all of us,
not excluding myself. We tend to think that we are "God's gift to the world," and that we are highly original in posing problems and suggesting solutions.”
— O. Kempthorne, p. 451, discussion on Lindley, D. V. (1971) The estimation of many parameters (with discussion). In: Foundations of Statistical Inference, eds. V. P. Godambe and D. A. Sprott, Toronto: Holt, Rinehart & Winston, pp. 435-455.
Y. Yang and D. R. Bickel, “Minimum description length and empirical Bayes methods of identifying SNPs associated with disease,” Technical Report, Ottawa Institute of Systems Biology, COBRA Preprint Series, Article 74, available at biostats.bepress.com/cobra/ps/art74 (2010). Full preprint | Software
Z. Montazeri, C. M. Yanofsky, and D. R. Bickel [the first two authors contributed equally], “Shrinkage estimation of effect sizes as an alternative to hypothesis testing followed by estimation in high-dimensional biology: Applications to differential gene expression,” Statistical Applications in Genetics and Molecular Biology 9 (1) 23 (2010). Article | Software | Supplementary material | Draft
C. M. Yanofsky and D. R. Bickel, “Validation of differential gene expression algorithms: Application comparing fold change estimation to hypothesis testing,” BMC Bioinformatics 11, 63 (2010). Article | Draft
D. R. Bickel, “Correcting the estimated level of differential expression for gene selection bias: Application to a microarray study,” Statistical Applications in Genetics and Molecular Biology 7 (1) 10, (2008). Article
D. R. Bickel, “Degrees of differential gene expression: Detecting biologically significant expression differences and estimating their magnitudes,” Bioinformatics 20, 682-688 (2004). Abstract and main article | Supplementary material | Software (Statomics)
D. R. Bickel, Z. Montazeri, P.-C. Hsieh, M. Beatty, S. J. Lawit, and N. J. Bate, “Gene network reconstruction from transcriptional dynamics under kinetic model uncertainty: A case for the second derivative,” Bioinformatics 25, 772-779 (2009). Open access (PDF) | Supplement & software | Data
D. R. Bickel, “Probabilities of spurious connections in gene networks: Application to expression time series,” Bioinformatics 21, 1121-1128 (2005). Abstract and main article | Uncorrected version | Supplementary material (corrected) | Scalable Fig. 3 (corrected) | Software (Statomics 0.4 fixed bug)
D. R. Bickel, “Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically,” Bioinformatics 19, 818-824 (2003). Abstract and article | Software (PLATO) | Full preprint
D. R. Bickel, “Minimum description length methods of medium-scale simultaneous inference,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1009.5981 (2010). Full preprint
Z. Yang, Z. Li, and D. R. Bickel, “Empirical Bayes estimation of posterior probabilities of enrichment,” Technical Report, Ottawa Institute of Systems Biology, Technical Report, Ottawa Institute of Systems Biology, arXiv:1201.0153 (2011). Full preprint | 2010 seed | Software
D. R. Bickel, “Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1106.4490 (2011). Full preprint
D. R. Bickel, “Small-scale inference: Empirical Bayes and confidence methods for as few as a single comparison,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1104.0341 (2011). Full preprint
D. R. Bickel, “Estimating the null distribution to adjust observed confidence levels for genome-scale screening,” Biometrics 67, 363-370 (2011). Abstract and article | French abstract | Supplementary material | Full preprint
M. Guo, S. Yang, M. Rupe, B. Hu, D. R. Bickel, L. Arthur, and O. Smith, “Genome-wide allele-specific expression analysis using Massively Parallel Signature Sequencing (MPSS) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue,” Plant Molecular Biology 66, 551-563 (2008).
D. R. Bickel, “Error-rate and decision-theoretic methods of multiple testing: Which genes have high objective probabilities of differential expression?” Statistical Applications in Genetics and Molecular Biology 3 (1) 8, (2004). Article | S-PLUS software | R software | Full preprint [earlier version cited as 'Bickel, D. R. (2003), "Selecting an optimal rejection region for multiple testing: A decision-theoretic alternative to FDR control, with an application to microarrays," Tech. rep., Medical College of Georgia']
D. R. Bickel, “On 'Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates': Does a large number of tests obviate confidence intervals of the FDR?” Technical Report, Medical College of Georgia, arXiv:q-bio.GN/0404032 (2004). Technical report | Software (Statomics)
D. R. Bickel, “Reliably determining which genes have a high posterior probability of differential expression: A microarray application of decision-theoretic multiple testing,” Technical Report, Medical College of Georgia, arXiv:q-bio.QM/0402048 (2004). Technical report | S-PLUS software | R software
D. R. Bickel, “Microarray gene expression analysis: Data transformation and multiple-comparison bootstrapping,” Computing Science and Statistics 34, 383-400, Interface Foundation of North America (Proceedings of the 34th Symposium on the Interface, Montréal, Québec, Canada, April 17-20, 2002). Abstract | Full article | Software (BioinfoStat)
D. R. Bickel, “Conservative identification of differentially expressed genes using cDNA or oligonucleotide microarrays: Inference about values of high probability density,” 2002 Proceedings of the American Statistical Association, Biometrics Section [CD-ROM], American Statistical Association: Alexandria, VA (2002). Abstract | Full article
D. R. Bickel, “A predictive approach to measuring the strength of statistical evidence for single and multiple comparisons,” Canadian Journal of Statistics 39, 610–631 (2011). Full article | Revised preprint | 2010 draft
D. R. Bickel, “Measuring support for a hypothesis about a random parameter without estimating its unknown prior,” Technical Report, Ottawa Institute of Systems Biology, arXiv:1101.0305 (2011). Full preprint
D. R. Bickel, “Game-theoretic probability combination with applications to resolving conflicts between statistical methods,” International Journal of Approximate Reasoning 53, 880-891 (2012). Full article | 2011 preprint
D. R. Bickel, “Controlling the degree of caution in statistical inference with the Bayesian and frequentist approaches as opposite extremes,” Electronic Journal of Statistics 6, 686-709 (2012). Full article (open access) | 2011 preprint
D. R. Bickel, “Blending Bayesian and frequentist methods according to the precision of prior information with applications to hypothesis testing,” Technical Report, Ottawa Institute of Systems Biology, available from http://goo.gl/kCVUs (2012). 2012 preprint | 2011 preprint
D. R. Bickel, “Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals,” Statistical Applications in Genetics and Molecular Biology 11 (3), art. 7 (2012). Full article | 2010 preprint
D. R. Bickel, “A prior-free framework of coherent inference and its derivation of simple shrinkage estimators,” Technical Report, Ottawa Institute of Systems Biology, available from http://goo.gl/aUSLr (2012). 2012 preprint
D. R. Bickel, “A frequentist framework of inductive reasoning,” Technical Report, Ottawa Institute of Systems Biology, arXiv:math.ST/0602377 (2009). Christmas revisionFull preprint | Mode estimation software
D. R. Bickel, “Robust and efficient estimation of the mode of continuous data: The mode as a viable measure of central tendency,” Journal of Statistical Computation and Simulation 73, 899-912 (2003); peer-reviewed preprint: InterStat, November 2001, http://interstat.stat.vt.edu/interstat/articles/2001/abstracts/n01001.html-ssi. Abstract | Full article
The following papers describe fractal models of evolution and their compatibility with DNA data.
D. R. Bickel and B. J. West, “Multiplicative and fractal processes in DNA evolution,” Fractals 6, 211-217 (1998). This paper provides an introduction to molecular evolution and its assumptions. Abstract
B. J. West and D. R. Bickel, “Fractional-difference stochastic model of evolutionary substitutions in DNA sequences,” Physics Letters A 256, 188-196 (1999). This paper also discusses assumptions made in models of DNA evolution. Abstract
D. R. Bickel, “Implications of fluctuations in substitution rates: Impact on the uncertainty of branch lengths and on relative-rate tests,” Journal of Molecular Evolution 50, 381-390 (2000). Abstract
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal Poisson process in agreement with mammalian sequence comparisons,” Molecular Biology and Evolution 15, 967-977 (1998). Abstract
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal renewal point process in agreement with the dispersion of substitutions in mammalian genes,” Journal of Molecular Evolution 47, 551-556 (1998). Abstract
B. J. West and D. R. Bickel, “Molecular evolution modeled as a fractal stochastic process,” Physica A 249, 544-552 (1998). Abstract
D. R. Bickel and D. Lai, “Asymptotic distribution of time-series intermittency estimates: applications to economic and clinical data,” Computational Statistics and Data Analysis 37, 419-431 (2001). Abstract | Full preprint
D. R. Bickel, “Estimating the intermittency of point processes with applications to human activity and viral DNA,” Physica A 265, 634-648 (1999). Abstract
D. R. Bickel, “Simple estimation of intermittency in multifractal stochastic processes: Biomedical applications,” Physics Letters A 262, 251-256 (1999). Abstract
D. R. Bickel, “Generalized entropy and multifractality of time-series:
Relationship between order and intermittency,” Chaos, Solitons & Fractals
13, 491-497 (2002). Abstract
D. R. Bickel, M. T. Verklan, and J. Moon, “Detection of anomalous diffusion using confidence intervals of the scaling exponent with application to preterm neonatal heart rate variability,” Physical Review E 58, 6440-6448 (1998). Abstract | Full article
M. T. Verklan, D. R. Bickel, and J. Moon, “Heart rate variability of preterm neonates quantified by energy entropy,” Nursing and Health Sciences 1, 103-111 (1999). Abstract
D. R. Bickel, comment on "Sequential Monte Carlo for Bayesian Computation" (P. Del Moral, A. Doucet, A. Jasra) in Bayesian Statistics 8 (Oxford Science Publications, 2007, p. 140), available as arXiv:math.ST/0606557 (2006). | Abstract: Is there a class of static inference problems for which the backward-kernel approach is better suited than a mixture transition kernel that automatically adapts to the target distribution?
C. A. Ordonez, D. R. Bickel, V. C. Venezia, F. D. McDaniel, S. E. Matteson, and M. I. Molina, “Electronic ion energy loss calculations on the basis of the binary encounter approximation,” Journal of Nuclear Materials 264, 133-140 (1999). Abstract | This article includes a description of the accept-reject Monte Carlo simulations of D. R. Bickel, The Stopping Power of Amorphous and Channelled Silicon at all Energies as Computed with the Binary Encounter Approximation, MA thesis, University of North Texas, Denton, Texas (1994).
D. R. Bickel, “Robust and efficient estimation of the mode of continuous data: The mode as a viable measure of central tendency,” Journal of Statistical Computation and Simulation; peer-reviewed preprint: InterStat, November 2001, http://interstat.stat.vt.edu/interstat/articles/2001/abstracts/n01001.html-ssi.
Although a natural measure of the central tendency of a sample of continuous data is its mode (the most probable value), the mean and median are the most popular measures of location due to their simplicity and ease of estimation. The median is often used instead of the mean for asymmetric data because it is closer to the mode and is insensitive to extreme values in the sample. However, the mode itself can be reliably estimated by first transforming the data into approximately normal data by raising the values to a real power, and then estimating the mean and standard deviation of the transformed data. With this method, two estimators of the mode of the original data are proposed: a simple estimator based on estimating the mean by the sample mean and the standard deviation by the sample standard deviation, and a more robust estimator based on estimating the mean by the median and the standard deviation by the standardized median absolute deviation. Both of these mode estimators were tested using simulated data drawn from normal (symmetric), lognormal (asymmetric), and Pareto (very asymmetric) distributions. The latter two distributions were chosen to test the generality of the method since they are not power transforms of the normal distribution. Each of the proposed estimators of the mode has a much lower variance than the mean and median for the two asymmetric distributions. When outliers were added to the simulations, the more robust of the two proposed mode estimators had a lower bias and variance than the median for the asymmetric distributions, especially when the level of contamination approached the 50% breakdown point. It is concluded that the mode is often a more reliable measure of location than the mean or median for asymmetric data. The proposed estimators also performed well relative to previous estimators of the mode. While different estimators are better under different conditions, the proposed robust estimator is reliable for a wide variety of distributions and contamination levels.
D. R. Bickel, “Microarray gene expression analysis: Data transformation and multiple-comparison bootstrapping,” Computing Science and Statistics 34, 383-400, Interface Foundation of North America (Proceedings of the 34th Symposium on the Interface, Montréal, Québec, Canada, April 17-20, 2002).
A simple transform function is proposed to preprocess the intensity of gene expression, where the intensity can be that of a colored dye for cDNA microarrays or a gauge of probe matching for oligonucleotide arrays. A new measure of skewness is introduced to show that the transform function effectively reduces the asymmetry of intensity values for Affymetrix data of Golub et al. (1999). This transform approaches a logarithmic transform for large intensities, but approaches a linear transform for small intensities, so that the effect of spurious ratios of small intensities is avoided. When the intensity is the average difference (AD) score, the suggested transform function preserves the stochastic nature of AD values rather than resetting negative values to an arbitrary positive value. A conservative estimator of the fold-change based on this transform is proposed. After the B-cell ALL and AML data of Golub et al. (1999) was transformed, a nonparametric bootstrapping method found that the number of genes considered differentially expressed is 48 when controlling the family-wise error rate at the 5% level and 572 when controlling the false-discovery rate at the 1% level.
D. R. Bickel, “Conservative identification of differentially expressed genes using cDNA or oligonucleotide microarrays: Inference about values of high probability density,” 2002 Proceedings of the Joint Statistical Meetings of the American Statistical Association (Biometrics Section).
Many methods of identifying differential expression in genes depend on testing the null hypothesis of equal mean expression for each gene across two groups, even though a difference in the mean does not imply any difference in the distribution center. This can lead to many genes considered differentially expressed that might only differ in the tails of their expression distributions. A more conservative approach is to specifically test whether distributions differ in a parameter of location that does not depend on the tails. This can be accomplished by bootstrapping outlier-rejecting estimators of location parameters. Genes identified as differentially expressed can then be used in classification. In distinguishing microarrays from patients with different types of leukemia, the expression values of many more genes were found to differ in their means than were found to differ in their central values. The data was preprocessed using a transform that approaches a logarithmic transform for large intensities, but approaches a linear transform for small intensities, so that the effect of spurious ratios of small intensities was avoided; negative AD values were not arbitrarily truncated.
Motivation: The success of each method of cluster analysis depends on how well its underlying model describes the patterns of expression. Outlier-resistant and distribution-insensitive clustering of genes are robust against violations of model assumptions. Results: A measure of dissimilarity that combines advantages of the Euclidean distance and the correlation coefficient is introduced. The measure can be made robust using a rank order correlation coefficient. A robust graphical method of summarizing the results of cluster analysis and a biological method of determining the number of clusters are also presented. These methods are applied to the data of DeRisi et al. (1997), showing that rank-based methods perform better than log-based methods.
Measures of location based on the shortest half sample, including the shorth and the location of the least median of squares, are more robust than the median to outliers, but less robust to contamination near the location. Although such measures can estimate the mode, the proposed estimator of the mode, based on densest half ranges, has a much lower bias while having similar robustness. Like the median, this mode estimator has the highest breakdown point possible: the estimator has meaning if less than half the sample consists of outliers. The mode is more robust than the median in that the mode estimates are unaffected by outliers, whereas the median is influenced by each outlier. Robustness in this sense is quantified by the rejection point, the largest absolute value that is not rejected, which is low for the mode but infinite for the median. Even though the median is changed less by contamination near the location than is the mode, outliers generally pose more of a problem to estimation than contamination near the location, so the mode is more robust for data that may have a large number of outliers. A robust estimator of skewness is based on this mode estimator. Copyright (c) 2002 Elsevier Science B.V. All rights reserved.
Three aspects of time series are uncertainty (dispersion at a given time scale), scaling (time-scale dependence), and intermittency (inclination to change dynamics). Simple measures of dispersion are the mean absolute deviation and the standard deviation; scaling exponents describe how dispersions change with the time scale. Intermittency has been defined as a difference between two scaling exponents. After taking a moving average, these measures give descriptive information, even for short heart rate records. For this data, dispersion and intermittency perform better than scaling exponents.
The intermittency of a time series can be defined as its normalized difference in scaling parameters. We establish the central limit theorem for the estimates of intermittency under the null hypothesis of a random walk. Simulations of random walks indicate that the distribution of intermittency estimates is slightly negatively skewed and positively biased, but that the skewness and bias approach zero as the length n of the random walks increases. We provide a formula by which the sample variance of the intermittency estimates of these simulations can be used to approximate the standard error of the intermittency for any large n. These results can be used to test whether the intermittency estimate of an observed long time series is significantly greater than zero, the intermittency of a random walk. This test reveals that the intermittency estimates of the S&P 500 index and of the heart rate of a human adult are significantly positive. The hypothesis testing proposed in this paper can also be applied to other observed time series to determine whether their intermittency estimates are sufficiently high for the series to be considered intermittent, or whether their estimates are small enough to be consistent with a random walk. Copyright (c) 2001 Elsevier Science B.V. All rights reserved.
The intermittency of a time-series, i.e. the extent to which it departs from slowly-varying, unifractal dynamics, can often be quantified by simple scale-free statistics. For fractal point processes, singular measures, and certain other models that describe physical and biological phenomena, the correlation co-dimension, C2, quantifies intermittency. C2 of human activity during the night quantifies restfulness in that it is negatively correlated with the average activity level. However, C2 of activity appears to be more sensitive to the use of steroids than the average activity level. Copyright (c) 2000 World Scientific Publishing Company.
The intermittency of a time-series is its tendency to have large departures
from its characteristic dynamics. The quantification of intermittency has
applications to the study of physical, biological, and economic phenomena.
Intermittency has been quantified by multifractality, the extent to which
generalized Hurst exponents differ. As an alternative descriptor of intermittent
processes, we present a nonextensive measure of order, based on the Tsallis
entropy of a sequence of symbols corresponding to the time-series. Like multifractality,
nonextensive order increases with intermittency. Nonextensive order has the
advantage that it does not assume scaling in the time-series, whereas a scaling
region has to be identified in order to estimate multifractality. However,
unlike multifractality, nonextensive order requires the selection of parameters
used to generate subsequences of symbols from the time-series.
Both nonextensive order and multifractality can distinguish time-series that have different levels of intermittency. In distinguishing simulated point processes of D=0.1 from those of D=0.5, nonextensive order and multifractality performed about equally well and nonextensive order performed better than its extensive counterpart. Multifractality more accurately distinguished processes with D=0.5 from those of D=0.9. Which statistic better describes a time-series depends on the specific application. Copyright (c) 2002 Elsevier Science B.V. All rights reserved.
The number of molecular substitutions occurring in a DNA sequence over a given time is described by a fractional difference random walk model. This is empirically motivated stochastic model of molecular evolution and does not address the detailed evolutionary mechanisms that lead to the substitution of nucleotides. This fractal stochastic process yields a Fano Factor, the ratio of the variance to the mean in the number of molecular substitutions, that increases as a power law in time. This prediction agrees with the observed statistics across 49 different genes in mammals. The fractional-difference model of molecular evolution is episodic and can be made consistent with the punctuated equilibrium model of macroevolution. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
Many tests of the lineage-dependence of substitution rates, computations of the error of evolutionary distances, and simulations of molecular evolution assume that the rate of evolution is constant in time within each lineage descended from a common ancestor. However, estimates of the index of dispersion of numbers of mammalian substitutions suggest that the rate has time-dependent variations consistent with a fractal-Gaussian-rate Poisson process, which assumes common descent without assuming rate constancy. While this model does not affect certain relative-rate tests, it substantially increases the uncertainty of branch lengths. Thus, fluctuations in the rate of substitution cannot be neglected in calculations that rely on evolutionary distances, such as the confidence intervals of divergence times and certain phylogenetic reconstructions. The fractal-Gaussian-rate Poisson process is compared and contrasted with previous models of molecular evolution, including other Poisson processes, the fractal renewal process, a Lvy-stable process, a fractional-difference process, and a log-Brownian process. The fractal models are more compatible with mammalian data than the non-fractal models considered, and they may also be better supported by Darwinian theory. Although the fractal-Gaussian-rate Poisson process has not been proven to have better agreement with data or theory than the other fractal models, its Gaussian nature simplifies the exploration of its impact on evolutionary distance errors and relative-rate tests. Copyright (c) 2000 Springer-Verlag New York Inc.
The fractal doubly-stochastic Poisson process (FDSPP) model of molecular evolution, like other doubly-stochastic Poisson models, agrees with the high estimates for the index of dispersion found from sequence comparisons. Unlike certain previous models, the FDSPP also predicts a positive geometric correlation found between the index of dispersion and the mean number of substitutions. Such a relationship is statistically proven herein using comparisons between 49 mammalian genes. There is no characteristic rate associated with molecular evolution according to this model, but there is a scaling relationship in rates according to a fractal dimension of evolution. The FDSPP is a suitable replacement for the homogeneous Poisson process in tests of the lineage-dependence of rates and in estimating confidence intervals for divergence times. As opposed to other fractal models, this model can be interpreted in terms of Darwinian selection and drift. Copyright (c) 1998 Society for Molecular Biology and Evolution.
D. R. Bickel and B. J. West, “Molecular evolution modeled as a fractal renewal point process in agreement with the dispersion of substitutions in mammalian genes,” Journal of Molecular Evolution 47, 551-556 (1998).
A fractal renewal point process (FRPP) is used to model molecular evolution in agreement with the relationship between the variance and mean numbers of nonsynonymous and synonymous substitutions in mammals. Like other episodic models such as the doubly-stochastic Poisson process, this model accounts for the large variances observed in amino acid substitution rates, but unlike certain other episodic models, it also accounts for the increase in the index of dispersion with the mean number of substitutions in Ohta's (1995) data. We find that this correlation is significant for nonsynonymous substitutions at the 1% level and for synonymous substitutions at the 10% level, even after removing lineage effects and when using Bulmer's (1989) unbiased estimator of the index of dispersion. This model is simpler than most other overdispersed models of evolution in the sense that it is fully specified by a single interevent probability distribution. Interpretations in terms of chaotic dynamics and in terms of chance and selection are discussed. Copyright (c) 1998 Springer-Verlag New York Inc.
Modeling the rate of nucleotide substitutions in DNA as a dichotomous stochastic process with an inverse power-law correlation function describes evolution by a fractal stochastic process (FSP). This FSP model agrees with recent findings on the relationship between the variance and mean number of synonymous and nonsynonymous substitutions in 49 different genes in mammals, that being a power-law increase in the ratio of the variance to the mean, the index of dispersion, with the number of substitutions in a protein. The probability of a given number of substitutions occurring in a time t is determined by a fractional diffusion equation whose solution is a truncated Lvy distribution implying that evolution is a Lvy process in time and yields the same functional behavior for the variance in the number of substitutions as the FSP model. In addition to obtaining these relationships, the FSP model implies lognormal statistics for the index of dispersion as a function of the mean number of substitutions in a protein, which is confirmed in the regression of the FSP model to data. Lognormal statistics suggest that molecular evolution can be viewed as a multiplicative stochastic process, rather than the linear additive process of Darwinian selection and drift. Copyright (c) 1998 Elsevier Science B.V. All rights reserved.
The intermittency of a point process is the extent to which the number of events in a time window has pronounced departures from typical values. Combining point process and multifractal formalisms indicates that the correlation codimension can be used to quantify intermittency. The correlation codimension is easily estimated and is simply related to other second order scaling exponents, such as those of the Fano factor and spectral density. The correlation codimensions are derived for various uncorrelated, fractal, and fractal-rate point processes. In addition, the estimation of intermittency as the correlation codimension of experimental events is illustrated with applications to experimental data. Human activity during bed rest is highly intermittent, while other human activity and viral DNA composition are non-intermittent. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
A number of physical and biological phenomena are intermittent in the sense that they tend to have large departures from their typical dynamics. The intermittency of a multifractal can be qualified and quantified by differential or nondifferential multifractality, the extent to which the generalized Hurst exponents differ. Multifractality is related to the generalized dimension of a singular measure, but also applies to other signals, including noises, walks, anomalous diffusion, and point processes. Multifractality has uses in data-model and data-data comparisons; e.g., the multifractality of the heart rate reveals the inadequacy of unifractal models and distinguishes healthy subjects from those with heart failure. In addition, the multifractality of human activity quantifies restfulness at night. Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
Darwin's theory of evolution by natural selection revolutionized science in the nineteenth century. Not only providing a new paradigm for biology, the theory formed the basis for analogous interpretations of complex systems studied by other disciplines, such as sociology and psychology. With the subsequent linking of macroscopic phenomena to microscopic processes, the Darwinian interpretation was adopted to patterns observed in molecular evolution by assuming that natural selection operates fundamentally at the level of DNA. Thus, patterns of molecular evolution have important implications in many fields of science. Although the evolution rate of a given gene seems to be of approximately the same order of magnitude in all species, genes appear to differ in rate from one another by orders of magnitude, a fact which standard theory does not adequately explain. An understanding of the statistics of rates across different genes may shed light on this problem. The evolution rates of mammalian DNA, based on recent estimates of numbers of nonsynonymous substitutions in 49 genes of humans, rodents, and artiodactyls, are studied. We find that the rate variations are better described by lognormal statistics, as would be the case for a multiplicative process, than by Gaussian statistics, which would correspond to a linear, additive process. Thus, we introduce a multiplicative evolution statistical hypothesis (MESH), in which the theoretical explanation of these statistics requires the evolution of different substitution rates in different genes to be a multiplicative process in that each rate results from the interaction of a number of interdependent contingency processes. Lognormal statistics lend support to fractal process models of DNA substitutions, including anomalous diffusion processes and fractal stochastic point processes, such as the fractal renewal process and the fractal doubly-stochastic Poisson process. The realization of a fractal process is a random self-similar time series with a power-law autocorrelation function, spectral density, and Fano factor over many time scales. Copyright (c) 1998 World Scientific Publishing Company.
D. R. Bickel, M. T. Verklan, and J. Moon, “Detection of anomalous diffusion using confidence intervals of the scaling exponent with application to preterm neonatal heart rate variability,” Physical Review E 58, 6440-6448 (1998).
The scaling exponent of the root mean square (rms) displacement quantifies the roughness of fractal or multifractal time series; it is equivalent to other second-order measures of scaling, such as the power-law exponents of the spectral density and autocorrelation function. For self-similar time series, the rms scaling exponent equals the Hurst parameter, which is related to the fractal dimension. A scaling exponent of 0.5 implies that the process is normal diffusion, which is equivalent to an uncorrelated random walk; otherwise, the process can be modeled as anomalous diffusion. Higher exponents indicate that the increments of the signal have positive correlations, while exponents below 0.5 imply that they have negative correlations. Scaling exponent estimates of successive segments of the increments of a signal are used to test the null hypothesis that the signal is normal diffusion, with the alternate hypothesis that the diffusion is anomalous. Dispersional analysis, a simple technique which does not require long signals, is used to estimate the scaling exponent from the slope of the linear regression of the logarithm of the standard deviation of binned data points on the logarithm of the number of points per bin. Computing the standard error of the scaling exponent using successive segments of the signal is superior to previous methods of obtaining the standard error, such as that based on the sum of squared errors used in the regression; the regression error is more of a measure of the deviation from power-law scaling than of the uncertainty of the scaling exponent estimate. Applying this test to preterm neonate heart rate data, it is found that time intervals between heart beats can be modeled as anomalous diffusion with negatively correlated increments. This corresponds to power spectra between 1/f and 1/f2, whereas healthy adults are usually reported to have 1/f spectra, suggesting that the immaturity of the neonatal nervous system affects the scaling properties of the heart rate. Copyright (c) 1998 The American Physical Society.
Identifying variables predictive of neurobehavioral sequelae is a key objective in the study of high-risk neonates. Examination of heart rate variability (HRV) characteristics may be a finer discriminator of the neonate's response to physiologic stressors than the mean heart rate. The energy entropy of the heartbeat tachogram, computed in four different domains, was used to quantify the HRV in 13 preterm neonates. The entropies of energies were computed from 1024 interbeat time intervals obtained once per week from 26 to 35 weeks post-conceptional age (PCA). The energy entropy computed in three of the domains, like the standard deviation of intervals, distinguished between the 10 neonates that were measured at 35 weeks PCA with 100% specificity and 67% sensitivity, but did not distinguish between healthy and unhealthy neonates at earlier ages. The findings suggest that energy entropy may be a discerning measure of physiologic stress in the preterm infant, although future research is needed to refine the test and determine statistical significance. Copyright (c) 1999 The Japanese Urological Association.
C. A. Ordonez, D. R. Bickel, V. C. Venezia, F. D. McDaniel, S. E. Matteson, and M. I. Molina, “Electronic ion energy loss calculations on the basis of the binary encounter approximation,” Journal of Nuclear Materials 264, 133-140 (1999).
Electronic ion energy loss calculations on the basis of the binary encounter approximation are presented for protons in oxygen, nitrogen and silicon. Calculations using both an analytical approach as well as a Monte Carlo approach are found to agree well with experimental data even down to energies below the stopping cross section maximum. Energy loss calculations for protons in silicon under channeling conditions are included and predictions are made for channeling in the <110> direction at low energies (5 to 500 keV). Copyright (c) 1999 Elsevier Science B.V. All rights reserved.
Last modified September 4, 2013
Author information. personal web page
This web site is not affiliated with the University of Ottawa.