IN SILICO ANALYSIS TECHNIQUES (BIOINFORMATICS) - STATISTICS AND EPIDEMIOLOGY


  • statistics
  • epidemiology

  • statistics : a discipline devoted to the collection, analysis, and interpretation of numerical data using the theory of probability, concerned particularly with methods for drawing inferences about characteristics of a population from examination of a random sample.

    position parameters (in the normal distribution mean = mode = median)
    dispersion parameters
    shape parameters
    normal or gaussian distribution
  • mean (m) : in probability and statistics, the expected value (mathematical expectation) of a random variable, the limiting value to which the sample mean converges as the sample size is increased indefinitely (if the limit exists)
      • population mean : the mean of the probability distribution characterizing a specified population; for a finite population, the arithmetic mean of the population values
    • arithmetic mean : the sum of n numbers divided by n. m = limn=>+oo[sum(xi)/N]
    • geometric mean : the nth root of the product of n numbers, e.g., the geometric mean of [2,8,32] is (2 x 8 x 32)1/3 = 8.
    • harmonic mean (m) : reciprocal of the mean of the reciprocals of the individual values in a given set; e.g., for the set [10, 40, 60] the harmonic mean is 1 / [ 1/3 ( 1/10 + 1/40 + 1/60 )] = 21.2.
  • average absolute difference = 1/n . summation(|xi-x-|)
  • coefficient of variation (CV) : the standard deviation divided by the mean, sometimes multiplied by 100; a unitless quantity indicating the variability around the mean in relation to the size of the mean
  • deviance / average quadratic difference = 1/n . summation[(xi-x-)2]
  • variance (s2) = 1/(n-1) . sum(xi-x-)2 : in statistics, a measure of the variation shown by a set of observations: the average of the squared deviations from the mean; it is the square of the standard deviation
    • s2(x+a) = s2 (x)
    • s2(bx) = b2s2(x)
    => s2(a+bx) = b2s2(x
  • range : an interval in which values sampled from a population, or the values in the population itself, are known to lie = xmax - xmin
  • population standard deviation (SD / s) = [1/(n-1) . summation(xi-x-)2]1/2 :  in statistics a measure of the amount by which each value deviates from the mean; equal to the square root of the variance, i.e., the square root of the average of the squared deviations from the mean. It is the most commonly used measure of dispersion of statistical data
  • skewness of a probability distribution, lack of symmetry about the mean, or any measure of the lack of symmetry = [summation(xi-x-)2]/ns2
    • > 0 : right tail
    • = 0 : symmetrical
    • < 0 : left tail
  • kurtosis : the degree of peakedness or flatness of a probability distribution, relative to the normal distribution with the same variance = [sum (xi-x-)2]/ns4
    • < 3 : leptokurtic : pertaining to a probability distribution more heavily concentrated around the mean, i.e., having a sharper, narrower peak, than the normal distribution with the same variance
    • = 3 : normal distribution
    • > 3 : platykurtic : pertaining to a probability distribution less concentrated about the mean, i.e., having a broader, flatter peak than the normal distribution with the same variance
  • skew distribution : a frequency distribution that is asymmetric.
  • mode : the most frequently occurring value or item in a distribution; when data are grouped, it is the midpoint of the grouping with the highest frequency. A distribution with 2 peaks is bimodal
  • median (Me) : any value that divides the probability distribution of a random variable in half, i.e., the probability of observing a value above the median and the probability of observing a value below the median are both less than or equal to one half. For a finite population or sample, the median is
    • the middle value of an odd number of values (arranged in ascending order)
    • any value between the 2 middle values of an even number of values; in the latter case it is conventional to use the average of the 2 middle values
    The median is the value for which the sum of |xi - Me| is minimal. Its value doesn't change when values of x < Me or x > Me are changed and can be used also for ordinal continuous variables. 
  • quantile : any of the values that divide the range of an observed or theoretical probability distribution into a given number of equal, ordered parts. Each value divides the range into 2 specified parts, with the part below the value corresponding to a prescribed fraction p and the part above to 1 - p.
    • percentile : any one of the 99 values that divide the range of a probability distribution or sample into 100 intervals of equal probability or frequency, e.g., 45% of a population scores below the 45th percentile
    • quartile : any of the 3 values that divide the range of a probability distribution into 4 parts of equal probability; i.e., the 1st (Q1), 2nd (Q2), and 3rd (Q3) quartiles are the 25th, 50th, and 75th percentiles.
    • quintile : any of the 5 values that divide the range of a probability distribution into 5 parts of equal probability, i.e., the 1st, 2nd, 3rd, and 4th quintiles are the 20th, 40th, 60th, and 80th percentiles
  • interquartile range : the difference between the data values at the 75th and 25th percentiles (Q3 - Q1), encompassing the middle 50 percent of the data
  • Epidemiology : Web resources : Bibliography :
    Copyright © 2001-2014 Daniele Focosi. All rights reserved Terms of use  | Legal notices
    About this site  |  Site map  |  AcknowledgementsCurrent link partners
     Abbreviations and acronyms  |  Medical terminology  |  Add a link  |  Translate   |  Softwares Cite this page!


    This website subscribes to the HONcode principles of the HON Foundation. Click to verify.
    Search 
    Search 
    for 
    Search Medical Dictionary 
    for