Saturday, 26 November 2016

Missing data or missing values

In statistics, they «occur when no data value is stored for the variable in an observation [1].» They are «values of variables within data sets which are not known [2].»
Bibliographic references:
[1] Missing data [Internet]. En.wikipedia.org. 2016 [cited 19 November 2016]. Available from: https://en.wikipedia.org/wiki/Missing_data.
[2] Statistics Glossary: M [Internet]. Statsoft.com. 2016 [cited 19 November 2016]. Available from: http://www.statsoft.com/textbook/statistics-glossary/m#Missing values.

Standard error (SE)

«These are the SEs for the descriptive statistics. The SE gives some idea about the variability possible in the statistic [1].» It «(...) is a measure of the variability of a statistic. It is an estimate of the standard deviation of a sampling distribution [2].» It «(...) is the standard deviation of the sampling distribution of a statistic, (...) (Everitt BS, 2003, cited in [3]). «The SE of the mean [SEM] (...) is the theoretical standard deviation of all sample means of size n drawn from a population and depends on both the population variance (sigma) and the sample size (n) (...) [4].» «The SEM can be seen to depict the relationship between the dispersion of individual observations around the population mean (the standard deviation), and the dispersion of sample means around the population mean (the SE). Different samples drawn from that same population would in general have different values of the sample mean, so there is a distribution of sampled means (with its own mean and variance). (...) As the sample size increases, the dispersion of the sample means clusters more closely around the population mean and the SE decreases [3].» «The SE depends on three factors: the number of observations in the population (N), the number of observations in the sample (n), and the way that the random sample is chosen. If the population size is much larger than the sample size, then the sampling distribution has roughly the same SE, whether we sample with or without replacement . On the other hand, if the sample represents a significant fraction (say, 1/20) of the population size, the SE will be noticeably smaller, when we sample without replacement [2].» «The SE of the proportion (...) is the standard deviation of the distribution of the sample proportion over repeated samples [4].»
Source: Steve's Favorites. Pinterest. Available at: https://www.pinterest.pt/pin/419819996504206121/. Accessed November 3, 2017.


Bibliographic references:
[1] Annotated SPSS Output: Descriptive statistics [Internet]. Ats.ucla.edu. 2016 [cited 20 November 2016]. Available from: http://www.ats.ucla.edu/stat/spss/output/descriptives.htm.
[2] Statistics Dictionary [Internet]. Stattrek.com. 2016 [cited 20 November 2016]. Available from: http://stattrek.com/statistics/dictionary.aspx.
[3] Standard error [Internet]. En.wikipedia.org. 2016 [cited 20 November 2016]. Available from: https://en.wikipedia.org/wiki/Standard_error#cite_note-1.
[4] Statistics Glossary: S [Internet]. Statsoft.com. 2016 [cited 20 November 2016]. Available from: http://www.statsoft.com/textbook/statistics-glossary/s#Standard.

Sunday, 13 November 2016

Confidence interval (CI)

It is «a range of values, calculated from the sample observations, that is believed, with a particular probability, to contain the true parameter value» [1]. It is used «to express the degree of uncertainty associated with a sample statistic. A confidence interval is an interval estimate combined with a probability statement.» One «might describe the interval estimate as a "95% confidence interval". This means that if» one «used the same sampling method to select different samples and computed an interval estimate for each sample,» one «would expect the true population parameter to fall within the interval estimates 95% of the time» [2]. «Precision is taken to be the narrowness of the confidence interval. (...) The interval estimate is an expression of the uncertainty surrounding the point estimate and derives mainly from sampling variation as well as measurement variation/error. In general, the degree of uncertainty is inversely related to the size of the study. On one hand, if a study is too small, the uncertainty may increase to a level considered to be undesirable or useless. On the other, as the study size increases, the degree of uncertainty decreases, and the interval estimate becomes narrower» [3]. «Confidence intervals are preferred to point estimates and to interval estimates, because only confidence intervals indicate the precision of the estimate and the uncertainty of the estimate» [2].
Bibliographic references:
[1] Everitt, B. and Skrondal, A. (2010). The Cambridge dictionary of statistics. 4th ed. Cambridge, UK: Cambridge University Press.
[2] Stattrek.com. (2016). Statistics Dictionary. [online] Available at: http://stattrek.com/statistics/dictionary.aspx [Accessed 13 Nov. 2016].
[3] Broeck, J. and Brestoff, J. (2013). Epidemiology: Principles and Practical Guidelines. 1st ed. Dordrecht: Springer.

Saturday, 12 November 2016

Standard deviation (σ or s)

«The standard deviation of a population is denoted by σ and the standard deviation of a sample, by s [1].» It is a measure of dispersion or absolute variability [2,3] in original units (unit of measurement equal to that of the mean). It is the positive square root of the variance (σ2 or s2 or Var[X]). It «is useful as a measure of variation within a given set of data» [4]. «A small value of the standard deviation indicates that most values are close to the sample mean. A large value of the standard deviation indicates many values are far from the sample mean. Other words which are used to convey the concept of the standard deviation are “spread” and “scale.”» It is «the “typical” spread of observations about the sample mean [2].» It «is a numerical value used to indicate how widely individuals in a group vary. If individual observations vary greatly from the group mean, the standard deviation is big; and vice versa [1].»
Bibliographic references:
[1] Stattrek.com. (2016). Statistics Dictionary. [online] Available at: http://stattrek.com/statistics/dictionary.aspx [Accessed 13 Nov. 2016].
[2] Supino, P. and Borer, J. (2012). Principles of research methodology. 1st ed. New York, NY: Springer.
[3] Doi, S. and Williams, G. (2013). Methods of clinical epidemiology. 1st ed. Berlin: Springer.
[4] Daniel, W. and Cross, C. (2013). Biostatistics: a foundation for analysis in the health sciences. 10th ed. Danvers, Massachusetts, USA: John Wiley & Sons, Inc.