Timothée Bonnet
June 1st 2018
Probability to detect an effect that exists for real
= 1 - false negative rate (type II error)
Low power means:
High power means:
BEFORE planning experiment or data collection:
But also AFTER doing an analysis:
1. Sample size
2. Strength of the effect
3. Unexplained variation
If we know or assume:
Computer simulations
See file tempPA
We assume we know the data variability. How many samples to find a difference of 0.5?
library(pwr)
p40 <- pwr.t.test(n = 40, d = 0.5/1 )
p40$power
[1] 0.5981469
p143 <- pwr.t.test(n = 143, d = 0.5/1 )
p143$power
[1] 0.9878887
Try several values between zero and one for the assumed effect size in the simulations (pwr is allowed)
Show how power varies with sample size and effect size
You did the experiment with a sample size 100 and did not find a significant result. Let's assume a value less than 0.7 is biologically unimportant. Can you conclude something?
What if a value of 0.1 is still biologically important?
Post-hoc power analysis
We have assumed observations to be independent in our simulations
Corresponds to an assumption of linear models
What if they are not?
Does size differ between French people and Brazilian people?
Very few individuals available here => we take 50 measurements of each person
t-test of the difference, p<0.0001…
In reality, the data contain no information because the measurements are perfectly dependent: Effective sample size is 2, question has 2 parameters. Power=0
Two bird populations, one supplemented with food. Measure reproductive success within populations, test the effect of food on reproduction. Effective sample size?
2, for 2 parameters => no information, no power
300 mass measurements of 50 individuals. Does mass impact on lifetime reproductive success? Effective sample size?
50, for 2 parameters => some information / power
?
package lme4
The previous data were generated by the code contained in CodeForOneSimulation1.R
Modify it to estimate the false positive rate (when diffgroup = 0), and the power (when diffgroup = 0.5) for lm() and lmer(), with non-duplicated and with duplicated data
(NB: probably need of for loop, and look at the distribution of t-values instead of p-values)
What model is more powerful? Which one should you use?
Does a defensive tissue (thorns) reduce herbivory on a plant species?
Collected data in 5 locations to reach large sample size
summary(lm(y~ x))
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-1.73696 -0.59086 0.01651 0.50495 2.37736
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.91598 0.27035 14.485 <2e-16 ***
x 0.06498 0.06718 0.967 0.336
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7554 on 98 degrees of freedom
Multiple R-squared: 0.009457, Adjusted R-squared: -0.0006511
F-statistic: 0.9356 on 1 and 98 DF, p-value: 0.3358
What is wrong?
Simpson's paradox
library(lme4)
summary(lmer(y~ x + (1|block)))
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 | block)
REML criterion at convergence: -6
Scaled residuals:
Min 1Q Median 3Q Max
-3.4740 -0.6403 0.0745 0.5321 2.9025
Random effects:
Groups Name Variance Std.Dev.
block (Intercept) 2.36630 1.5383
Residual 0.03789 0.1947
Number of obs: 100, groups: block, 5
Fixed effects:
Estimate Std. Error t value
(Intercept) 7.93789 0.70059 11.33
x -0.97592 0.03393 -28.76
Correlation of Fixed Effects:
(Intr)
x -0.187
Look at the simulation code in CodeForOneRepeat2.R
Modify the code to estimate power for the lm() and lmer() in this context
(NB: you can use absolute values of t-values, instead of p-values)
Unknown variation should be accounted for (mixed models)
But mixed models are interesting for other reasons…