Improving your project through
pre-registration
Dorothy V. M. Bishop
Professor of Developmental Neuropsychology
University of Oxford
@deevybee
The Reproducibility Crisis
“There is increasing concern about the
reliability of biomedical research, with recent
articles suggesting that up to 85% of
research funding is wasted.”
Bustin, S. A. (2015). The reproducibility of
biomedical research: Sleepers awake!
Biomolecular Detection and
Quantification
2005. PLoS Medicine, 2(8), e124. doi:
10.1371/journal.pmed.0020124
Four key factors leading to poor reproducibility
P-hackingPublication bias
Low power
HARKing
Thought experiment:
You have submitted a paper to Current Biology
evaluating effect of computer games on dyslexia
How likely is your paper to be accepted if you report:
• 20 participants; beneficial effect of intervention, p < .05
• 20 participants; group difference is non-significant
• 200 participants; group difference is non-significant
Thought experiment:
You have submitted a paper to Current Biology
evaluating effect of computer games on dyslexia
How likely is your paper to be accepted if you report:
• 20 participants; beneficial effect of intervention, p < .05
• 20 participants; group difference is non-significant
• 200 participants; group difference is non-significant
• http://deevybee.blogspot.co.uk/2013/03/high-impact-journals-where.html
Ample evidence that many journals – especially ‘high impact’ journals
prioritise newsworthiness over methodological quality:
Reluctant to publish null results
= PUBLICATION BIAS
1956
De Groot
1975
Greenwald
The “file drawer” problem
1979
Rosenthal
Prejudice against the null
“As it is functioning in at least some areas of
behavioral science research, the research-
publication system may be regarded as a
device for systematically generating and
propagating anecdotal information.”
Publication bias
1956
De Groot
1975
Greenwald
1987
Newcombe
“Small studies continue to be carried out
with little more than a blind hope of
showing the desired effect. Nevertheless,
papers based on such work are submitted
for publication, especially if the results
turn out to be statistically significant.”
1979
Rosenthal
Low power
POWER problem:
Journals typically willing to publish a significant finding
with a very small sample, even if they would not think of
doing so for a null result
P-hacking and HARKing
1956
De Groot
Failure to distinguish between hypothesis-testing and
hypothesis-generating (exploratory) research
-> misuse of statistical tests
Historical timeline: concerns about reproducibility
Acta Psychologica 148, 2014, pp. 188-194
The meaning of “significance” for different types of research
[translated and annotated by Eric-Jan Wagenmakers et al.
//doi.org/10.1016/j.actpsy.2014.02.001
Here is a correlation matrix from a study that measured
various perceptual skills in relation to reading ability in
children. How should you report/interpret results?
N = 20 subjects
Created from script on: https://osf.io/skz3j/
* p < .05, ** p < .01
Key question: Did researcher specifically predict this
association?
• Probability that a specific prespecified pair of variables will be correlated at p
< .05 level when null is true = .05.
• Probability that at least one of 21 correlations will meet p < .05 when null is
true:
= 1 - .95^21 = .64 (i.e. one minus prob. that NONE is significant)
• Bonferroni-corrected significance level = .05/21 = .002
• With N = 20 subjects, to reach .002 significance level, r = .61
If you didn’t predict this specific association and you report
uncorrected p-values, this is P-hacking
P-hacking -> huge risk of false positives
You run a study investigating how a drug, X, affects
anxiety. You plot the results by age, and see this:
No significant effect of X on anxiety overall
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
16 20 24 28 32 36 40 44 48 52 56 60
Symptomimprovement
Age (yr)
Treatment effect by age
But you notice that there is a treatment effect for
those aged over 36
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
16 20 24 28 32 36 40 44 48 52 56 60
Symptomimprovement
Age (yr)
Treatment effect by age
How should you analyse/report this result?
• We tested whether X affects anxiety
• We tested whether X affects anxiety in people aged over 36 years
• We tested whether age affects the impact of X on anxiety
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
16 20 24 28 32 36 40 44 48 52 56 60
Symptomimprovement
Age (yr)
Treatment effect by age
How should you analyse/report this result?
• We tested whether X affects anxiety -TRUE
• We tested whether X affects anxiety in people aged over 36 years – UNTRUE, and most
would agree unacceptable
• We tested whether age affects the impact of X on anxiety – UNTRUE, but many would think
acceptable, given results
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
16 20 24 28 32 36 40 44 48 52 56 60
Symptomimprovement
Age (yr)
Treatment effect by age
Improve your study with pre-registration
Close link between p-hacking and HARKing
You are HARKing if you have no prior predictions, but on seeing results you write up paper as
if you planned to look at effect of age on drug effect.
This kind of thing is endemic in psychology.
• It is OK to say that this association was observed in exploratory analysis, and that it
suggests a new hypothesis that needs to be tested in a new sample.
• It is NOT OK to pretend that you predicted the association if you didn’t.
• And it is REALLY REALLY NOT OK to report only the data that support your new hypothesis
(e.g. dropping those aged below 36 from the analysis)
-1
-0.5
0
0.5
1
16 20 24 28 32 36 40 44 48 52 56 60
Symptom
improvement
Age (yr)
Treatment effect by age
HARKING
Capitalises on chance and produces huge risk of
false positives
Widespread in many fields – and even explicitly
encouraged by some influential people
Which Article Should You Write?
There are two possible articles you can write: (a) the article you planned to
write when you designed your study or (b) the article that makes the most sense
now that you have seen the results. They are rarely the same, and the correct
answer is (b).
re Data Analysis: Examine them from every angle. Analyze the sexes separately.
Make up new composite indexes. If a datum suggests a new hypothesis, try to
find additional evidence for it elsewhere in the data. If you see dim traces of
interesting patterns, try to reorganize the data to bring them into bolder relief. If
there are participants you don’t like, or trials, observers, or interviewers who
gave you anomalous results, drop them (temporarily). Go on a fishing expedition
for something— anything —interesting.
Writing the Empirical Journal Article
Daryl J. Bem
The Compleat Academic: A Practical Guide for the Beginning Social
Scientist, 2nd Edition. Washington, DC: American Psychological
Association, 2004.
“This book provides invaluable guidance that will help new academics plan,
play, and ultimately win the academic career game.”
Explicitly advises
HARKing!
HARKING seems innocuous but it fills the
literature with dross
‘We report how we determined our sample size, all
data exclusions (if any), all manipulations, and all
measures in the study.’
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2012).
A 21 Word Solution. SPSP Dialogue,26, 2, Fall 2012 issue.
One solution to protect against HARKing:
Suggested wording in write-up to keep researchers honest
A more comprehensive solution: Pre-registration
Plan study
Do study
Submit to
journal
Respond
to
reviewer
comments
Publish
paper
Acceptance!
Classic publishing
Plan study
Do study
Submit to
journal
Respond
to
reviewer
comments
Publish
paper
Plan study Submit to
journal
Respond
to
reviewer
comments
Do study
Publish
paper
Acceptance!
Classic publishing
Registered reports
Acceptance!
Registered reports solves issues of:
• Publication bias: publication decision made on the
basis of quality of introduction/methods, before
results are known
• Low power: researchers required to have 90%
power
• P-hacking: analysis plan specified up-front
• HARKing: hypotheses specified up-front.
Unanticipated findings can be reported but clearly
demarcated as ‘exploratory’
Registered reports
But problematic for student projects because:
• Time-scale means delay before data collected
• Power requirements often hard to meet
Plan study Submit to
journal
Respond
to
reviewer
comments
Do study
Publish
paper
Acceptance!
An alternative: Preregistration lite:
Open Science Framework
Pre-registration on OSF
• Similar to regular publication route
• No guarantee of publication
• But reviewers generally positive about preregistered
papers because prevents p-hacking or HARKing
• And benefits of having well-worked out plan – less stress
when it comes to making sense of data
Plan study
Submit
plan to
OSF
Check by
OSF
statistician
Do study
Submit to
journal
Respond
to
reviewer
comments
Publish
paper
Acceptance!
Advantages of pre-registration through OSF
• Free methodological/statistical consulting:
http://cos.io/stats_consulting
• Have a date-stamped record of what was planned
which can be referred to when publishing
• Encourages open data and scripts
• Work has greater impact
• Errors get detected
• Improves reproducibility
• Possibility of winning $1000 !
https://osf.io/jea94/
Even if you decide not to formally pre-register, this
template is likely to be useful for planning your project
Here I won’t go through template, but I will note points
that crop up when trying to complete it
What is your research question?
Points to consider
• What type of question? Yes/No, Why, How?
• How could it be improved? – is it too general/too precise
Hypotheses
• Can you formulate specific predictions?
• E.g. X will be bigger than Y
• X will be bigger than zero
• X will vary systematically with Y
• Are predictions directional?
Study information
What is your study type?
Are you systematically manipulating a variable to see its effect?
= Experiment
Or are you looking at relationships between variables that occur
naturally?
= Observational Study
Rationale for proposed sample size
OSF note this could include a power analysis but also constraints such as
time, money, availability of a particular group
Power analysis
Can use GPower
http://www.gpower.hhu.de/en.html
For more complex designs, simulate data with given effect size and repeatedly
run through analysis to see how often you can detect the effect of interest
(see Lazic, 2016)
N.B. Power analysis is not the only way to rationalise sample size, but it is
most common
Sample size
When we talk of ‘control’ we usually mean ‘negative’ controls – where we
compare effect of X with a situation that is identical except for X – to isolate
specific effect.
In a positive control: aim is to rule out trivial explanations for null result – e.g.
manipulation didn’t work; participants didn’t attend, etc.
Examples:
You are comparing autistic and typically developing children on a false memory
task
– does paradigm yield false memory effect in the typically-developing group?
You are interested whether there is suppression of the mu frequency band in EEG
(regarded as ‘mirror neuron’ activity) when participants view hand gestures:
- is there mu suppression when participants perform hand gestures?
Making sense of null results:
1. Positive controls
Making sense of null results:
2. Bayes factors
”Bayes factors provide a coherent approach to determining whether non-
significant results support a null hypothesis over a theory, or whether the
data are just insensitive.”
Designing an analysis script using simulated data
• Aim: create a process that is completely transparent and increase the
likelihood that your analysis can be replicated
• Analysis script reads in raw data as input, does all analysis and generates
Tables/Figures/Summary statistics as output
• Avoids common scenario where researcher cannot remember how results
were generated
• Scripts need to be extensively ’commented’ to explain what the code is doing
• Increasingly researchers are using scripts written in R, Matlab or Python, but
can also create scripts easily in SPSS.
Building a script in SPSS
We start by simulating some data (see last
week’s lecture)
Can be either random data (null hypothesis)
or data with an effect of interest added
Here are random numbers (Score) allocated
to groups 1 and 2
Building a script in SPSS
Now do whatever analysis steps you
usually do via the GUI
Here we have selected options for a t-
test
But instead of hitting OK, we hit Paste
Building a script in SPSS
Hitting Paste opens a new
window with a script in it
This shows the code behind
the analysis done in the GUI
You can run the analysis by
pressing the big green
button
You can add to the script,
add comments, and save it.
Then you can rerun it any
time with a new dataset
Benefit: you have a
complete record of the
analysis
Hitting Paste opens a new
window with a script in it
This shows the code behind
the analysis done in the GUI
You can run the analysis by
pressing the big green
button
You can add to the script,
add comments, and save it.
Then you can rerun it any
time with a new dataset
Benefit: you have a
complete record of the
analysis
Free Coursera lectures
https://www.coursera.org/learn/statistical-inferences
Further suggestions for study
https://www.slideshare.net/deevybishop/bishop-reproducibility-references-nov2016
https://www.slideshare.net/deevybishop/what-is-the-reproducibility-crisis-in-science-
and-what-can-we-do-about-it
Experimental Design for
Laboratory Biologists :
Maximising Information and
Improving Reproducibility
Stanley E Lazic
The 7 deadly sins of
Psychology
Chris Chambers
http://christophergandrud.github.io/RepResR-RStudio/
“Treat all of your research files as if someone who has not worked on
the project will, in the future, try to understand them.”
https://images.nature.com/original/nature-
cms/uploads/ckeditor/attachments/4127/RegisteredReportsGuidelines_NatureHumanBe
haviour.pdf
Require either 95% power or Bayesian equivalent
For Bayes, recommend https://osf.io/d4dcu/
Going further:
author/reviewer guidelines for Registered Reports in Nature Human Behaviour
Improve your study with pre-registration

More Related Content

PPTX
Alcohol dependent syndrome
PPTX
Alcohol: The Drink, The Addiction and the Solution
PPTX
Mental disorders ppt
PPTX
Management of epilepsy
PPTX
Organisation of data
PPTX
Diabetes Mellitus
PPTX
Hypertension
PPTX
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx
Alcohol dependent syndrome
Alcohol: The Drink, The Addiction and the Solution
Mental disorders ppt
Management of epilepsy
Organisation of data
Diabetes Mellitus
Hypertension
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx

What's hot (20)

PPTX
Beyond Fact Checking — Modelling Information Change in Scientific Communication
PDF
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
PPTX
Knowledge Graph Introduction
PDF
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
PDF
Engineering data quality
PPTX
Scientific Research Paper Writing
PPTX
Writing and publishing a research article
PDF
Best Practices for Building and Deploying Data Pipelines in Apache Spark
PDF
How Graphs Enhance AI
PPTX
Critical appraisal of randomized clinical trials
PDF
Introduction to Knowledge Graphs: Data Summit 2020
PPT
Writing good scientific_papers_v2
PDF
Spark DataFrames and ML Pipelines
PDF
Data Curation and Debugging for Data Centric AI
PPTX
Academic writing and and publishing
PPTX
Review of Related LiteratureLessons.pptx
PPTX
How to write a biomedical research paper
PPTX
Critical appraisal of published article
PPT
How to-write-a-research-paper
Beyond Fact Checking — Modelling Information Change in Scientific Communication
Digital Attribution Modeling Using Apache Spark-(Anny Chen and William Yan, A...
Knowledge Graph Introduction
Continuous Data Replication into Cloud Storage with Oracle GoldenGate
Engineering data quality
Scientific Research Paper Writing
Writing and publishing a research article
Best Practices for Building and Deploying Data Pipelines in Apache Spark
How Graphs Enhance AI
Critical appraisal of randomized clinical trials
Introduction to Knowledge Graphs: Data Summit 2020
Writing good scientific_papers_v2
Spark DataFrames and ML Pipelines
Data Curation and Debugging for Data Centric AI
Academic writing and and publishing
Review of Related LiteratureLessons.pptx
How to write a biomedical research paper
Critical appraisal of published article
How to-write-a-research-paper
Ad

Similar to Improve your study with pre-registration (20)

PDF
تحليل البيانات وتفسير المعطيات
PDF
محاضرة د.سعاد
PPT
Replication Crisis in Psycholhhhyogy.ppt
PPTX
sience 2.0 : an illustration of good research practices in a real study
PPTX
Chap3_ business reaserch
PPT
What is research
PDF
What is the reproducibility crisis in science and what can we do about it?
PPT
A well-defined research question is the cornerstone of any successful investi...
PPTX
Lecture 2 Steps in Concluding Research.pptx
PPT
SAMPLE_AND_OTHER.ppt
PPTX
Does preregistration improve the interpretability and credibility of research...
PPT
What is research
PPTX
Research misconduct an introduction
PPTX
Hypothesis testing
PDF
Research Method for Business chapter 5
PDF
Scientific method
PDF
RM UNIT 1.pdf
PPTX
research methodology , Hypothesis.pptx
PDF
Roche_open_science_NIOO_KNAW_workshop_NL
تحليل البيانات وتفسير المعطيات
محاضرة د.سعاد
Replication Crisis in Psycholhhhyogy.ppt
sience 2.0 : an illustration of good research practices in a real study
Chap3_ business reaserch
What is research
What is the reproducibility crisis in science and what can we do about it?
A well-defined research question is the cornerstone of any successful investi...
Lecture 2 Steps in Concluding Research.pptx
SAMPLE_AND_OTHER.ppt
Does preregistration improve the interpretability and credibility of research...
What is research
Research misconduct an introduction
Hypothesis testing
Research Method for Business chapter 5
Scientific method
RM UNIT 1.pdf
research methodology , Hypothesis.pptx
Roche_open_science_NIOO_KNAW_workshop_NL
Ad

More from Dorothy Bishop (20)

PPTX
Exercise/fish oil intervention for dyslexia
PPTX
Open Research Practices in the Age of a Papermill Pandemic
PDF
Language-impaired preschoolers: A follow-up into adolescence.
PPTX
Journal club summary: Open Science save lives
PPTX
Short talk on 2 cognitive biases and reproducibility
PPTX
Otitis media with effusion: an illustration of ascertainment bias
PPTX
Insights from psychology on lack of reproducibility
PPTX
What are metrics good for? Reflections on REF and TEF
PPTX
Biomarkers for psychological phenotypes?
PPTX
Data simulation basics
PPTX
Simulating data to gain insights into power and p-hacking
PPTX
Talk on reproducibility in EEG research
PPTX
What is Developmental Language Disorder
PPTX
Developmental language disorder and auditory processing disorder: 
Same or di...
PDF
Fallibility in science: Responsible ways to handle mistakes
PPTX
Introduction to simulating data to improve your research
PPTX
Southampton: lecture on TEF
DOCX
Reading list: What’s wrong with our universities
DOCX
IJLCD Winter Lecture 2016-7 : References
DOCX
What's wrong with our Universities, and will the Teaching Excellence Framewor...
Exercise/fish oil intervention for dyslexia
Open Research Practices in the Age of a Papermill Pandemic
Language-impaired preschoolers: A follow-up into adolescence.
Journal club summary: Open Science save lives
Short talk on 2 cognitive biases and reproducibility
Otitis media with effusion: an illustration of ascertainment bias
Insights from psychology on lack of reproducibility
What are metrics good for? Reflections on REF and TEF
Biomarkers for psychological phenotypes?
Data simulation basics
Simulating data to gain insights into power and p-hacking
Talk on reproducibility in EEG research
What is Developmental Language Disorder
Developmental language disorder and auditory processing disorder: 
Same or di...
Fallibility in science: Responsible ways to handle mistakes
Introduction to simulating data to improve your research
Southampton: lecture on TEF
Reading list: What’s wrong with our universities
IJLCD Winter Lecture 2016-7 : References
What's wrong with our Universities, and will the Teaching Excellence Framewor...

Recently uploaded (20)

PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PDF
CHEM - GOC general organic chemistry.ppt
PPTX
Basic principles of chromatography techniques
PDF
CuO Nps photocatalysts 15156456551564161
PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PDF
No dilute core produced in simulations of giant impacts on to Jupiter
PDF
Glycolysis by Rishikanta Usham, Dhanamanjuri University
PPTX
ELISA(Enzyme linked immunosorbent assay)
PDF
final prehhhejjehehhehehehebesentation.pdf
PPT
Chapter 6 Introductory course Biology Camp
PPTX
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
PDF
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
PDF
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
PDF
Exploring PCR Techniques and Applications
PDF
cell_morphology_organelles_Physiology_ 07_02_2019.pdf
PPTX
Introduction to Immunology (Unit-1).pptx
PPTX
CELL DIVISION Biology meiosis and mitosis
PPTX
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
PPTX
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
PPTX
Thyroid disorders presentation for MBBS.pptx
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
CHEM - GOC general organic chemistry.ppt
Basic principles of chromatography techniques
CuO Nps photocatalysts 15156456551564161
Preformulation.pptx Preformulation studies-Including all parameter
No dilute core produced in simulations of giant impacts on to Jupiter
Glycolysis by Rishikanta Usham, Dhanamanjuri University
ELISA(Enzyme linked immunosorbent assay)
final prehhhejjehehhehehehebesentation.pdf
Chapter 6 Introductory course Biology Camp
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
Sustainable Biology- Scopes, Principles of sustainiability, Sustainable Resou...
Telemedicine: Transforming Healthcare Delivery in Remote Areas (www.kiu.ac.ug)
Exploring PCR Techniques and Applications
cell_morphology_organelles_Physiology_ 07_02_2019.pdf
Introduction to Immunology (Unit-1).pptx
CELL DIVISION Biology meiosis and mitosis
Targeted drug delivery system 1_44299_BP704T_03-12-2024.pptx
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
Thyroid disorders presentation for MBBS.pptx

Improve your study with pre-registration

  • 1. Improving your project through pre-registration Dorothy V. M. Bishop Professor of Developmental Neuropsychology University of Oxford @deevybee
  • 2. The Reproducibility Crisis “There is increasing concern about the reliability of biomedical research, with recent articles suggesting that up to 85% of research funding is wasted.” Bustin, S. A. (2015). The reproducibility of biomedical research: Sleepers awake! Biomolecular Detection and Quantification 2005. PLoS Medicine, 2(8), e124. doi: 10.1371/journal.pmed.0020124
  • 3. Four key factors leading to poor reproducibility P-hackingPublication bias Low power HARKing
  • 4. Thought experiment: You have submitted a paper to Current Biology evaluating effect of computer games on dyslexia How likely is your paper to be accepted if you report: • 20 participants; beneficial effect of intervention, p < .05 • 20 participants; group difference is non-significant • 200 participants; group difference is non-significant
  • 5. Thought experiment: You have submitted a paper to Current Biology evaluating effect of computer games on dyslexia How likely is your paper to be accepted if you report: • 20 participants; beneficial effect of intervention, p < .05 • 20 participants; group difference is non-significant • 200 participants; group difference is non-significant • http://deevybee.blogspot.co.uk/2013/03/high-impact-journals-where.html Ample evidence that many journals – especially ‘high impact’ journals prioritise newsworthiness over methodological quality: Reluctant to publish null results = PUBLICATION BIAS
  • 6. 1956 De Groot 1975 Greenwald The “file drawer” problem 1979 Rosenthal Prejudice against the null “As it is functioning in at least some areas of behavioral science research, the research- publication system may be regarded as a device for systematically generating and propagating anecdotal information.” Publication bias
  • 7. 1956 De Groot 1975 Greenwald 1987 Newcombe “Small studies continue to be carried out with little more than a blind hope of showing the desired effect. Nevertheless, papers based on such work are submitted for publication, especially if the results turn out to be statistically significant.” 1979 Rosenthal Low power POWER problem: Journals typically willing to publish a significant finding with a very small sample, even if they would not think of doing so for a null result
  • 9. 1956 De Groot Failure to distinguish between hypothesis-testing and hypothesis-generating (exploratory) research -> misuse of statistical tests Historical timeline: concerns about reproducibility Acta Psychologica 148, 2014, pp. 188-194 The meaning of “significance” for different types of research [translated and annotated by Eric-Jan Wagenmakers et al. //doi.org/10.1016/j.actpsy.2014.02.001
  • 10. Here is a correlation matrix from a study that measured various perceptual skills in relation to reading ability in children. How should you report/interpret results? N = 20 subjects Created from script on: https://osf.io/skz3j/ * p < .05, ** p < .01
  • 11. Key question: Did researcher specifically predict this association?
  • 12. • Probability that a specific prespecified pair of variables will be correlated at p < .05 level when null is true = .05. • Probability that at least one of 21 correlations will meet p < .05 when null is true: = 1 - .95^21 = .64 (i.e. one minus prob. that NONE is significant) • Bonferroni-corrected significance level = .05/21 = .002 • With N = 20 subjects, to reach .002 significance level, r = .61 If you didn’t predict this specific association and you report uncorrected p-values, this is P-hacking
  • 13. P-hacking -> huge risk of false positives
  • 14. You run a study investigating how a drug, X, affects anxiety. You plot the results by age, and see this: No significant effect of X on anxiety overall -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 16 20 24 28 32 36 40 44 48 52 56 60 Symptomimprovement Age (yr) Treatment effect by age
  • 15. But you notice that there is a treatment effect for those aged over 36 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 16 20 24 28 32 36 40 44 48 52 56 60 Symptomimprovement Age (yr) Treatment effect by age
  • 16. How should you analyse/report this result? • We tested whether X affects anxiety • We tested whether X affects anxiety in people aged over 36 years • We tested whether age affects the impact of X on anxiety -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 16 20 24 28 32 36 40 44 48 52 56 60 Symptomimprovement Age (yr) Treatment effect by age
  • 17. How should you analyse/report this result? • We tested whether X affects anxiety -TRUE • We tested whether X affects anxiety in people aged over 36 years – UNTRUE, and most would agree unacceptable • We tested whether age affects the impact of X on anxiety – UNTRUE, but many would think acceptable, given results -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 16 20 24 28 32 36 40 44 48 52 56 60 Symptomimprovement Age (yr) Treatment effect by age
  • 19. Close link between p-hacking and HARKing You are HARKing if you have no prior predictions, but on seeing results you write up paper as if you planned to look at effect of age on drug effect. This kind of thing is endemic in psychology. • It is OK to say that this association was observed in exploratory analysis, and that it suggests a new hypothesis that needs to be tested in a new sample. • It is NOT OK to pretend that you predicted the association if you didn’t. • And it is REALLY REALLY NOT OK to report only the data that support your new hypothesis (e.g. dropping those aged below 36 from the analysis) -1 -0.5 0 0.5 1 16 20 24 28 32 36 40 44 48 52 56 60 Symptom improvement Age (yr) Treatment effect by age
  • 20. HARKING Capitalises on chance and produces huge risk of false positives Widespread in many fields – and even explicitly encouraged by some influential people
  • 21. Which Article Should You Write? There are two possible articles you can write: (a) the article you planned to write when you designed your study or (b) the article that makes the most sense now that you have seen the results. They are rarely the same, and the correct answer is (b). re Data Analysis: Examine them from every angle. Analyze the sexes separately. Make up new composite indexes. If a datum suggests a new hypothesis, try to find additional evidence for it elsewhere in the data. If you see dim traces of interesting patterns, try to reorganize the data to bring them into bolder relief. If there are participants you don’t like, or trials, observers, or interviewers who gave you anomalous results, drop them (temporarily). Go on a fishing expedition for something— anything —interesting. Writing the Empirical Journal Article Daryl J. Bem The Compleat Academic: A Practical Guide for the Beginning Social Scientist, 2nd Edition. Washington, DC: American Psychological Association, 2004. “This book provides invaluable guidance that will help new academics plan, play, and ultimately win the academic career game.” Explicitly advises HARKing!
  • 22. HARKING seems innocuous but it fills the literature with dross
  • 23. ‘We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study.’ Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2012). A 21 Word Solution. SPSP Dialogue,26, 2, Fall 2012 issue. One solution to protect against HARKing: Suggested wording in write-up to keep researchers honest
  • 24. A more comprehensive solution: Pre-registration
  • 25. Plan study Do study Submit to journal Respond to reviewer comments Publish paper Acceptance! Classic publishing
  • 26. Plan study Do study Submit to journal Respond to reviewer comments Publish paper Plan study Submit to journal Respond to reviewer comments Do study Publish paper Acceptance! Classic publishing Registered reports Acceptance!
  • 27. Registered reports solves issues of: • Publication bias: publication decision made on the basis of quality of introduction/methods, before results are known • Low power: researchers required to have 90% power • P-hacking: analysis plan specified up-front • HARKing: hypotheses specified up-front. Unanticipated findings can be reported but clearly demarcated as ‘exploratory’
  • 28. Registered reports But problematic for student projects because: • Time-scale means delay before data collected • Power requirements often hard to meet Plan study Submit to journal Respond to reviewer comments Do study Publish paper Acceptance!
  • 29. An alternative: Preregistration lite: Open Science Framework
  • 30. Pre-registration on OSF • Similar to regular publication route • No guarantee of publication • But reviewers generally positive about preregistered papers because prevents p-hacking or HARKing • And benefits of having well-worked out plan – less stress when it comes to making sense of data Plan study Submit plan to OSF Check by OSF statistician Do study Submit to journal Respond to reviewer comments Publish paper Acceptance!
  • 31. Advantages of pre-registration through OSF • Free methodological/statistical consulting: http://cos.io/stats_consulting • Have a date-stamped record of what was planned which can be referred to when publishing • Encourages open data and scripts • Work has greater impact • Errors get detected • Improves reproducibility • Possibility of winning $1000 !
  • 32. https://osf.io/jea94/ Even if you decide not to formally pre-register, this template is likely to be useful for planning your project Here I won’t go through template, but I will note points that crop up when trying to complete it
  • 33. What is your research question? Points to consider • What type of question? Yes/No, Why, How? • How could it be improved? – is it too general/too precise Hypotheses • Can you formulate specific predictions? • E.g. X will be bigger than Y • X will be bigger than zero • X will vary systematically with Y • Are predictions directional? Study information
  • 34. What is your study type? Are you systematically manipulating a variable to see its effect? = Experiment Or are you looking at relationships between variables that occur naturally? = Observational Study
  • 35. Rationale for proposed sample size OSF note this could include a power analysis but also constraints such as time, money, availability of a particular group Power analysis Can use GPower http://www.gpower.hhu.de/en.html For more complex designs, simulate data with given effect size and repeatedly run through analysis to see how often you can detect the effect of interest (see Lazic, 2016) N.B. Power analysis is not the only way to rationalise sample size, but it is most common Sample size
  • 36. When we talk of ‘control’ we usually mean ‘negative’ controls – where we compare effect of X with a situation that is identical except for X – to isolate specific effect. In a positive control: aim is to rule out trivial explanations for null result – e.g. manipulation didn’t work; participants didn’t attend, etc. Examples: You are comparing autistic and typically developing children on a false memory task – does paradigm yield false memory effect in the typically-developing group? You are interested whether there is suppression of the mu frequency band in EEG (regarded as ‘mirror neuron’ activity) when participants view hand gestures: - is there mu suppression when participants perform hand gestures? Making sense of null results: 1. Positive controls
  • 37. Making sense of null results: 2. Bayes factors ”Bayes factors provide a coherent approach to determining whether non- significant results support a null hypothesis over a theory, or whether the data are just insensitive.”
  • 38. Designing an analysis script using simulated data • Aim: create a process that is completely transparent and increase the likelihood that your analysis can be replicated • Analysis script reads in raw data as input, does all analysis and generates Tables/Figures/Summary statistics as output • Avoids common scenario where researcher cannot remember how results were generated • Scripts need to be extensively ’commented’ to explain what the code is doing • Increasingly researchers are using scripts written in R, Matlab or Python, but can also create scripts easily in SPSS.
  • 39. Building a script in SPSS We start by simulating some data (see last week’s lecture) Can be either random data (null hypothesis) or data with an effect of interest added Here are random numbers (Score) allocated to groups 1 and 2
  • 40. Building a script in SPSS Now do whatever analysis steps you usually do via the GUI Here we have selected options for a t- test But instead of hitting OK, we hit Paste
  • 41. Building a script in SPSS Hitting Paste opens a new window with a script in it This shows the code behind the analysis done in the GUI You can run the analysis by pressing the big green button You can add to the script, add comments, and save it. Then you can rerun it any time with a new dataset Benefit: you have a complete record of the analysis Hitting Paste opens a new window with a script in it This shows the code behind the analysis done in the GUI You can run the analysis by pressing the big green button You can add to the script, add comments, and save it. Then you can rerun it any time with a new dataset Benefit: you have a complete record of the analysis
  • 44. http://christophergandrud.github.io/RepResR-RStudio/ “Treat all of your research files as if someone who has not worked on the project will, in the future, try to understand them.”
  • 45. https://images.nature.com/original/nature- cms/uploads/ckeditor/attachments/4127/RegisteredReportsGuidelines_NatureHumanBe haviour.pdf Require either 95% power or Bayesian equivalent For Bayes, recommend https://osf.io/d4dcu/ Going further: author/reviewer guidelines for Registered Reports in Nature Human Behaviour