Unit Author: Professor Nick J Fox

Learning objectives

Having successfully completed the work in this unit, you should be able to:

  • Choose which articles within your field to read, appraise and incorporate within your own research.
  • Describe the rules of evidence necessary to appraise a piece of literature for its validity and applicability.
  • Systematically apply these rules of evidence to a piece of published literature.

1. What is critical appraisal and why do we bother?

In its broadest sense, critical appraisal is the systematic application of rules of evidence to a piece of published literature to determine its validity and relevance. By the time you have reached this stage of the social research methods course, you should have the required knowledge and skills in social inquiry to critically appraise any piece of literature within your field.

This knowledge and these skills will enable you to identify the important and relevant evidence (within academic and professional research papers), and assess the findings they report for its applicability to your own research.

Critical appraisal is the means to judge what is known about a subject, and what still needs to be discovered.  It is therefore the means by which you can refine your research question.

But it also provides the materials for you to synthesise a literature review that can present the current state of knowledge in a field of study; a review which is more than just a description of the contents of one paper after another. In other words, it will enable you to critically analyse and interpret the significance of literature and inform your own research, enabling you to put your work in the wider context of scholarship.

For each of these reasons, critical appraisal is a key research skill, and in this unit, we will look at the criteria by which you can evaluate the quality of a piece of published literature such as a research paper.  Such quality assessment is particularly important because of the sheer volume of papers out there.  Given the emphasis on publication in academia, and the exponential increase in specialist academic journals (some of which are now only published online), there is a vast quantity of social scientific literature available.  It is estimated that if you read one paper a day, after a year, you would need 6000 years to catch up on what had been published since you began your review!

So unless you have some systematic way of appraising what is published, filtering it and incorporating it within your own canon of literature then quite simply you will founder in the tide of research papers.  To give you the skills you need, we will now look at the different elements of a paper, and how you can judge the quality of what has been written.

2. Stages in critically appraising a paper

2.1 Research process

Before starting, you need to decide whether a paper is worth reading in the first place. Start with the abstract (the summary at the start of a paper) and ask yourself the following questions:

  • What was the authors’ aim in doing this research?
  • What did the authors do practically?
  • What were the findings?
  • What do the findings mean?

If the answers to these questions are not readily apparent on reading the abstract and skimming through the paper then it may not be worth reading. However, if the results of this first filtering are good, read on.

2.2 Research question, aims and objectives

The first questions a critical reader needs to ask when reading a paper are `Have the authors defined a clear and important research question?’ and/or `Have they formulated a clear aim or specified objectives for their study?

In some cases a hypothesis will be stated which the authors wish to test.

All of these are the appropriate first step in the research process as we have already seen. If none of these approaches are evident then it is unlikely that the effort involved in reading, appraising and making notes on this particular paper is worthwhile.

In summary, ask yourself:

  • What is the overarching aim of the study? (What research question would the study answer?)
  • Is this aim unambiguously defined?
  • Is it a question worth asking?
  • Are the objectives (what the research needed to do, in order to answer the question) clearly stated?

2.3 Design

A key question to ask here is:

  • Is the chosen design appropriate to answer the research question?

The appropriateness of the design assesses if it is well suited to answer the research question (Morse, 1990: 134).

Consider, for example, this research question: ‘What are the relative levels of poverty among older people living in the city and in the country?’

An appropriate design for this particular question is one that can make a statistical comparison of samples of these two groups: a quantitative survey would probably be the most appropriate design.

If, however, when you read the paper you find the researchers used a small ethnographic design, you would be justified in concluding that – though interesting – the study design has been inappropriate, as it is does not provide the quantitative data needed to make a statistical comparison between the groups.

By contrast, an appropriate design for a study that wanted to explore people’s experiences or values would be qualitative, such as a design using in-depth interviews.

When reading a range of social research papers, you might ask critical questions such as:

  • Is a survey appropriate to assess how life has changed for the respondents over the past 10 years?
  • Is a qualitative design appropriate to explore the range of religious beliefs among recent migrants to the UK?
  • Is this case-study approach to race-track gambling appropriate to cover all the possible circumstances that may arise in this setting?

(You may like to look back at Unit 3 to consider which designs are appropriate to which research questions.)

2.4 Methods

If you have established the appropriateness of the research design being used, the next step is to look at the specific methods to be used, and here we need to use two separate but related concepts, appropriateness and adequacy, to critically appraise the methods used.  Let us look at each in turn.

  • Are the methods used appropriate?

This question concerns whether the methods being used have the potential to gather the correct data to answer the question.  It relates to the internal validity of the study, but also to the feasibility and ethics of research (see Units 4 and 10).

Consider again the research question: ‘What are the relative levels of poverty among older people living in the city and in the country?’

To answer this question, we will need data that gives insight into income and expenditure of these two groups.  The most accurate data could be gleaned from the bank accounts of these individuals, but access to these is neither feasible nor ethical.  So we must depend upon other indicators of poverty, probably using some kind of questionnaire.  We might ask respondents about their shopping habits, or whether they have difficulty paying for heating or transport.  Together these indicators may enable the study to assess poverty levels in the different locations.

Appropriateness is also an issue in the choice of data analysis methods.  In this example, the most appropriate analysis method will be a simple comparative statistical analysis (for instance, a t-test – see Unit 7).

The second question is:

  •  Are the methods used adequate?

This question considers if the methods chosen have the capacity – practically – to produce sufficient high-quality data to answer the question satisfactorily (Morse, 1990: 134).  It relates to both reliability and instrument validity (see Unit 4).

For example, if the survey we have decided to use to compare poverty levels among city and country dwellers uses a postal questionnaire, this might have problems getting reliable and valid data from those suffering highest levels of poverty, who may be less likely to respond to these sensitive questions.  A telephone or Internet survey might also be inadequate, as some potential respondents may not have access to these modes of communication.  We might conclude that a door-to-door structured interview is the only method of data collection that is adequate to gather high quality data.

The sorts of questions the critical reader of social research papers will ask concerning research methods will be:

  • Is a postal questionnaire appropriate to gather in-depth data on respondents’ early experiences of education?
  • Were audio-taped group interviews with families adequate to identify the interactions that went on within the family?
  • How reliable and valid was the data gathered during an ethnography of binge drinking by teenagers in city centre clubs?

2.5 Sampling

Sampling is a specific aspect of a study’s methods, but is worth considering separately.  As we saw in Unit 5, there are a number of different sampling methods that may be used, including simple random, systematic, stratified, non-random and theoretical.  These sampling methods are used by researchers for differing purposes, and when we critically appraise a study, we should ask:

  • Is the sample appropriate and adequate to answer the question?

The appropriateness of the sampling method refers to whether the right categories of people have been chosen to answer the research question, while the adequacy concerns whether the sample is sufficient in size and quality to provide the data needed (Morse, 1990: 127).

Once again, consider the research question: ‘What are the relative levels of poverty among older people living in the city and in the country?’

Given we have identified a survey as the appropriate design, then in terms of appropriateness, the study would need to define ‘city’ and ‘country’ and then select respondents who fall into one or other of these geographical categories.  It would also need to use an appropriate random sampling method, to avoid introducing biases into the samples that would prevent valid comparisons on the dependent variable (poverty) that the study is investigating.

To attain an adequate sample, the study would need to gather sufficient data to enable statistical comparisons.  This is partly a matter of sample size: the number of respondents needs to be adequate to provide both statistical significance and statistical power (see Unit 7).  But it is also about external validity: the researchers need to use a sampling frame that assures representativeness of the samples to the population from shih they are drawn.  So, for example, they randomly stopped people they met in the street, you would be justified in concluding that the study did not have external validity, and was therefore inadequate.

The kinds of critical questions you will need to ask concerning sampling are:

  • Is it clear from this study of racism among recent migrants what kind of sampling method was chosen, and was it appropriate?
  • Was the sampling frame used in this study of social mobility representative, or could it be biased?
  • Is the sample large enough to provide this quantitative study of voting intentions of men and women with sufficient power to detect a `real’ difference?)
  • Has the sample in this qualitative study been cast wide enough to cover the range of phenomena under study?

2.6 Results

The key issues here for the critical reader are the accessibility of the data presented, and the appropriateness of the analysis.

In terms of accessibility, data should be displayed in ways that to enable judgements to be made.  For quantitative studies, there should be a good use of tables and figures (graphs, bar charts etc.), and numbers should be used in addition to percentages.

In terms of appropriateness of analysis methods, you need to look carefully at the framework used and what assumptions may have been made when analysing the findings. The aim here is to ensure that the conclusions drawn are appropriate to the data gathered and does not exaggerate the data’s significance.

For quantitative analysis some key questions to ask are:

  • Has an appropriate statistical test been used by the authors for the particular type of data which have been collected? For example, have tests designed for interval data been used on nominal data?
  • Have justifiable statistical inferences been made about a relationship between different variables?
  • Does the study have sufficient power to identify real differences?

For qualitative papers, different sorts of questions need to be asked:

  • Is the process of data analysis transparent, comprehensive and clearly written?
  • Have the authors used an appropriate method for sorting and coding their data?
  • Are there sufficient examples or quotes from transcripts given to justify the themes presented and the conclusions drawn?
  • Has the author described their own prejudices or ‘conceptual baggage’?
  • Have the raw data (for example, transcripts of interviews) been archived so that they can be re-analysed?

For both quantitative and qualitative studies, ask what data has not been presented.  All too often when reading a piece of work, the data which is not published is the very data which would enable a critical reader to make an assessment of the validity of the results.  Some data on non-responders to a questionnaire, for example, should be offered since a poor response rate can introduce bias. For example, only those with a criticism of a particular service might bother to return a questionnaire.  Key questions include:

  • Have any non-responders been followed up to see if they differ in significant ways from the responders?
  • Has the data been reweighted to compensate for the non-response and to make the findings more representative?
  • Finally, have any confounding variables been considered which might influence the results?

2.7 Conclusions and recommendations

Finally, a critical reader needs to assess the quality of the conclusions of a study:

  • Do the conclusions and recommendations follow from the results?
  • Have any conclusions been drawn that are not supported by the data presented?
  • Indeed, are any results left `unspoken’, and if so is there any explanation of why these results have not been discussed?
  • Has there been any reflection on the limitations of the study or methodology and how significant are the conclusions in a practical and statistical sense?

Answers to these questions will enable the critical reader to decide whether the conclusions of a piece of published work are worthy of recognition as ‘knowledge’, or whether there are doubts about its validity and relevance.

By critically appraising a study in terms of these component parts, it is possible to come to a judgement about the quality of the study, from the way its research question was formulated, right through to the conclusions drawn.

The following SAQ will give you an opportunity to practice some of these skills by applying them to a published paper.

SAQ 12.1 Critical appraisal in practice

Please download the following open access paper, and read it quickly but critically, applying the lessons of critical appraisal provided earlier.Wills, W., Meah, A., Dickinson, A.M. and Short, F. (2015) I don’t think I ever had food poisoning’. A practice-based approach to understanding food-borne disease that originates in the home.  Appetite, 85: 118–125.http://www.sciencedirect.com/science/article/pii/S0195666314005443

[table id=10 /]


Critical appraisal enables you to assess the quality of research, rather than simply take it at face value.  We have dissected the different stages of the research process and asked critical questions of each stage.  When you first start appraising relevant literature it will be useful to use this kind of detailed checklist.  Once you are more familiar with the research process, it will become second nature, and you will quickly spot flaws in research papers.  Remember that research can never fully recapitulate the messy business of real life, so any research study can only ever be `good enough’.  Critical appraisal is the means to assess the ways in which a study and the events it aims to report diverge.

To complete this unit, now please undertake the following reflective exercise, for inclusion in your log-book.

Reflective Exercise 12.1

Please think about what you have learnt about the process of critical appraisal, and what you learnt from reading and appraising the paper in SAQ 12.1.

[table id=11 /]


Morse, J.M. (1990) Strategies for sampling.  In: Morse, J.M. (ed.) Qualitative Nursing Research. London: Sage.

Further reading

Greenhalgh, T. (2014) How to Read a Paper: The Basics of Evidence-Based Medicine.  Chichester: Wiley.

Webber, M. (2015) Applying Research Evidence in Social Work Practice.  London: Palgrave.

Answers to SAQ 12.1

1 Research question, aims and objectives

a) What is the research question, and has this been clearly stated?

    The research question was not explicitly stated.

b) Have the authors formulated a clear aim for their study?

    The aim was to explore kitchen practices, to discover what people do to avoid food-borne disease (FBD) and what they know about food safety and preventing FBD in the home.

c) Are the objectives (what the research needed to do, in order to answer the question) clearly stated?

    No these were not stated, though the methods section described the methods that were used.

2. Design

a) What design was used and is it appropriate to answer the research question?

    The study used an ethnographic ‘practices approach’ to investigate the meanings and context of everyday kitchen life.

3. Methods

a) What are the methods used, and are they appropriate to the question?

    Observation, video, photography, interviews and participant diaries/scrap books.  These are standard ethnographic methods and can provide a rich picture of the practices of the observed sample.

b) Are the methods used adequate to collect the data needed to answer the question?

    Yes.  However, the data may not necessarily be generalisable to a wider population as it did not seek to be proportionate.

4. Sampling

a) What was the sampling method and is it appropriate to answer the question?

    A sample of 20 households was selected from a larger database, and sought a wide range of households (including those over 60 and pregnant women who are at higher risk of FBD), with the aim of maximising variability.  This qualitative sample can provide in-depth information on the practices of food preparation and the meanings associated with them.

b) Is the sampling method adequate (e.g. have the right categories of people have been chosen; is sample sufficient in size and quality to provide the data needed)?

    It was stated that it was large enough, but this was not justified.  The sample was not representative of the wider population but include a wide range of household types.

5. Analysis

a) What methods of analysis were used, and are they appropriate to the data collected?

    It was a bottom-up or inductive approach, adopting some aspects of ‘grounded theory’.  The data were closely read to identify features and from these more general patterns were identified.  Four themes were developed and refined, though these themes –Where, How, With whom and Why – do not seem particularly ‘emergent’.  This kind of inductive analytical approach is an appropriate way to analyse the data.

6. Results

a) Are the findings displayed in ways that enable judgements to be made?

    The four themes are reported descriptively, with extracts from the data used to illustrate points.  The findings section did not move beyond description to a more analytical level, though the following discussion offered some more analytical reflections.

7. Conclusions and recommendations

a) Do the conclusions and recommendations follow from the results?

    The conclusions were generally speculative (for instance that over-60s might be at greater risk of FBD because of their kitchen practices or because food production processes have changed during their lifetime), with the claim that FBD is not a high priority for most people being underplayed.  There were no recommendations.

b) How significant are the conclusions (in a practical and/or statistical sense)?

    Not very significant, as the authors seem more concerned in the conclusions with their method than about the findings.  Potential significance for health promotion campaigns are not drawn out.  The paper could be used to inform a larger study, perhaps attempting some kind of intervention or action research.

[box type=”info”]This unit is part of the course on Social Research Methods. You must be registered and logged in to access course content. Back to courses Welcome page.[/box]