Skip to main content

Extract Data

Overview

Data extraction is the process of systematically pulling the information you need from each included study and recording it in a standardised form. It bridges the gap between your screened, appraised set of studies and the synthesis you will conduct later. Consistent, thorough extraction is what makes synthesis possible: if you extract different information from different papers, you cannot meaningfully compare or combine them.

Extraction is not the same as reading for interest. You are not summarizing papers freely; you are completing a pre-designed form that captures the same fields from every study in the same way.


Design Your Extraction Form

Your extraction form should have been designed as part of your protocol. Review it now against your actual included studies and refine if necessary. Any changes at this stage count as a protocol amendment and should be documented.

Core Fields

Every extraction form for a business or management SLR should include the following fields as a minimum:

FieldWhat to record
Study IDA unique reference number you assign (e.g., S01, S02) for use in tables and in-text citation during synthesis
Author(s)Last name and initials of all authors
YearYear of publication
TitleFull title of the article
Journal/SourceJournal name, conference, or report series
Country/RegionCountry where the study was conducted or the data originate
Study design/methodQualitative, quantitative, mixed-methods; specify further (e.g., semi-structured interviews, survey, case study)
SampleSize, type, and characteristics of the sample or dataset
Data collection periodWhen data were collected (may differ from publication year)
Key findingsA concise, accurate summary of findings relevant to your research question; use the authors' own language where possible
Theoretical frameworkAny theory the study draws on (relevant for deductive synthesis)
Limitations noted by authorsAs reported in the paper
Quality appraisal ratingTransfer the overall rating from earlier appraisal
NotesAny observations relevant to synthesis (e.g., contradicts S04; uses unusual operationalisation)

Additional Fields by Research Type

Depending on your topic, you may also need:

  • Quantitative studies: key statistical outcomes (effect sizes, correlation coefficients, significance levels), measurement instruments used
  • Qualitative studies: epistemological position (interpretivist, constructivist), method of analysis (thematic analysis, grounded theory, discourse analysis)
  • Conceptual or review papers: type of contribution (framework, typology, critique), scope of literature reviewed

Format of the Extraction Form

A spreadsheet (Excel or LibreOffice Calc) with one row per study and one column per field is the standard format and works well for most thesis-level reviews.

Advantages:

  • Easy to sort and filter by country, method, year, or quality rating
  • Column widths can accommodate varying amounts of text
  • Can be shared with a supervisor for review
  • Exports cleanly to a summary table for the thesis appendix

A blank version of the extraction form should be included as an appendix in your final thesis.


Conducting the Extraction

Work through each included study in order of your Study ID. For each paper:

  1. Read the full text carefully, focusing on the abstract, introduction, methods, results, and discussion sections
  2. Complete every field in the extraction form; leave no field blank (use "not reported" where the paper does not provide the information rather than leaving the cell empty)
  3. Record findings in your own words, except for key definitions or theoretical statements where the authors' precise language matters; note any direct quotations with page numbers
  4. Note any information that is ambiguous, inconsistent between sections of the paper, or that raises a question for synthesis

Handling Ambiguity

You will encounter papers where the methodology is not clearly described, findings are presented inconsistently, or the research question shifts between the introduction and the discussion. Record what is actually in the paper, note the ambiguity explicitly, and do not interpret charitably to fill gaps. Gaps in reporting are themselves evidence of methodological weakness and belong in your quality appraisal record.


Pilot Extraction

Before extracting all included studies, conduct a pilot on three to five papers. It tests whether your form captures the information you actually need for synthesis

After the pilot, review the form: are any fields consistently empty or impossible to complete? Are any fields producing inconsistent entries between reviewers? Revise the form before proceeding, and document any changes as a protocol amendment.


Maintaining an Audit Trail

Keep a running note of any decisions you make during extraction that go beyond straightforward form completion. Examples:

  • "S07 reports two separate studies in one paper; extracted as two separate rows"
  • "S12 uses 'SME' to mean firms with under 500 employees, which differs from the EU definition; noted in synthesis"
  • "S19 abstract reports significant results but body of paper presents a non-significant finding; used body of paper"

These notes protect you if your decisions are questioned during examination, and they support transparency if your review is ever published.


Preparing for Synthesis

Before moving to next step, review your completed extraction form as a whole:

  • Are there patterns in the countries, methods, or theoretical frameworks of included studies?
  • Are there clusters of studies addressing the same sub-question or using the same construct?
  • Are there contradictions between studies that will need to be addressed in synthesis?
  • Are there gaps in the evidence base that were not apparent before extraction?

A brief written memo at this stage, even half a page of notes, is a valuable precursor to synthesis. It helps you enter the next step with a sense of the landscape of the evidence rather than facing a blank page.


Common Mistakes to Avoid

  • Extracting selectively. Record all findings relevant to your research question, including those that contradict your expectations. Selective extraction is a form of bias.
  • Leaving fields blank. An empty field is ambiguous: it may mean the information was not reported, or it may mean you forgot to check. Use "not reported" explicitly.
  • Conflating extraction with synthesis. The extraction form captures what each study says; synthesis is where you interpret and compare across studies. Do not begin drawing conclusions in the extraction form.
  • Using only the abstract. Abstracts routinely omit, simplify, or misrepresent the findings of the full paper. Always extract from the full text.
  • Not versioning the form. If you revise the extraction form after beginning extraction, save both versions and note when the change was made. Applying revised criteria retrospectively without documentation introduces inconsistency.