more research is needed to effectively evaluate

Home » Evaluating Research – Process, Examples and Methods

Evaluating Research – Process, Examples and Methods

Table of Contents

Evaluating Research

Research evaluation is a systematic process used to assess the quality, relevance, credibility, and overall contribution of a research study. Effective evaluation allows researchers, policymakers, and practitioners to determine the reliability of findings, understand the study’s strengths and limitations, and make informed decisions based on evidence. Research evaluation is crucial across disciplines, ensuring that conclusions drawn from studies are valid, meaningful, and applicable.

Why Evaluate Research?

Evaluating research provides several benefits, including:

Ensuring Credibility : Confirms the reliability and validity of research findings.
Identifying Limitations : Highlights potential biases, methodological flaws, or gaps.
Promoting Accountability : Helps allocate funding and resources to high-quality studies.
Supporting Decision-Making : Enables stakeholders to make informed decisions based on rigorous evidence.

Process of Evaluating Research

The evaluation process typically involves several steps, from understanding the research context to assessing methodology, analyzing data quality, and interpreting findings. Below is a step-by-step guide for evaluating research.

Step 1: Understand the Research Context

Identify the Purpose : Determine the study’s objectives and research questions.
Contextual Relevance : Evaluate the study’s relevance to current knowledge, theory, or practice.

Example : For a study examining the effects of social media on mental health, assess whether the study addresses an important and timely issue in the field of psychology.

Step 2: Assess Research Design and Methodology

Design Appropriateness : Determine if the research design is suitable for answering the research question (e.g., experimental, observational, qualitative, or quantitative).
Sampling : Evaluate the sample size, sampling methods, and participant selection to ensure they are representative of the population being studied.
Variables and Measures : Review how variables were defined and measured, and ensure that the measures are valid and reliable.

Example : In an experimental study on cognitive performance, check if participants were randomly assigned to control and treatment groups to ensure the design minimizes bias.

Step 3: Evaluate Data Collection and Analysis

Data Collection Methods : Assess the tools, procedures, and sources used for data collection. Ensure they align with the research question and minimize bias.
Statistical Analysis : Review the statistical methods used to analyze data. Check for appropriate use of tests, proper handling of variables, and accurate interpretation of results.
Ethics and Integrity : Consider whether data collection and analysis adhered to ethical guidelines, including participant consent, data confidentiality, and unbiased reporting.

Example : If a study uses surveys to collect data on job satisfaction, evaluate if the survey questions are clear, unbiased, and relevant to the research objectives.

Step 4: Interpret Results and Findings

Relevance of Findings : Determine whether the findings answer the research question and contribute meaningfully to the field.
Consistency with Existing Knowledge : Check if the results align with or contradict previous research. If they contradict, consider potential explanations for the differences.
Generalizability : Evaluate whether the findings are applicable to a broader population or specific to the study sample.

Example : For a study on the effects of a dietary supplement on athletic performance, assess whether the findings could be generalized to athletes of different ages, genders, or skill levels.

Step 5: Assess Limitations and Biases

Identifying Limitations : Recognize any acknowledged limitations in the study, such as small sample size, selection bias, or short duration.
Potential Biases : Consider potential sources of bias, including researcher bias, funding source bias, or publication bias.
Impact on Validity : Evaluate how limitations and biases might impact the study’s internal and external validity.

Example : If a study on drug efficacy was funded by a pharmaceutical company, acknowledge the potential for funding bias and whether safeguards were in place to maintain objectivity.

Step 6: Conclude with Overall Quality and Contribution

Summarize Strengths and Weaknesses : Provide an overview of the study’s strengths and limitations, focusing on aspects that affect the reliability and applicability of the findings.
Contribution to the Field : Assess the overall contribution to knowledge, practice, or policy, and identify any recommendations for future research or application.

Example : Conclude by summarizing whether the study’s methodology and findings are robust and suggest areas for future research, such as longer follow-up periods or larger sample sizes.

Examples of Research Evaluation

Purpose : To assess whether stress levels affect productivity.
Evaluation Process : Review if the sample includes participants with varying stress levels, if the stress is accurately measured (e.g., cortisol levels), and if the analysis properly accounts for confounding variables like sleep or work environment.
Conclusion : The study could be evaluated as robust if it uses valid measures and controlled conditions, with future research suggested on different population groups.
Purpose : To determine if digital learning tools improve student outcomes.
Evaluation Process : Assess the appropriateness of the sample (students with similar baseline knowledge), methodology (controlled comparisons of digital vs. traditional methods), and results interpretation.
Conclusion : Evaluate if findings are generalizable to broader educational contexts and whether technology access could be a limitation.
Purpose : To determine the efficacy of a new medication for treating anxiety.
Evaluation Process : Review if participants were randomly assigned, if a placebo was used, and if double-blinding was implemented to minimize bias.
Conclusion : If the study follows a strong experimental design, it could be deemed credible. Note potential side effects for further investigation.

Methods for Evaluating Research

Several methods are used to evaluate research, depending on the type of study, objectives, and evaluation criteria. Common methods include peer review , meta-analysis , systematic reviews , and quality assessment frameworks .

1. Peer Review

Definition : Peer review is a method in which experts in the field evaluate the study before publication. They assess the study’s quality, methodology, and contribution to the field.

Advantages :

Increases the credibility of the research.
Provides feedback on methodological rigor and relevance.

Example : Before publishing a study on environmental sustainability, experts in environmental science review its methods, findings, and implications.

2. Meta-Analysis

Definition : Meta-analysis is a statistical technique that combines results from multiple studies to draw broader conclusions. It focuses on studies with similar research questions or variables.

Offers a comprehensive view of a topic by synthesizing findings from various studies.
Identifies overall trends and potential effect sizes.

Example : Conducting a meta-analysis of studies on cognitive behavioral therapy to determine its effectiveness for treating depression across diverse populations.

3. Systematic Review

Definition : A systematic review evaluates and synthesizes findings from multiple studies, providing a high-level summary of evidence on a particular topic.

Follows a structured, transparent process for identifying and analyzing studies.
Helps identify gaps in research, limitations, and consistencies.

Example : A systematic review of research on the impact of exercise on mental health, summarizing evidence on exercise frequency, intensity, and outcomes.

4. Quality Assessment Frameworks

Definition : Quality assessment frameworks are tools used to evaluate the rigor and validity of research studies, often using checklists or scales.

Examples of Quality Assessment Tools :

CASP (Critical Appraisal Skills Programme) : Provides checklists for evaluating qualitative and quantitative research.
GRADE (Grading of Recommendations Assessment, Development and Evaluation) : Assesses the quality of evidence and strength of recommendations.
PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) : A guideline for systematic reviews, ensuring clarity and transparency in reporting.

Example : Using the CASP checklist to evaluate a qualitative study on patient satisfaction with healthcare services by assessing sampling, ethical considerations, and data validity.

Evaluating research is a critical process that enables researchers, practitioners, and policymakers to determine the quality and applicability of study findings. By following a structured evaluation process and using established methods like peer review, meta-analysis, systematic review, and quality assessment frameworks, stakeholders can make informed decisions based on robust evidence. Effective research evaluation not only enhances the credibility of individual studies but also contributes to the advancement of knowledge across disciplines.

Creswell, J. W., & Creswell, J. D. (2018). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (5th ed.). SAGE Publications.
Petticrew, M., & Roberts, H. (2006). Systematic Reviews in the Social Sciences: A Practical Guide . Blackwell Publishing.
Egger, M., Smith, G. D., & Altman, D. G. (2008). Systematic Reviews in Health Care: Meta-Analysis in Context (2nd ed.). Wiley-Blackwell.
Greenhalgh, T. (2019). How to Read a Paper: The Basics of Evidence-Based Medicine and Healthcare (6th ed.). Wiley-Blackwell.
Higgins, J. P. T., & Green, S. (Eds.). (2011). Cochrane Handbook for Systematic Reviews of Interventions (Version 5.1.0). The Cochrane Collaboration.

About the author

Muhammad Hassan

Researcher, Academic Writer, Web developer

Research Techniques – Methods, Types and Examples

Research Methods – Types, Examples and Guide

Data Collection – Methods Types and Examples

Significance of the Study – Examples and Writing...

Research Problem – Examples, Types and Guide

Critical Analysis – Types, Examples and Writing...

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

The Federal Evaluation Toolkit BETA

Evaluation 101.

What is evaluation? How can it help me do my job better? Evaluation 101 provides resources to help you answer those questions and more. You will learn about program evaluation and why it is needed, along with some helpful frameworks that place evaluation in the broader evidence context. Other resources provide helpful overviews of specific types of evaluation you may encounter or be considering, including implementation, outcome, and impact evaluations, and rapid cycle approaches.

What is Evaluation?

Heard the term "evaluation," but are still not quite sure what that means? These resources help you answer the question, "what is evaluation?," and learn more about how evaluation fits into a broader evidence-building framework.

What is Program Evaluation?: A Beginners Guide

Program evaluation uses systematic data collection to help us understand whether programs, policies, or organizations are effective. This guide explains how program evaluation can contribute to improving program services. It provides a high-level, easy-to-read overview of program evaluation from start (planning and evaluation design) to finish (dissemination), and includes links to additional resources.

Types of Evaluation

What's the difference between an impact evaluation and an implementation evaluation? What does each type of evaluation tell us? Use these resources to learn more about the different types of evaluation, what they are, how they are used, and what types of evaluation questions they answer.

Common Framework for Research and Evaluation The Administration for Children & Families Common Framework for Research and Evaluation (OPRE Report #2016-14). Office of Planning, Research, and Evaluation, U.S. Department of Health and Human Services. https://www.acf.hhs.gov/sites/default/files/documents/opre/acf_common_framework_for_research_and_evaluation_v02_a.pdf" aria-label="Info for Common Framework for Research and Evaluation">

Building evidence is not one-size-fits all, and different questions require different methods and approaches. The Administration for Children & Families Common Framework for Research and Evaluation describes, in detail, six different types of research and evaluation approaches – foundational descriptive studies, exploratory descriptive studies, design and development studies, efficacy studies, effectiveness studies, and scale-up studies – and can help you understand which type of evaluation might be most useful for you and your information needs.

Formative Evaluation Toolkit Formative evaluation toolkit: A step-by-step guide and resources for evaluating program implementation and early outcomes . Washington, DC: Children’s Bureau, Administration for Children and Families, U.S. Department of Health and Human Services." aria-label="Info for Formative Evaluation Toolkit">

Formative evaluation can help determine whether an intervention or program is being implemented as intended and producing the expected outputs and short-term outcomes. This toolkit outlines the steps involved in conducting a formative evaluation and includes multiple planning tools, references, and a glossary. Check out the overview to learn more about how this resource can help you.

Introduction to Randomized Evaluations

Randomized evaluations, also known as randomized controlled trials (RCTs), are one of the most rigorous evaluation methods used to conduct impact evaluations to determine the extent to which your program, policy, or initiative caused the outcomes you see. They use random assignment of people/organizations/communities affected by the program or policy to rule out other factors that might have caused the changes your program or policy was designed to achieve. This in-depth resource introduces randomized evaluations in a non-technical way, provides examples of RCTs in practice, describes when RCTs might be the right approach, and offers a thorough FAQ about RCTs.

Rapid Cycle Evaluation at a Glance Rapid Cycle Evaluation at a Glance (OPRE #2020-152). Office of Planning, Research, and Evaluation, U.S. Department of Health and Human Services. https://www.acf.hhs.gov/opre/report/rapid-cycle-evaluation-glance" aria-label="Info for Rapid Cycle Evaluation at a Glance">

Rapid Cycle Evaluation (RCE) can be used to efficiently assess implementation and inform program improvement. This brief provides an introduction to RCE, describing what it is, how it compares to other methods, when and how to use it, and includes more in-depth resources. Use this brief to help you figure out whether RCE makes sense for your program.

Evaluation.gov

An official website of the Federal Government

Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Evaluating Sources: General Guidelines

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

Once you have an idea of the types of sources you need for your research, you can spend time evaluating individual sources. If a bibliographic citation seems promising, it’s a good idea to spend a bit more time with the source before you determine its credibility. Below are some questions to ask and things to consider as you read through a source.

Find Out What You Can about the Author

One of the first steps in evaluating a source is to locate more information about the author. Sometimes simply typing an author’s name into a search engine will give you an initial springboard for information. Finding the author’s educational background and areas of expertise will help determine whether the author has experience in what they’re writing about. You should also examine whether the author has other publications and if they are with well-known publishers or organizations.

Read the Introduction / Preface

Begin by reading the Introduction or the Preface—What does the author want to accomplish? Browse through the Table of Contents and the Index. This will give you an overview of the source. Is your topic covered in enough depth to be helpful? If you don't find your topic discussed, try searching for some synonyms in the Index.

If your source does not contain any of these elements, consider reading the first few paragraphs of the source and determining whether it includes enough information on your topic for it to be relevant.

Determine the Intended Audience

Consider the tone, style, vocabulary, level of information, and assumptions the author makes about the reader. Are they appropriate for your needs? Remember that scholarly sources often have a very particular audience in mind, and popular sources are written for a more general audience. However, some scholarly sources may be too dense for your particular research needs, so you may need to turn to sources with a more general audience in mind.

Determine whether the Information is Fact, Opinion, or Propaganda

Information can usually be divided into three categories: fact, opinion, and propaganda. Facts are objective, while opinions and propaganda are subjective. A fact is something that is known to be true. An opinion gives the thoughts of a particular individual or group. Propaganda is the (usually biased) spreading of information for a specific person, group, event, or cause. Propaganda often relies on slogans or emotionally-charged images to influence an audience. It can also involve the selective reporting of true information in order to deceive an audience.

Fact: The Purdue OWL was launched in 1994.
Opinion: The Purdue OWL is the best website for writing help.
Propaganda: Some students have gone on to lives of crime after using sites that compete with the Purdue OWL. The Purdue OWL is clearly the only safe choice for student writers.

The last example above uses facts in a bad-faith way to take advantage of the audience's fear. Even if the individual claim is true, the way it is presented helps the author tell a much larger lie. In this case, the lie is that there is a link between the websites students visit for writing help and their later susceptibility to criminal lifstyles. Of course, there is no such link. Thus, when examining sources for possible propaganda, be aware that sometimes groups may deploy pieces of true information in deceptive ways.

Note also that the difference between an opinion and propaganda is that propaganda usually has a specific agenda attached—that is, the information in the propaganda is being spread for a certain reason or to accomplish a certain goal. If the source appears to represent an opinion, does the author offer legitimate reasons for adopting that stance? If the opinion feels one-sided, does the author acknowledge opposing viewpoints? An opinion-based source is not necessarily unreliable, but it’s important to know whether the author recognizes that their opinion is not the only opinion.

Identify the Language Used

Is the language objective or emotional? Objective language sticks to the facts, but emotional language relies on garnering an emotional response from the reader. Objective language is more commonly found in fact-based sources, while emotional language is more likely to be found in opinion-based sources and propaganda.

Evaluate the Evidence Listed

If you’re just starting your research, you might look for sources that include more general information. However, the deeper you get into your topic, the more comprehensive your research will need to be.

If you’re reading an opinion-based source, ask yourself whether there’s enough evidence to back up the opinions. If you’re reading a fact-based source, be sure that it doesn’t oversimplify the topic.

The more familiar you become with your topic, the easier it will be for you to evaluate the evidence in your sources.

Cross-Check the Information

When you verify the information in one source with information you find in another source, this is called cross-referencing or cross-checking. If the author lists specific dates or facts, can you find that same information somewhere else? Having information listed in more than one place increases its credibility.

Check the Timeliness of the Source

How timely is the source? Is the source twenty years out of date? Some information becomes dated when new research is available, but other older sources of information can still be useful and reliable fifty or a hundred years later. For example, if you are researching a scientific topic, you will want to be sure you have the most up-to-date information. However, if you are examining an historical event, you may want to find primary documents from the time of the event, thus requiring older sources.

Examine the List of References

Check for a list of references or other citations that look as if they will lead you to related material that would be good sources. If a source has a list of references, it often means that the source is well-researched and thorough.

As you continue to encounter more sources, evaluating them for credibility will become easier.

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Publications
Account settings
Advanced Search
Journal List

Changing how we evaluate research is difficult, but not impossible

Stephen curry.

Author information
Article notes
Copyright and License information

Corresponding author.

Received 2020 May 7; Accepted 2020 Aug 6; Collection date 2020.

This article is distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use and redistribution provided that the original author and source are credited.

The San Francisco Declaration on Research Assessment (DORA) was published in 2013 and described how funding agencies, institutions, publishers, organizations that supply metrics, and individual researchers could better evaluate the outputs of scientific research. Since then DORA has evolved into an active initiative that gives practical advice to institutions on new ways to assess and evaluate research. This article outlines a framework for driving institutional change that was developed at a meeting convened by DORA and the Howard Hughes Medical Institute. The framework has four broad goals: understanding the obstacles to changes in the way research is assessed; experimenting with different approaches; creating a shared vision when revising existing policies and practices; and communicating that vision on campus and beyond.

Research organism: None

Introduction

Declarations can inspire revolutionary change, but the high ideals inspiring the revolution must be harnessed to clear guidance and tangible goals to drive effective reform. When the San Francisco Declaration on Research Assessment (DORA) was published in 2013, it catalogued the problems caused by the use of journal-based indicators to evaluate the performance of individual researchers, and provided 18 recommendations to improve such evaluations. Since then, DORA has inspired many in the academic community to challenge long-standing research assessment practices, and over 150 universities and research institutions have signed the declaration and committed to reform.

But experience has taught us that this is not enough to change how research is assessed. Given the scale and complexity of the task, additional measures are called for. We have to support institutions in developing the processes and resources needed to implement responsible research assessment practices. That is why DORA has transformed itself from a website collecting signatures to a broader campaigning initiative that can provide practical guidance. This will help institutions to seize the opportunities created by the momentum now building across the research community to reshape how we evaluate research.

Systemic change requires fundamental shifts in policies, processes and power structures, as well as in deeply held norms and values. Those hoping to drive such change need to understand all the stakeholders in the system: in particular, how do they interact with and depend on each other, and how do they respond to internal and external pressures? To this end DORA and the Howard Hughes Medical Institute (HHMI) convened a meeting in October 2019 that brought together researchers, university administrators, librarians, funders, scientific societies, non-profits and other stakeholders to discuss these questions. Those taking part in the meeting ( https://sfdora.org/assessingresearch/agenda/ ) discussed emerging policies and practices in research assessment, and how they could be aligned with the academic missions of different institutions.

The discussion helped to identify what institutional change could look like, to surface new ideas, and to formulate practical guidance for research institutions looking to embrace reform. This guidance – summarized below – provides a framework for action that consists of four broad goals: i) understand obstacles that prevent change; ii) experiment with different ideas and approaches at all levels; iii) create a shared vision for research assessment when reviewing and revising policies and practices; iv) communicate that vision on campus and externally to other research institutions.

Understand obstacles that prevent change

Most academic reward systems rely on proxy measures of quality to assess researchers. This is problematic when there is an over-reliance on these proxy measures, particularly so if aggregate measures are used that mask the variations between individuals and individual outputs. Journal-based metrics and the H-index, alongside qualitative notions of publisher prestige and institutional reputation, present obstacles to change that have become deeply entrenched in academic evaluation. This has happened because such measures contain an appealing kernel of meaning (though the appeal only holds so long as one operates within the confines of the law of averages) and because they provide a convenient shortcut for busy evaluators. Additionally, the over-reliance on proxy measures that tend to be focused on research can discourage researchers from working on other activities that are also important to the mission of most research institutions, such as teaching, mentoring, and work that has societal impact.

Rethinking research assessment therefore means addressing the privilege that exists in academia, and taking proper account of how luck and opportunity can influence decision-making more than personal characteristics such as talent, skill and tenacity.

The use of proxy measures also preserves biases against scholars who still feel the force of historical and geographical exclusion from the research community. Progress toward gender and race equality has been made in recent years, but the pace of change remains unacceptably slow. A recent study of basic science departments in US medical schools suggests that under current practices, a level of faculty diversity representative of the national population will not be achieved until 2080 ( Gibbs et al., 2016 ).

Rethinking research assessment therefore means addressing the privilege that exists in academia, and taking proper account of how luck and opportunity can influence decision-making more than personal characteristics such as talent, skill and tenacity. As a community, we need to take a hard look – without averting our gaze from the prejudices that attend questions of race, gender, sexuality, or disability – at what we really mean when we talk about 'success' and 'excellence' if we are to find answers congruent with our highest aspirations.

This is by no means easy. Many external and internal pressures stand in the way of meaningful change. For example, institutions have to wrestle with university rankings as part of research assessment reform, because stepping away from the surrogate, selective, and incomplete 'measures' of performance totted up by rankers poses a reputational threat. Grant funding, which is commonly seen as an essential signal of researcher success, is clearly crucial for many universities and research institutions: however, an overemphasis on grants in decisions about hiring, promotion and tenure incentivizes researchers to discount other important parts of their job. The huge mental health burden of hyper-competition is also a problem that can no longer be ignored ( Wellcome, 2020a ).

Experiment with different ideas and approaches at all levels

Culture change is often driven by the collective force of individual actions. These actions take many forms, but spring from a common desire to champion responsible research assessment practices. At the DORA/HHMI meeting Needhi Bhalla (University of California, Santa Cruz) advocated strategies that have been proven to increase equity in faculty hiring – including the use of diversity statements to assess whether a candidate is aligned with the department's equity mission – as part of a more holistic approach to researcher evaluation ( Bhalla, 2019 ). She also described how broadening the scope of desirable research interests in the job descriptions for faculty positions in chemistry at the University of Michigan resulted in a two-fold increase of applicants from underrepresented groups ( Stewart and Valian, 2018 ). As a further step, Bhalla's department now includes untenured assistant professors in tenure decisions: this provides such faculty with insights into the tenure process.

The seeds planted by individual action must be encouraged to grow, so that discussions about research assessment can reach across the entire institution.

The actions of individual researchers, however exemplary, are dependent on career stage and position: commonly, those with more authority have more influence. As chair of the cell biology department at the University of Texas Southwestern Medical Center, Sandra Schmid used her position to revise their hiring procedure to focus on key research contributions, rather than publication or grant metrics, and to explore how the applicant's future plans might best be supported by the department. According to Schmid, the department's job searches were given real breadth and depth by the use of Skype interviews (which enhanced the shortlisting process by allowing more candidates to be interviewed) and by designating faculty advocates from across the department for each candidate ( Schmid, 2017 ). Another proposal for shifting the attention of evaluators from proxies to the content of an applicant's papers and other contributions is to instruct applicants for grants and jobs to remove journal names from CVs and publication lists ( Lobet, 2020 ).

The seeds planted by individual action must be encouraged to grow, so that discussions about research assessment can reach across the entire institution. This is rarely straightforward, given the size and organizational autonomy within modern universities, which is why some have set up working groups to review their research assessment policies and practices. At the Universitat Oberta de Catalunya (UOC) and Imperial College London, for example, the working groups produced action plans or recommendations that have been adopted by the university and are now being implemented ( UOC, 2019 ; Imperial College, 2020 ). University Medical Center (UMC) Utrecht has gone a step further: in addition to revising its processes and criteria for promotion and for internal evaluation of research programmes ( Benedictus et al., 2016 ), it is undertaking an in-depth evaluation of how the changes are impacting their researchers (see below).

To increase their chances of success these working groups need to ensure that women and other historically excluded groups have a voice. It is also important that the viewpoints of administrators, librarians, tenured and non-tenured faculty members, postdocs, and graduate students are all heard. This level of inclusion is important because when communities impacted by new practices are involved in their design, they are more likely to adopt them. But the more views there are around the table, the more difficult it can be to reach a consensus. Everyone brings their own frame-of-reference, their own ideas, and their own experiences. To help ensure that working groups do not become mired in minutiae, their objectives should be defined early in the process and should be simple, clear and realistic.

Create a shared vision

Aligning policies and practices with an institution's mission.

The re-examination of an institution's policies and procedures can reveal the real priorities that may be glossed over in aspirational mission statements. Although the journal impact factor (JIF) is widely discredited as a tool for research assessment, more than 40% of research-intensive universities in the United States and Canada explicitly mention the JIF in review, promotion, and tenure documents ( McKiernan et al., 2019 ). The number of institutions where the JIF is not mentioned in such documents, but is understood informally to be a performance criterion, is not known. A key task for working groups is therefore to review how well the institution's values, as expressed in its mission statement, are embedded in its hiring, promotion, and tenure practices. Diversity, equity, and inclusion are increasingly advertised as core values, but work in these areas is still often lumped into the service category, which is the least recognized type of academic contribution when it comes to promotion and tenure ( Schimanski and Alperin, 2018 ).

A complicating factor here is that while mission statements publicly signal organizational values, the commitments entailed by those statements are delivered by individuals, who are prone to unacknowledged biases, such as the perception gap between what people say they value and what they think others hold most dear. For example, when Meredith Niles and colleagues surveyed faculty at 55 institutions, they found that academics value readership most when selecting where to publish their work ( Niles et al., 2019 ). But when asked how their peers decide to publish, a disconnect was revealed: most faculty members believe their colleagues make choices based on the prestige of the journal or publisher. Similar perception gaps are likely to be found when other performance proxies (such as grant funding and student satisfaction) are considered.

Bridging perception gaps requires courage and honesty within any institution – to break with the metrics game and create evaluation processes that are visibly infused with the organization's core values. To give one example, HHMI tries to advance basic biomedical research for the benefit of humanity by setting evaluation criteria that are focused on quality and impact. To increase transparency, these criteria are now published ( HHMI, 2019 ). As one element of the review, HHMI asks Investigators to "choose five of their most significant articles and provide a brief statement for each that describes the significance and impact of that contribution." It is worth noting that both published and preprint articles can be included. This emphasis on a handful of papers helps focus the review evaluation on the quality and impact of the Investigator's work.

Generic terms like 'world-class' or 'excellent' appear to provide standards for quality; however, they are so broad that they allow evaluators to apply their own definitions, creating room for bias.

Arguably, universities face a stiffer challenge here. Institutions striving to improve their research assessment practices will likely be casting anxious looks at what their competitors are up to. However, one of the hopeful lessons from the October meeting is that less courage should be required – and progress should be faster – if institutions come together to collaborate and establish a shared vision for the reform of research evaluation.

Finding conceptual clarity

Conceptual clarity in hiring, promotion, and tenure policies is another area for institutions to examine when aligning practices with values ( Hatch, 2019 ). Generic terms like 'world-class' or 'excellent' appear to provide standards for quality; however, they are so broad that they allow evaluators to apply their own definitions, creating room for bias. This is especially the case when, as is still likely, there is a lack of diversity in decision-making panels. The use of such descriptors can also perpetuate the Matthew Effect, a phenomenon in which resources accrue to those who are already well resourced. Moore et al., 2017 have critiqued the rhetoric of 'excellence' and propose instead focusing evaluation on more clearly defined concepts such as soundness and capacity-building. (See also Belcher and Palenberg, 2018 for a discussion of the many meanings of the words 'outputs', 'outcomes' and 'impacts' as applied to research in the field of international development).

Establishing standards

Institutions should also consider conceptual clarity when structuring the information requested from those applying for jobs, promotion, or funding. There have been some interesting innovations in recent years from institutions seeking to advance more holistic forms of researcher evaluation. UMC Utrecht, the Royal Society, the Dutch Research Council (NWO), and the Swiss National Science Foundation (SNSF) are also experimenting with structured narrative CV formats ( Benedictus et al., 2016 ; Gossink-Melenhorst, 2019 ; Royal Society, 2020 ; SNSF, 2020 ). These can be tailored to institutional needs and values. The concise but consistently formatted structuring of information in such CVs facilitates comparison between applicants and can provide a richer qualitative picture to complement more the quantitative aspects of academic contributions.

DORA worked with the Royal Society to collect feedback on its 'Resumé for Researchers' narrative CV format, where, for example, the author provides personal details (e.g., education, key qualification and relevant positions), a personal statement, plus answers to the following four questions: how have you contributed to the generation of knowledge?; how have you contributed to the development of individuals?; how have you contributed to the wider research community?; how have you contributed to broader society? ( The template also asks about career breaks and other factors "that might have affected your progression as a researcher"). The answers to these questions will obviously depend on the experience of the applicant but, as Athene Donald of Cambridge University has written: "The topics are broad enough that most people will be able to find something to say about each of them. Undoubtedly there is still plenty of scope for the cocky to hype their life story, but if they can only answer the first [question], and give no account of mentoring, outreach or conference organization, or can't explain why what they are doing is making a contribution to their peers or society, then they probably aren't 'excellent' after all" ( Donald, 2020 ).

It is too early to say if narrative CVs are having a significant impact, but according to the NWO their use has led to an increased consensus between external evaluators and to a more diverse group of researchers being selected for funding ( DORA, 2020 ).

Even though the imposition of structure promotes consistency, there is a confounding factor of reviewer subjectivity. At the meeting, participants identified a two-step strategy to reduce the impact of individual subjectivity on decision-making. First, evaluators should identify and agree on specific assessment criteria for all the desired capabilities. The faculty in the biology department at University of Richmond, for example, discuss the types of expertise, experience, and characteristics desired for a role before soliciting applications.

This lays the groundwork for the second step, which is to define the full range of performance standards for criteria to be used in the evaluation process. An example is the three-point rubric used by the Office for Faculty Equity and Welfare at University of California, Berkeley, which helps faculty to judge the commitment of applicants to advancing diversity, equity, and inclusion ( UC Berkeley, 2020 ). A strong applicant is one who "describes multiple activities in depth, with detailed information about both their role in the activities and the outcomes. Activities may span research, teaching and service, and could include applying their research skills or expertise to investigating diversity, equity and inclusion." A weaker candidate, on the other hand, is someone who provides "descriptions of activities that are brief, vague, or describe being involved only peripherally."

Recognizing collaborative contributions

Researcher evaluation is rightly preoccupied with the achievements of individuals, but increasingly, individual researchers are working within teams and collaborations. The average number of authors per paper has been increasing steadily since 1950 ( National Library of Medicine, 2020 ). Teamwork is essential to solve the most complex research and societal challenges, and is often mentioned as a core value in mission statements, but evaluating collaborative contributions and determining who did what remains challenging. In some disciplines, the order of authorship on a publication can signal how much an individual has contributed; but, as with other proxies, it is possible to end up relying more on assumptions than on information about actual contributions.

More robust approaches to the evaluation of team science are being introduced, with some aimed at behavior change. For example, the University of California Irvine has created guidance for researchers and evaluators on how team science should be described and assessed ( UC Irvine, 2019 ). In a separate development, led by a coalition of funders and universities, the Contributor Roles Taxonomy (CRediT) system ( https://credit.niso.org ), which provides more granular insight into individual contributions to published papers, is being adopted by many journal publishers. But new technological solutions are also needed. For scientific papers, it is envisioned that authorship credit may eventually be assigned at a figure level to identify who designed, performed, and analyzed specific experiments for a study. Rapid Science is also experimenting with an indicator to measure effective collaboration ( http://www.rapidscience.org/about/ ).

Communicate the vision on campus and externally

Although many individual researchers feel constrained by an incentive system over which they have little control, at the institutional level and beyond they can be informed about and involved in the critical re-examination of research assessment. This is crucial if policy changes are to take root, and can happen in different ways, during and after the deliberations of the working groups described above. For example, University College London (UCL) held campus-wide and departmental-level consultations in drafting and reviewing new policies on the responsible use of bibliometrics, part of broader moves to embrace open scholarship ( UCL, 2018 ; Ayris, 2020 ). The working group at Imperial College London organized a symposium to foster a larger conversation within and beyond the university about implementing its commitment to DORA ( Imperial College, 2018 ).

Other institutions and departments have organized interactive workshops or invited speakers who advocate fresh thinking on research evaluation. UMC Utrecht, one of the most energetic reformers of research assessment, hosted a series of town hall meetings to collect faculty and staff input before formalizing its new policies. It is also working with social scientists from Leiden University to monitor how researchers at UMC are responding to the changes. Though the work is yet to be completed, they have identified three broad types of response: i) some researchers have embraced change and see the positive potential of aligning assessment criteria with real world impact and the diversity of academic responsibilities; ii) some would prefer to defend a status quo that re-affirms the value of more traditional metrics; iii) some are concerned about the uncertainty that attends the new norms for their assessment inside and outside UMC ( Benedictus et al., 2019 ). This research serves to maintain a dialogue about change within the institution and will help to refine the content and implementation of research assessment practices. However, the changes have already empowered PhD students at UMC to reshape their own evaluation by bringing a new emphasis on research competencies and professional development to the assessment of their performance ( Algra et al., 2020 ).

We encourage institutions and departments to publish information about their research assessment policies and practices so that research staff can see what is expected of them and, in turn, hold their institutions to account.

The Berlin Institute of Health (BIH) has executed a similarly deep dive into its research culture. In 2017, as part of efforts to improve its research and research assessment practices, it established the QUEST (Quality-Ethics-Open Science-Translation) Center in and launched a programme of work that combined communication, new incentives and new tools to foster institutional culture change ( Strech et al., 2020 ). Moreover, a researcher applying for promotion at the Charité University Hospital, which is part of BIH, must answer questions about their contributions to science, reproducibility, open science, and team science, while applications for intramural funding are assessed on QUEST criteria that refer to robust research practices (such as strategies to reduce the risk of bias, and transparent reporting of methods and results). To help embed these practices independent QUEST officers attend hiring commissions and funding reviewers are required to give structured written feedback. Although the impact of these changes is still being evaluated, lessons already learned include the importance of creating a positive narrative centered on improving the value of BIH research and of combining strong leadership and tangible support with bottom-up engagement by researchers, clinicians, technicians, administrators, and students across the institute ( Strech et al., 2020 ).

Regardless of format, transparency in the communication of policy and practice is critical. We encourage institutions and departments to publish information about their research assessment policies and practices so that research staff can see what is expected of them and, in turn, hold their institutions to account. While transparency increases accountability, it has been argued that it may stifle creativity, particularly if revised policies and criteria are perceived as overly prescriptive. Such risks can be mitigated by dialogue and consultation, and we would advise institutions to emphasize the spirit, rather than the letter, of any guidance they publish.

Universities should be encouraged to share new policies and practices with one another. Research assessment reform is an iterative process, and institutions can learn from the successes and failures of others. Workable solutions may well have to be accommodated within the traditions and idiosyncrasies of different institutions. DORA is curating a collection of new practices in research assessment that institutions can use as a resource (see sfdora.org/goodpractices ), and is always interested to receive new submissions. Based on feedback from the meeting, one of us (AH) and Ruth Schmidt (Illinois Institute of Technology) have written a briefing note that helps researchers make the case for reform to their university leaders and helps institutions experiment with different ideas and approaches by pointing to five design principles for reform ( Hatch and Schmidt, 2020 ).

Looking ahead

DORA is by no means the only organization grappling with the knotty problem of reforming research evaluation. The Wellcome Trust and the INORMS research evaluation group have both recently released guidance to help universities develop new policies and practices ( Wellcome, 2020b ; INORMS, 2020 ). Such developments are aligned with the momentum of the open research movement and the greater recognition by the academy of the need to address long-standing inequities and lack of diversity. Even with new tools, aligning research assessment policies and practices to an institution's values is going to take time. There is tension between the urgency of the situation and the need to listen to and understand the concerns of the community as new policies and practices are developed. Institutions and individuals will need to dedicate time and resources to establishing and maintaining new policies and practices if academia is to succeed in its oft-stated mission of making the world a better place. DORA and its partners are committed to supporting the academic community throughout this process.

DORA receives financial support from eLife, and an eLife employee (Stuart King) is a member of the DORA steering committee.

Acknowledgements

We thank the attendees at the meeting for robust and thoughtful discussions about ways to improve research assessment. We are extremely grateful to Boyana Konforti for her keen insights and feedback throughout the writing process. Thanks also go to Bodo Stern, Erika Shugart, and Caitlin Schrein for very helpful comments, and to Rinze Benedictus, Kasper Gossink, Hans de Jonge, Ndaja Gmelch, Miriam Kip, and Ulrich Dirnagl for sharing information about interventions to improve research assessment practices at their organizations.

Biographies

Anna Hatch is the program director at DORA, Rockville, United States

Stephen Curry is Assistant Provost (Equality, Diversity & Inclusion) and Professor of Structural Biology at Imperial College, London, UK. He is also chair of the DORA steering committee

Funding Statement

No external funding was received for this work.

Competing interests

No competing interests declared.

Contributor Information

Anna Hatch, Email: [email protected].

Stephen Curry, Email: [email protected].

Algra A, Koopman I, Snoek R. How young researchers can re-shape the evaluation of their work. [August 5, 2020];Nature Index. 2020 https://www.natureindex.com/news-blog/how-young-researchers-can-re-shape-research-evaluation-universities
Ayris P. UCL statement on the importance of open science. [July 21, 2020];2020 https://www.ucl.ac.uk/research/strategy-and-policy/ucl-statement-importance-open-science
Belcher B, Palenberg M. Outcomes and impacts of development interventions: toward conceptual clarity. American Journal of Evaluation. 2018;39:478–495. doi: 10.1177/1098214018765698. [ DOI ] [ Google Scholar ]
Benedictus R, Miedema F, Ferguson MW. Fewer numbers, better science. Nature. 2016;538:453–455. doi: 10.1038/538453a. [ DOI ] [ PubMed ] [ Google Scholar ]
Benedictus R, Dix G, Zuijderwijk J. The evaluative breach: How research staff deal with a challenge of evaluative norms in a Dutch biomedical research institute. [July 21, 2020];2019 https://wcrif.org/images/2019/ArchiveOtherSessions/day2/62.%20CC10%20-%20Jochem%20Zuijderwijk%20-%20WCRI%20Presentation%20HK%20v5%20-%20Clean.pdf
Bhalla N. Strategies to improve equity in faculty hiring. Molecular Biology of the Cell. 2019;30:2744–2749. doi: 10.1091/mbc.E19-08-0476. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
Donald A. A holistic CV. [July 22, 2020];2020 http://occamstypewriter.org/athenedonald/2020/02/16/a-holistic-cv/
DORA DORA's first funder discussion: updates from Swiss National Science Foundation, Wellcome Trust and the Dutch Research Council. [July 21, 2020];2020 https://sfdora.org/2020/04/14/doras-first-funder-discussion-updates-from-swiss-national-science-foundation-wellcome-trust-and-the-dutch-research-council/
Gibbs KD, Basson J, Xierali IM, Broniatowski DA. Decoupling of the minority PhD talent pool and assistant professor hiring in medical school basic science departments in the US. eLife. 2016;5:e21393. doi: 10.7554/eLife.21393. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
Gossink-Melenhorst K. Quality over quantity: How the Dutch Research Council is giving researchers the opportunity to showcase diverse types of talent. [July 21, 2020];2019 https://sfdora.org/2019/11/14/quality-over-quantity-how-the-dutch-research-council-is-giving-researchers-the-opportunity-to-showcase-diverse-types-of-talent/
Hatch A. To fix research assessment, swap slogans for definitions. Nature. 2019;576:9. doi: 10.1038/d41586-019-03696-w. [ DOI ] [ PubMed ] [ Google Scholar ]
Hatch A, Schmidt R. Rethinking research assessment: ideas for action. [July 21, 2020];2020 https://sfdora.org/2020/05/19/rethinking-research-assessment-ideas-for-action/
HHMI Review of HHMI investigators. [July 21, 2020];2019 https://www.hhmi.org/programs/biomedical-research/investigator-program/review
Imperial College Mapping the future of research assessment at imperial college London. [July 21, 2020];2018 https://www.youtube.com/watch?v=IpKyN-cXHL4
Imperial College Research evaluation. [July 21, 2020];2020 http://www.imperial.ac.uk/research-and-innovation/about-imperial-research/research-evaluation/
INORMS Research evaluation working group. [July 21, 2020];2020 https://inorms.net/activities/research-evaluation-working-group/
Lobet G. Fighting the impact factor one CV at a time. [July 21, 2020];ecrLife. 2020 https://ecrlife.org/fighting-the-impact-factor-one-cv-at-a-time/
McKiernan EC, Schimanski LA, Muñoz Nieves C, Matthias L, Niles MT, Alperin JP. Use of the journal impact factor in academic review, promotion, and tenure evaluations. eLife. 2019;8:e47338. doi: 10.7554/eLife.47338. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
Moore S, Neylon C, Eve MP, O'Donnell DP, Pattinson D. "Excellence R Us": University research and the fetishisation of excellence. Palgrave Communications. 2017;3:1–13. doi: 10.1057/PALCOMMS.2016.105. [ DOI ] [ Google Scholar ]
National Library of Medicine Number of authors per MEDLINE/PubMed citation. [July 21, 2020];2020 https://www.nlm.nih.gov/bsd/authors1.html
Niles MT, Schimanski LA, McKiernan EC, Alperin JP. Why we publish where we do: faculty publishing values and their relationship to review promotion and tenure expectations. bioRxiv. 2019 doi: 10.1101/706622. [ DOI ] [ PMC free article ] [ PubMed ]
Royal Society Résumé for researchers. [July 21, 2020];2020 https://royalsociety.org/topics-policy/projects/research-culture/tools-for-support/resume-for-researchers/
Schimanski LA, Alperin JP. The evaluation of scholarship in academic promotion and tenure processes: past, present, and future. F1000Research. 2018;7:1605. doi: 10.12688/f1000research.16493.1. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
Schmid SL. Five years post-DORA: promoting best practices for research assessment. Molecular Biology of the Cell. 2017;28:2941–2944. doi: 10.1091/mbc.e17-08-0534. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
SNSF SciCV – SNSF tests new format CV in biology and medicine. [July 21, 2020];2020 http://www.snf.ch/en/researchinFocus/newsroom/Pages/news-200131-scicv-snsf-tests-new-cv-format-in-biology-and-medicine.aspx
Stewart AJ, Valian V. An Inclusive Academy: Achieving Diversity and Excellence. Cambridge, MA: The MIT Press; 2018. [ Google Scholar ]
Strech D, Weissgerber T, Dirnagl U, QUEST Group Improving the trustworthiness, usefulness, and ethics of biomedical research through an innovative and comprehensive institutional initiative. PLOS Biology. 2020;18:e3000576. doi: 10.1371/journal.pbio.3000576. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
UC Berkeley Support for faculty search committees. [July 21, 2020];2020 https://ofew.berkeley.edu/recruitment/contributions-diversity/support-faculty-search-committees
UC Irvine Identifying faculty contributions to collaborative scholarship. [July 21, 2020];2019 https://ap.uci.edu/faculty/guidance/collaborativescholarship/
UCL UCL bibliometrics policy. [July 21, 2020];2018 https://www.ucl.ac.uk/library/research-support/bibliometrics/ucl-bibliometrics-policy
UOC The UOC signs the san francisco declaration to encourage changes in research assessment. [July 21, 2020];2019 https://www.uoc.edu/portal/en/news/actualitat/2019/120-dora.html
Wellcome What researchers think about the culture they work in. [July 21, 2020];2020a https://wellcome.ac.uk/reports/what-researchers-think-about-research-culture
Wellcome Guidance for research organisations on how to implement the principles of the San Francisco Declaration on Research Assessment. [July 21, 2020];2020b https://wellcome.ac.uk/how-we-work/open-research/guidance-research-organisations-how-implement-dora-principles
View on publisher site
PDF (224.9 KB)
Collections

Add to Collections

An official website of the United States government

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock Locked padlock icon ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Publications
Account settings
Advanced Search
Journal List

A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance

Kathryn skivington, lynsay matthews, sharon anne simpson, peter craig, janis baird, jane m blazeby, kathleen anne boyd, david p french, emma mcintosh, mark petticrew, jo rycroft-malone, martin white, laurence moore.

Author information
Article notes
Copyright and License information

Correspondence to: K Skivington [email protected]

Corresponding author.

Accepted 2021 Aug 9; Collection date 2021.

This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/ .

The UK Medical Research Council’s widely used guidance for developing and evaluating complex interventions has been replaced by a new framework, commissioned jointly by the Medical Research Council and the National Institute for Health Research, which takes account of recent developments in theory and methods and the need to maximise the efficiency, use, and impact of research.

Complex interventions are commonly used in the health and social care services, public health practice, and other areas of social and economic policy that have consequences for health. Such interventions are delivered and evaluated at different levels, from individual to societal levels. Examples include a new surgical procedure, the redesign of a healthcare programme, and a change in welfare policy. The UK Medical Research Council (MRC) published a framework for researchers and research funders on developing and evaluating complex interventions in 2000 and revised guidance in 2006. 1 2 3 Although these documents continue to be widely used and are now accompanied by a range of more detailed guidance on specific aspects of the research process, 4 5 6 7 8 several important conceptual, methodological and theoretical developments have taken place since 2006. These developments have been included in a new framework commissioned by the National Institute of Health Research (NIHR) and the MRC. 9 The framework aims to help researchers work with other stakeholders to identify the key questions about complex interventions, and to design and conduct research with a diversity of perspectives and appropriate choice of methods.

Summary points.

Complex intervention research can take an efficacy, effectiveness, theory based, and/or systems perspective, the choice of which is based on what is known already and what further evidence would add most to knowledge

Complex intervention research goes beyond asking whether an intervention works in the sense of achieving its intended outcome—to asking a broader range of questions (eg, identifying what other impact it has, assessing its value relative to the resources required to deliver it, theorising how it works, taking account of how it interacts with the context in which it is implemented, how it contributes to system change, and how the evidence can be used to support real world decision making)

A trade-off exists between precise unbiased answers to narrow questions and more uncertain answers to broader, more complex questions; researchers should answer the questions that are most useful to decision makers rather than those that can be answered with greater certainty

Complex intervention research can be considered in terms of phases, although these phases are not necessarily sequential: development or identification of an intervention, assessment of feasibility of the intervention and evaluation design, evaluation of the intervention, and impactful implementation

At each phase, six core elements should be considered to answer the following questions:

How does the intervention interact with its context?

What is the underpinning programme theory?

How can diverse stakeholder perspectives be included in the research?

What are the key uncertainties?

How can the intervention be refined?

What are the comparative resource and outcome consequences of the intervention?

The answers to these questions should be used to decide whether the research should proceed to the next phase, return to a previous phase, repeat a phase, or stop

Development of the Framework for Developing and Evaluating Complex Interventions

The updated Framework for Developing and Evaluating Complex Interventions is the culmination of a process that included four stages:

A gap analysis to identify developments in the methods and practice since the previous framework was published

A full day expert workshop, in May 2018, of 36 participants to discuss the topics identified in the gap analysis

An open consultation on a draft of the framework in April 2019, whereby we sought stakeholder opinion by advertising via social media, email lists and other networks for written feedback (52 detailed responses were received from stakeholders internationally)

Redraft using findings from the previous stages, followed by a final expert review.

We also sought stakeholder views at various interactive workshops throughout the development of the framework: at the annual meetings of the Society for Social Medicine and Population Health (2018), the UK Society for Behavioural Medicine (2017, 2018), and internationally at the International Congress of Behavioural Medicine (2018). The entire process was overseen by a scientific advisory group representing the range of relevant NIHR programmes and MRC population health investments. The framework was reviewed by the MRC-NIHR Methodology Research Programme Advisory Group and then approved by the MRC Population Health Sciences Group in March 2020 before undergoing further external peer and editorial review through the NIHR Journals Library peer review process. More detailed information and the methods used to develop this new framework are described elsewhere. 9 This article introduces the framework and summarises the main messages for producers and users of evidence.

What are complex interventions?

An intervention might be considered complex because of properties of the intervention itself, such as the number of components involved; the range of behaviours targeted; expertise and skills required by those delivering and receiving the intervention; the number of groups, settings, or levels targeted; or the permitted level of flexibility of the intervention or its components. For example, the Links Worker Programme was an intervention in primary care in Glasgow, Scotland, that aimed to link people with community resources to help them “live well” in their communities. It targeted individual, primary care (general practitioner (GP) surgery), and community levels. The intervention was flexible in that it could differ between primary care GP surgeries. In addition, the Link Workers did not support just one specific health or wellbeing issue: bereavement, substance use, employment, and learning difficulties were all included. 10 11 The complexity of this intervention had implications for many aspects of its evaluation, such as the choice of appropriate outcomes and processes to assess.

Flexibility in intervention delivery and adherence might be permitted to allow for variation in how, where, and by whom interventions are delivered and received. Standardisation of interventions could relate more to the underlying process and functions of the intervention than on the specific form of components delivered. 12 For example, in surgical trials, protocols can be designed with flexibility for intervention delivery. 13 Interventions require a theoretical deconstruction into components and then agreement about permissible and prohibited variation in the delivery of those components. This approach allows implementation of a complex intervention to vary across different contexts yet maintain the integrity of the core intervention components. Drawing on this approach in the ROMIO pilot trial, core components of minimally invasive oesophagectomy were agreed and subsequently monitored during main trial delivery using photography. 14

Complexity might also arise through interactions between the intervention and its context, by which we mean “any feature of the circumstances in which an intervention is conceived, developed, implemented and evaluated.” 6 15 16 17 Much of the criticism of and extensions to the existing framework and guidance have focused on the need for greater attention on understanding how and under what circumstances interventions bring about change. 7 15 18 The importance of interactions between the intervention and its context emphasises the value of identifying mechanisms of change, where mechanisms are the causal links between intervention components and outcomes; and contextual factors, which determine and shape whether and how outcomes are generated. 19

Thus, attention is given not only to the design of the intervention itself but also to the conditions needed to realise its mechanisms of change and/or the resources required to support intervention reach and impact in real world implementation. For example, in a cluster randomised trial of ASSIST (a peer led, smoking prevention intervention), researchers found that the intervention worked particularly well in cohesive communities that were served by one secondary school where peer supporters were in regular contact with their peers—a key contextual factor consistent with diffusion of innovation theory, which underpinned the intervention design. 20 A process evaluation conducted alongside a trial of robot assisted surgery identified key contextual factors to support effective implementation of this procedure, including engaging staff at different levels and surgeons who would not be using robot assisted surgery, whole team training, and an operating theatre of suitable size. 21

With this framing, complex interventions can helpfully be considered as events in systems. 16 Thinking about systems helps us understand the interaction between an intervention and the context in which it is implemented in a dynamic way. 22 Systems can be thought of as complex and adaptive, 23 characterised by properties such as emergence, feedback, adaptation, and self-organisation ( table 1 ).

Properties and examples of complex adaptive systems

For complex intervention research to be most useful to decision makers, it should take into account the complexity that arises both from the intervention’s components and from its interaction with the context in which it is being implemented.

Research perspectives

The previous framework and guidance were based on a paradigm in which the salient question was to identify whether an intervention was effective. Complex intervention research driven primarily by this question could fail to deliver interventions that are implementable, cost effective, transferable, and scalable in real world conditions. To deliver solutions for real world practice, complex intervention research requires strong and early engagement with patients, practitioners, and policy makers, shifting the focus from the “binary question of effectiveness” 26 to whether and how the intervention will be acceptable, implementable, cost effective, scalable, and transferable across contexts. In line with a broader conception of complexity, the scope of complex intervention research needs to include the development, identification, and evaluation of whole system interventions and the assessment of how interventions contribute to system change. 22 27 The new framework therefore takes a pluralistic approach and identifies four perspectives that can be used to guide the design and conduct of complex intervention research: efficacy, effectiveness, theory based, and systems ( table 2 ).

Although each research perspective prompts different types of research question, they should be thought of as overlapping rather than mutually exclusive. For example, theory based and systems perspectives to evaluation can be used in conjunction, 33 while an effectiveness evaluation can draw on a theory based or systems perspective through an embedded process evaluation to explore how and under what circumstances outcomes are achieved. 34 35 36

Most complex health intervention research so far has taken an efficacy or effectiveness perspective and for some research questions these perspectives will continue to be the most appropriate. However, some questions equally relevant to the needs of decision makers cannot be answered by research restricted to an efficacy or effectiveness perspective. A wider range and combination of research perspectives and methods, which answer questions beyond efficacy and effectiveness, need to be used by researchers and supported by funders. Doing so will help to improve the extent to which key questions for decision makers can be answered by complex intervention research. Example questions include:

Will this effective intervention reproduce the effects found in the trial when implemented here?

Is the intervention cost effective?

What are the most important things we need to do that will collectively improve health outcomes?

In the absence of evidence from randomised trials and the infeasibility of conducting such a trial, what does the existing evidence suggest is the best option now and how can this be evaluated?

What wider changes will occur as a result of this intervention?

How are the intervention effects mediated by different settings and contexts?

Phases and core elements of complex intervention research

The framework divides complex intervention research into four phases: development or identification of the intervention, feasibility, evaluation, and implementation ( fig 1 ). A research programme might begin at any phase, depending on the key uncertainties about the intervention in question. Repeating phases is preferable to automatic progression if uncertainties remain unresolved. Each phase has a common set of core elements—considering context, developing and refining programme theory, engaging stakeholders, identifying key uncertainties, refining the intervention, and economic considerations. These elements should be considered early and continually revisited throughout the research process, and especially before moving between phases (for example, between feasibility testing and evaluation).

Framework for developing and evaluating complex interventions. Context=any feature of the circumstances in which an intervention is conceived, developed, evaluated, and implemented; programme theory=describes how an intervention is expected to lead to its effects and under what conditions—the programme theory should be tested and refined at all stages and used to guide the identification of uncertainties and research questions; stakeholders=those who are targeted by the intervention or policy, involved in its development or delivery, or more broadly those whose personal or professional interests are affected (that is, who have a stake in the topic)—this includes patients and members of the public as well as those linked in a professional capacity; uncertainties=identifying the key uncertainties that exist, given what is already known and what the programme theory, research team, and stakeholders identify as being most important to discover—these judgments inform the framing of research questions, which in turn govern the choice of research perspective; refinement=the process of fine tuning or making changes to the intervention once a preliminary version (prototype) has been developed; economic considerations=determining the comparative resource and outcome consequences of the interventions for those people and organisations affected

Core elements

The effects of a complex intervention might often be highly dependent on context, such that an intervention that is effective in some settings could be ineffective or even harmful elsewhere. 6 As the examples in table 1 show, interventions can modify the contexts in which they are implemented, by eliciting responses from other agents, or by changing behavioural norms or exposure to risk, so that their effects will also vary over time. Context can be considered as both dynamic and multi-dimensional. Key dimensions include physical, spatial, organisational, social, cultural, political, or economic features of the healthcare, health system, or public health contexts in which interventions are implemented. For example, the evaluation of the Breastfeeding In Groups intervention found that the context of the different localities (eg, staff morale and suitable premises) influenced policy implementation and was an explanatory factor in why breastfeeding rates increased in some intervention localities and declined in others. 37

Programme theory

Programme theory describes how an intervention is expected to lead to its effects and under what conditions. It articulates the key components of the intervention and how they interact, the mechanisms of the intervention, the features of the context that are expected to influence those mechanisms, and how those mechanisms might influence the context. 38 Programme theory can be used to promote shared understanding of the intervention among diverse stakeholders, and to identify key uncertainties and research questions. Where an intervention (such as a policy) is developed by others, researchers still need to theorise the intervention before attempting to evaluate it. 39 Best practice is to develop programme theory at the beginning of the research project with involvement of diverse stakeholders, based on evidence and theory from relevant fields, and to refine it during successive phases. The EPOCH trial tested a large scale quality improvement programme aimed at improving 90 day survival rates for patients undergoing emergency abdominal surgery; it included a well articulated programme theory at the outset, which supported the tailoring of programme delivery to local contexts. 40 The development, implementation, and post-study reflection of the programme theory resulted in suggested improvements for future implementation of the quality improvement programme.

A refined programme theory is an important evaluation outcome and is the principal aim where a theory based perspective is taken. Improved programme theory will help inform transferability of interventions across settings and help produce evidence and understanding that is useful to decision makers. In addition to full articulation of programme theory, it can help provide visual representations—for example, using a logic model, 41 42 43 realist matrix, 44 or a system map, 45 with the choice depending on which is most appropriate for the research perspective and research questions. Although useful, any single visual representation is unlikely to sufficiently articulate the programme theory—it should always be articulated well within the text of publications, reports, and funding applications.

Stakeholders

Stakeholders include those individuals who are targeted by the intervention or policy, those involved in its development or delivery, or those whose personal or professional interests are affected (that is, all those who have a stake in the topic). Patients and the public are key stakeholders. Meaningful engagement with appropriate stakeholders at each phase of the research is needed to maximise the potential of developing or identifying an intervention that is likely to have positive impacts on health and to enhance prospects of achieving changes in policy or practice. For example, patient and public involvement 46 activities in the PARADES programme, which evaluated approaches to reduce harm and improve outcomes for people with bipolar disorder, were wide ranging and central to the project. 47 Involving service users with lived experiences of bipolar disorder had many benefits, for example, it enhanced the intervention but also improved the evaluation and dissemination methods. Service users involved in the study also had positive outcomes, including more settled employment and progression to further education. Broad thinking and consultation is needed to identify a diverse range of appropriate stakeholders.

The purpose of stakeholder engagement will differ depending on the context and phase of the research, but is essential for prioritising research questions, the co-development of programme theory, choosing the most useful research perspective, and overcoming practical obstacles to evaluation and implementation. Researchers should nevertheless be mindful of conflicts of interest among stakeholders and use transparent methods to record potential conflicts of interest. Research should not only elicit stakeholder priorities, but also consider why they are priorities. Careful consideration of the appropriateness and methods of identification and engagement of stakeholders is needed. 46 48

Key uncertainties

Many questions could be answered at each phase of the research process. The design and conduct of research need to engage pragmatically with the multiple uncertainties involved and offer a flexible and emergent approach to exploring them. 15 Therefore, researchers should spend time developing the programme theory, clearly identifying the remaining uncertainties, given what is already known and what the research team and stakeholders identify as being most important to determine. Judgments about the key uncertainties inform the framing of research questions, which in turn govern the choice of research perspective.

Efficacy trials of relatively uncomplicated interventions in tightly controlled conditions, where research questions are answered with great certainty, will always be important, but translation of the evidence into the diverse settings of everyday practice is often highly problematic. 27 For intervention research in healthcare and public health settings to take on more challenging evaluation questions, greater priority should be given to mixed methods, theory based, or systems evaluation that is sensitive to complexity and that emphasises implementation, context, and system fit. This approach could help improve understanding and identify important implications for decision makers, albeit with caveats, assumptions, and limitations. 22 Rather than maintaining the established tendency to prioritise strong research designs that answer some questions with certainty but are unsuited to resolving many important evaluation questions, this more inclusive, deliberative process could place greater value on equivocal findings that nevertheless inform important decisions where evidence is sparse.

Intervention refinement

Within each phase of complex intervention research and on transition from one phase to another, the intervention might need to be refined, on the basis of data collected or development of programme theory. 4 The feasibility and acceptability of interventions can be improved by engaging potential intervention users to inform refinements. For example, an online physical activity planner for people with diabetes mellitus was found to be difficult to use, resulting in the tool providing incorrect personalised advice. To improve usability and the advice given, several iterations of the planner were developed on the basis of interviews and observations. This iterative process led to the refined planner demonstrating greater feasibility and accuracy. 49

Refinements should be guided by the programme theory, with acceptable boundaries agreed and specified at the beginning of each research phase, and with transparent reporting of the rationale for change. Scope for refinement might also be limited by the policy or practice context. Refinement will be rare in the evaluation phase of efficacy and effectiveness research, where interventions will ideally not change or evolve within the course of the study. However, between the phases of research and within systems and theory based evaluation studies, refinement of interventions in response to accumulated data or as an adaptive and variable response to context and system change are likely to be desirable features of the intervention and a key focus of the research.

Economic considerations

Economic evaluation—the comparative analysis of alternative courses of action in terms of both costs (resource use) and consequences (outcomes, effects)—should be a core component of all phases of intervention research. Early engagement of economic expertise will help identify the scope of costs and benefits to assess in order to answer questions that matter most to decision makers. 50 Broad ranging approaches such as cost benefit analysis or cost consequence analysis, which seek to capture the full range of health and non-health costs and benefits across different sectors, 51 will often be more suitable for an economic evaluation of a complex intervention than narrower approaches such as cost effectiveness or cost utility analysis. For example, evaluation of the New Orleans Intervention Model for infants entering foster care in Glasgow included short and long term economic analysis from multiple perspectives (the UK’s health service and personal social services, public sector, and wider societal perspectives); and used a range of frameworks, including cost utility and cost consequence analysis, to capture changes in the intersectoral costs and outcomes associated with child maltreatment. 52 53 The use of multiple economic evaluation frameworks provides decision makers with a comprehensive, multi-perspective guide to the cost effectiveness of the New Orleans Intervention Model.

Developing or identifying a complex intervention

Development refers to the whole process of designing and planning an intervention, from initial conception through to feasibility, pilot, or evaluation study. Guidance on intervention development has recently been developed through the INDEX study 4 ; although here we highlight that complex intervention research does not always begin with new or researcher led interventions. For example:

A key source of intervention development might be an intervention that has been developed elsewhere and has the possibility of being adapted to a new context. Adaptation of existing interventions could include adapting to a new population, to a new setting, 54 55 or to target other outcomes (eg, a smoking prevention intervention being adapted to tackle substance misuse and sexual health). 20 56 57 A well developed programme theory can help identify what features of the antecedent intervention(s) need to be adapted for different applications, and the key mechanisms that should be retained even if delivered slightly differently. 54 58

Policy or practice led interventions are an important focus of evaluation research. Again, uncovering the implicit theoretical basis of an intervention and developing a programme theory is essential to identifying key uncertainties and working out how the intervention might be evaluated. This step is important, even if rollout has begun, because it supports the identification of mechanisms of change, important contextual factors, and relevant outcome measures. For example, researchers evaluating the UK soft drinks industry levy developed a bounded conceptual system map to articulate their understanding (drawing on stakeholder views and document review) of how the intervention was expected to work. This system map guided the evaluation design and helped identify data sources to support evaluation. 45 Another example is a recent analysis of the implicit theory of the NHS diabetes prevention programme, involving analysis of documentation by NHS England and four providers, showing that there was no explicit theoretical basis for the programme, and no logic model showing how the intervention was expected to work. This meant that the justification for the inclusion of intervention components was unclear. 59

Intervention identification and intervention development represent two distinct pathways of evidence generation, 60 but in both cases, the key considerations in this phase relate to the core elements described above.

Feasibility

A feasibility study should be designed to assess predefined progression criteria that relate to the evaluation design (eg, reducing uncertainty around recruitment, data collection, retention, outcomes, and analysis) or the intervention itself (eg, around optimal content and delivery, acceptability, adherence, likelihood of cost effectiveness, or capacity of providers to deliver the intervention). If the programme theory suggests that contextual or implementation factors might influence the acceptability, effectiveness, or cost effectiveness of the intervention, these questions should be considered.

Despite being overlooked or rushed in the past, the value of feasibility testing is now widely accepted with key terms and concepts well defined. 61 62 Before initiating a feasibility study, researchers should consider conducting an evaluability assessment to determine whether and how an intervention can usefully be evaluated. Evaluability assessment involves collaboration with stakeholders to reach agreement on the expected outcomes of the intervention, the data that could be collected to assess processes and outcomes, and the options for designing the evaluation. 63 The end result is a recommendation on whether an evaluation is feasible, whether it can be carried out at a reasonable cost, and by which methods. 64

Economic modelling can be undertaken at the feasibility stage to assess the likelihood that the expected benefits of the intervention justify the costs (including the cost of further research), and to help decision makers decide whether proceeding to a full scale evaluation is worthwhile. 65 Depending on the results of the feasibility study, further work might be required to progressively refine the intervention before embarking on a full scale evaluation.

The new framework defines evaluation as going beyond asking whether an intervention works (in the sense of achieving its intended outcome), to a broader range of questions including identifying what other impact it has, theorising how it works, taking account of how it interacts with the context in which it is implemented, how it contributes to system change, and how the evidence can be used to support decision making in the real world. This implies a shift from an exclusive focus on obtaining unbiased estimates of effectiveness 66 towards prioritising the usefulness of information for decision making in selecting the optimal research perspective and in prioritising answerable research questions.

A crucial aspect of evaluation design is the choice of outcome measures or evidence of change. Evaluators should work with stakeholders to assess which outcomes are most important, and how to deal with multiple outcomes in the analysis with due consideration of statistical power and transparent reporting. A sharp distinction between one primary outcome and several secondary outcomes is not necessarily appropriate, particularly where the programme theory identifies impacts across a range of domains. Where needed to support the research questions, prespecified subgroup analyses should be carried out and reported. Even where such analyses are underpowered, they should be included in the protocol because they might be useful for subsequent meta-analyses, or for developing hypotheses for testing in further research. Outcome measures could capture changes to a system rather than changes in individuals. Examples include changes in relationships within an organisation, the introduction of policies, changes in social norms, or normalisation of practice. Such system level outcomes include how changing the dynamics of one part of a system alters behaviours in other parts, such as the potential for displacement of smoking into the home after a public smoking ban.

A helpful illustration of the use of system level outcomes is the evaluation of the Delaware Young Health Program—an initiative to improve the health and wellbeing of young people in Delaware, USA. The intervention aimed to change underlying system dynamics, structures, and conditions, so the evaluation identified systems oriented research questions and methods. Three systems science methods were used: group model building and viable systems model assessment to identify underlying patterns and structures; and social network analysis to evaluate change in relationships over time. 67

Researchers have many study designs to choose from, and different designs are optimally suited to consider different research questions and different circumstances. 68 Extensions to standard designs of randomised controlled trials (including adaptive designs, SMART trials (sequential multiple assignment randomised trials), n-of-1 trials, and hybrid effectiveness-implementation designs) are important areas of methods development to improve the efficiency of complex intervention research. 69 70 71 72 Non-randomised designs and modelling approaches might work best if a randomised design is not practical, for example, in natural experiments or systems evaluations. 5 73 74 A purely quantitative approach, using an experimental design with no additional elements such as a process evaluation, is rarely adequate for complex intervention research, where qualitative and mixed methods designs might be necessary to answer questions beyond effectiveness. In many evaluations, the nature of the intervention, the programme theory, or the priorities of stakeholders could lead to a greater focus on improving theories about how to intervene. In this view, effect estimates are inherently context bound, so that average effects are not a useful guide to decision makers working in different contexts. Contextualised understandings of how an intervention induces change might be more useful, as well as details on the most important enablers and constraints on its delivery across a range of settings. 7

Process evaluation can answer questions around fidelity and quality of implementation (eg, what is implemented and how?), mechanisms of change (eg, how does the delivered intervention produce change?), and context (eg, how does context affect implementation and outcomes?). 7 Process evaluation can help determine why an intervention fails unexpectedly or has unanticipated consequences, or why it works and how it can be optimised. Such findings can facilitate further development of the intervention programme theory. 75 In a theory based or systems evaluation, there is not necessarily such a clear distinction between process and outcome evaluation as there is in an effectiveness study. 76 These perspectives could prioritise theory building over evidence production and use case study or simulation methods to understand how outcomes or system behaviour are generated through intervention. 74 77

Implementation

Early consideration of implementation increases the potential of developing an intervention that can be widely adopted and maintained in real world settings. Implementation questions should be anticipated in the intervention programme theory, and considered throughout the phases of intervention development, feasibility testing, process, and outcome evaluation. Alongside implementation specific outcomes (such as reach or uptake of services), attention to the components of the implementation strategy, and contextual factors that support or hinder the achievement of impacts, are key. Some flexibility in intervention implementation might support intervention transferability into different contexts (an important aspect of long term implementation 78 ), provided that the key functions of the programme are maintained, and that the adaptations made are clearly understood. 8

In the ASSIST study, 20 a school based, peer led intervention for smoking prevention, researchers considered implementation at each phase. The intervention was developed to have minimal disruption on school resources; the feasibility study resulted in intervention refinements to improve acceptability and improve reach to male students; and in the evaluation (cluster randomised controlled trial), the intervention was delivered as closely as possible to real world implementation. Drawing on the process evaluation, the implementation included an intervention manual that identified critical components and other components that could be adapted or dropped to allow flexible implementation while achieving delivery of the key mechanisms of change; and a training manual for the trainers and ongoing quality assurance built into rollout for the longer term.

In a natural experimental study, evaluation takes place during or after the implementation of the intervention in a real world context. Highly pragmatic effectiveness trials or specific hybrid effectiveness-implementation designs also combine effectiveness and implementation outcomes in one study, with the aim of reducing time for translation of research on effectiveness into routine practice. 72 79 80

Implementation questions should be included in economic considerations during the early stages of intervention and study development. How the results of economic analyses are reported and presented to decision makers can affect whether and how they act on the results. 81 A key consideration is how to deal with interventions across different sectors, where those paying for interventions and those receiving the benefits of them could differ, reducing the incentive to implement an intervention, even if shown to be beneficial and cost effective. Early engagement with appropriate stakeholders will help frame appropriate research questions and could anticipate any implementation challenges that might arise. 82

Conclusions

One of the motivations for developing this new framework was to answer calls for a change in research priorities, towards allocating greater effort and funding to research that can have the optimum impact on healthcare or population health outcomes. The framework challenges the view that unbiased estimates of effectiveness are the cardinal goal of evaluation. It asserts that improving theories and understanding how interventions contribute to change, including how they interact with their context and wider dynamic systems, is an equally important goal. For some complex intervention research problems, an efficacy or effectiveness perspective will be the optimal approach, and a randomised controlled trial will provide the best design to achieve an unbiased estimate. For others, alternative perspectives and designs might work better, or might be the only way to generate new knowledge to reduce decision maker uncertainty.

What is important for the future is that the scope of intervention research is not constrained by an unduly limited set of perspectives and approaches that might be less risky to commission and more likely to produce a clear and unbiased answer to a specific question. A bolder approach is needed—to include methods and perspectives where experience is still quite limited, but where we, supported by our workshop participants and respondents to our consultations, believe there is an urgent need to make progress. This endeavour will involve mainstreaming new methods that are not yet widely used, as well as undertaking methodological innovation and development. The deliberative and flexible approach that we encourage is intended to reduce research waste, 83 maximise usefulness for decision makers, and increase the efficiency with which complex intervention research generates knowledge that contributes to health improvement.

Monitoring the use of the framework and evaluating its acceptability and impact is important but has been lacking in the past. We encourage research funders and journal editors to support the diversity of research perspectives and methods that are advocated here and to seek evidence that the core elements are attended to in research design and conduct. We have developed a checklist to support the preparation of funding applications, research protocols, and journal publications. 9 This checklist offers one way to monitor impact of the guidance on researchers, funders, and journal editors.

We recommend that the guidance is continually updated, and future updates continue to adopt a broad, pluralist perspective. Given its wider scope, and the range of detailed guidance that is now available on specific methods and topics, we believe that the framework is best seen as meta-guidance. Further editions should be published in a fluid, web based format, and more frequently updated to incorporate new material, further case studies, and additional links to other new resources.

Acknowledgments

We thank the experts who provided input at the workshop, those who responded to the consultation, and those who provided advice and review throughout the process. The many people involved are acknowledged in the full framework document. 9 Parts of this manuscript have been reproduced (some with edits and formatting changes), with permission, from that longer framework document.

Contributors: All authors made a substantial contribution to all stages of the development of the framework—they contributed to its development, drafting, and final approval. KS and LMa led the writing of the framework, and KS wrote the first draft of this paper. PC, SAS, and LMo provided critical insights to the development of the framework and contributed to writing both the framework and this paper. KS, LMa, SAS, PC, and LMo facilitated the expert workshop, KS and LMa developed the gap analysis and led the analysis of the consultation. KAB, NC, and EM contributed the economic components to the framework. The scientific advisory group (JB, JMB, DPF, MP, JR-M, and MW) provided feedback and edits on drafts of the framework, with particular attention to process evaluation (JB), clinical research (JMB), implementation (JR-M, DPF), systems perspective (MP), theory based perspective (JR-M), and population health (MW). LMo is senior author. KS and LMo are the guarantors of this work and accept the full responsibility for the finished article. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting authorship criteria have been omitted.

Funding: The work was funded by the National Institute for Health Research (Department of Health and Social Care 73514) and Medical Research Council (MRC). Additional time on the study was funded by grants from the MRC for KS (MC_UU_12017/11, MC_UU_00022/3), LMa, SAS, and LMo (MC_UU_12017/14, MC_UU_00022/1); PC (MC_UU_12017/15, MC_UU_00022/2); and MW (MC_UU_12015/6 and MC_UU_00006/7). Additional time on the study was also funded by grants from the Chief Scientist Office of the Scottish Government Health Directorates for KS (SPHSU11 and SPHSU18); LMa, SAS, and LMo (SPHSU14 and SPHSU16); and PC (SPHSU13 and SPHSU15). KS and SAS were also supported by an MRC Strategic Award (MC_PC_13027). JMB received funding from the NIHR Biomedical Research Centre at University Hospitals Bristol NHS Foundation Trust and the University of Bristol and by the MRC ConDuCT-II Hub (Collaboration and innovation for Difficult and Complex randomised controlled Trials In Invasive procedures - MR/K025643/1). DF is funded in part by the NIHR Manchester Biomedical Research Centre (IS-BRC-1215-20007) and NIHR Applied Research Collaboration - Greater Manchester (NIHR200174). MP is funded in part as director of the NIHR’s Public Health Policy Research Unit. This project was overseen by a scientific advisory group that comprised representatives of NIHR research programmes, of the MRC/NIHR Methodology Research Programme Panel, of key MRC population health research investments, and authors of the 2006 guidance. A prospectively agreed protocol, outlining the workplan, was agreed with MRC and NIHR, and signed off by the scientific advisory group. The framework was reviewed and approved by the MRC/NIHR Methodology Research Programme Advisory Group and MRC Population Health Sciences Group and completed NIHR HTA Monograph editorial and peer review processes.

Competing interests: All authors have completed the ICMJE uniform disclosure form at http://www.icmje.org/coi_disclosure.pdf and declare: support from the NIHR, MRC, and the funders listed above for the submitted work; KS has project grant funding from the Scottish Government Chief Scientist Office; SAS is a former member of the NIHR Health Technology Assessment Clinical Evaluation and Trials Programme Panel (November 2016 - November 2020) and member of the Chief Scientist Office Health HIPS Committee (since 2018) and NIHR Policy Research Programme (since November 2019), and has project grant funding from the Economic and Social Research Council, MRC, and NIHR; LMo is a former member of the MRC-NIHR Methodology Research Programme Panel (2015-19) and MRC Population Health Sciences Group (2015-20); JB is a member of the NIHR Public Health Research Funding Committee (since May 2019), and a core member (since 2016) and vice chairperson (since 2018) of a public health advisory committee of the National Institute for Health and Care Excellence; JMB is a former member of the NIHR Clinical Trials Unit Standing Advisory Committee (2015-19); DPF is a former member of the NIHR Public Health Research programme research funding board (2015-2019), the MRC-NIHR Methodology Research Programme panel member (2014-2018), and is a panel member of the Research Excellence Framework 2021, subpanel 2 (public health, health services, and primary care; November 2020 - February 2022), and has grant funding from the European Commission, NIHR, MRC, Natural Environment Research Council, Prevent Breast Cancer, Breast Cancer Now, Greater Sport, Manchester University NHS Foundation Trust, Christie Hospital NHS Trust, and BXS GP; EM is a member of the NIHR Public Health Research funding board; MP has grant funding from the MRC, UK Prevention Research Partnership, and NIHR; JR-M is programme director and chairperson of the NIHR’s Health Services Delivery Research Programme (since 2014) and member of the NIHR Strategy Board (since 2014); MW received a salary as director of the NIHR PHR Programme (2014-20), has grant funding from NIHR, and is a former member of the MRC’s Population Health Sciences Strategic Committee (July 2014 to June 2020). There are no other relationships or activities that could appear to have influenced the submitted work.

Patient and public involvement: This project was methodological; views of patients and the public were included at the open consultation stage of the update. The open consultation, involving access to an initial draft, was promoted to our networks via email and digital channels, such as our unit Twitter account ( @theSPHSU ). We received five responses from people who identified as service users (rather than researchers or professionals in a relevant capacity). Their input included helpful feedback on the main complexity diagram, the different research perspectives, the challenge of moving interventions between different contexts and overall readability and accessibility of the document. Several respondents also highlighted useful signposts to include for readers. Various dissemination events are planned, but as this project is methodological we will not specifically disseminate to patients and the public beyond the planned dissemination activities.

Provenance and peer review: Not commissioned; externally peer reviewed.

1. Craig P, Dieppe P, Macintyre S, Michie S, Nazareth I, Petticrew M, Medical Research Council Guidance . Developing and evaluating complex interventions: the new Medical Research Council guidance. BMJ 2008;337:a1655. 10.1136/bmj.a1655. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
2. Craig P, Dieppe P, Macintyre S, et al. Developing and evaluating complex interventions: new guidance. Medical Research Council, 2006. [ Google Scholar ]
3. Campbell M, Fitzpatrick R, Haines A, et al. Framework for design and evaluation of complex interventions to improve health. BMJ 2000;321:694-6. 10.1136/bmj.321.7262.694. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
4. O’Cathain A, Croot L, Duncan E, et al. Guidance on how to develop complex interventions to improve health and healthcare. BMJ Open 2019;9:e029954. 10.1136/bmjopen-2019-029954 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
5. Craig P, Cooper C, Gunnell D, et al. Using natural experiments to evaluate population health interventions: new Medical Research Council guidance. J Epidemiol Community Health 2012;66:1182-6. 10.1136/jech-2011-200375 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
6. Craig P, Ruggiero ED, Frohlich KL, et al. Taking account of context in population health intervention research: guidance for producers, users and funders of research. NIHR Journals Library, 2018 10.3310/CIHR-NIHR-01 . [ DOI ] [ Google Scholar ]
7. Moore GF, Audrey S, Barker M, et al. Process evaluation of complex interventions: Medical Research Council guidance. BMJ 2015;350:h1258. 10.1136/bmj.h1258 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
8. Moore G, Campbell M, Copeland L, et al. Adapting interventions to new contexts-the ADAPT guidance. BMJ 2021;374:n1679. 10.1136/bmj.n1679 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
9. Skivington K, Matthews L, Simpson SA, et al. Framework for the development and evaluation of complex interventions: gap analysis, workshop and consultation-informed update. Health Technol Assess 2021. [forthcoming]. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
10. Chng NR, Hawkins K, Fitzpatrick B, et al. Implementing social prescribing in primary care in areas of high socioeconomic deprivation: process evaluation of the ‘Deep End’ community links worker programme. Br J Gen Pract 2021;1153:BJGP.2020.1153. 10.3399/BJGP.2020.1153 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
11. Mercer SW, Fitzpatrick B, Grant L, et al. Effectiveness of Community-Links Practitioners in Areas of High Socioeconomic Deprivation. Ann Fam Med 2019;17:518-25. 10.1370/afm.2429 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
12. Hawe P, Shiell A, Riley T. Complex interventions: how “out of control” can a randomised controlled trial be? BMJ 2004;328:1561-3. 10.1136/bmj.328.7455.1561 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
13. Blencowe NS, Mills N, Cook JA, et al. Standardizing and monitoring the delivery of surgical interventions in randomized clinical trials. Br J Surg 2016;103:1377-84. 10.1002/bjs.10254 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
14. Blencowe NS, Skilton A, Gaunt D, et al. ROMIO Study team . Protocol for developing quality assurance measures to use in surgical trials: an example from the ROMIO study. BMJ Open 2019;9:e026209. 10.1136/bmjopen-2018-026209 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
15. Greenhalgh T, Papoutsi C. Studying complexity in health services research: desperately seeking an overdue paradigm shift. BMC Med 2018;16:95. 10.1186/s12916-018-1089-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
16. Hawe P, Shiell A, Riley T. Theorising interventions as events in systems. Am J Community Psychol 2009;43:267-76. 10.1007/s10464-009-9229-9 [ DOI ] [ PubMed ] [ Google Scholar ]
17. Petticrew M. When are complex interventions ‘complex’? When are simple interventions ‘simple’? Eur J Public Health 2011;21:397-8. 10.1093/eurpub/ckr084 [ DOI ] [ PubMed ] [ Google Scholar ]
18. Anderson R. New MRC guidance on evaluating complex interventions. BMJ 2008;337:a1937. 10.1136/bmj.a1937 [ DOI ] [ PubMed ] [ Google Scholar ]
19. Pawson R, Tilley N. Realistic Evaluation. Sage, 1997. [ Google Scholar ]
20. Campbell R, Starkey F, Holliday J, et al. An informal school-based peer-led intervention for smoking prevention in adolescence (ASSIST): a cluster randomised trial. Lancet 2008;371:1595-602. 10.1016/S0140-6736(08)60692-3 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
21. Randell R, Honey S, Hindmarsh J, et al. A realist process evaluation of robot-assisted surgery: integration into routine practice and impacts on communication, collaboration and decision-making . NIHR Journals Library, 2017. https://www.ncbi.nlm.nih.gov/books/NBK447438/ . [ PubMed ]
22. Rutter H, Savona N, Glonti K, et al. The need for a complex systems model of evidence for public health. Lancet 2017;390:2602-4. 10.1016/S0140-6736(17)31267-9 [ DOI ] [ PubMed ] [ Google Scholar ]
23. The Health Foundation. Evidence Scan. Complex adaptive systems. Health Foundation 2010. https://www.health.org.uk/publications/complex-adaptive-systems .
24. Wiggins M, Bonell C, Sawtell M, et al. Health outcomes of youth development programme in England: prospective matched comparison study. BMJ 2009;339:b2534. 10.1136/bmj.b2534 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
25. Robinson M, Geue C, Lewsey J, et al. Evaluating the impact of the alcohol act on off-trade alcohol sales: a natural experiment in Scotland. Addiction 2014;109:2035-43. 10.1111/add.12701 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
26. Raine R, Fitzpatrick R, de Pury J. Challenges, solutions and future directions in evaluative research. J Health Serv Res Policy 2016;21:215-6. 10.1177/1355819616664495 [ DOI ] [ PubMed ] [ Google Scholar ]
27. Kessler R, Glasgow RE. A proposal to speed translation of healthcare research into practice: dramatic change is needed. Am J Prev Med 2011;40:637-44. 10.1016/j.amepre.2011.02.023 [ DOI ] [ PubMed ] [ Google Scholar ]
28. Folegatti PM, Ewer KJ, Aley PK, et al. Oxford COVID Vaccine Trial Group . Safety and immunogenicity of the ChAdOx1 nCoV-19 vaccine against SARS-CoV-2: a preliminary report of a phase 1/2, single-blind, randomised controlled trial. Lancet 2020;396:467-78. 10.1016/S0140-6736(20)31604-4 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
29. Voysey M, Clemens SAC, Madhi SA, et al. Oxford COVID Vaccine Trial Group . Safety and efficacy of the ChAdOx1 nCoV-19 vaccine (AZD1222) against SARS-CoV-2: an interim analysis of four randomised controlled trials in Brazil, South Africa, and the UK. Lancet 2021;397:99-111. 10.1016/S0140-6736(20)32661-1 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
30. Soi C, Shearer JC, Budden A, et al. How to evaluate the implementation of complex health programmes in low-income settings: the approach of the Gavi Full Country Evaluations. Health Policy Plan 2020;35(Supplement_2):ii35-46. 10.1093/heapol/czaa127 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
31. Burgess RA, Osborne RH, Yongabi KA, et al. The COVID-19 vaccines rush: participatory community engagement matters more than ever. Lancet 2021;397:8-10. 10.1016/S0140-6736(20)32642-8 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
32. Paltiel AD, Schwartz JL, Zheng A, Walensky RP. Clinical Outcomes Of A COVID-19 Vaccine: Implementation Over Efficacy. Health Aff (Millwood) 2021;40:42-52. 10.1377/hlthaff.2020.02054 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
33. Dalkin S, Lhussier M, Williams L, et al. Exploring the use of Soft Systems Methodology with realist approaches: A novel way to map programme complexity and develop and refine programme theory. Evaluation 2018;24:84-97. 10.1177/1356389017749036 . [ DOI ] [ Google Scholar ]
34. Mann C, Shaw ARG, Guthrie B, et al. Can implementation failure or intervention failure explain the result of the 3D multimorbidity trial in general practice: mixed-methods process evaluation. BMJ Open 2019;9:e031438. 10.1136/bmjopen-2019-031438 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
35. French C, Pinnock H, Forbes G, Skene I, Taylor SJC. Process evaluation within pragmatic randomised controlled trials: what is it, why is it done, and can we find it?-a systematic review. Trials 2020;21:916. 10.1186/s13063-020-04762-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
36. Penney T, Adams J, Briggs A, et al. Evaluation of the impacts on health of the proposed UK industry levy on sugar sweetened beverages: developing a systems map and data platform, and collection of baseline and early impact data. National Institute for Health Research, 2018. https://www.journalslibrary.nihr.ac.uk/programmes/phr/164901/#/ .
37. Hoddinott P, Britten J, Pill R. Why do interventions work in some places and not others: a breastfeeding support group trial. Soc Sci Med 2010;70:769-78. 10.1016/j.socscimed.2009.10.067 [ DOI ] [ PubMed ] [ Google Scholar ]
38. Funnell SC, Rogers PJ. Purposeful program theory: effective use of theories of change and logic models. 1st ed. Jossey-Bass, 2011. [ Google Scholar ]
39. Lawless A, Baum F, Delany-Crowe T, et al. Developing a Framework for a Program Theory-Based Approach to Evaluating Policy Processes and Outcomes: Health in All Policies in South Australia. Int J Health Policy Manag 2018;7:510-21. 10.15171/ijhpm.2017.121 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
40. Stephens TJ, Peden CJ, Pearse RM, et al. EPOCH trial group . Improving care at scale: process evaluation of a multi-component quality improvement intervention to reduce mortality after emergency abdominal surgery (EPOCH trial). Implement Sci 2018;13:142. 10.1186/s13012-018-0823-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
41. Bonell C, Jamal F, Melendez-Torres GJ, Cummins S. ‘Dark logic’: theorising the harmful consequences of public health interventions. J Epidemiol Community Health 2015;69:95-8. 10.1136/jech-2014-204671 [ DOI ] [ PubMed ] [ Google Scholar ]
42. Maini R, Mounier-Jack S, Borghi J. How to and how not to develop a theory of change to evaluate a complex intervention: reflections on an experience in the Democratic Republic of Congo. BMJ Glob Health 2018;3:e000617. 10.1136/bmjgh-2017-000617 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
43. Cook PA, Hargreaves SC, Burns EJ, et al. Communities in charge of alcohol (CICA): a protocol for a stepped-wedge randomised control trial of an alcohol health champions programme. BMC Public Health 2018;18:522. 10.1186/s12889-018-5410-0 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
44. Ebenso B, Manzano A, Uzochukwu B, et al. Dealing with context in logic model development: Reflections from a realist evaluation of a community health worker programme in Nigeria. Eval Program Plann 2019;73:97-110. 10.1016/j.evalprogplan.2018.12.002 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
45. White M, Cummins S, Raynor M, et al. Evaluation of the health impacts of the UK Treasury Soft Drinks Industry Levy (SDIL) Project Protocol. NIHR Journals Library, 2018. https://www.journalslibrary.nihr.ac.uk/programmes/phr/1613001/#/summary-of-research .
46. National Institute for Health and Care Excellence. What is public involvement in research? – INVOLVE. https://www.invo.org.uk/find-out-more/what-is-public-involvement-in-research-2/ .
47. Jones S, Riste L, Barrowclough C, et al. Reducing relapse and suicide in bipolar disorder: practical clinical approaches to identifying risk, reducing harm and engaging service users in planning and delivery of care – the PARADES (Psychoeducation, Anxiety, Relapse, Advance Directive Evaluation and Suicidality) programme. Programme Grants for Applied Research 2018;6:1-296. 10.3310/pgfar06060 [ DOI ] [ PubMed ] [ Google Scholar ]
48. Moodie R, Stuckler D, Monteiro C, et al. Lancet NCD Action Group . Profits and pandemics: prevention of harmful effects of tobacco, alcohol, and ultra-processed food and drink industries. Lancet 2013;381:670-9. 10.1016/S0140-6736(12)62089-3 [ DOI ] [ PubMed ] [ Google Scholar ]
49. Yardley L, Ainsworth B, Arden-Close E, Muller I. The person-based approach to enhancing the acceptability and feasibility of interventions. Pilot Feasibility Stud 2015;1:37. 10.1186/s40814-015-0033-z [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
50. Barnett ML, Dopp AR, Klein C, Ettner SL, Powell BJ, Saldana L. Collaborating with health economists to advance implementation science: a qualitative study. Implement Sci Commun 2020;1:82. 10.1186/s43058-020-00074-w [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
51. National Institute for Health and Care Excellence. Developing NICE guidelines: the manual. NICE, 2014. https://www.nice.org.uk/process/pmg20/resources/developing-nice-guidelines-the-manual-pdf-72286708700869 . [ PubMed ]
52. Boyd KA, Balogun MO, Minnis H. Development of a radical foster care intervention in Glasgow, Scotland. Health Promot Int 2016;31:665-73. 10.1093/heapro/dav041 [ DOI ] [ PubMed ] [ Google Scholar ]
53. Deidda M, Boyd KA, Minnis H, et al. BeST study team . Protocol for the economic evaluation of a complex intervention to improve the mental health of maltreated infants and children in foster care in the UK (The BeST? services trial). BMJ Open 2018;8:e020066. 10.1136/bmjopen-2017-020066 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
54. Escoffery C, Lebow-Skelley E, Haardoerfer R, et al. A systematic review of adaptations of evidence-based public health interventions globally. Implement Sci 2018;13:125. 10.1186/s13012-018-0815-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
55. Stirman SW, Miller CJ, Toder K, Calloway A. Development of a framework and coding system for modifications and adaptations of evidence-based interventions. Implement Sci 2013;8:65. 10.1186/1748-5908-8-65 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
56. Forsyth R, Purcell C, Barry S, et al. Peer-led intervention to prevent and reduce STI transmission and improve sexual health in secondary schools (STASH): protocol for a feasibility study. Pilot Feasibility Stud 2018;4:180. 10.1186/s40814-018-0354-9 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
57. White J, Hawkins J, Madden K, et al. Adapting the ASSIST model of informal peer-led intervention delivery to the Talk to FRANK drug prevention programme in UK secondary schools (ASSIST + FRANK): intervention development, refinement and a pilot cluster randomised controlled trial. Public Health Research 2017;5:1-98. 10.3310/phr05070 [ DOI ] [ PubMed ] [ Google Scholar ]
58. Evans RE, Moore G, Movsisyan A, Rehfuess E, ADAPT Panel. ADAPT Panel comprises of Laura Arnold . How can we adapt complex population health interventions for new contexts? Progressing debates and research priorities. J Epidemiol Community Health 2021;75:40-5. 10.1136/jech-2020-214468. [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
59. Hawkes RE, Miles LM, French DP. The theoretical basis of a nationally implemented type 2 diabetes prevention programme: how is the programme expected to produce changes in behaviour? Int J Behav Nutr Phys Act 2021;18:64. 10.1186/s12966-021-01134-7 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
60. Ogilvie D, Adams J, Bauman A, et al. Using natural experimental studies to guide public health action: turning the evidence-based medicine paradigm on its head. SocArXiv 2019 10.31235/osf.io/s36km. [ DOI ] [ PMC free article ] [ PubMed ]
61. Eldridge SM, Chan CL, Campbell MJ, et al. PAFS consensus group . CONSORT 2010 statement: extension to randomised pilot and feasibility trials. BMJ 2016;355:i5239. 10.1136/bmj.i5239 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
62. Thabane L, Hopewell S, Lancaster GA, et al. Methods and processes for development of a CONSORT extension for reporting pilot randomized controlled trials. Pilot Feasibility Stud 2016;2:25. 10.1186/s40814-016-0065-z [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
63. Craig P, Campbell M. Evaluability Assessment: a systematic approach to deciding whether and how to evaluate programmes and policies. Evaluability Assessment working paper. 2015. http://whatworksscotland.ac.uk/wp-content/uploads/2015/07/WWS-Evaluability-Assessment-Working-paper-final-June-2015.pdf
64. Ogilvie D, Cummins S, Petticrew M, White M, Jones A, Wheeler K. Assessing the evaluability of complex public health interventions: five questions for researchers, funders, and policymakers. Milbank Q 2011;89:206-25. 10.1111/j.1468-0009.2011.00626.x [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
65. Expected Value of Perfect Information (EVPI). YHEC - York Health Econ. Consort. https://yhec.co.uk/glossary/expected-value-of-perfect-information-evpi/ .
66. Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials. Soc Sci Med 2018;210:2-21. 10.1016/j.socscimed.2017.12.005 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
67. Rosas S, Knight E. Evaluating a complex health promotion intervention: case application of three systems methods. Crit Public Health 2019;29:337-52. 10.1080/09581596.2018.1455966 . [ DOI ] [ Google Scholar ]
68. McKee M, Britton A, Black N, McPherson K, Sanderson C, Bain C. Methods in health services research. Interpreting the evidence: choosing between randomised and non-randomised studies. BMJ 1999;319:312-5. 10.1136/bmj.319.7205.312 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
69. Burnett T, Mozgunov P, Pallmann P, Villar SS, Wheeler GM, Jaki T. Adding flexibility to clinical trial designs: an example-based guide to the practical use of adaptive designs. BMC Med 2020;18:352. 10.1186/s12916-020-01808-2 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
70. Collins LM, Murphy SA, Strecher V. The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): new methods for more potent eHealth interventions. Am J Prev Med 2007;32(Suppl):S112-8. 10.1016/j.amepre.2007.01.022 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
71. McDonald S, Quinn F, Vieira R, et al. The state of the art and future opportunities for using longitudinal n-of-1 methods in health behaviour research: a systematic literature overview. Health Psychol Rev 2017;11:307-23. 10.1080/17437199.2017.1316672 [ DOI ] [ PubMed ] [ Google Scholar ]
72. Green BB, Coronado GD, Schwartz M, Coury J, Baldwin LM. Using a continuum of hybrid effectiveness-implementation studies to put research-tested colorectal screening interventions into practice. Implement Sci 2019;14:53. 10.1186/s13012-019-0903-5 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
73. Tugwell P, Knottnerus JA, McGowan J, Tricco A. Big-5 Quasi-Experimental designs. J Clin Epidemiol 2017;89:1-3. 10.1016/j.jclinepi.2017.09.010 [ DOI ] [ PubMed ] [ Google Scholar ]
74. Egan M, McGill E, Penney T, et al. NIHR SPHR Guidance on Systems Approaches to Local Public Health Evaluation. Part 1: Introducing systems thinking. NIHR School for Public Health Research, 2019. https://sphr.nihr.ac.uk/wp-content/uploads/2018/08/NIHR-SPHR-SYSTEM-GUIDANCE-PART-1-FINAL_SBnavy.pdf .
75. Bonell C, Fletcher A, Morton M, Lorenc T, Moore L. Realist randomised controlled trials: a new approach to evaluating complex public health interventions. Soc Sci Med 2012;75:2299-306. 10.1016/j.socscimed.2012.08.032 [ DOI ] [ PubMed ] [ Google Scholar ]
76. McGill E, Marks D, Er V, Penney T, Petticrew M, Egan M. Qualitative process evaluation from a complex systems perspective: A systematic review and framework for public health evaluators. PLoS Med 2020;17:e1003368. 10.1371/journal.pmed.1003368 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
77. Bicket M, Christie I, Gilbert N, et al. Magenta Book 2020 Supplementary Guide: Handling Complexity in Policy Evaluation. Lond HM Treas 2020. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/879437/Magenta_Book_supplementary_guide._Handling_Complexity_in_policy_evaluation.pdf
78. Pfadenhauer LM, Gerhardus A, Mozygemba K, et al. Making sense of complexity in context and implementation: the Context and Implementation of Complex Interventions (CICI) framework. Implement Sci 2017;12:21. 10.1186/s13012-017-0552-5 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
79. Curran GM, Bauer M, Mittman B, Pyne JM, Stetler C. Effectiveness-implementation hybrid designs: combining elements of clinical effectiveness and implementation research to enhance public health impact. Med Care 2012;50:217-26. 10.1097/MLR.0b013e3182408812 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
80. Landes SJ, McBain SA, Curran GM. An introduction to effectiveness-implementation hybrid designs. Psychiatry Res 2019;280:112513. 10.1016/j.psychres.2019.112513 [ DOI ] [ PMC free article ] [ PubMed ] [ Google Scholar ]
81. Imison C, Curry N, Holder H, et al. Shifting the balance of care: great expectations. Research report. Nuffield Trust. https://www.nuffieldtrust.org.uk/research/shifting-the-balance-of-care-great-expectations
82. Remme M, Martinez-Alvarez M, Vassall A. Cost-Effectiveness Thresholds in Global Health: Taking a Multisectoral Perspective. Value Health 2017;20:699-704. 10.1016/j.jval.2016.11.009 [ DOI ] [ PubMed ] [ Google Scholar ]
83. Chalmers I, Glasziou P. Avoidable waste in the production and reporting of research evidence. Lancet 2009;374:86-9. 10.1016/S0140-6736(09)60329-9 [ DOI ] [ PubMed ] [ Google Scholar ]
View on publisher site
Collections

Add to Collections

Understanding Effectiveness Evaluation: Definition, Benefits, and Best Practices
Learning Center

Learn about effectiveness evaluation, its definition, benefits, and best practices. Discover how effectiveness evaluation can help assess program performance and inform decision-making. Get insights into the challenges and limitations of effectiveness evaluation and explore future directions for research and practice.

Table of Contents

What is Effectiveness Evaluation and Why is it Important?

Effectiveness evaluation: purpose, goals, benefits of effectiveness evaluation, types of effectiveness evaluation, best practices for conducting effectiveness evaluation, examples of effectiveness evaluation in practice, examples of effectiveness evaluation questions, challenges and limitations of effectiveness evaluation, future directions for effectiveness evaluation research and practice.

Effectiveness evaluation is a process of assessing the extent to which a program, policy, or intervention achieves its intended objectives or goals. It involves measuring the outcomes and impacts of a program to determine whether it is producing the desired results and meeting the needs of the target population.

Effectiveness evaluation is important for several reasons. First, it provides information on whether a program is working or not, and if it is not working, why it is not working. This information can be used to make necessary changes to the program to improve its effectiveness.

Second, effectiveness evaluation helps to demonstrate accountability and transparency to stakeholders such as funders, policymakers, and the public. It shows that the program is being monitored and evaluated regularly to ensure that it is achieving its intended outcomes.

Third, effectiveness evaluation provides evidence of the impact of the program, which can be used to inform future decision-making and resource allocation. It helps to identify best practices and areas for improvement that can be applied to similar programs in the future.

Evaluation of programme effectiveness is an essential component of programme management and evaluation, and it plays a significant role in ensuring that programmes are impactful, efficient, and effective in meeting their goals.

Effectiveness evaluation is an essential process that assesses the extent to which a program, policy, or intervention achieves its intended objectives or goals. The primary purpose of effectiveness evaluation is to measure the outcomes and impacts of a program to determine whether it is producing the desired results and meeting the needs of the target population.

In addition to measuring outcomes, effectiveness evaluation also aims to identify the strengths and weaknesses of the program. This includes determining what is working well and what needs improvement. By identifying these strengths and weaknesses, effectiveness evaluation provides valuable information that can be used to make necessary changes to the program to improve its effectiveness.

Another goal of effectiveness evaluation is to inform decision-making and resource allocation. By providing evidence of the impact of the program, effectiveness evaluation helps stakeholders to make informed decisions about the allocation of resources and to identify best practices that can be applied to similar programs in the future.

Evaluating the effectiveness of programmes is done with the overarching goal of ensuring that those programmes are impactful, efficient, and effective. It provides valuable information on the performance of the programme, which can be used to make necessary changes to the programme in order to improve its effectiveness as well as to inform decision-making and the allocation of resources.

Evaluation of effectiveness serves the purpose of ensuring that programmes are impactful, efficient, and effective in meeting their objectives. It provides valuable information on programme performance, which can be used to make necessary changes to the programme in order to improve its effectiveness. It can also be used to inform decision-making and the allocation of resources.

There are several types of effectiveness evaluation that can be used to assess program performance.

The first type is outcome evaluation, which assesses whether a program is achieving its intended outcomes. This involves measuring changes in knowledge, attitudes, behaviors, and other indicators of program success. Outcome evaluation is often used to determine whether a program is achieving its intended impact on the target population.
The second type is impact evaluation, which assesses the overall impact of a program on the target population. Impact evaluation looks at both intended and unintended outcomes of a program and examines how the program has affected the lives of those it serves.
The third type is cost-effectiveness evaluation, which assesses the cost-effectiveness of a program. This involves comparing the costs of implementing a program to the outcomes and impacts it produces. Cost-effectiveness evaluation is often used to determine whether a program is efficient and whether it is achieving its intended outcomes at a reasonable cost.
The fourth type is process evaluation, which assesses how well a program is being implemented. This involves examining how the program is being delivered and whether it is being delivered as intended. Process evaluation is often used to identify areas where program implementation can be improved and to ensure that the program is being delivered consistently across different settings.

In order to provide a more in-depth analysis of the effectiveness of a programme, these various types of effectiveness evaluation can be combined with one another. By employing a variety of approaches to evaluation, organisations can gain a more nuanced comprehension of the efficacy of their programmes and the areas in which they can strengthen their offerings.

Effectiveness evaluation is a critical process that requires careful planning and execution to ensure that it produces reliable and useful results. To conduct an effective evaluation, it is important to clearly define program objectives, choose appropriate evaluation methods, collect high-quality data, use a comparison group, and involve stakeholders in the evaluation process.

Clear definition of program objectives helps to ensure that the evaluation measures the right outcomes and that the results are relevant to the program’s goals. Choosing appropriate evaluation methods requires careful consideration of the type of program being evaluated and the most appropriate method for measuring its outcomes.

Collecting high-quality data is essential to the accuracy and reliability of the evaluation results. Using validated instruments and well-designed surveys can help to ensure that the data collected is of high quality and that it is collected consistently across different settings.

Using a comparison group can help to determine whether the program is responsible for any observed changes in outcomes. This can help to ensure that the results of the evaluation are reliable and valid.

Finally, involving stakeholders in the evaluation process can help to ensure that the evaluation is relevant and useful. Stakeholders can provide feedback on the evaluation design, help to interpret the results and use the results to make program improvements.

In general, adhering to best practices for conducting effectiveness evaluations is essential for ensuring that the evaluation produces reliable and useful results that can be used to improve programme performance and make informed decisions about resource allocation. This is because these results can be used to inform how resources should be distributed.

Effectiveness evaluation can be used in a variety of settings to assess program performance and inform decision-making. Here are some examples of effectiveness evaluation in practice:

Healthcare: Effectiveness evaluation is commonly used in healthcare to assess the impact of medical interventions, treatments, and public health programs. For example, effectiveness evaluation may be used to determine the impact of a new medication on patient outcomes, the effectiveness of a disease prevention program, or the impact of a public health campaign to promote healthy behaviors.
Education: Effectiveness evaluation is also used in education to assess the impact of educational programs, teaching strategies, and interventions. For example, effectiveness evaluation may be used to determine the impact of a new teaching method on student learning outcomes or to assess the effectiveness of a literacy intervention program.
Social services: Effectiveness evaluation is also used in social services to assess the impact of programs designed to improve the well-being of individuals and communities. For example, effectiveness evaluation may be used to assess the impact of a job training program on employment outcomes or the effectiveness of a community-based program to reduce substance abuse.
Environmental programs: Effectiveness evaluation is also used in environmental programs to assess the impact of policies and interventions designed to protect the environment and promote sustainability. For example, effectiveness evaluation may be used to assess the impact of a new recycling program on waste reduction or to assess the effectiveness of a policy to reduce carbon emissions.
International Development: Effectiveness evaluation is also used in international development to assess the impact of aid and development programs in low-income countries. For example, effectiveness evaluation may be used to assess the impact of a health program aimed at reducing the prevalence of a specific disease or the effectiveness of a program aimed at increasing access to education.
Non-profit sector: Effectiveness evaluation is also used in the non-profit sector to assess the impact of programs aimed at addressing social issues. For example, effectiveness evaluation may be used to assess the impact of a program aimed at reducing poverty or homelessness or to assess the effectiveness of a program aimed at increasing access to healthcare for low-income individuals.
Government programs: Effectiveness evaluation is also used by governments to assess the impact of public policies and programs. For example, effectiveness evaluation may be used to assess the impact of a policy aimed at reducing crime or to assess the effectiveness of a program aimed at increasing access to affordable housing.
Business sector: Effectiveness evaluation is also used in the business sector to assess the impact of programs aimed at improving business operations, customer satisfaction, and employee engagement. For example, effectiveness evaluation may be used to assess the impact of a new customer service training program on customer satisfaction or to assess the effectiveness of a program aimed at increasing employee retention.
Technology sector: Effectiveness evaluation is also used in the technology sector to assess the impact of software, apps, and other technological tools. For example, effectiveness evaluation may be used to assess the impact of a new app designed to improve mental health outcomes or to assess the effectiveness of a software program aimed at improving productivity in the workplace.
Public health: Effectiveness evaluation is widely used in public health to evaluate the effectiveness of various public health interventions, including vaccination programs, health education campaigns, and disease control initiatives. For example, effectiveness evaluation may be used to assess the impact of a vaccination program on disease prevalence or to assess the effectiveness of a health education campaign aimed at reducing tobacco use.

In conclusion, effectiveness evaluation is a versatile instrument that can be applied to a wide variety of programmes and interventions in a variety of fields, such as healthcare, education, social services, international development, non-profit work, government, business, technology, and public health. These are just some of the fields that can benefit from its use. Its application can help ensure that programmes and interventions are achieving the outcomes they were designed to achieve and improving the lives of individuals as well as communities.

The specific questions that an effectiveness evaluation seeks to answer will depend on the goals and objectives of the program or intervention being evaluated. However, some common questions that an effectiveness evaluation might seek to answer include:

To what extent has the program achieved its intended outcomes and objectives?
What evidence is there that the program has had an impact on the target population?
Has the program had any unintended consequences or negative impacts?
What factors have contributed to the program’s success or failure?
To what extent have program resources been used effectively and efficiently?
What is the long-term impact of the program on the target population and the broader community?
How does the program compare to other similar programs in terms of effectiveness and efficiency?
Are there any opportunities for improvement or modifications that could increase the program’s effectiveness?
Has the program been implemented as planned and if not, how has this affected program outcomes?
How sustainable is the program in terms of its long-term viability and ongoing impact?

These questions are not exhaustive, and the specific questions asked in an effectiveness evaluation will depend on the unique characteristics of the program or intervention being evaluated.

Effectiveness evaluation is a valuable tool for assessing program performance, but it is not without its challenges and limitations. One of the main challenges of effectiveness evaluation is the difficulty of measuring program outcomes. Measuring program outcomes can be challenging, particularly for programs with complex and long-term goals. Additionally, some outcomes may be difficult to measure, and measuring outcomes accurately can require a significant amount of time and resources.

Another challenge of effectiveness evaluation is limited resources. Conducting an effectiveness evaluation can be expensive and time-consuming, particularly for small organizations with limited resources. This can make it difficult for these organizations to conduct evaluations regularly.

In addition, programs can be affected by external factors that are beyond their control, such as changes in economic conditions or new policies. This can make it difficult to determine whether any observed changes in outcomes are the result of the program or of external factors.

Selecting an appropriate comparison group can also be challenging, particularly for programs that serve a diverse population. Choosing an appropriate comparison group can require a significant amount of expertise and resources.

Finally, programs may have multiple components, and it can be difficult to attribute specific outcomes to a particular component of the program. This can make it difficult to determine the effectiveness of individual program components.

It is important to be aware of the challenges and limitations of effectiveness evaluation in general, despite the fact that effectiveness evaluation is a valuable tool for assessing the performance of programmes overall. Organizations can work to ensure that their evaluations are effective and produce reliable results if they understand these challenges and limitations and work to address them accordingly.

As programs and interventions continue to evolve and become more complex, there is a need for continued research and innovation in effectiveness evaluation. Future directions for effectiveness evaluation research and practice should focus on incorporating new data sources, emphasizing stakeholder engagement, advancing evaluation methods, addressing equity and diversity, and emphasizing implementation science.

Incorporating new data sources, such as social media data and other forms of digital data, can help to improve effectiveness evaluation and provide more accurate and relevant results. Emphasizing stakeholder engagement is crucial for ensuring that evaluations are relevant, useful, and actionable. This includes involving program staff, participants, and funders in the evaluation process.

Advancing evaluation methods is important for improving the accuracy and relevance of effectiveness evaluation. This includes exploring new approaches such as randomized controlled trials, quasi-experimental designs, and mixed-methods approaches.

Addressing equity and diversity in program evaluation is also crucial. This includes ensuring that evaluation methods are culturally responsive and that evaluations are inclusive of diverse populations.

Finally, emphasizing implementation science can help to improve program implementation and effectiveness. Implementation science focuses on understanding how programs are implemented and how implementation affects program outcomes.

The overall goal of future directions for research and practise in effectiveness evaluation should be to improve the accuracy, relevance, and inclusivity of evaluations. This will ensure that programmes and interventions are achieving the outcomes for which they were designed and are making people’s and communities’ lives better.

References

Centers for Disease Control and Prevention (CDC): The CDC provides resources on program evaluation, including guidance on conducting effectiveness evaluations. You can find their website here: https://www.cdc.gov/evaluation/index.htm
The Campbell Collaboration: This organization is a global network of researchers who conduct systematic reviews of the effectiveness of interventions in education, social welfare, and crime and justice. Their website has a searchable database of research reports: https://www.campbellcollaboration.org/
The United States Department of Health and Human Services (HHS): The HHS provides resources on program evaluation, including a guide to conducting evaluations of federal programs. You can find their website here: https://aspe.hhs.gov/topics/evaluation

RITA METTAO CASTRO

Awesome information.

Samuel Molefe

how do we cite this

Radha Nakarmi

Awesome information and Very useful. How can Gate cite this paper? Radha Nakarmi

Centers for Disease Control and Prevention (CDC): The CDC provides resources on program evaluation, including guidance on conducting effectiveness evaluations. You can find their website here: https://www.cdc.gov/evaluation/index.htm The Campbell Collaboration: This organization is a global network of researchers who conduct systematic reviews of the effectiveness of interventions in education, social welfare, and crime and justice. Their website has a searchable database of research reports: https://www.campbellcollaboration.org/ The United States Department of Health and Human Services (HHS): The HHS provides resources on program evaluation, including a guide to conducting evaluations of federal programs. You can find their website here: https://aspe.hhs.gov/topics/evaluation

Very useful paper. I want to see the citations and related references for this paper. Radha Nakarmi

Feb 28, 2024 Very useful paper. I want to see the citations and related references for this paper. Radha Nakarmi

How strong is my Resume?

Only 2% of resumes land interviews.

Land a better, higher-paying career

more research is needed to effectively evaluate

Jobs for You

Senior associate, human resources.

United States

Monitoring, Evaluation and Learning Advisor

Toronto, ON, Canada
Cuso International

Country Development Cooperation Strategy Advisor

Democracy and governance advisor, senior advisor, senior health advisor, education, youth, and child development advisor, monitoring & evaluation advisor, director of organizational development/ organizational/change management specialist, mis/mel specialist, deputy chief of party, chief of party, senior project manager, local economic expert, social cohesion and conflict prevention expert, services you might be interested in, useful guides ....

How to Create a Strong Resume

Monitoring And Evaluation Specialist Resume

Resume Length for the International Development Sector

Types of Evaluation

Monitoring, Evaluation, Accountability, and Learning (MEAL)

LAND A JOB REFERRAL IN 2 WEEKS (NO ONLINE APPS!)

How to Evaluate a Study

Not all studies should be treated equally. Below are a few key factors to consider when evaluating a study’s conclusions.

Has the study been reviewed by other experts ? Peer-review, the process by which a study is sent to other researchers in a particular field for their notes and thoughts, is essential in evaluating a study’s findings. Since most consumers and members of the media are not well-trained enough to evaluate a study’s design and researcher’s findings, studies that pass muster with other researchers and are accepted for publication in prestigious journals are generally more trustworthy.
Do other experts agree? Have other experts spoken out against the study’s findings? Who are these other experts and are their criticisms valid?
Are there reasons to doubt the findings? One of the most important items to keep in mind when reviewing studies is that correlation does not prove causation. For instance, just because there is an association between eating blueberries and weighing less does not mean that eating blueberries will make you lose weight. Researchers should look for other explanations for their findings, known as “confounding variables.” In this instance, they should consider that people who tend to eat blueberries also tend to exercise more and consume fewer calories overall.
How do the conclusions fit with other studies? It’s rare that a single study is enough to overturn the preponderance of research offering a different conclusion. Though studies that buck the established notion are not necessarily wrong, they should be scrutinized closely to ensure that their findings are accurate.
How big was the study? Sample size matters. The more patients or subjects involved in a study, the more likely it is that the study’s conclusions aren’t merely due to random chance and are, in fact, statistically significant.
Are there any major flaws in the study’s design? This is one of the most difficult steps if you aren’t an expert in a particular field, but there are ways to look for bias. For example, was the study a “double-blind” experiment or were the researchers aware of which subjects were the control set?
Have the researchers identified any flaws or limitations with their researc h? Often buried in the conclusion, researchers acknowledge limitations or possible other theories for their results. Because the universities, government agencies, or other organizations who’ve funded and promoted the study often want to highlight the boldest conclusion possible, these caveats can be overlooked. However, they’re important when considering how important the study’s conclusions really are.
Have the findings been replicated? With growing headlines of academic fraud and leading journals forced to retract articles based on artificial results, replication of results is increasingly important to judge the merit of a study’s findings. If other researchers can replicate an experiment and come to a similar conclusion, it’s much easier to trust those results than those that have only been peer reviewed.

IMAGES

Conducting Research: A Step-by-Step Guide
Evaluative Research: Definition, Methods & Applications
What is Research?
What is Evaluation Research? + [Methods & Examples]
Evaluative Research: Definition, Methods & Types
Evaluation Research Examples

VIDEO

Critical Appraisal of research evidence
Metho 4: Good Research Qualities / Research Process / Research Methods Vs Research Methodology
[Webinar] Mouse Models and Pharmacology Services for Neurological Disease Research
Establishing Your Threat Model for Cybersecurity Success
Evaluative Research 101 🐝 #evaluativeresearch #uxresearch #uxdesign #userexperienceresearch
SmartCrowd's Investment Calculator

COMMENTS

Evaluating Research – Process, Examples and Methods
Evaluating Research. Research evaluation is a systematic process used to assess the quality, relevance, credibility, and overall contribution of a research study. Effective evaluation allows researchers, policymakers, and practitioners to determine the reliability of findings, understand the study’s strengths and limitations, and make informed decisions based on evidence.
Evaluation.gov | Evaluation 101
Evaluation 101 provides resources to help you answer those questions and more. You will learn about program evaluation and why it is needed, along with some helpful frameworks that place evaluation in the broader evidence context. Other resources provide helpful overviews of specific types of evaluation you may encounter or be considering ...
Evaluating Sources: General Guidelines - Purdue OWL®
Evaluate the Evidence Listed. If you’re just starting your research, you might look for sources that include more general information. However, the deeper you get into your topic, the more comprehensive your research will need to be. If you’re reading an opinion-based source, ask yourself whether there’s enough evidence to back up the ...
What Is Quantitative Research? An Overview and Guidelines
In terms of goal, quantitative research is designed with the intention of hypothesis testing, predictive modeling, or making a causal inference. Such research seeks to either validate or disprove existing hypotheses, produce empirically validated predictions, or establish cause-and-effect relationships among variables.
How Can Research Be Evaluated? | RAND - RAND Corporation
To be effective, the design of the framework should depend on the purpose of the evaluation: advocacy, accountability, analysis and/or allocation. Research evaluation tools typically fall into one of two groups, which serve different needs; multiple methods are required if researchers’ needs span both groups.
Evaluating Sources | Methods & Examples - Scribbr
Lateral reading is the act of evaluating the credibility of a source by comparing it to other sources. This allows you to: Verify evidence. Contextualize information. Find potential weaknesses. If a source is using methods or drawing conclusions that are incompatible with other research in its field, it may not be reliable. Example: Lateral ...
Changing how we evaluate research is difficult, but not ...
The San Francisco Declaration on Research Assessment (DORA) was published in 2013 and described how funding agencies, institutions, publishers, organizations that supply metrics, and individual researchers could better evaluate the outputs of scientific research. Since then DORA has evolved into an active initiative that gives practical advice ...
A new framework for developing and evaluating complex ...
The UK Medical Research Council (MRC) published a framework for researchers and research funders on developing and evaluating complex interventions in 2000 and revised guidance in 2006. 1 2 3 Although these documents continue to be widely used and are now accompanied by a range of more detailed guidance on specific aspects of the research ...
Understanding Effectiveness Evaluation: Definition, Benefits ...
As programs and interventions continue to evolve and become more complex, there is a need for continued research and innovation in effectiveness evaluation. Future directions for effectiveness evaluation research and practice should focus on incorporating new data sources, emphasizing stakeholder engagement, advancing evaluation methods ...
How to Evaluate a Study - The Center for Accountability in ...
Peer-review, the process by which a study is sent to other researchers in a particular field for their notes and thoughts, is essential in evaluating a study’s findings. Since most consumers and members of the media are not well-trained enough to evaluate a study’s design and researcher’s findings, studies that pass muster with other ...

Evaluating Research – Process, Examples and Methods

Evaluating Research

Why Evaluate Research?

Process of Evaluating Research

Step 1: Understand the Research Context

Step 2: Assess Research Design and Methodology

Step 3: Evaluate Data Collection and Analysis

Step 4: Interpret Results and Findings

Step 5: Assess Limitations and Biases

Step 6: Conclude with Overall Quality and Contribution

Examples of Research Evaluation

Methods for Evaluating Research

1. Peer Review

2. Meta-Analysis

3. Systematic Review

4. Quality Assessment Frameworks

About the author

Muhammad Hassan

You may also like

Research Techniques – Methods, Types and Examples

Research Methods – Types, Examples and Guide

Data Collection – Methods Types and Examples

Significance of the Study – Examples and Writing...

Research Problem – Examples, Types and Guide

Critical Analysis – Types, Examples and Writing...

The Federal Evaluation Toolkit BETA

What is Evaluation?

What is Program Evaluation?: A Beginners Guide

Types of Evaluation

Introduction to Randomized Evaluations

Evaluating Sources: General Guidelines

Welcome to the Purdue OWL

Find Out What You Can about the Author

Read the Introduction / Preface

Determine the Intended Audience

Determine whether the Information is Fact, Opinion, or Propaganda

Identify the Language Used

Evaluate the Evidence Listed

Cross-Check the Information

Check the Timeliness of the Source

Examine the List of References

Changing how we evaluate research is difficult, but not impossible

Introduction

Understand obstacles that prevent change

Experiment with different ideas and approaches at all levels

Create a shared vision

Finding conceptual clarity

Establishing standards

Recognizing collaborative contributions

Communicate the vision on campus and externally

Looking ahead

Acknowledgements

Biographies

Funding Statement

Competing interests

Contributor Information

Similar articles

Add to Collections

A new framework for developing and evaluating complex interventions: update of Medical Research Council guidance

Summary points.

Development of the Framework for Developing and Evaluating Complex Interventions

What are complex interventions?

Phases and core elements of complex intervention research

Core elements

Programme theory

Stakeholders

Key uncertainties

Intervention refinement

Economic considerations

Developing or identifying a complex intervention

Feasibility

Implementation

Conclusions

Acknowledgments

Similar articles

Add to Collections

What is Effectiveness Evaluation and Why is it Important?

RITA METTAO CASTRO

Samuel Molefe

Radha Nakarmi