PAASP partner became TÜV-certified ‘Qualitätsbeauftragter, -manager and -auditor’
PAASP partner Björn Gerlach undertook a three-month course focusing on classical quality management systems, like ISO9001 and EFQM. After passing written and oral examinations, Björn became a TÜV-Rhineland certified quality manager (“Qualitätsbeauftragter, -manager and -auditor”).
This knowledge will contribute to PAASP’s mission of enhancing quality in biomedical research but also helps to implement a quality system within PAASP itself. One of the midterm goals for PAASP is the certification of all internal procedures according to the ISO9001 quality management system.
The TÜV Rheinland AG is a globally active independent test service provider headquartered in Cologne, Germany. It operates as a technical testing organization for safety, efficiency and quality.
|
|
|
Meeting Report - St. Petersburg:
June 1, 2016 - Anton Bespalov participated in the meeting "Current issues in the preclinical and clinical research on novel drugs, biomedical cell products and medical devices" in St Petersburg. This meeting, held for the fifth time, is a unique opportunity for the academic and industrial researchers and organizations (including CROs and SMEs) to meet regulators and funding agencies and in an informal manner discuss real-life issues. Anton Bespalov presented a plenary lecture on the "Low-burden quality management system for nonclinical drug discovery research".
|
|
Meeting Report - Washington, DC
June 5, 2017 - Anton Bespalov took part in the meeting in Washington, DC (USA) hosted by Francis Collins, Director of National Institutes of Health (NIH), and Nora Volkow, Director of National Institute on Drug Abuse. This meeting was aimed to discuss the role of science in addressing the opioid crisis (LINK) and was attended by representatives from pharma and biotech companies, SMEs, academic scientists, consultants in the related fields of drug discovery research, as well as officials from FDA/CDER. The goal was to identify efforts that need to be prioritized in order to facilitate development of medications for opioid use disorders and overdose prevention/reversal. It was emphasized that the goal is to come up with an action plan (i.e. not a conventional academic/scientific meeting). The NIH Opioid Research Initiative suggested three main directions for developing an actionable plan: i) pain management, ii) opioid addiction treatment, and iii) overdose reversal. PAASP’s position is that this field of drug development is not much different from others where failures to develop novel and effective medications have complex reasons including poor quality of preclinical data and inappropriate use and interpretation of preclinical data. In support of this opinion, it was confirmed by several meeting attendees (in private conversations) that there are too many novel drugs and mechanisms that are reported at every scientific conference to have miraculous anti-addictive and/or analgesic efficacy in preclinical settings. This high rate of positive results can only be explained by either false positive results due to insufficient research rigor or inappropriateness of the models used.
|
|
|
Meeting report - PAASP at the 5th World Conference on Research Integrity (WCRI)
Björn Gerlach attended the 5th WCRI, which was held in Amsterdam from May 28th to May 31st. With over 800 participants, it is the biggest conference of its kind addressing all different aspects of research integrity, e.g. meta-research, mechanisms to improve reproducibility and how to deal with research misconduct and fraud.
Many speakers with different backgrounds presented their opinions, thoughts and action items. For example, Robert-Jan Smits from the European Commission stated that research integrity is a responsibility for all of us and endorsed the European Code of Conduct on Research Integrity, which was published at the beginning of this year. Also publishers presented their ideas and measures for improving quality in research: Bernd Pulverer, Editor in Chief at The EMBO Journal, stated that the peer review process is not broken but needs to be improved. For that purpose, he presented a new work-flow which includes purely technical reviews of research articles beside the classical peer-review process already conducted. Sowmya Swaminathan from Nature Springer in San Francisco formulated a clear role for journals in promoting reproducibility. She mentioned several aspects to act against the reproducibility crisis: advocacy, new policies, shifting incentives, providing an infrastructure (e.g. registered report), installing better reporting standards, increasing statistical methodology and providing discipline-specific standards. Keith Wollmann presented the concept of the “STAR method”, developed at Cell Press. These are guidelines allowing for a structured and detailed reporting of the methods section.
Brian Nosek from the Center for Open Science (Charlottesville, USA) pointed out in a very entertaining way how important it is to integrate new concepts in a way researchers can refer to, e.g. the introduction of quality badges to promote an open research culture with registered reports (Design=>Collect&Analyze=>Report=>Publish) and the “Pre-registration challenge” (1000$ for 1000 researchers who pre-register their studies). Notably, 52 journals are already offering the preregistration of studies.
The importance and the need for training programs was also emphasized and discussed by Patricia Valdez from the NIH. A very specific example was presented by Rebecca Davies from the University of Minnesota, St. Paul, USA. Rebecca established a detailed Research Quality Assurance course in the MD/PhD and PhD training program. 12 students started this interesting teaching module which is run as pilot and will last over 3 years.
Overall, it was a really interesting conference with many talks and presentations running in parallel sessions with a highly enthusiastic community mainly consisting of academics, politicians and editors. Hardly any people with an industry background were spotted, which is noteworthy, since they also played an important part in revealing the “reproducibility crisis”. The next WCRI will be held in 2019 in Hong Kong and it will be interesting to see what ideas, strategies and measures discussed this year could be implemented to support the research community addressing current research integrity issues.
|
|
|
Nature is making a step towards greater reproducibility for life-sciences research
This month Nature started to publish, alongside every life-sciences manuscript, a new reporting-summary document, to which authors are now expected to add details of experimental design, reagents and analysis. This summary document is essentially the checklist that was accompanying Nature life sciences manuscript submissions since 2013. However, the checklist was not made available to the readers. We believe that our discussions with the Nature journal editors have helped them to come up with the decision to introduce these new summary documents.
This new summary document is certainly an important step towards improving the quality of reporting in the life-sciences articles. Yet, one may want to note that, even with this summary document, it will still be difficult to identify which experiments were run in an exploratory mode and which were the confirmatory studies. One of the key distinctions between these two study modes is the pre-specification of the hypotheses, endpoints and data analysis that is not covered by the summary document.
|
|
A Guide for the Design of Pre-clinical Studies on Sex Differences in Metabolism
In metabolic animal models, researchers often use male rodents because they show metabolic disease better than females. Thus, females are underrepresented in these studies because of an acknowledged sex difference. In this perspective article, F. Mauvais-Jarvis and colleagues discuss the experimental design and interpretation of studies addressing the mechanisms of sex differences in metabolic homeostasis and disease, using animal models and cells. The authors also highlight current limitations in research tools and attitudes that threaten to delay progress in studies of sex differences in basic animal research.
|
|
|
The ABCs of finding a good antibody: How to find a good antibody, validate it, and publish meaningful data
Finding an antibody that works for a specific application can be a difficult task. Hundreds of vendors offer millions of antibodies, but the quality of these products and available validation information varies greatly. In addition, several studies have called into question the reliability of published data as the primary metric for assessing antibody quality. In this article, P. Acharya and colleagues discuss the antibody quality problem and provide best practice guidelines for selecting and validating an antibody, as well as for publishing data generated using antibodies.
|
|
|
The reproducibility of research and the misinterpretation of P values
Despite decades of warnings, many areas of science still insist on labelling a result of P < 0.05 as “significant”. In this article, D. Colquhoun discusses whether the inconsiderate use of p-values and other questionable research practices like multiple comparisons, lack of randomisation and P-hacking account for a substantial part of the reproducibility issue. He concludes that science is endangered by statistical misunderstanding, and by people who impose perverse incentives on scientists.
|
|
|
|
The impact factor of an open access journal does not contribute to an article’s citations
In this article, SK Chua et al. analyze the phenomenon that citations of papers are positively influenced by the journal’s impact factor (IF). The authors argue that for non-open access (non-OA) journals, this influence may be due to the fact that high-IF journals are more often purchased by libraries, and are therefore more often available to researchers, than low-IF journals. This positive influence has not, however, been shown specifically for papers published in open access (OA) journals, which are universally accessible, and do not need library purchase. This article therefore addresses the question whether the IF influences citations in OA journals too. By analyzing 203 randomized controlled trials (102 OA and 101 non-OA) published in January 2011, the authors found that it is better to publish in an OA journal for more citations than in non-AO journals. On the other hand, if one wishes to publish in a non-OA journal, it is better to choose one with a high IF.
|
|
|
Further commentaries, articles and blog posts worth reading:
|
|
PAASP archive of publications
PAASP maintains a list of previous publications and a collection of articles on our website.
We will be regularly updating the Literature and Guidelines sections where you can find references for some of the most valuable publications related to the growing fields of GRP, Data Integrity and Reproducibility:
- Consequences of poor data quality
- Study design
- Reproducibility and data robustness
- Data analysis and statistics
- Data quality tools
- Therapeutic indication
- Reporting bias
- Peer Review and Impact Factor
- Various commentaries
|
|
|
A commentary based on the two articles by Jiménez et al. and Eglen et al. about the need to publish the software code of computer programs developed and used in research labs
The generation of non-reproducible research data causes a huge scientific and social problem wasting many billions of US dollars and other resources every year. The root cause of this problem is with no doubt multifactorial and the whole research community is needed to fix this problem: researchers, publishers, funding agencies, industry, research and governmental institutions. The questions remains, where to start? Two recently published articles (Eglen et al. 2017 and Jimenez et al. 2017) directly address researcher, funding agencies and journal editorial bodies with an almost neglected but nevertheless important piece of the reproducibility puzzle: the development and distribution of research software.
90% of researchers acknowledge that software is important for their research and many researchers even develop their own software to solve a specific problem they are facing in the lab. This self-developed software then becomes a corner stone for the generation and/or analysis of research data, and, thus, also plays a crucial part in reproducing the scientific results. However, quite often, these important software tools are not published alongside the data, which makes it impossible for other scientists to understand how specific data sets were generated and how they can be reproduced.
Eglen et al. 2017 and Jimenez et al. 2017 both emphasize the importance for peers to understand the software code of these individual applications and the need of making it available. This is already advocated in computational sciences as stated by Eglen et al. in the introduction: “[…] the scholarship is not the article, the scholarship is the complete software […]”.
Why is it so difficult to publish the Software code in natural sciences?
Jimenez et al., the authors of the article: “Four simple recommendations to encourage best practices in research software” provide the following recommendations to create an Open Source Software (OSS):
- The source code should be made accessible from day one in a publicly available, version controlled repository. Only this would allow for trust in the software and give the opportunity for the community to understand, judge and enhance the software. This can be easily achieved with, for example, GitHub or Bitbucket.
- Providing software metadata for better understanding the context of the software. The authors claim that is it important to provide additional information together with the software code. This might include source code location, contributors, licence, version, etc.
- Adopt a licence and comply with the licence of third-party dependencies. This adopted licence should clarify how the software can be used, modified and distributed by anybody else.
- Define clear and transparent contribution, governance and communication processes. In this context, it is up to the developer whether they want the developer community to contribute or not, however, in any case, these processes should be clarified upfront and made transparent.
The authors of this article conclude that these recommendations aim to encourage the adoption of best practice and help to create better software for better research. In contrast to most previously published recommendations targeting the software developer themselves, the authors target a wider audience, basically everybody who is producing, investing in or funding research software development. These recommendations were discussed at several workshops and meetings to get feedback from multiple stakeholders.
The article by Eglen and colleagues published in Nature Neuroscience is more specific in promoting standard practices for sharing computer code and programs in neuroscience. The authors also see a huge advantage if developers got into a habit of sharing as much information as possible, not only to help others but also to help themselves. Two main reasons are given for this: A) the code will not be lost when a colleague leaves the lab and B) people tend to write higher quality codes when they know it will become publicly available. Some of the points by Eglen et al. are overlapping with the four previously mentioned recommendations, but, they provide some further details: For example, adopting a licence, having a version controlled system (like GitHub) or providing additional data about the software (in a README file) were also mentioned by Jiménez et al. In addition, Eglen et al. recommend to create a stable URL (such as a DOI) and to comply with Daniel Kahneman’s “reproducibility etiquette”. Furthermore, it would be favorable to publish all experimental data collected alongside the software. Finally, testing of the software is a very critical step, however, it is often neglected by researchers. Therefore, the authors recommend including test suites to be able to demonstrate that the software is producing the correct results.
Without any doubt, the discussed issue is a very important piece in the puzzle and these recommendations are crucial steps towards more transparency and a better understanding how research data were generated. The critical question will be, however, how to implement these new recommendations? For a researcher, computer software is quite often just a tool for her/his real aim: producing novel data. Therefore, it might not be easy to convince them to put more effort and resources into the development and distribution of these “side-tools” except if it were made obligatory or strongly incentivized by funders, publishers and/or policymakers.
|
|
|
We will continue presenting Case Study publications that are particularly interesting from the Good Research Practice perspective. We hope that these cases can be useful in training programs and will help younger scientists to learn about the basics of study design, planning and analysis. We invite our readers to share examples that can be used for such educational purposes.
This month we would like to turn again to the subject of sample size. We will be using examples from a paper published by Bradley and colleagues in the Journal of Clinical Investigation in December 2016 (link). This paper reported on the therapeutic potential of M1 muscarinic positive allosteric modulators to slow prion-related neurodegeneration and restore memory loss in mice. We are not challenging the conclusions made in this paper and do not mean to question the value of this kind of research. We use this paper solely with the purpose to illustrate a phenomenon that seems to be common in many papers that combine multiple research methods.
In this paper, hippocampal-dependent learning and memory was assessed using one of the most frequently used tasks - fear conditioning. When fear conditioning was compared in M1 knockout and wild-type mice, each group contained 8 mice. Additional control experiments compared pain thresholds and locomotor activity using seemingly the same groups of 8 mice each (as shown in Figure 1 in the article). Along with the behavioral data, this Figure presents immunohistochemical evidence of M1 receptor activation as the result of the fear conditioning training (“representative” images, no quantification and no indication as to whether this analysis was done in more than one mouse). So far so good - it looks rather common and may reflect our unfortunate habit to leave some technical details non-reported (e.g. a power analysis that was based on previous studies and that can justify N=8; a quantification of the IHC studies since “a nice picture says more than 1000 words”, etc.).
In Figure 3, all studies were done using wild-type animals that were either prion-infected or served as controls. In an experiment analyzing combined fear conditioning and prion infection, the authors “randomly” used 19 mice per group and observed impaired fear conditioning (panel 3C). It is interesting though, that in this case the control experiments are conducted using only N=6 animals (elevated plus maze and pain thresholds, panels 3D and 3E). However, it is not clear to the reader whether these are the same animals or separate groups. The same applies to the results on M1 receptor Bmax (panel 3G) where the sample size is even lower – N=3. Again, there may be some reasons not stated in the manuscript to justify the sample size and explain allocation of animals to different experiments as well as the reasons for subjecting prion-infected and control animals to non-comparable training and testing conditions (Figure 3B).
The reason to choose this paper as an example and to specifically point out sample sizes in Figures 3C vs 3D-E-G is, because, if all these data came from the same two groups of animals (prion-infected and controls), the small-sample experiments may not provide accurate estimates of what is likely to happen in a larger population. In other words, if the sample size for the fear conditioning study was N=6 and not N=19, the difference between the prion-infected and control mice could look less convincing. And, vice versa, if the sample size for the plus maze were larger (N=19 instead of N=6), differences in the number of the open arms visits between prion-infected and control mice could lead to a conclusion that the former are more likely to display anxiety-related behavior.
The example above illustrates the need for particularly well detailed and transparent description of the study design (including flow of the experiments, total numbers of subjects and allocation to individual experiments) and sample size justification for research where:
- multiple methods are combined (e.g. in vivo and ex vivo)
- certain methods are applied only for randomly (?) selected subsets of subjects or samples, and
- expected (hypothesized) outcome is a mixture of positive and negative results such as the learning impairment in the absence of pain sensitivity changes in the paper discussed here.
|
|
To fosters interest in statistics and numerical research, Tyler Vigen, a consultant at The Boston Consulting Group, wrote a program that attempts to automatically find things that correlate. More than 4,000 correlations were found so far. Some of the gems include:
- The divorce rate in Maine versus per capita consumption of margarine,
- Worldwide non-commercial space launches versus Sociology doctorates awarded (US)
- Marriage rate in Alabama versus whole milk consumption per capita, and
- Honey produced in bee colonies versus labor political action committees.
Many things correlate with cheese consumption ;-)
The phenomenon effectively illustrated by T. Vigen is known in the biomedical area as the multiple testing problem - when many questions are asked of the same data, some of those questions will by chance come up as false positives. For large studies and data sets, this shows that, sometimes, the right way is to step back and to consider issues associated with the design of these studies and to develop a (pre-specified) statistical analysis strategy that takes into account the large number of questions at issue.
|
|
- June 28, 2017 - Webinar (11 AM - 12 PM EST )
Seminar by Carlijn Hooijmans presenting the Grading of Recommendations Assessment, Development and Evaluation (GRADE) concept for preclinical research
- July 3-5, 2017 - Heidelberg, Germany
ECNP Workshop for Young Scientists “How to Make Preclinical Research Robust”
- August 7-11, 2017 - Nijmegen, Netherlands
Summer School: Improving quality of preclinical animal studies using the systematic review methodology
- August 14-18, 2017 - Nijmegen, Netherlands
Summer School: Molecules, Mice and Math: A statistical toolbox for the lab
- August 28 - September 2, 2017 - VU University Amsterdam, The Netherlands
6th Postgraduate ONWAR Course in Behavioral Neuroscience: In vivo Phenotyping of Mutant Rodents: Integrating Neural Activity, Neurochemistry, Heart Rate & Behavior
- September 4, 2017 - Paris, France
ECNP Preclinical Data Forum network meeting during the 30th annual ECNP Congress
- October 11-13, 2017 – Bethesda, Maryland, US
ASA Symposium on Statistical Inferences
|
|
Please let us know if you are aware of additional or future meetings covering relevant aspects of Good Research Practice and Data Integrity in the biomedical research area:
|
|
Get involved!
We at PAASP would like to invite you to join our projects and together help to improve the quality of biomedical research.
Please reach out to us and learn more about potential opportunities to collaborate on different activities, which range from the development of new quality standards for pre-clinical research to the training for students and scientists in Good Scientific Practices.
|
|
|
|
|