"Green Flags": Heuristics for Biomedical Results

By Sarah Constantin

This is just a summary of some of the rough (and idiosyncratic) ideas I use in evaluating opportunities in biomedical research.  LRI's funding priorities often hit several of these heuristics at once.

 

 

  • Big effect sizes.

 

Large, dramatic, qualitative changes are hard to fake, and thus less likely to be the result of statistically manipulating ambiguous or unimpressive data.

 

  • Improvements in mortality or function.

Proxy metrics and biomarkers are doubtful and based on uncertain theories of biological mechanism.  It’s harder to fool yourself when you’re directly measuring survival or health.

 

  •  Nonstandard animal models; wild animals; comparative biology.

Laboratory mice and other standard experimental models are atypical in lots of ways; more diverse animal evidence adds robustness.  

 

  • Accidental discoveries or evidence from historical records.

These are less likely to be artifacts of confirmation bias.

 

  • Old publications (pre-1980).

The boom in federal and industry funding for biomedical research in recent decades probably inflated the numbers of low-quality and unreplicable publications; earlier publications are less likely to be fake (though, of course, more likely to have since been disconfirmed.)

 

  • Research from evolutionary biologists and practicing physicians.

Biology is still more an observational science than a theoretical one. Ideas that come from people with strong roots in “natural history” -- observing humans or animals with their five senses -- have a certain extra weight.

 

  • Non-drug treatment modalities.

Electrical stimulation, parabiosis, infrared light, bacterial therapies, growing livers in lymph nodes, all that “mad science” stuff.  Mainly, these have extra credibility because of their distance from the traditional pharmaceutical industry, and their closer connection to engineering (which is closer to the tinkerer/empiricist heart of true natural science.)

 

  • Effects that emerge clearly from multiple distinct model-free empirical comparisons.

If you do a large-scale GWAS study or drug screen or expression profile or other comparison of many potential variables, and a few factors pop out dramatically from several of these methods, those factors probably actually matter.

 

  •  Biologics that don’t depend on correctly identifying the active ingredient.

We sometimes observe that young or healthy cells or organs, or medicinal plants, or bacteria, seem to contain some substance that produces a desirable response.  In such cases, we can be more confident that the whole living thing contains the “active ingredient” than in any one molecule extracted from the biological material.  Before there was hydrocortisone, there were adrenal gland extracts. This is the genuine insight behind the “holistic” heuristic.

 

  • Use in non-medical communities of practice.

Folk wisdom is often wrong, but use in traditional medicine (or pre-modern lifestyles) is a weak independent consideration in favor of a treatment. In a similar vein, if athletes or soldiers use a performance-enhancing treatment, that’s a consideration in favor of its effectiveness.  A stronger, though less independent, positive consideration in favor of a treatment is use in veterinary medicine.

 

Of course, traditional criteria for research evaluation also matter: adequately powered, randomized, blinded, and replicated experiments; track records of excellence in researchers, research institutions, and journals; well-understood and well-validated mechanistic explanations.  The above considerations are just the contrarian additions to the traditional criteria.