Herophilus Publishes a General Method for Detecting Relevant Signals in Machine Learning Analysis of Complex Biological Datasets | Business


Herophilus, a leading biotechnology company developing neurotherapies to cure complex brain diseases, today announced the publication of research describing a new statistical method to identify and analyze the effects of potentially confounding variables on models of machine learning for complex biological datasets.

The ability of machine learning (ML) to extract scientific insights from high-dimensional datasets is often limited by confounding variables that bias models. Determining the influence of confounders is particularly difficult for complex bioscience datasets, which tend to be organized in nested hierarchies that prohibit the use of traditional methods such as linear regression to correct for the effects of nuisance variables. Although tools exist to mitigate known confounders, scientists lack a general method for identifying variables in a set of potential confounders that require debiasing.

In “Hierarchical confounder discovery in the experiment–machine learning cycle”, published in Cell models, the authors define a new nonparametric statistical method to score the effect of a potential confounder, called the “Rank-to-Group” (RTG) score. The RTG score is robust to outlier noise and can identify the source of a confounding effect even in nonlinear structures. The method is applicable to both raw data and results from ML models.

“RTG scoring is a widely useful tool for analyzing high-dimensional datasets with complex, potentially nested sources of bias that standard bias identification methods cannot resolve. This approach enables a virtuous cycle of experimental design, data collection and model building for the reduction of bias in the data and thus strengthens the use of machine learning in discovery science,” said Sean Escola. MD, Ph.D., co-founder of Hérophile.

“Herophilus is focused on the discovery and development of curative therapies for diseases of the brain, but we maintain a serious commitment to advancing the tools of basic scientific research for the benefit of all,” said Saul Kato, Ph.D. ., co-founder and CEO of Hérophile. “The next wave of machine learning research is moving beyond strict model performance to considerations of reliability, interpretability, and bias. RTG notation has become part of our everyday use of ML to do interpretable science, and we felt it was worth sharing with the community. »

About Herophile

Herophilus is a San Francisco-based neurotherapy company focused on curing complex brain diseases. The Company’s large-scale discovery platform combines brain organoid science, systems neuroscience approaches, robotic automation and advanced machine learning techniques to generate multimodal “deep phenotypes” that are mined to identify novel therapeutic targets and treatments for disorders including neurodevelopmental, psychiatric and neurodegenerative. diseases. For more information, visit www.herophilus.com

See the source version on businesswire.com: https://www.businesswire.com/news/home/20220228005373/en/

CONTACT: Thermal for Herophilus

Joanne Lin




SOURCE: Herophile

Copyright BusinessWire 2022.

PUBLISHED: 02/28/2022 12:00 PM / DISK: 02/28/2022 12:02 PM


Copyright BusinessWire 2022.

Source link

About Donald P. Hooten

Check Also

Distributed deep learning method without sharing sensitive data

Data sharing is one of the major challenges of machine learning models. The advent of …