AI-Driven Research and Trait Aware Representations of Medical Diagnostics
Abstract
This seminar is devoted to two separate projects:
As AI promises to accelerate scientific discovery, it remains unclear whether AI systems can perform fully autonomous research and whether they can do so while adhering to key scientific values, such as transparency, traceability and verifiability. Mimicking human scientific practices, we built data-to-paper, an automation platform that guides interacting LLM agents through a complete stepwise research process, from annotated data to comprehensive research papers, while programmatically back-tracing information flow, resulting in “data-chained” manuscripts. Testing the platform on diverse datasets, it produced autonomously correct papers in 80%-90% of runs for simple datasets and research goals, yet human interventions became critical for more complex tasks. Our work demonstrates a potential for AI-driven acceleration of scientific discovery in data-driven research and beyond, while setting through “data-chaining” a new standard for verifiability and traceability for the coming era of AI-driven science.
Electronic health records offer significant potential for uncovering trait-specific patterns and advancing personalized medicine. Various methods, mainly borrowed from natural language processing, have been proposed to represent International Classification of Diseases (ICD) codes, yet these approaches often yield representations that are difficult to quantify and primarily serve predictive tasks. Here, we present TAR—a method for Trait-Aware Representations of ICD codes implemented as a modified skip-gram model and demonstrate its ability to explore sex-specific differences in medical diagnostics, a topic often overlooked in prior literature. In collaboration with Maccabi Healthcare Services, we trained TAR on 25 years of data from 1.3 million patients, revealing sex-dependent dynamics of aging. In addition, the learned embeddings encode the concept of sex as a distinct direction. Overall, TAR presents sex-specific diagnostic differences, and is readily extendable to other phenotypic traits, offering a versatile tool for broader biomedical discovery.