I am a postdoctoral fellow at the National Institutes of Health (NIH), supervised by Dr. Zhiyong Lu.
I received my Ph.D. in Computer Science from Zhejiang University in June 2024, supervised by Professors Huajun Chen, Ningyu Zhang, and Xiaohui Fan.
I am a biomedical AI researcher developing computational methods and benchmark resources for biomedical reasoning and data-driven biological discovery. My earlier work centered on molecular learning — knowledge graph-enhanced molecular representation, biomolecular instruction tuning, and molecular generation with feedback — bridging graph neural networks and large language models for chemistry and biology. My current research at NLM/NIH builds biomedical large language models for single-cell reasoning and clinical applications. I am now leading the TrialGPT 2.0 project, NLM/NIH's AI-powered patient-to-clinical-trial matching system.
Selected Research
Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning
Yin Fang, Qiao Jin, Guangzhi Xiong, Bowen Jin, Xianrui Zhong, Siru Ouyang, Aidong Zhang, Jiawei Han, Zhiyong Lu
Bioinformatics, 2026 Medical LLMs
Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution
Qiao Jin, Yin Fang*, Lauren He, Yifan Yang, Guangzhi Xiong, Zhizheng Wang, Nicholas Wan, Joey Chan, Donald C. Comeau, Robert Leaman, Charalampos S. Floudas, Aidong Zhang, Michael F. Chiang, Yifan Peng, Zhiyong Lu
arXiv, 2026 Molecular AI
Knowledge Graph-enhanced Molecular Contrastive Learning with Functional Prompt
Yin Fang, Qiang Zhang, Ningyu Zhang, Zhuo Chen, Xiang Zhuang, Xin Shao, Xiaohui Fan, Huajun Chen
Nature Machine Intelligence, 2023 Molecular LLMs
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
Yin Fang, Xiaozhuan Liang, Ningyu Zhang, Kangwei Liu, Rui Huang, Zhuo Chen, Xiaohui Fan, Huajun Chen
ICLR, 2024 Molecular AI
Domain-Agnostic Molecular Generation with Self-feedback
Yin Fang, Ningyu Zhang, Zhuo Chen, Lingbing Guo, Xiaohui Fan, Huajun Chen
ICLR, 2024 Molecular LLMs
The Future of Molecular Studies through the Lens of Large Language Models
Jinlu Zhang*, Yin Fang*, Xin Shao, Huajun Chen, Ningyu Zhang, Xiaohui Fan
Journal of Chemical Information and Modeling, 2024 Single-cell Omics
De Novo Analysis of Bulk RNA-seq Data at Spatially Resolved Single-cell Resolution
Jie Liao*, Jingyang Qian*, Yin Fang*, Zhuo Chen*, Xiang Zhuang*, Ningyu Zhang, Xin Shao, Yining Hu, Penghui Yang, Junyun Cheng, Yang Hu, Lingqi Yu, Haihong Yang, Jinlu Zhang, Xiaoyan Lu, Li Shao, Dan Wu, Yue Gao, Huajun Chen, Xiaohui Fan
Nature Communications, 2022 Molecular AI
Molecular Contrastive Learning with Chemical Element Knowledge Graph
Yin Fang, Qiang Zhang, Haihong Yang, Xiang Zhuang, Shumin Deng, Wen Zhang, Ming Qin, Zhuo Chen, Xiaohui Fan, Huajun Chen
AAAI, 2022 Molecular AI
Knowledge-informed Molecular Learning: A Survey on Paradigm Transfer
Yin Fang, Zhuo Chen, Xiaohui Fan, Ningyu Zhang, Huajun Chen
KSEM, 2024* Equal contribution. Full list: Google Scholar