I am a postdoctoral fellow at the National Institutes of Health (NIH), supervised by Dr. Zhiyong Lu.

I received my Ph.D. in Computer Science from Zhejiang University in June 2024, supervised by Professors Huajun Chen, Ningyu Zhang, and Xiaohui Fan.

I am a biomedical AI researcher developing computational methods and benchmark resources for biomedical reasoning and data-driven biological discovery. My earlier work centered on molecular learning — knowledge graph-enhanced molecular representation, biomolecular instruction tuning, and molecular generation with feedback — bridging graph neural networks and large language models for chemistry and biology. My current research at NLM/NIH builds biomedical large language models for single-cell reasoning and clinical applications. I am now leading the TrialGPT 2.0 project, NLM/NIH's AI-powered patient-to-clinical-trial matching system.

Selected Research

Biomedical LLMs First page of the Cell-o1 paper

Cell-o1: Training LLMs to Solve Single-Cell Reasoning Puzzles with Reinforcement Learning

Yin Fang, Qiao Jin, Guangzhi Xiong, Bowen Jin, Xianrui Zhong, Siru Ouyang, Aidong Zhang, Jiawei Han, Zhiyong Lu

Bioinformatics, 2026
Medical LLMs First page of the Med-V1 paper

Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution

Qiao Jin, Yin Fang*, Lauren He, Yifan Yang, Guangzhi Xiong, Zhizheng Wang, Nicholas Wan, Joey Chan, Donald C. Comeau, Robert Leaman, Charalampos S. Floudas, Aidong Zhang, Michael F. Chiang, Yifan Peng, Zhiyong Lu

arXiv, 2026
Molecular AI First page of the KANO paper

Knowledge Graph-enhanced Molecular Contrastive Learning with Functional Prompt

Yin Fang, Qiang Zhang, Ningyu Zhang, Zhuo Chen, Xiang Zhuang, Xin Shao, Xiaohui Fan, Huajun Chen

Nature Machine Intelligence, 2023
Molecular LLMs First page of the Mol-Instructions paper

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

Yin Fang, Xiaozhuan Liang, Ningyu Zhang, Kangwei Liu, Rui Huang, Zhuo Chen, Xiaohui Fan, Huajun Chen

ICLR, 2024
Molecular AI First page of the MolGen paper

Domain-Agnostic Molecular Generation with Self-feedback

Yin Fang, Ningyu Zhang, Zhuo Chen, Lingbing Guo, Xiaohui Fan, Huajun Chen

ICLR, 2024
Molecular LLMs First page of the molecular LLM review paper

The Future of Molecular Studies through the Lens of Large Language Models

Jinlu Zhang*, Yin Fang*, Xin Shao, Huajun Chen, Ningyu Zhang, Xiaohui Fan

Journal of Chemical Information and Modeling, 2024
Single-cell Omics First page of the Bulk2Space paper

De Novo Analysis of Bulk RNA-seq Data at Spatially Resolved Single-cell Resolution

Jie Liao*, Jingyang Qian*, Yin Fang*, Zhuo Chen*, Xiang Zhuang*, Ningyu Zhang, Xin Shao, Yining Hu, Penghui Yang, Junyun Cheng, Yang Hu, Lingqi Yu, Haihong Yang, Jinlu Zhang, Xiaoyan Lu, Li Shao, Dan Wu, Yue Gao, Huajun Chen, Xiaohui Fan

Nature Communications, 2022
Molecular AI First page of the KCL paper

Molecular Contrastive Learning with Chemical Element Knowledge Graph

Yin Fang, Qiang Zhang, Haihong Yang, Xiang Zhuang, Shumin Deng, Wen Zhang, Ming Qin, Zhuo Chen, Xiaohui Fan, Huajun Chen

AAAI, 2022
Molecular AI First page of the knowledge-informed molecular learning survey

Knowledge-informed Molecular Learning: A Survey on Paradigm Transfer

Yin Fang, Zhuo Chen, Xiaohui Fan, Ningyu Zhang, Huajun Chen

KSEM, 2024