PaperScout
An autonomous LLM-based agent that reformulates academic paper search as a multi-turn decision-making process, dynamically deciding when and how to invoke search and citation expansion tools.
Project Repo →Scientific Literature Mining
We are redefining how AI interacts with scientific literature, evolving from paper retrieval to autonomous discovery of science through a 5-level cognitive hierarchy.
Our Vision
AI's capability in scientific literature mining follows a progressive path of cognitive depth.
Retrieval from massive literature pools (e.g., PaperScout).
Parsing PDF into structured, machine-readable data (e.g., ChemTable).
Understanding methods, tasks, and contributions (e.g., ScholarSum).
Reasoning and multi-paper synthesis (e.g., PaperArena, Mind2Report).
Autonomous discovery and research generation (Future Vision).
Research Projects
Our work spans the entire hierarchy, providing tools and benchmarks for the next generation of scientific AI.
An autonomous LLM-based agent that reformulates academic paper search as a multi-turn decision-making process, dynamically deciding when and how to invoke search and citation expansion tools.
Project Repo →A large-scale benchmark for multimodal LLMs to recognize and understand complex chemical tables, combining symbolic chemical formulas, table structures, and visual molecule diagrams.
Project Repo →Advancing scientific summarization through knowledge graph reasoning and reflective refinement, distilling complex research into structured, high-fidelity summaries.
OpenReview →An evaluation benchmark for tool-augmented agentic reasoning on scientific literature, requiring agents to integrate information across multiple papers with diverse tools.
Project Repo →A cognitive deep research agent that emulates commercial analysts to synthesize expert-level reports from massive web sources through intent-driven search and iterative synthesis.
Project Repo →Our ultimate goal: building agents that not only understand literature but also autonomously discover laws, generate new hypotheses, and propose novel research directions.
Visionary ResearchTimeline
Jun. 2025
A large-scale benchmark designed to test MLLMs on real-world chemical tables, addressing structural complexity and domain-specific semantics.
Oct. 2025
A novel Student-Teacher framework for scientific summarization, leveraging knowledge graph reasoning and reflective refinement to ensure structural coherence.
Oct. 2025
The first benchmark for tool-augmented agentic reasoning on scientific literature, evaluating cross-paper integration and multi-tool orchestration.
Jan. 2026
An adaptive agentic framework for academic paper search, optimized with Proximal Sequence Policy Optimization (PSPO) for multi-turn interactions.
Jan. 2026
A cognitive deep research agent that emulates commercial analysts to synthesize expert-level reports via intent-driven search and iterative synthesis.
The explosion of scientific literature has exceeded human cognitive limits. We aim to build AI systems that can autonomously navigate, understand, and synthesize scientific knowledge, ultimately accelerating the pace of human discovery.
Moving beyond keyword matching to deep semantic understanding and multi-hop reasoning across heterogeneous scientific documents.
Integrating text, tables, figures, and equations to form a holistic understanding of scientific breakthroughs.
Discovering novel connections between disparate research areas to propose and validate new scientific hypotheses.