
Dongwei Jiang
Speech researcher in the past, NLP researcher now
Second year Master's student
CLSP, Johns Hopkins University
Research Interest
I am broadly interested in reasoning. In the realm of reasoning, I’ve worked on complicated problems like:
- Theorem proving and Logical reasoning that uses theorem prover Lean to help with the reasoning process [1]
- Decompositional entailment that formulates a consistent and theoretically grounded approach to annotating decompositional entailment dataset [2]
However, one puzzling limitation in LLM reasoning is that while these models can solve “superhuman” problems in specific domains, they often fail at simple tasks. This observation led me to question whether LLMs’ problem-solving abilities truly demonstrate superior reasoning capabilities or simply reflect domain-specific overspecialization. As a result, my focus has now turned to the more general, system-2-like reasoning. My work in this area includes:
- Building general-purpose verifier through rationale extraction from unlabelled data to provide process supervision during reasoning [3]
- Investigating the effectiveness of CoT prompting across 100+ papers and 20 datasets and discovering CoT benefits mainly math/symbolic reasoning tasks [4]
I’m also interested in the self-improvement capability of LLMs. If we begin with the “end” (superintelligence/AGI) in mind, relying on human input won’t get us there. We need to teach models to interact with the environment and self-improve. Specifically, I’ve worked on:
- Understanding the reason that prevents LLM from effective self-improvement [5]
- Probing the limits of self-improvement
I believe these two research directions are deeply interconnected and can synergistically enhance each other. Strong reasoning capabilities are essential for effective self-improvement, as models need to logically analyze and discriminate between good and bad generations to provide meaningful feedback. Conversely, self-improvement mechanisms are crucial for advancing reasoning capabilities, as complex logical problems often require multiple attempts and refinements to reach the correct solution. This bidirectional relationship suggests that advancing either area could create positive feedback loops that benefit both capabilities.
In addition, my research has frequently drawn inspiration from cognitive science concepts, including cognitive load, system 2 reasoning, and zone of proximal development. This connection seems natural, given that LLMs are fundamentally trained to emulate human cognitive patterns. I would love to explore this intersection more deeply in future research.
More About Me
In my past life, I spent six years working in the industry on speech processing and speech foundation models. Recently, my focus has shifted to LLMs. To that end, I’m currently studying at JHU as a master’s student, working with Professor Daniel Khashabi and Benjamin Van Durme. I’ve also worked with Professor Shay Cohen from Edinburgh and Greg Durrett from UT Austin.
My industrial career has been marked by a series of devastating external events that fundamentally disrupted the companies where I worked. At DiDi, I was developing speech processing systems when the company was hit by severe regulatory action from the Chinese government, forcing its delisting from the New York Stock Exchange and creating massive operational uncertainty. At YuanFuDao, my work was abruptly affected when the Double Reduction Policy essentially crippled the core business model of educational technology companies throughout China. Later at Shopee, I was advancing speech technologies when the combination of global economic downturn and US-China tensions triggered an 80% stock price collapse and extensive layoffs throughout the company. These successive corporate disruptions necessitated my transitions between roles, as each company faced existential challenges that made continuing my technical work there untenable.
In my free time, I sometimes play Civ 6 or Hearthstone. I also run and go bouldering every other day - well, more like every three or four days, but who’s counting? I’ve noticed there’s something puzzle-like about all these activities—whether it’s planning civilizations, crafting the perfect deck, or figuring out a tricky climbing route—which probably explains why I enjoy them alongside my research work.
Selected Publications
- NAACL