Taming the Wild Hypothesis: Introducing Controllable Abductive Reasoning in Knowledge Graphs
By Jiaxin
In the world of Artificial Intelligence, abductive reasoning is the digital equivalent of Sherlock Holmes. It is the process of observing a set of facts (or “clues”) and inferring the most plausible explanation for them.
When applied to Knowledge Graphs (KGs), abductive reasoning has massive potential. It can help doctors diagnose complex diseases based on symptom clusters or help scientists discover new celestial bodies based on orbital anomalies.
But there is a problem.
Knowledge Graphs are vast. A single observation can trigger thousands of plausible explanations. Most of these are technically “correct” but practically useless. If a doctor asks for a diagnosis, they don’t want a hypothesis about the patient’s insurance policy—even if the KG contains that link.
In our latest paper, “Controllable Logical Hypothesis Generation for Abductive Reasoning in Knowledge Graphs” (CtrlHGen), we introduce a new task and framework to solve this. We allow users to control what the AI focuses on and how complex its reasoning should be.
The Problem: Too Many Answers, No Control
Existing methods for abductive reasoning (like AbductiveKGR) are powerful, but they lack a steering wheel.
Imagine you observe three NBA players. You want to know what connects them.
- Without control, the model might tell you: “They all played in the Western Conference in 2015.” (True, but trivial).
- What you actually wanted: “They all won MVP awards and played for coaches who were former players.” (Complex and specific).
To make abductive reasoning useful in the real world, we identified two necessary types of control:
- Semantic Control: Directing the reasoning toward a specific domain (e.g., “Explain these symptoms focusing on treatment,” not pathology).
- Structural Control: Adjusting the granularity (e.g., “Give me a simple 1-hop explanation” vs. “Give me a complex multi-hop logical deduction”).
The Challenges of Controlling Logic
Building CtrlHGen wasn’t as simple as adding a prompt. When generating logical hypotheses on KGs, we faced two major technical hurdles:
1. Hypothesis Space Collapse
As logical explanations get longer and more complex, the number of valid hypotheses drops off a cliff. We call this Hypothesis Space Collapse. If we force the model to generate a long, complex structure and satisfy a specific semantic constraint, the intersection of valid answers becomes vanishingly small. The model struggles to learn because it rarely sees “winning” examples.
2. Hypothesis Oversensitivity
Logical reasoning is brittle. In natural language generation, if you swap a word, the sentence usually still makes sense. In logical hypothesis generation, swapping a single operator (e.g., changing AND to OR) or a single entity can change the conclusion entirely. This makes Reinforcement Learning (RL) unstable because the reward signal fluctuates wildly.
Our Solution: The CtrlHGen Framework
To tackle these challenges, we proposed a two-stage training paradigm (Supervised Learning + Reinforcement Learning) with two key innovations:
Innovation A: Sub-Logical Decomposition
To fix the “Space Collapse,” we developed a data augmentation strategy. Instead of trying to teach the model complex logic all at once, we decompose complex hypotheses into simpler sub-logics.
By leveraging the semantic patterns in these simpler components, the model learns to construct complex logical structures step-by-step. It’s like teaching a student to write sentences before asking them to write a dissertation.
Innovation B: Smoothed Semantic Rewards
To fix “Oversensitivity,” we moved beyond the standard Jaccard index (which is very strict). We incorporated Dice and Overlap coefficients into our reward function.
- Jaccard demands a perfect match.
- Dice/Overlap provide “partial credit,” smoothing out the gradients.
We also added a Condition-Adherence Reward, which specifically rewards the model for obeying the user’s constraints (e.g., “Did you actually use the ’treatment’ relation I asked for?”).
Real-World Example: Clinical Diagnosis
Let’s look at a case from our paper involving three diseases: Systemic Lupus Erythematosus, Antiphospholipid Syndrome, and Sjögren’s Syndrome.
(Insert Figure 1a from the paper here)
With CtrlHGen, we can generate different hypotheses for the same observation based on user intent:
- Control (Pathology): The model deduces these are all chronic, autoimmune diseases producing autoantibodies.
- Control (Treatment): The model hypothesizes that all three can be treated with hydroxychloroquine.
- Control (Susceptibility): The model identifies that these occur in women carrying the HLA-DR3 allele.
All three are true. But only one is useful depending on whether you are a pathologist, a doctor prescribing meds, or a geneticist.
Results
We tested CtrlHGen on three benchmark datasets: FB15k-237, WN18RR, and DBpedia50.
The results were clear:
- High Adherence: The model respects user controls (semantic and structural) with high accuracy.
- Better Reasoning: Even when constrained, our model achieved superior semantic similarity scores compared to baselines. The “Sub-logical Decomposition” strategy significantly boosted performance on complex reasoning tasks.
Why This Matters
As we move toward more autonomous AI agents, the ability to reason abductively—to look at the world and guess why things are happening—is critical. But an agent that guesses randomly is dangerous.
By introducing controllability, we are moving abductive reasoning from a theoretical curiosity to a practical tool for domain-specific applications like healthcare, fault detection, and scientific research.
Want to dive deeper?
- 📄 Read the full paper: arXiv:2505.20948
Authors: Yisen Gao, Jiaxin Bai, Tianshi Zheng, Qingyun Sun, Ziwei Zhang, Jianxin Li, Yangqiu Song, Xingcheng Fu.
驾驭假设:在知识图谱中引入可控溯因推理
作者:白佳欣
在人工智能的世界里,溯因推理 (Abductive Reasoning) 就像是数字版的福尔摩斯。它是一个观察事实(或“线索”)并推断出最合理通过解释的过程。
当应用于知识图谱 (Knowledge Graphs, KGs) 时,溯因推理潜力巨大。它可以帮助医生根据症状群诊断复杂疾病,或者帮助科学家根据轨道异常发现新的天体。
但这里有一个问题。
知识图谱浩如烟海。一个单一的观察结果可能会触发成千上万种看似合理的解释。大多数解释虽然在技术上是“正确”的,但实际上却毫无用处。如果医生要求诊断,他们并不想要一个关于病人保险单的假设——即使知识图谱中确实包含这种联系。
在我们最新的论文 “Controllable Logical Hypothesis Generation for Abductive Reasoning in Knowledge Graphs” (CtrlHGen) 中,我们引入了一项新任务和框架来解决这个问题。我们允许用户控制 AI 关注的内容以及推理的复杂程度。
问题所在:答案太多,缺乏控制
现有的溯因推理方法(如 AbductiveKGR)虽然强大,但缺乏“方向盘”。
想象一下,你观察到三名 NBA 球员。你想知道他们之间有什么联系。
- 没有控制时,模型可能会告诉你:“他们都在 2015 年效力于西部联盟。”(这是真的,但很琐碎)。
- 你真正想要的:“他们都获得过 MVP 奖项,并且效力于前球员出身的教练麾下。”(这既复杂又具体)。
为了使溯因推理在现实世界中有用,我们确定了两种必要的控制类型:
- 语义控制 (Semantic Control): 将推理引向特定领域(例如,“解释这些症状时请关注治疗方案”,而不是病理学)。
- 结构控制 (Structural Control): 调整粒度(例如,“给我一个简单的单跳解释” vs “给我一个复杂的多跳逻辑推演”)。
控制逻辑的挑战
构建 CtrlHGen 并不像添加提示词那么简单。在知识图谱上生成逻辑假设时,我们面临两大技术障碍:
1. 假设空间坍缩 (Hypothesis Space Collapse)
随着逻辑解释变得越来越长、越来越复杂,有效假设的数量会断崖式下跌。我们称之为假设空间坍缩。如果我们强迫模型生成一个长而复杂的结构,并且还要满足特定的语义约束,那么有效答案的交集就会变得微乎其微。模型很难学习,因为它很少看到“成功”的例子。
2. 假设过度敏感 (Hypothesis Oversensitivity)
逻辑推理是脆弱的。在自然语言生成中,如果你换了一个词,句子通常仍然通顺。但在逻辑假设生成中,交换单个运算符(例如将 AND 改为 OR)或单个实体可能会完全改变结论。这使得强化学习 (RL) 变得不稳定,因为奖励信号波动剧烈。
我们的解决方案:CtrlHGen 框架
为了解决这些挑战,我们提出了一种两阶段训练范式(监督学习 + 强化学习),包含两个关键创新:
创新 A:子逻辑分解 (Sub-Logical Decomposition)
为了解决“空间坍缩”,我们开发了一种数据增强策略。我们不再试图让模型一次性学会复杂的逻辑,而是将复杂的假设分解为更简单的子逻辑 (Sub-logics)。
通过利用这些简单组件中的语义模式,模型学会了一步步构建复杂的逻辑结构。这就像教学生在写论文之前先学会造句。
创新 B:平滑语义奖励 (Smoothed Semantic Rewards)
为了解决“过度敏感”,我们超越了标准的 Jaccard 指数(它非常严格)。我们在奖励函数中引入了 Dice 和 Overlap 系数。
- Jaccard 要求完美匹配。
- Dice/Overlap 提供“部分分数”,从而平滑了梯度。
我们还添加了条件依从性奖励 (Condition-Adherence Reward),专门奖励模型遵守用户约束的行为(例如,“你真的使用了我要求的‘治疗’关系吗?”)。
现实案例:临床诊断
让我们看看论文中的一个案例,涉及三种疾病:系统性红斑狼疮、抗磷脂综合征和干燥综合征。
(此处插入论文中的图 1a)
使用 CtrlHGen,我们可以根据用户意图为同一观察结果生成不同的假设:
- 控制(病理学): 模型推断这些都是产生自身抗体的慢性自身免疫性疾病。
- 控制(治疗): 模型假设这三种疾病都可以用羟氯喹治疗。
- 控制(易感性): 模型识别出这些疾病多发于携带 HLA-DR3 等位基因的女性。
这三个结论都是真的。但取决于你是病理学家、开药的医生还是遗传学家,只有一个是有用的。
结果
我们在三个基准数据集上测试了 CtrlHGen:FB15k-237, WN18RR 和 DBpedia50。
结果非常清晰:
- 高依从性: 模型能够高精度地遵守用户的控制(语义和结构)。
- 更好的推理: 即使受到约束,我们的模型也取得了比基线更好的语义相似度得分。“子逻辑分解”策略显著提升了复杂推理任务的性能。
意义何在
随着我们迈向更自主的 AI 智能体,溯因推理的能力——即观察世界并猜测事情发生原因的能力——至关重要。但一个只会随机猜测的智能体是危险的。
通过引入可控性,我们将溯因推理从理论上的好奇心转变为医疗保健、故障检测和科学研究等特定领域的实用工具。
想深入了解?
- 📄 阅读完整论文: arXiv:2505.20948
作者:Yisen Gao, Jiaxin Bai, Tianshi Zheng, Qingyun Sun, Ziwei Zhang, Jianxin Li, Yangqiu Song, Xingcheng Fu.