Automating the Grounding Layer: A Survey of Operational Knowledge for Enterprise Agents

Enterprise AI demos are easy. Enterprise AI deployment is not.

Most teams can wire up an LLM to some tools and ship a pilot. The hard part shows up later: the agent does not know what entities in your CRM mean, which API calls are valid under which policy, what changed in last week’s schema migration, or whether a proposed action violates a business rule written three years ago in a PDF nobody reads anymore.

We wrote a survey about that missing middle layer—operational grounding—and how LLMs can help construct and maintain it. The paper is titled Scaling Enterprise Agent Deployment: A Survey of LLM-Based Operational Grounding Construction. We also released a companion awesome list to track the growing literature: awesome-operational-grounding.

What is operational grounding?

Think of it as the structured layer that sits between raw enterprise data and agent behavior. We formalize it as \(\mathcal{O}=\langle V,C,A,G,M\rangle\):

Vocabulary (V) — what things are: schemas, taxonomies, DB/API type mappings
Constraints (C) — what states are valid: SHACL shapes, OWL axioms, business rules
Actions (A) — what the agent can do: PDDL domains, BPMN workflows, OpenAPI tool contracts
Governance (G) — what it is allowed to do: RBAC/ABAC, privacy policies, compliance rules
Maintenance (M) — how the layer evolves: versioning, provenance, temporal validity

RAG gives agents facts. Operational grounding gives them operating instructions—linked artifacts that must stay consistent with each other as the enterprise changes.

We call the research problem Automated Operational Grounding Construction (AOGC): using LLMs to build, validate, or update these artifacts from heterogeneous sources—schema dumps, API docs, SOPs, policy documents, release notes.

The insight that surprised us

When we coded the literature systematically, a clear pattern emerged. Most published work (25 of 32 core methods) sits at Tier 2: fixed multi-step pipelines with external validators. A smaller set (7 methods) handles isolated subtasks at Tier 1—extract this OWL axiom, generate that BPMN fragment.

Tier 3—construction agents that autonomously decide what grounding artifacts are missing and maintain the linked layer over time—has zero core methods in the current corpus. That is the gap between research demos and production deployment.

Another takeaway: success in this space should not be measured by extraction F1 alone. What matters is cross-artifact fidelity and downstream agent behavior. A beautifully extracted ontology that contradicts your OpenAPI spec is worse than useless—it gives the agent false confidence.

Where the field is heading

The survey connects AOGC to adjacent threads we have been working on—schema induction from text (AutoSchemaKG), task-aligned graph construction (AutoGraph-R1), and autonomy framing from our scientific discovery survey—while distinguishing operational grounding from instance-level KG population and from general ontology engineering.

Industry signals point the same direction: MCP standardizing tool surfaces, forward-deployed engineering teams embedding with users, and growing demand for auditable, policy-constrained agent actions. The research community is building the pieces; the open problem is linking them into a maintainable whole.

Paper: Scaling Enterprise Agent Deployment (ResearchGate)

Curated resources: HKBU-KnowComp/awesome-operational-grounding

Contributions welcome on the awesome list—especially Tier 3 work, if you know of any we missed.