Why LLMs Alone Can’t Do Threat Comprehension: What Specialized Models Like NARC Add

Security leaders want machines that can read adversaries the way analysts do. There is clear business value in AI-powered automation engines that can parse threat reports, extract the behaviors that matter, and pinpoint where organizations need to improve their defenses.

The rise of general-purpose LLMs makes this feel within reach. If an AI can summarize complex text, maybe it can read CTI and tell SOC analysts what to defend, too.

Unfortunately, that isn’t the case. Threat comprehension is too demanding for general-purpose LLMs. Even with larger context windows and more parameters, the fundamental architecture these systems use can’t reliably generate the results security leaders are looking for.

What Threat Comprehension Actually Requires

Threat comprehension is the disciplined act of turning unstructured reporting into operational intelligence that defenders can act on. It goes far beyond extracting tactics or summarizing malware behavior. It requires identifying the exact techniques an adversary used, the environmental conditions that made them possible, and the defensive signals that would reveal them in practice.

Effective threat comprehension also depends on strict alignment to shared frameworks. MITRE ATT&CK, D3FEND, and related models give organizations a common language for describing behavior, identifying defensive gaps, and measuring control effectiveness.

This structure matters. Each procedure must capture commands, file paths, parameters, privileges, software dependencies, and execution context in a repeatable format analysts can validate, and machines can compare at scale.

Without this level of precision and schema enforcement, downstream activities break. Coverage mapping, confidence scoring, prioritization, and CTEM workflows all rely on consistent, structured threat objects. General-purpose LLMs aren’t designed to maintain structure in that way.

Where General-Purpose LLMs Break Down

General-purpose LLMs excel at producing fluent text, but they falter the moment precision, consistency, and structured interpretation become mandatory. Their underlying training and architecture work against the requirements of threat comprehension in four principal ways:

Hallucinations. Threat reports can be vague or incomplete, and LLMs tend to “fill” information gaps on their own. They can invent ATT&CK technique IDs, mislabel malware families, or create relationships that were never present in the source material. In security workloads, these are not benign errors. They corrupt downstream coverage data and mislead defenders about what adversaries actually do.
Strict schema maintenance. Long CTI documents require hundreds of extractions across dozens of object types. Generic LLMs can drift from required fields, drop parameters, mis-format JSON, or mix vocabulary sources. They are designed to treat every prompt as an isolated task rather than a contribution to a persistent model for understanding context.
Controlled vocabulary enforcement. ATT&CK and D3FEND mappings must be exact. Even small inconsistencies break the ability to compare behaviors or measure coverage. LLMs tend to rename, reformat, or improvise labels, producing objects that look correct but ultimately fail validation when put to the test.
Provable chains of provenance. Analysts need to trace every extracted procedure back to exact source sentences. Most LLMs cannot reliably track these references, making their outputs difficult to audit, trust, or operationalize at scale.

Why Structured Context and Controlled Vocabularies Matter

Analysts rely on models like MITRE ATT&CK and D3FEND because they provide a shared language for describing how adversaries operate and how defenders can counter them. These vocabularies standardize techniques, defensive controls, data sources, and relationships. Without them, analysts can’t compare behaviors, measure coverage, and track gaps across tooling and environments.

This structure collapses if the underlying data is inconsistent. A single misaligned technique ID or an improvised label breaks the ability to correlate detections, optimize coverage, or evaluate defensive posture.

Engineering teams depend on precise, repeatable mappings to avoid false assumptions about what their stack can or cannot detect. Without strict schema enforcement, organizations end up with mismatched objects, duplicate behaviors, and coverage calculations they cannot trust.

Controlled vocabularies also make threat objects machine actionable. They allow procedures to feed cleanly into coverage maps, confidence scoring models, content validation, and broader CTEM workflows. If a system fails to preserve these exact identifiers, parameters, and relationships, downstream systems require painstaking manual revision.

Inside NARC: A Supervised Extraction Model

Natural Attack Reading and Comprehension (NARC) is an industry first AI engine and purpose-built by Tidal Cyber to solve a single problem: converting unstructured threat reporting into precise, reusable, ATT&CK-aligned procedures correlated with groups, campaigns and software.

Its internal architecture reflects that purpose. Here’s a close look at the exact steps NARC follows to comprehend and parse for procedural intelligence:

Ingest and Segment the Report: NARC identifies and structures procedure-level details present in the source text in CTI, DFIR, and other report formats.
Procedure Extraction: Each extracted procedure includes commands, parameters, privileges, file paths, execution context, and environmental preconditions.
Structured Threat Objects: NARC extracts what’s explicitly in the text and captures the details present in the sources such as commands, tools, parameters, or other explicit attributes.
Map Everything to MITRE ATT&CK: NARC also maps extracted objects to techniques or sub-techniques and links them to related entities. This creates a consistent vocabulary for detection, validation, and measurement.
Build and Enrich the Knowledge Graph: A persistent knowledge graph shows how procedures, software, and actor entities correlate. NARC clusters similar procedures, tracks Procedure Sightings across reports, and reveals recurring behavior patterns across campaigns.
Human Validation Closes the Loop: Analysts review extractions, correct mappings, and refine relationships. Their feedback trains the engine over time, ensuring accuracy while enabling scale beyond manual analysis.

From Specialized Models to Operational Defense

NARC’s structured outputs flow directly into Tidal Cyber’s defensive analytics engine, enabling workflows that generic LLMs cannot support. Coverage Maps use procedure-level attributes to show what a stack can detect or block, based on real adversary behavior details rather than theoretical technique support. This precision exposes meaningful gaps, eliminates guesswork, and gives detection engineers and architects a reliable view of defensive readiness.

The same dataset powers the Confidence Score and defensive stack optimization. Confidence Scores summarize mapped coverage against procedure-level behavior. This closes the loop between adversary behavior and defensive performance.

Teams also see where tools overlap, where configurations underperform, and where investments produce real value. This closes the loop between adversary behavior and defensive performance, turning threat intelligence into measurable operational improvement.

Conclusion

Tidal Cyber is the first true Threat-Led Defense platform built to flip the traditional defensive model by putting real adversary behavior at the center of your defense strategy.

By mapping techniques, sub-techniques, and procedures to ATT&CK, we reveal exactly where you’re exposed and how attackers actually operate. It’s a level of precision you’ve never had before, empowering your security team to proactively reduce risk and optimize high-impact security investments.

Threat-Led Defense is Tidal Cyber’s unique implementation of Threat-Informed Defense, enhanced with procedure-level granularity to make CTI more relevant and actionable.

Why LLMs Alone Can’t Do Threat Comprehension: What Specialized Models Like NARC Add

What Threat Comprehension Actually Requires

Where General-Purpose LLMs Break Down

Why Structured Context and Controlled Vocabularies Matter

Inside NARC: A Supervised Extraction Model

From Specialized Models to Operational Defense

Conclusion

Meet Tidal Enterprise Edition

Platform

I Want To...

Services

Company

Resources

Why LLMs Alone Can’t Do Threat Comprehension: What Specialized Models Like NARC Add

What Threat Comprehension Actually Requires

Where General-Purpose LLMs Break Down

Why Structured Context and Controlled Vocabularies Matter

Inside NARC: A Supervised Extraction Model

From Specialized Models to Operational Defense

Conclusion

Meet Tidal Enterprise Edition

Subscribe to blog updates

Platform

I Want To...

Services

Company

Resources