A DFD gives your entire team ; developers, architects, security engineers, and product managers ; a shared visual map of how data moves through your system. It exposes trust boundaries, entry points, data stores, and the exact paths an attacker could exploit. Without it, your threat modeling is essentially guesswork.

According to OWASP, DFDs are the most widely used diagramming technique in threat modeling, and they form the foundation of industry-standard methodologies like STRIDE, PASTA, and LINDDUN. Microsoft, NIST, and ISO 27001 all reference DFDs as a core component of secure system design.

In this guide, you’ll learn:

What a threat modeling DFD is and how it differs from other diagrams
The 5 core DFD elements and their symbols
DFD levels (Level 0, 1, and 2) and when to use each
A step-by-step process for creating a DFD from scratch
How to apply STRIDE and PASTA to your DFD
Real-world DFD examples across multiple industries
The best DFD tools available today
How to integrate DFDs into Agile and DevSecOps pipelines
DFD threat modeling for cloud, microservices, and IoT
Regulatory compliance requirements (ISO 27001, GDPR, CRA)
Limitations of DFDs and best practices

Whether you’re a security engineer building your first threat model or a DevSecOps professional looking to scale the practice across your organization, this guide covers everything you need.

Certified Threat Modeling Professional

Learn STRIDE, PASTA, VAST & RTMP frameworks in one certification.

What Is a Threat Modeling Data Flow Diagram?

A Data Flow Diagram (DFD) in the context of threat modeling is a structured visual representation that shows how data moves through a system, from external actors, through processes, into data stores, and back out again, while explicitly marking the trust boundaries where security controls must be enforced.

Unlike a general system architecture diagram, a DFD is purpose-built for security analysis. It doesn’t show infrastructure details like server specs, network topology, or deployment configurations. Instead, it focuses on data movement and trust relationships, the two things that matter most when identifying threats.

The goal of a threat modeling DFD is simple: make the invisible visible. It forces your team to answer critical questions:

Where does sensitive data enter the system?
Who or what can access it at each stage?
Where are the trust boundaries between components?
What happens to data when it crosses those boundaries?

DFD vs. Architecture Diagram: What’s the Difference?

This is one of the most common points of confusion for teams new to threat modeling.

Aspect	Data Flow Diagram (DFD)	Architecture Diagram
Purpose	Security analysis and threat identification	System design and infrastructure planning
Focus	Data movement and trust boundaries	Components, services, and infrastructure
Audience	Security engineers, developers, architects	Developers, DevOps, infrastructure teams
Detail level	Abstracted ; shows logical flows	Detailed ; shows physical/logical infrastructure
Symbols	5 standardized elements	Varies (AWS icons, UML, etc.)
Used in	Threat modeling, security reviews	System design, capacity planning

An architecture diagram might show that your application runs on AWS EC2 behind an Application Load Balancer with an RDS PostgreSQL database. A DFD, on the other hand, shows that a user (external entity) sends credentials to an authentication process, which reads from a user database (data store), with a trust boundary separating the public internet from the internal application zone.

Both are useful. But for threat modeling, the DFD is the right tool.

DFD vs. Process Flow Diagram (PFD): Which Should You Use?

A Process Flow Diagram (PFD) focuses on the sequence of operations or business processes ; it answers “what happens step by step?” A DFD, by contrast, focuses on data movement and trust ; it answers “where does data go and who can touch it?”

For threat modeling, DFDs are the standard. However, some advanced methodologies like VAST (Visual, Agile, and Simple Threat modeling) use PFDs alongside DFDs to capture both operational and data-centric threat perspectives. For most teams, starting with DFDs is the right approach.

Why DFDs Are the Industry Standard for Threat Modeling

DFDs have become the de facto standard for threat modeling for several reasons:

Simplicity: Only 5 symbol types means anyone on the team can read and contribute to a DFD ; no specialized training required.
Universality: DFDs are methodology-agnostic. They work with STRIDE, PASTA, LINDDUN, VAST, and OCTAVE.
Threat surface visibility: By explicitly marking trust boundaries, DFDs make attack surfaces immediately visible.
Hierarchical decomposition: DFDs can be created at multiple levels of abstraction, from a high-level context diagram down to individual process flows.

Regulatory acceptance: ISO 27001, NIST SP 800-154, and the EU Cyber Resilience Act all recognize DFD-based threat modeling as a valid security practice.

The 5 Core Elements of a DFD (With Symbols)

Every DFD ; regardless of the tool you use or the methodology you follow ; is built from exactly five types of components. This standardization is what makes DFDs so powerful: once you know these five elements, you can read and create any DFD.

1. External Entities (Interactors) ; The Rectangle

Symbol: A solid rectangle (sometimes labeled with a name)

What it represents: Any person, system, or organization that exists outside the scope of your system but interacts with it. External entities are the sources and destinations of data ; they initiate requests, receive responses, and represent the “edges” of your system boundary.

Examples:

End users (customers, employees, administrators)
Third-party APIs (payment gateways, OAuth providers, analytics services)
External systems (partner integrations, legacy systems, IoT devices)
Regulatory bodies or auditing systems

Security significance: External entities are inherently untrusted. Every data flow from an external entity into your system is a potential attack vector. When you see a rectangle in a DFD, your threat modeling mindset should immediately ask: “What could a malicious actor do if they controlled this entity?”

2. Processes ; The Circle or Rounded Rectangle

Symbol: A circle (Yourdon notation) or a rounded rectangle (Microsoft notation)

What it represents: Any component within your system that receives, transforms, or transmits data. Processes are the active components ; they do something with data. This could be a web application, a microservice, an API endpoint, a background job, or any piece of code that processes information.

Examples:

Web application frontend
Authentication service
Payment processing module
Data validation logic
API gateway
Background job scheduler

Security significance: Processes are where most application-layer vulnerabilities live. SQL injection, authentication bypass, business logic flaws, and privilege escalation all occur within processes. In STRIDE, processes are susceptible to Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege ; all six threat categories.

3. Data Stores ; The Double Line or Database Symbol

Symbol: Two parallel horizontal lines (Yourdon notation) or a database cylinder (common in tools like draw.io)

What it represents: Any location where data is stored at rest within your system. Data stores are passive ; they don’t process data, they simply hold it. The key security question for every data store is: “Who can read from or write to this store, and is that access properly controlled?”

Examples:

Relational databases (PostgreSQL, MySQL, SQL Server)
NoSQL databases (MongoDB, DynamoDB, Redis)
File systems and blob storage (S3, Azure Blob, local disk)
Log aggregators (Elasticsearch, Splunk)
Message queues (Kafka, RabbitMQ, SQS)
Caches (Redis, Memcached)
Session stores

Security significance: Data stores are high-value targets because they hold sensitive data at rest. Threats include unauthorized read access (information disclosure), unauthorized write access (tampering), and data destruction (denial of service). Encryption at rest, access controls, and audit logging are the primary countermeasures.

4. Data Flows ; The Arrow

Symbol: A directed arrow (→) showing the direction of data movement; bidirectional flows use a double-headed arrow (↔)

What it represents: The movement of data between any two components in the DFD. Every arrow represents a communication channel ; an HTTP request, a database query, a message queue event, a file read/write, or any other form of data exchange.

Best practices for labeling data flows:

Include the protocol (HTTPS, TLS 1.3, gRPC, AMQP)
Note the data type (user credentials, PII, payment data, session tokens)
Indicate authentication method if applicable (JWT, API key, mTLS)
Mark whether the channel is encrypted or unencrypted

Examples:

User → Web App: HTTPS POST /login (username, password)
Web App → Auth DB: SQL SELECT (encrypted connection)
Payment Service → Payment Gateway: HTTPS/TLS 1.3 (card data, PCI DSS scope)

Security significance: Data flows crossing trust boundaries are the highest-risk elements in a DFD. Unencrypted flows, flows carrying sensitive data, and flows that cross from untrusted to trusted zones all warrant careful threat analysis. STRIDE threats like Tampering (MITM attacks) and Information Disclosure (eavesdropping) are primarily associated with data flows.

5. Trust Boundaries ; The Dotted Line

Symbol: A dashed or dotted line, often forming a rectangle or zone around components

What it represents: The boundary between two zones of different trust levels. When data crosses a trust boundary, the security assumptions change. Trust boundaries are arguably the most important element in a DFD because they define where security controls must be enforced.

Common trust boundaries:

Internet (untrusted) vs. internal network (trusted)
Public DMZ vs. private application tier
Application tier vs. database tier
User space vs. kernel space (in OS-level models)
On-premises vs. cloud environment
Your organization’s systems vs. third-party services
Different microservices with different privilege levels

Security significance: Every data flow that crosses a trust boundary is a potential attack vector. Trust boundaries are where authentication, authorization, input validation, encryption, and logging controls must be applied. If a DFD has no trust boundaries, it’s incomplete, and your threat model will miss the most critical attack surfaces.

DFD Levels Explained: Level 0, Level 1, and Level 2

One of the most powerful features of DFDs is their hierarchical nature. You don’t have to capture every detail of your system in a single diagram. Instead, you build DFDs at progressively increasing levels of detail, starting with a high-level overview and drilling down into specific subsystems as needed.

Level 0 – The Context Diagram (Big Picture View)

A Level 0 DFD, also called a Context Diagram, shows the entire system as a single process (one circle or rounded rectangle) surrounded by all external entities that interact with it. It’s the 30,000-foot view.

What it shows:

The system as a single black box
All external actors (users, third-party systems, external services)
The high-level data flows between external actors and the system
The overall system boundary

When to use it:

At the start of a threat modeling session to establish scope
When communicating with non-technical stakeholders
To identify all external dependencies and integration points
As the first step before decomposing into Level 1

Example ; E-commerce Platform Context Diagram:

External entities: Customer, Payment Gateway, Shipping Provider, Admin User, Email Service
System: E-commerce Platform (single process)
Data flows: Order requests, payment data, shipping confirmations, admin commands, email notifications

Level 1 – System Decomposition

A Level 1 DFD breaks the single system process from Level 0 into its major subsystems or functional components. This is where you start to see the internal structure of the system and where trust boundaries between components become visible.

What it shows:

Major application components (frontend, backend, authentication service, payment module, etc.)
Data stores (databases, caches, file storage)
Data flows between internal components and external entities
Trust boundaries between major zones (internet, DMZ, internal network, database tier)

When to use it:

For most threat modeling sessions ; this is the “sweet spot” level
When you need to identify threats at the component level
When applying STRIDE-per-element methodology
For security architecture reviews

Example ; E-commerce Platform Level 1:

Processes: Web Frontend, Order Management Service, Payment Processing Service, User Authentication Service, Inventory Service
Data Stores: User Database, Order Database, Product Catalog, Session Store
Trust Boundaries: Internet zone, Application zone, Database zone

Level 2 – Process-Level Detail

A Level 2 DFD decomposes individual processes from Level 1 into their internal sub-processes. This is the most detailed level and is typically only created for high-risk or complex components.

What it shows:

Internal logic of a specific process (e.g., the Authentication Service broken into: Input Validation → Credential Lookup → Token Generation → Session Creation)
Fine-grained data flows within a component
Internal data stores (in-memory caches, temporary files)
Micro-level trust boundaries

When to use it:

For high-risk components that warrant deep analysis (authentication, payment processing, admin functions)
When a Level 1 threat analysis reveals a component with many potential threats
For compliance purposes where detailed documentation is required
When performing code-level security reviews

When to Stop: Choosing the Right Level of Abstraction

A common mistake in threat modeling is going too deep too quickly. Here’s a practical guide:

Scenario	Recommended Level
New system, first threat model	Level 0 → Level 1
Existing system, security review	Level 1
High-risk component deep dive	Level 2
Compliance audit documentation	Level 1 + Level 2 for critical components
Agile sprint threat modeling	Level 1 (focused on the feature being built)
Executive/stakeholder communication	Level 0

Rule of thumb: Start at Level 0 to establish scope, build Level 1 for your primary threat analysis, and only go to Level 2 for components where Level 1 reveals significant risk or complexity.

Real-World DFD Examples for Threat Modeling

Theory is useful, but seeing DFDs applied to real systems is where the learning crystallizes. Here are four detailed examples across different domains.

Example 1 – Web Application Authentication Flow (OAuth2)

System: A SaaS application using OAuth2 for authentication with an external identity provider (e.g., Google, Okta, Auth0)

DFD Components:

External Entities: User (Browser), OAuth2 Provider (Google/Okta)
Processes: Frontend SPA, Backend API, Token Validation Service
Data Stores: Session Store (Redis), User Profile Database
Trust Boundaries:
- Internet Zone: User, OAuth2 Provider
- Application Zone: Frontend SPA, Backend API, Token Validation Service
- Data Zone: Session Store, User Profile DB

Key Data Flows:

User → Frontend SPA: HTTPS | Browser request
Frontend SPA → OAuth2 Provider: HTTPS | Authorization request (redirect)
OAuth2 Provider → Frontend SPA: HTTPS | Authorization code
Frontend SPA → Backend API: HTTPS | POST /auth/callback {code}
Backend API → OAuth2 Provider: HTTPS | Token exchange {code, client_secret}
OAuth2 Provider → Backend API: HTTPS | {access_token, id_token, refresh_token}
Backend API → Session Store: TLS | Store session {session_id, user_id, token}
Backend API → User Profile DB: TLS/SQL | Upsert user profile

Key Threats Identified:

Authorization Code Interception (T: Tampering) ; Attacker intercepts the authorization code via redirect URI manipulation → Mitigation: PKCE (Proof Key for Code Exchange)
Token Leakage (T: Information Disclosure) ; Access tokens stored in localStorage accessible via XSS → Mitigation: Store tokens in httpOnly cookies
Session Fixation (T: Elevation of Privilege) ; Attacker fixes session ID before authentication → Mitigation: Regenerate session ID after successful login

OAuth2 Provider Impersonation (T: Spoofing) ; Attacker creates fake OAuth2

DFD Threat Modeling Tools You Should Know

The right tool can dramatically accelerate your threat modeling process. Here’s a comprehensive overview of the most widely used DFD threat modeling tools.

Microsoft Threat Modeling Tool (TMT)

Best for: Teams working in Microsoft/Azure ecosystems; organizations following SDL (Security Development Lifecycle)

Key features:

Free, desktop-based tool (Windows only)
Built-in STRIDE threat generation ; automatically suggests threats based on your DFD elements
Pre-built stencils for common components (web apps, databases, Azure services)
Generates threat reports in HTML and CSV format
Supports custom threat templates

Limitations: Windows-only, no collaboration features, limited to STRIDE methodology, UI feels dated

Download: Microsoft Threat Modeling Tool

OWASP Threat Dragon

Best for: Open-source advocates, teams wanting a free cross-platform tool, OWASP methodology followers

Key features:

Free and open-source (MIT license)
Available as a web application and desktop app (Windows, macOS, Linux)
Supports STRIDE threat modeling
GitHub integration for storing threat models as code
Clean, modern UI
JSON-based threat model format (version controllable)

Limitations: Less automated threat generation than Microsoft TMT, smaller stencil library

Access: OWASP Threat Dragon

Threat Modeling as Code: pytm and Threagile

For teams that want to integrate threat modeling directly into their development workflow, Threat Modeling as Code tools allow you to define your DFD and threat model in code ; making it version-controllable, reviewable, and automatable.

pytm (Python Threat Modeling)

Define your DFD in Python code
Automatically generates DFD diagrams (using Graphviz) and threat reports
Supports STRIDE threat library
Integrates with CI/CD pipelines
Open-source (MIT license)
GitHub: OWASP pytm

from pytm import TM, Server, Datastore, Dataflow, Boundary, Actor

tm = TM(“E-commerce Threat Model”)
internet = Boundary(“Internet”)
app_zone = Boundary(“Application Zone”)

user = Actor(“Customer”, inBoundary=internet)
web_app = Server(“Web Application”, inBoundary=app_zone)
db = Datastore(“Order Database”, inBoundary=app_zone)

Dataflow(user, web_app, “HTTPS POST /order”)
Dataflow(web_app, db, “SQL INSERT order”)

tm.process()

Threagile

- YAML-based threat modeling for cloud-native architectures
- Generates DFD diagrams, threat reports, and risk tracking
- Built-in support for cloud components (AWS, Azure, GCP)
- Docker-based execution
- Open-source

GitHub: Threagile

Real-World Example: The E-commerce Adventure

For instance, an online store where the user provides the credit card details and later, when the shipment is made, the payment processor will verify the details. A DFD of such a case will lay bare threats such as credit card skimming and order tampering. By analyzing these risks, the store can implement encryption, fraud detection systems, and secure data storage. Example of a Threat Modeling Data Flow diagram.

Also Read, Threat Modeling Best Practices

DFD Threat Modeling in Agile and DevSecOps Pipelines

One of the most common objections to threat modeling is: “We don’t have time for it in our sprints.” This objection usually comes from teams that think of threat modeling as a heavyweight, waterfall-era activity: a multi-day workshop that produces a 50-page document.

Modern DevSecOps threat modeling is different. It’s lightweight, iterative, and integrated directly into the development workflow.

The cost of finding a threat at design time vs. production:

Design time: ~$80 to fix (architecture change)
Development: ~$240 to fix (code change + testing)
Testing: ~$960 to fix (code change + regression testing + re-testing)
Production: ~$7,600+ to fix (emergency patch + incident response + potential breach costs)

DFD Threat Modeling for Cloud, Microservices, and IoT

Modern architectures present unique challenges for DFD-based threat modeling. The principles remain the same, but the application requires adaptation.

Challenges of DFDs in Cloud-Native Architectures

Cloud-native systems introduce several complexities that traditional DFD approaches don’t fully address:

Shared Responsibility Model: In cloud environments, the trust boundary between your organization and the cloud provider is critical. AWS manages the hypervisor; you manage the OS and above. This boundary must be explicitly represented in your DFD.
Ephemeral Infrastructure: Containers and serverless functions spin up and down dynamically. Your DFD needs to represent logical components, not specific instances.
Managed Services: When you use AWS RDS, Azure Cosmos DB, or Google Cloud Pub/Sub, you’re trusting the cloud provider to secure the underlying infrastructure. This is a trust boundary that must be documented.
IAM Complexity: Cloud IAM roles, service accounts, and cross-account access create complex trust relationships that are difficult to represent in traditional DFDs.

Limitations of DFDs in Threat Modeling

DFDs are powerful, but they’re not perfect. Understanding their limitations helps you use them more effectively and know when to supplement them with other techniques.

False Positives and False Negatives in DFD-Based Threat Models

False positives occur when your STRIDE analysis identifies threats that aren’t actually exploitable in your specific context. For example, flagging “SQL injection” as a threat on a data flow that only carries pre-validated, non-user-supplied data.

False negatives are more dangerous ; they occur when real threats are missed because the DFD doesn’t capture enough detail. Common causes:

DFD is too high-level and misses important sub-processes
Trust boundaries are incorrectly drawn
External entities are missing (forgotten integrations, internal admin tools)
Data flows are unlabeled, hiding the sensitivity of transmitted data

Mitigation: Validate your DFD against the actual system architecture. Have developers review the DFD for accuracy. Use multiple threat modeling techniques (DFD + attack trees, or DFD + PASTA) for high-risk systems.

Conclusion

DFDs are dynamic documents that should evolve with your system. Keep them updated, share them with your team, and utilize them as a guiding compass through the ever-changing security landscape.

So, armed with your metaphorical markers, start mapping with DFDs. Illuminate potential vulnerabilities, build secure systems, sleep soundly knowing your data flows safely, and impress your fellow security nerds along the way. Upskill in Threat Modeling

The Certified Threat Modeling Professional (CTMP) course provides hands-on training through browser-based labs, 24/7 instructor support, and the best learning resources to upskill in Threat Modeling.

Start your journey mastering threat modeling today with Practical DevSecOps!

Misbah Thevarmannil

Security Research Writer

Misbah Thevarmannil is a content engineer who thrives at the intersection of creativity and technical writing expertise. She scripts articles on DevSecOps and Cybersecurity that are technically sound, clear, and concise to readers. With a knack for translating complex DevSecOps concepts into engaging narratives, she empowers developers and security professionals alike.

New Course

Certified AI Security Professional (CAISP)

The Certified AI Security Professional course offers an in-depth exploration of the risks associated with the AI sup…

Learn more

DevSecOps Live

Addressing the Top 10 Kubernetes Risks

Given the growth and adoption of Kubernetes, a number of projects have been published in the OWASP…

Learn more

All blogs

All blogs

Start your journey today and upgrade your security career

Gain advanced security skills through our certification courses. Upskill today and get certified to become the top 1% of cybersecurity engineers in the industry.

DevSecOps Courses

Certified DevSecOps Professional (CDP)Best Seller

Emerging Tech Security Courses

Certified AI Security Professional (CAISP)Best Seller

Certified MCP Security Expert (CMCPSE)Coming Soon

Application Security Courses

Certified Security Champion (CSC)New Course

Save on Bundle

In this blog

Share article:

Threat Modeling Data Flow Diagrams

Certified Threat Modeling Professional

What Is a Threat Modeling Data Flow Diagram?

DFD vs. Architecture Diagram: What’s the Difference?

DFD vs. Process Flow Diagram (PFD): Which Should You Use?

Why DFDs Are the Industry Standard for Threat Modeling

The 5 Core Elements of a DFD (With Symbols)

1. External Entities (Interactors) ; The Rectangle

2. Processes ; The Circle or Rounded Rectangle

3. Data Stores ; The Double Line or Database Symbol

4. Data Flows ; The Arrow

5. Trust Boundaries ; The Dotted Line

DFD Levels Explained: Level 0, Level 1, and Level 2

Level 0 – The Context Diagram (Big Picture View)

Level 1 – System Decomposition

Level 2 – Process-Level Detail

When to Stop: Choosing the Right Level of Abstraction

Example 1 – Web Application Authentication Flow (OAuth2)

DFD Threat Modeling Tools You Should Know

Microsoft Threat Modeling Tool (TMT)

OWASP Threat Dragon

Threat Modeling as Code: pytm and Threagile

Real-World Example: The E-commerce Adventure

DFD Threat Modeling in Agile and DevSecOps Pipelines

DFD Threat Modeling for Cloud, Microservices, and IoT

Challenges of DFDs in Cloud-Native Architectures

Limitations of DFDs in Threat Modeling

False Positives and False Negatives in DFD-Based Threat Models

Conclusion

Misbah Thevarmannil

Related articles

Start your journey today and upgrade your security career

Courses Learning Path