Correlating NTFS $LogFile and $UsnJrnl: A DFIR Practitioner’s Guide to Transactional Analysis
Table of Contents
Correlating NTFS $LogFile and $UsnJrnl: A DFIR Practitioner’s Guide to Transactional Analysis#
As Windows forensics practitioners, we rely on a constellation of artifacts to build a narrative. While we often treat artifacts like the $MFT, Shellbags, and LNK files as primary sources, we risk missing the underlying transactional truth of filesystem activity. Two NTFS metadata files, $LogFile and $UsnJrnl, provide this truth at different levels of abstraction. Analyzing them in isolation is useful; correlating them is a force multiplier, especially in complex cases involving rapid file changes, anti-forensics, or data destruction.
This post is for analysts who are already comfortable with NTFS fundamentals. We will skip the basics and dive directly into a reproducible workflow for fusing these two powerful data sources to build a high-fidelity timeline of filesystem events.
1) Why cover $LogFile and $UsnJrnl together#
At first glance, these artifacts seem redundant. Both log file system changes. However, their purpose and perspective are fundamentally different, and this difference is what makes their correlation so powerful.
$LogFile: The Ground Truth of Metadata Commits. The
$LogFileis the NTFS transactional journal. It records low-level metadata operations to ensure filesystem integrity. Before NTFS modifies the$MFTor other metadata structures, it first writes a log record describing the intended change (aredorecord) and often the information needed to reverse it (anundorecord). This is the ground truth of how a change was committed to the disk. It’s about transactionality, not auditing. Its entries are ordered by Log Sequence Numbers (LSNs), providing an immutable sequence of events.$UsnJrnl: The Auditable Summary of What Changed. The Update Sequence Number (USN) Journal, or
$UsnJrnl, is a higher-level change log designed for applications (like indexing services or backup software) to track changes to files and directories. It records what happened to an object—it was created, deleted, renamed, or its data was overwritten. These events are summarized byReasonflags. It is an audit trail, not a transactional log.
Correlating them allows us to move from assertion to confirmation. The $UsnJrnl might assert a file was renamed, but the $LogFile can confirm the exact metadata transaction that modified the $FILENAME attribute in the parent directory’s index, often providing both the old and new names within a single transactional record. This fusion helps us verify event timing, reconstruct activity when one artifact is damaged or has rolled over, and confidently defeat techniques like timestomping.
Comparison with Alternative Timeline Approaches#
Before diving into our correlation methodology, it’s worth contextualizing this approach against other timeline analysis techniques commonly used in DFIR:
Traditional Supertimeline Approach (Plaso/log2timeline): These tools excel at aggregating timestamps from diverse sources (registry, logs, browsers, etc.) but typically treat NTFS artifacts as separate, independent sources. While comprehensive, they miss the transactional relationships between $LogFile and $UsnJrnl that our correlation exposes.
MFT-Only Analysis: Many practitioners rely heavily on $MFT timestamps (SI/FN attributes) for file system timeline analysis. However, these can be manipulated by timestomping. Our correlation approach provides a validation mechanism against such anti-forensics techniques.
Individual Artifact Analysis: Tools like MFTECmd or USN Journal parsers provide excellent single-artifact timelines. Our approach builds upon these foundations by creating cross-artifact relationships that reveal the complete story of filesystem transactions.
Event Log Correlation: Windows Event Logs provide high-level system activity but lack the granular filesystem detail. Our method fills the gap between high-level system events and low-level disk operations.
The correlation approach we present here is uniquely positioned to provide transaction-level proof of filesystem activity while maintaining the auditability that individual artifact analysis offers.
2) Quick primers#
This is not an exhaustive breakdown, but a refresher on the concepts critical for correlation.
$LogFile#
- Structure: A circular log file consisting of restart areas and log records. When the log fills, it wraps around, overwriting the oldest records.
- Records: Primarily
redo(the new state of metadata) andundo(the old state). The most forensically relevant records detail operations against the$MFT, such asInitializeFileRecordSegment(file create),SetStandardInformation(timestamp changes), and updates to directory indexes ($INDEX_ROOT,$INDEX_ALLOCATION) for renames, creations, and deletions. - Focus: It tracks changes to NTFS metadata structures, not the content of a file’s
$DATAattribute. It tells you a file’s size changed, but not what data was written.
$UsnJrnl:$J#
- Structure: Stored in the
$Extend\$UsnJrnldirectory as an alternate data stream named$J. It is a log of variable-lengthUSN_RECORD_V2orUSN_RECORD_V3entries. - Records: Each record is keyed by the File Reference Number (FRN) of the object that changed. It contains a timestamp, the FRN, the parent FRN, a USN, and a bitmask of
Reasonflags. - Reason Flags: These flags describe the change, such as
USN_REASON_FILE_CREATE,USN_REASON_FILE_DELETE,USN_REASON_RENAME_OLD_NAME,USN_REASON_RENAME_NEW_NAME,USN_REASON_DATA_OVERWRITE, andUSN_REASON_CLOSE. Multiple reasons can be combined in a single record.
A quick note on time: all timestamps discussed here are 64-bit Windows FILETIME values, representing 100-nanosecond intervals since January 1, 1601. Always work in UTC.
3) Acquisition & persistence#
Both artifacts reside on the root of an NTFS volume.
- Locations:
C:\$LogFileandC:\$Extend\$UsnJrnl(the data is in the$JADS). - Rotation: Both are circular buffers. Their size is configurable, but the default
$LogFileis 64 MB, and$UsnJrnlis typically 32 MB. On a high-churn system (e.g., a build server), their history may only span hours or minutes. On a client workstation, it could be days or weeks. - Journal State: The USN Journal can be disabled (
fsutil usn deletejournal /d C:), which wipes the$Jstream. It can also be reset during certain system events like major Windows upgrades or aggressivechkdskoperations. The$LogFileis essential for filesystem operation and cannot be disabled while the volume is mounted.
Practical Acquisition Checklist#
Acquiring these files from a live system requires bypassing OS locks.
- Use a raw acquisition tool: Tools like KAPE, FTK Imager, or Arsenal Image Mounter can parse the raw disk and extract locked metadata files. This is the preferred method for dead-box forensics.
- Leverage Volume Shadow Copies (VSS): On a live system, VSS provides a clean, point-in-time snapshot of the volume. You can mount a VSS and copy the files out directly without fighting locks. This is the soundest approach for live collection.
- Capture the full volume context: Do not just grab the two files. At a minimum, also acquire the
$MFTfrom the same image or VSS. The$MFTis needed to resolve FRNs to full paths and check their allocation status. - Document source and hashes: Record the source volume, snapshot time (if applicable), and compute cryptographic hashes of all collected files (
$MFT,$LogFile,$UsnJrnl:$J) immediately upon acquisition. - Consider TRIM: On SSDs, be aware that data for deleted files, including potentially older metadata records, may have been subject to TRIM commands. Acquiring from a live VSS is often safer than a post-shutdown raw image.
4) Correlation workflow and strategy#
The goal is a unified timeline of file system events that leverages the strengths of both artifacts while compensating for their individual limitations.
Core Strategy#
Our primary join key is the MFT File Reference Number (FRN). We will use timestamps as a secondary, fuzzy join condition and for final ordering. The correlation process transforms two separate audit trails into a single, cross-validated timeline.
Step-by-Step Correlation Process#
- Parse Each Artifact: Use dedicated tools to parse
$LogFileand$UsnJrnlinto an intermediate format like CSV or SQLite. Ensure your parser extracts the minimum required fields. - Normalize Data: Convert all timestamps to UTC. Decode
Reasonflag bitmasks into human-readable text. - Join on FRN: The core of the workflow is joining the two datasets on the FRN. Because timestamps can have minor variations and not all events appear in both logs, we use a time window (e.g., +/- 2 seconds) to associate related records.
- Enrich with $MFT: Use a parsed
$MFTfrom the same point in time to resolve FRNs and Parent FRNs to their canonical file paths. Be aware these paths are volatile; a file’s name and location can change over its lifetime. - Validate and Filter: Apply detection logic to identify high-value events and filter noise.
Minimum Required Fields#
From $LogFile:
ts(UTC timestamp of the transaction commit)lsn(Log Sequence Number for strict ordering)op(Parsed operation, e.g.,UpdateFileName,DeleteFileRecord)frn(Target MFT Reference Number)parent_frn(Parent MFT Reference Number, when applicable)old_name(For renames)new_name(For renames)
From $UsnJrnl:
timestamp(UTC timestamp of the event)usn(Update Sequence Number)frn(Target MFT Reference Number)parent_frn(Parent MFT Reference Number)reason_mask(Integer bitmask of reasons)reason_text(Decoded reason flags, e.g., “RENAME_NEW_NAME | CLOSE”)filename(The name of the file/dir at the time of the record)attributes(File attributes like Hidden, System, etc.)
Example Correlated Timeline#
Consider the lifecycle of a malicious payload, payload.tmp, created in C:\Users\user\AppData\Local\Temp.
Time (UTC) | FRN-Seq | Path (best guess) | USN Reason(s) | LogFile Op | Details
---------------------------|-----------|----------------------------------------|--------------------------------------|-------------------------|---------------------------------
2025-08-24 10:30:01.1234567 | 55123-5 | C:\...\Temp\payload.tmp | FILE_CREATE | InitializeFileRecordSegment |
2025-08-24 10:30:01.5678901 | 55123-5 | C:\...\Temp\payload.tmp | DATA_OVERWRITE | CLOSE | UpdateStandardInformation | Size updated
2025-08-24 10:31:15.2233445 | 55123-5 | C:\...\Temp\payload.tmp | RENAME_OLD_NAME | | Old Name: payload.tmp
2025-08-24 10:31:15.2233445 | 55123-5 | C:\...\Temp\svchost.exe | RENAME_NEW_NAME | UpdateFileName | Old: payload.tmp, New: svchost.exe
2025-08-24 10:31:15.9876543 | 55123-5 | C:\...\Temp\svchost.exe | CLOSE | |
2025-08-24 10:35:45.0101010 | 55123-5 | C:\...\Temp\svchost.exe | FILE_DELETE | DeleteFileRecord | File flags marked as deleted
2025-08-24 10:35:45.0101010 | 49887-12 | C:\...\Temp | | UpdateUsnJournal | Parent directory USN updated
This unified view clearly shows the FILE_CREATE in $UsnJrnl aligning with the InitializeFileRecordSegment from $LogFile. The rename is captured as a pair of RENAME_OLD/NEW_NAME events in $UsnJrnl and a single, definitive UpdateFileName transaction in $LogFile, confirming both names.
5) DFIR playbooks / case studies#
Ransomware Rename Storms#
Ransomware often encrypts a file and renames it by appending an extension (e.g., .txt -> .txt.locked). This generates a massive volume of RENAME_NEW_NAME events in $UsnJrnl in a short period. The $LogFile provides the transactional proof. By correlating, you can definitively link the original filename (RENAME_OLD_NAME) to the new encrypted filename (RENAME_NEW_NAME) and establish a precise timeline of the encryption process, even if the malware attempts to clear other logs.
Mass Deletions#
An attacker or malicious insider might delete a large directory structure. The $UsnJrnl will show a flurry of FILE_DELETE reasons. The $LogFile provides deeper context. It will show the transactions that remove the $FILENAME attribute from the parent directory’s B-tree and deallocate the file record segment in the $MFT. This helps differentiate a true deletion from a directory being moved to another location (which would generate rename/move operations instead of deletions).
Living-off-the-Land Staging#
Adversaries often stage tools and data in temporary directories. This involves creating files, writing to them in small chunks, and renaming them to look legitimate. The $UsnJrnl will show a sequence of FILE_CREATE, DATA_OVERWRITE, and RENAME_* events. Correlating with $LogFile can confirm these operations at the metadata level and help build a granular timeline of the staging process, identifying the exact sequence of tool drops and modifications.
Challenging Timestomping#
This is a classic use case. An attacker uses a tool to modify the $STANDARD_INFORMATION (SI) timestamps of a file to blend in. The $MFT will reflect these forged timestamps. The $UsnJrnl may log a BASIC_INFO_CHANGE event with a timestamp of when the change occurred, but the SI timestamps themselves are now unreliable. However, the $LogFile transaction that committed the timestomp (SetStandardInformation or UpdateStandardInformation) is recorded with an authentic LSN and commit time. You can pivot from the suspicious $UsnJrnl record to the corresponding $LogFile transaction to prove when the timestamps were altered, invalidating the attacker’s anti-forensics attempt.
6) Gotchas & limits#
This technique is powerful, but not infallible. Maintain professional skepticism.
Pitfall: FRN Reuse#
When a file is deleted, its MFT record (and thus its FRN) can be reallocated to a new file. The sequence number component of the FRN increments each time an MFT record is reused, indicating that the same record slot has been allocated to a different file. A naive join on FRN alone could incorrectly associate $LogFile records for an old file with $UsnJrnl records for a new file that inherited the same FRN base number.
Safeguards:
- Pay close attention to the FRN’s sequence number (e.g.,
55123-5->55123-6). A sequence increment on the same FRN base indicates the MFT record has been reallocated to a new file. - Look for a
FILE_DELETEevent in$UsnJrnlfollowed by aFILE_CREATEfor the same FRN base but different sequence number. This is a strong indicator of reuse. - Enrich with the
$MFTto check if the FRN is currently allocated and to which file path. - Scrutinize temporal gaps: An FRN-sequence pair that shows activity, goes dormant for a significant period (e.g., weeks or months), and then reappears with a completely different filename and parent path may warrant skepticism, even without a sequence number change.
Other Critical Limitations#
- Finite History: Both artifacts are circular and can be overwritten quickly on busy systems. You are always working with a limited window of historical data.
- $UsnJrnl State: The USN journal can be disabled or deleted by an administrator or attacker, leaving a significant visibility gap. Its absence is, in itself, a finding.
- $LogFile Complexity: Parsing
$LogFileis non-trivial. The richness of the output depends heavily on your tool’s ability to interpret different redo/undo record types and reconstruct context like filenames. - Path Volatility: Reconstructing full paths from FRNs depends on having a point-in-time
$MFTsnapshot. Paths can change, and parent directories can be renamed or moved, complicating historical path analysis.
7) Modern tooling and correlation stack#
The forensic analysis landscape has evolved toward lightweight, scriptable workflows that can handle large datasets efficiently. Our approach leverages this evolution by combining specialized parsers with modern analytical databases.
Parsing Tools#
The first step is to convert the raw, binary artifact files into a structured format like CSV.
For $UsnJrnl:$J:
- Eric Zimmerman’s MFTECmd: A go-to tool in the field. While its primary purpose is parsing the
$MFT, it has a--usnflag to also process the$UsnJrnl:$Jfrom the same volume. It can output to CSV, making it a perfect fit for this workflow. - ANJP (Another UsnJrnl Parser): A dedicated USN Journal parser that is fast and produces easy-to-use output.
For $LogFile:
- TZWorks
logfile: A commercial, command-line tool that is highly regarded for its detailed parsing of$LogFilerecords into a delimited format. - KAPE’s
LogFileParserModule: The Kroll Artifact Parser and Extractor (KAPE) has a built-in module (LogFileParser.exe) that can process$LogFileand produce CSV output. This is an excellent option that integrates well into automated collection and processing pipelines.
The SQL-on-CSV Approach: Why This Works#
You might wonder why we recommend using SQL, a database language, to analyze simple CSV files. The answer lies in performance, flexibility, and scalability that traditional forensic tools often lack.
Traditional Challenges:
- Loading multi-gigabyte CSVs into spreadsheet applications is impractical and often crashes the application
- Custom scripting requires significant development time and is difficult to share or reproduce
- GUI-based forensic suites often lack the flexibility for complex correlation logic
- Memory constraints limit the size of datasets that can be processed
The SQL Solution:
Modern analytical databases like DuckDB solve these problems elegantly. DuckDB is a command-line analytical database that can execute complex SQL queries directly on CSV files without requiring a slow import process. It treats the CSVs as if they were database tables, enabling:
- Immediate Analysis: No time spent on database setup or imports
- Familiar Syntax: SQL is widely known and provides powerful joining, filtering, and aggregation capabilities
- Performance: DuckDB is optimized for analytical workloads and can handle datasets much larger than available RAM
- Reproducibility: SQL queries can be easily shared, version-controlled, and reproduced across different systems
Alternative: SQLite
SQLite provides a similar capability with a slightly different workflow. You would typically use its command-line interface to .import the CSV files into tables in a temporary database, then run queries against those tables. Both approaches achieve the same goal.
This “Parse to CSV → Query with SQL” model has become a cornerstone of modern, scalable digital forensics, allowing analysts to focus on the investigative questions rather than technical hurdles.
Performance Considerations for Large Datasets#
When dealing with very large datasets (multi-GB CSVs from high-activity systems), consider these optimization strategies:
Indexing Strategy:
-- Create indexes on frequently joined columns
CREATE INDEX idx_usn_frn ON usn_data(frn);
CREATE INDEX idx_usn_timestamp ON usn_data(timestamp);
CREATE INDEX idx_log_frn ON log_data(frn);
CREATE INDEX idx_log_timestamp ON log_data(ts);
Note: While DuckDB is highly optimized for querying CSVs directly, these indexing strategies are most applicable when, for very large or recurring investigations, you choose to import the CSV data into a persistent database like SQLite or PostgreSQL to maximize query performance.
Partitioning by Time: For datasets spanning weeks or months, partition your analysis by time windows:
-- Analyze one day at a time to reduce memory usage
SELECT * FROM correlation_view
WHERE timestamp BETWEEN '2025-08-24 00:00:00' AND '2025-08-24 23:59:59';
Selective Loading: Load only the columns you need for your specific analysis:
-- Load only essential columns to reduce memory footprint
SELECT frn, timestamp, reason_text, filename
FROM read_csv('usn.csv', columns={'frn': 'BIGINT', 'timestamp': 'TIMESTAMP', 'reason_text': 'VARCHAR', 'filename': 'VARCHAR'});
Streaming Analysis: For extremely large datasets, consider processing in chunks and writing results to intermediate tables rather than holding everything in memory.
8) Validation, detection, and visualization#
Once your data is correlated, the challenge shifts from technical implementation to analytical insight. This section provides practical techniques for finding meaningful patterns in your unified timeline.
Advanced Detection Queries#
These queries use DuckDB syntax and assume you have parsed your artifacts into usn.csv and logfile.csv.
Note: Queries involving aggregations across the entire dataset (like FRN Reuse and Gap Analysis) can be resource-intensive. It is often effective to first filter your data to a relevant time window before running these broader analytical queries.
1. Find Ransomware-like Rename Bursts
This query identifies suspicious volume spikes in file renames that could indicate ransomware encryption activity.
WITH rename_activity AS (
SELECT
DATE_TRUNC('minute', timestamp) AS minute_bucket,
COUNT(*) AS rename_count,
COUNT(DISTINCT parent_frn) AS affected_directories,
ARRAY_AGG(DISTINCT filename) AS sample_filenames
FROM read_csv_auto('usn.csv')
WHERE CONTAINS(reason_text, 'RENAME_NEW_NAME')
GROUP BY minute_bucket
)
SELECT
minute_bucket,
rename_count,
affected_directories,
sample_filenames[1:5] AS first_five_samples -- Show first 5 filenames
FROM rename_activity
WHERE rename_count > 50 -- Adjust threshold for your environment
AND affected_directories > 5 -- Multiple directories affected
ORDER BY rename_count DESC;
2. Correlate Specific Rename Operations with Transactional Proof
This query finds rename events in $UsnJrnl and correlates them with corresponding $LogFile transactions to provide definitive proof of the operation.
SELECT
u.timestamp AS event_time,
u.frn || '-' || u.frn_seq AS frn_reference,
u.filename AS current_name,
u.reason_text AS usn_reasons,
l.op AS transaction_type,
l.old_name AS original_name,
l.new_name AS renamed_to,
ABS(DATEDIFF('second', u.timestamp, l.ts)) AS time_delta_seconds
FROM read_csv_auto('usn.csv') u
INNER JOIN read_csv_auto('logfile.csv') l
ON l.frn = u.frn
AND ABS(DATEDIFF('second', u.timestamp, l.ts)) <= 5 -- 5-second correlation window
WHERE CONTAINS(u.reason_text, 'RENAME_NEW_NAME')
AND l.op IN ('UpdateFileName', 'UpdateFileNameInIndex')
ORDER BY u.timestamp;
3. Detect Potential FRN Reuse Scenarios
This query helps identify cases where an FRN may have been reused, which could lead to false correlations.
WITH frn_lifecycle AS (
SELECT
frn,
MIN(timestamp) AS first_seen,
MAX(timestamp) AS last_seen,
COUNT(*) AS event_count,
ARRAY_AGG(DISTINCT reason_text) AS all_reasons,
ARRAY_AGG(DISTINCT filename) AS all_filenames
FROM read_csv_auto('usn.csv')
GROUP BY frn
)
SELECT
frn,
first_seen,
last_seen,
DATEDIFF('hour', first_seen, last_seen) AS lifespan_hours,
event_count,
ARRAY_LENGTH(all_filenames) AS unique_filenames,
all_filenames
FROM frn_lifecycle
WHERE ARRAY_LENGTH(all_filenames) > 1 -- FRN associated with multiple filenames
AND ('FILE_DELETE' = ANY(all_reasons) AND 'FILE_CREATE' = ANY(all_reasons)) -- Both create and delete present
ORDER BY unique_filenames DESC, lifespan_hours ASC;
4. Timeline Gap Analysis
Identify periods where one artifact has activity but the other doesn’t, which could indicate selective log tampering or artifact corruption.
WITH hourly_activity AS (
SELECT
DATE_TRUNC('hour', timestamp) AS hour_bucket,
'USN' AS source,
COUNT(*) AS event_count
FROM read_csv_auto('usn.csv')
GROUP BY hour_bucket
UNION ALL
SELECT
DATE_TRUNC('hour', ts) AS hour_bucket,
'LOG' AS source,
COUNT(*) AS event_count
FROM read_csv_auto('logfile.csv')
GROUP BY hour_bucket
),
pivoted_activity AS (
SELECT
hour_bucket,
SUM(CASE WHEN source = 'USN' THEN event_count ELSE 0 END) AS usn_events,
SUM(CASE WHEN source = 'LOG' THEN event_count ELSE 0 END) AS log_events
FROM hourly_activity
GROUP BY hour_bucket
)
SELECT
hour_bucket,
usn_events,
log_events,
ABS(usn_events - log_events) AS activity_delta,
CASE
WHEN usn_events > 0 AND log_events = 0 THEN 'USN_ONLY'
WHEN log_events > 0 AND usn_events = 0 THEN 'LOG_ONLY'
WHEN ABS(usn_events - log_events) > 100 THEN 'SIGNIFICANT_MISMATCH'
ELSE 'NORMAL'
END AS anomaly_type
FROM pivoted_activity
WHERE anomaly_type != 'NORMAL'
ORDER BY hour_bucket;
Regex Patterns for Suspicious Activity#
When filtering filenames and paths in your analysis, these regex patterns can help identify potentially malicious activity:
Suspicious Extensions:
\.(bat|cmd|ps1|exe|dll|scr|com|pif|vbs|js|jar|tmp|dat|bin)$
Temporary/Staging Locations:
\\(temp|tmp|appdata\\local\\temp|windows\\temp|programdata)\\.*\.(exe|dll|bat|ps1)
Obfuscated Filenames:
^[a-f0-9]{8,}\.exe$|^[A-Z0-9]{8,}\.tmp$|^[0-9]+\.(exe|dll)$
False Positive Management#
When using these patterns, be aware of common false positives:
- System Updates: Windows Update and software installers create many temporary files with suspicious extensions
- Development Environments: Build processes generate numerous temporary executables
- Legitimate Software: Many applications use temporary files during normal operation
To reduce false positives:
-- Exclude known good paths and processes
WHERE filename ~ '\.(exe|dll|bat)$'
AND NOT parent_path ~ '\\(windows\\softwaredistribution|program files|programdata\\microsoft)'
AND NOT filename ~ '^(setup|install|update|patch)'
AND event_count > 1 -- Focus on files with multiple operations
Visualization Concepts#
Timeline Density Plots: Create heat maps showing filesystem activity density over time. High-density periods often correlate with significant system events or malicious activity.
FRN Lifecycle Diagrams:
For suspicious files, create Sankey diagrams showing the flow: Create → Write → Rename → Execute → Delete. This visualization clearly shows the complete lifecycle and helps identify patterns.
Correlation Confidence Matrices:
Plot the time delta between correlated $UsnJrnl and $LogFile events. Tight correlations (< 1 second) indicate high confidence, while loose correlations may indicate system load or clock skew issues.
Directory Tree Activity: Visualize activity at the directory level to identify targeted locations. Mass encryption or data exfiltration often shows distinct patterns when viewed hierarchically.
LAB: Hands-On Correlation Exercise#
You can generate your own test data to validate this workflow and understand the correlation patterns firsthand.
1. Generate Test Events#
Open a cmd prompt on a test machine and create a sequence of filesystem events that will generate clear patterns in both artifacts:
C:\> cd %TEMP%
C:\Users\test\AppData\Local\Temp> mkdir dfir_test
C:\Users\test\AppData\Local\Temp> cd dfir_test
C:\Users\test\AppData\Local\Temp\dfir_test> echo "malicious content" > file1.txt
C:\Users\test\AppData\Local\Temp\dfir_test> ren file1.txt file2.log
C:\Users\test\AppData\Local\Temp\dfir_test> ren file2.log totally_legit.dll
C:\Users\test\AppData\Local\Temp\dfir_test> echo "more data" >> totally_legit.dll
C:\Users\test\AppData\Local\Temp\dfir_test> copy totally_legit.dll backup.dll
C:\Users\test\AppData\Local\Temp\dfir_test> del totally_legit.dll
C:\Users\test\AppData\Local\Temp\dfir_test> ren backup.dll final.exe
C:\Users\test\AppData\Local\Temp\dfir_test> del final.exe
2. Acquire Artifacts#
Use a tool that can access live, locked files:
- KAPE: Use the
NTFStarget to collect all NTFS metadata - FTK Imager: Create a logical image or use the “Export Files” feature
- VSS Method: Mount a recent Volume Shadow Copy and copy files directly
Collect: C:\$LogFile, C:\$Extend\$UsnJrnl:$J, and C:\$MFT
3. Parse to Structured Data#
# Parse USN Journal and MFT together (when from same source)
MFTECmd.exe -f "C:\path\to\acquired\$MFT" --csv "C:\output"
# Note: MFTECmd will automatically process the $UsnJrnl if present from the same source
# For standalone USN Journal parsing, consider ANJP:
# ANJP.exe -f "C:\path\to\$UsnJrnl_$J" -o "C:\output\usn.csv"
# Parse LogFile (using KAPE's LogFileParser as example)
LogFileParser.exe -f "C:\$LogFile" -o "C:\output\logfile.csv"
4. Execute Correlation Analysis#
Use this comprehensive DuckDB query to analyze your test data:
-- Load and correlate the data
WITH correlated_events AS (
SELECT
COALESCE(u.frn, l.frn) AS frn,
COALESCE(u.timestamp, l.ts) AS event_time,
u.timestamp AS usn_timestamp,
l.ts AS log_timestamp,
u.reason_text,
l.op AS log_operation,
u.filename AS usn_filename,
l.old_name AS log_old_name,
l.new_name AS log_new_name,
u.parent_frn,
ABS(DATEDIFF('second', COALESCE(u.timestamp, l.ts), COALESCE(l.ts, u.timestamp))) AS time_delta
FROM read_csv('usn.csv', header=true, auto_detect=true) u
FULL OUTER JOIN read_csv('logfile.csv', header=true, auto_detect=true) l
ON l.frn = u.frn
AND ABS(DATEDIFF('second', u.timestamp, l.ts)) <= 5 -- 5-second correlation window
)
SELECT
frn,
event_time,
reason_text,
log_operation,
usn_filename,
log_old_name,
log_new_name,
time_delta,
CASE
WHEN reason_text IS NOT NULL AND log_operation IS NOT NULL THEN 'CORRELATED'
WHEN reason_text IS NOT NULL THEN 'USN_ONLY'
WHEN log_operation IS NOT NULL THEN 'LOG_ONLY'
ELSE 'UNKNOWN'
END AS correlation_status
FROM correlated_events
WHERE frn IN (
-- Find FRNs from your test directory
SELECT DISTINCT frn FROM read_csv('usn.csv', header=true, auto_detect=true)
WHERE filename ~ '(file1|file2|totally_legit|backup|final)\.(txt|log|dll|exe)'
)
ORDER BY event_time;
5. Analysis and Validation#
Your correlated results should show clear patterns that demonstrate the power of this approach:
Expected Correlation Patterns:
- File Creation:
FILE_CREATEin USN Journal should correlate withInitializeFileRecordSegmentin LogFile - Renames: Each
RENAME_OLD_NAME/RENAME_NEW_NAMEpair should correlate withUpdateFileNametransactions - Deletions:
FILE_DELETEshould align withDeleteFileRecordor similar metadata operations - Data Writes:
DATA_OVERWRITEevents should correlate withUpdateStandardInformation(for size changes)
Validation Questions:
- Are all your test file operations captured in both artifacts?
- Do the timestamps align within your correlation window?
- Can you trace the complete lifecycle of each test file through both artifacts?
- Are there any events that appear in only one artifact? Why might that be?
Key Takeaways#
- Complementary Views:
$LogFileis the how (low-level transactions),$UsnJrnlis the what (high-level audit). - FRN is the Pivot: The MFT File Reference Number is the common key that links these two disparate logs, but always consider sequence numbers to avoid reuse issues.
- Immutable Sequence: The LSN order in
$LogFileprovides a ground-truth sequence of events that can be used to validate timestamps and defeat timestomping. - Context is King: Always analyze these artifacts with a corresponding
$MFTto resolve FRNs to paths and understand file allocation status. - Beware of FRN Reuse: Always check for signs that an FRN has been reallocated to a new file to avoid drawing incorrect conclusions.
- Scale Considerations: Modern SQL-on-CSV approaches enable analysis of massive datasets that would be impossible with traditional tools.
- Validation is Critical: Cross-correlation provides confidence, but always validate findings with additional artifacts and context.
Final Checklist: Correlating $LogFile + $UsnJrnl in the Field#
Pre-Analysis#
- Acquire: Collect
$LogFile,$UsnJrnl:$J, and$MFTfrom the same source (image or VSS). - Document: Record acquisition time, source volume, and compute cryptographic hashes of all artifacts.
- Environment Setup: Install DuckDB or SQLite and ensure parsing tools are available and current.
Parsing and Preparation#
- Parse: Convert
$LogFileand$UsnJrnlinto structured CSV format using tools likeMFTECmdandLogFileParser. - Normalize: Convert all timestamps to UTC and ensure consistent field naming.
- Validate: Check CSV structure, verify expected column headers, and spot-check sample records.
Analysis Workflow#
- Initial Correlation: Perform basic FRN-based join with time window to create unified timeline.
- FRN Reuse Check: Query for potential FRN reuse scenarios using lifecycle analysis.
- Detection Queries: Apply behavioral detection logic to identify high-value events:
- Rename Storms: Look for mass rename activities indicating ransomware
- Deletion Patterns: Identify mass deletions or targeted file removal
- Timestomp Detection: Find evidence of timestamp manipulation
- Staging Activities: Detect file creation/rename patterns in temporary locations
Validation and Reporting#
- Cross-Verification: Validate key findings with additional artifacts (registry, logs, memory).
- Timeline Construction: Build comprehensive timeline with confidence indicators.
- Documentation: Record methodology, tools used, and any limitations or assumptions.
- Artifact Preservation: Retain all intermediate data (CSV files, SQL queries) for reproducibility.
Quality Assurance#
- Peer Review: Have findings reviewed by another analyst when possible.
- Alternative Explanations: Consider legitimate explanations for detected patterns.
- Confidence Assessment: Rate the confidence level of each finding based on correlation strength.
Advanced Considerations#
Memory and Performance Optimization#
For enterprise-scale investigations, consider partitioning analysis by time windows or implementing distributed processing using tools like Apache Spark with SQL interfaces.
Integration with Other Artifacts#
This correlation approach can be extended to include other time-based artifacts like Windows Event Logs, Prefetch files, or browser history to create comprehensive system timelines.
Automation Opportunities#
The SQL-based approach lends itself well to automation. Consider developing reusable query templates and integrating them into your organization’s standard forensic workflows.
Legal and Reporting Considerations#
The technical depth of this correlation method may require additional explanation in reports for non-technical audiences. Always be prepared to explain your methodology and the significance of cross-artifact validation.
This correlation methodology represents an evolution in NTFS forensics, moving beyond single-artifact analysis to provide transaction-level proof of filesystem activity. By combining the audit trail of $UsnJrnl with the transactional integrity of $LogFile, we achieve a level of forensic confidence that is difficult for adversaries to undermine and invaluable for complex investigations.