Windows File Collection Via Copy Utilities
The following analytic detects the use of Windows command-line copy utilities, such as xcopy, to systematically collect files from user directories and consolidate them into a centralized location on the system. This activity is often indicative of malicious behavior, as threat actors frequently use such commands to gather sensitive information, including documents with .doc, .docx, and .pdf extensions. The detection focuses on identifying recursive copy operations targeting user folders, such as Documents, Desktop, or other directories that commonly store personal or organizational files. Malware that performs this behavior typically attempts to evade detection by using legitimate Windows utilities, executing commands through cmd.exe or other scripting hosts, and writing the collected files to directories like C:\ProgramData or temporary storage locations. Once collected, the information may be staged for exfiltration, used for lateral movement, or leveraged for further compromise of the environment. By monitoring for these types of file collection patterns, security teams can identify suspicious activity early, differentiate between normal administrative tasks and potentially malicious scripts, and prevent sensitive data from being exfiltrated. This analytic is particularly relevant for environments where confidential documents are present and attackers may attempt to harvest them using built-in Windows tools.
MITRE ATT&CK
Detection Query
| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime
from datamodel=Endpoint.Processes where
(
Processes.process_name IN ("copy.exe", "xcopy.exe")
OR
Processes.original_file_name IN ("copy.exe", "xcopy.exe")
)
Processes.process IN (
"*.7z*",
"*.bmp*",
"*.db*",
"*.doc*",
"*.gif*",
"*.gz*",
"*.jpg*",
"*.log*",
"*.pdf*",
"*.png*",
"*.ppt*",
"*.rar*",
"*.rtf*",
"*.tar*",
"*.txt*",
"*.xls*",
"*.zip*"
)
by Processes.action Processes.dest Processes.original_file_name
Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid
Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path
Processes.process Processes.process_exec Processes.process_guid Processes.process_hash
Processes.process_id Processes.process_integrity_level Processes.process_name Processes.process_path
Processes.user Processes.user_id Processes.vendor_product
| `drop_dm_object_name(Processes)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `windows_file_collection_via_copy_utilities_filter`
Author
Teoderick Contreras, Splunk
Created
2026-03-10
Data Sources
References
Tags
Raw Content
name: Windows File Collection Via Copy Utilities
id: dbdd556d-9da8-4c42-9980-8a3ffe25a758
version: 4
date: '2026-03-10'
author: Teoderick Contreras, Splunk
status: production
type: Anomaly
description: The following analytic detects the use of Windows command-line copy utilities, such as xcopy, to systematically collect files from user directories and consolidate them into a centralized location on the system. This activity is often indicative of malicious behavior, as threat actors frequently use such commands to gather sensitive information, including documents with .doc, .docx, and .pdf extensions. The detection focuses on identifying recursive copy operations targeting user folders, such as Documents, Desktop, or other directories that commonly store personal or organizational files. Malware that performs this behavior typically attempts to evade detection by using legitimate Windows utilities, executing commands through cmd.exe or other scripting hosts, and writing the collected files to directories like C:\ProgramData or temporary storage locations. Once collected, the information may be staged for exfiltration, used for lateral movement, or leveraged for further compromise of the environment. By monitoring for these types of file collection patterns, security teams can identify suspicious activity early, differentiate between normal administrative tasks and potentially malicious scripts, and prevent sensitive data from being exfiltrated. This analytic is particularly relevant for environments where confidential documents are present and attackers may attempt to harvest them using built-in Windows tools.
data_source:
- Sysmon EventID 1
- Windows Event Log Security 4688
- CrowdStrike ProcessRollup2
search: |
| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime
from datamodel=Endpoint.Processes where
(
Processes.process_name IN ("copy.exe", "xcopy.exe")
OR
Processes.original_file_name IN ("copy.exe", "xcopy.exe")
)
Processes.process IN (
"*.7z*",
"*.bmp*",
"*.db*",
"*.doc*",
"*.gif*",
"*.gz*",
"*.jpg*",
"*.log*",
"*.pdf*",
"*.png*",
"*.ppt*",
"*.rar*",
"*.rtf*",
"*.tar*",
"*.txt*",
"*.xls*",
"*.zip*"
)
by Processes.action Processes.dest Processes.original_file_name
Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid
Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path
Processes.process Processes.process_exec Processes.process_guid Processes.process_hash
Processes.process_id Processes.process_integrity_level Processes.process_name Processes.process_path
Processes.user Processes.user_id Processes.vendor_product
| `drop_dm_object_name(Processes)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `windows_file_collection_via_copy_utilities_filter`
how_to_implement: The detection is based on data that originates from Endpoint Detection and Response (EDR) agents. These agents are designed to provide security-related telemetry from the endpoints where the agent is installed. To implement this search, you must ingest logs that contain the process GUID, process name, and parent process. Additionally, you must ingest complete command-line executions. These logs must be processed using the appropriate Splunk Technology Add-ons that are specific to the EDR product. The logs must also be mapped to the `Processes` node of the `Endpoint` data model. Use the Splunk Common Information Model (CIM) to normalize the field names and speed up the data modeling process.
known_false_positives: Administrators may execute this command for testing or auditing.
references:
- https://cert.gov.ua/article/6284730
drilldown_searches:
- name: View the detection results for - "$user$" and "$dest$"
search: '%original_detection_search% | search user = "$user$" dest = "$dest$"'
earliest_offset: $info_min_time$
latest_offset: $info_max_time$
- name: View risk events for the last 7 days for - "$user$" and "$dest$"
search: '| from datamodel Risk.All_Risk | search normalized_risk_object IN ("$user$", "$dest$") starthoursago=168 | stats count min(_time) as firstTime max(_time) as lastTime values(search_name) as "Search Name" values(risk_message) as "Risk Message" values(analyticstories) as "Analytic Stories" values(annotations._all) as "Annotations" values(annotations.mitre_attack.mitre_tactic) as "ATT&CK Tactics" by normalized_risk_object | `security_content_ctime(firstTime)` | `security_content_ctime(lastTime)`'
earliest_offset: $info_min_time$
latest_offset: $info_max_time$
rba:
message: An instance of $parent_process_name$ spawning $process_name$ was identified on endpoint $dest$ by user $user$ attempting to collect documents..
risk_objects:
- field: user
type: user
score: 20
- field: dest
type: system
score: 20
threat_objects:
- field: parent_process_name
type: parent_process_name
- field: process_name
type: process_name
tags:
analytic_story:
- LAMEHUG
asset_type: Endpoint
mitre_attack_id:
- T1119
product:
- Splunk Enterprise
- Splunk Enterprise Security
- Splunk Cloud
security_domain: endpoint
tests:
- name: True Positive Test
attack_data:
- data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/malware/lamehug/T1119/doc_collection/xcopy_event.log
source: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
sourcetype: XmlWinEventLog