splunk_escuAnomaly

Windows File Collection Via Copy Utilities

The following analytic detects the use of Windows command-line copy utilities, such as xcopy, to systematically collect files from user directories and consolidate them into a centralized location on the system. This activity is often indicative of malicious behavior, as threat actors frequently use such commands to gather sensitive information, including documents with .doc, .docx, and .pdf extensions. The detection focuses on identifying recursive copy operations targeting user folders, such as Documents, Desktop, or other directories that commonly store personal or organizational files. Malware that performs this behavior typically attempts to evade detection by using legitimate Windows utilities, executing commands through cmd.exe or other scripting hosts, and writing the collected files to directories like C:\ProgramData or temporary storage locations. Once collected, the information may be staged for exfiltration, used for lateral movement, or leveraged for further compromise of the environment. By monitoring for these types of file collection patterns, security teams can identify suspicious activity early, differentiate between normal administrative tasks and potentially malicious scripts, and prevent sensitive data from being exfiltrated. This analytic is particularly relevant for environments where confidential documents are present and attackers may attempt to harvest them using built-in Windows tools.

Detection Query

| tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime
from datamodel=Endpoint.Processes where
(
  Processes.process_name IN ("copy.exe", "xcopy.exe")
  OR
  Processes.original_file_name IN ("copy.exe", "xcopy.exe")
)
Processes.process IN (
  "*.7z*",
  "*.bmp*",
  "*.db*",
  "*.doc*",
  "*.gif*",
  "*.gz*",
  "*.jpg*",
  "*.log*",
  "*.pdf*",
  "*.png*",
  "*.ppt*",
  "*.rar*",
  "*.rtf*",
  "*.tar*",
  "*.txt*",
  "*.xls*",
  "*.zip*"
)
by Processes.action Processes.dest Processes.original_file_name
Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid
Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path
Processes.process Processes.process_exec Processes.process_guid Processes.process_hash
Processes.process_id Processes.process_integrity_level Processes.process_name Processes.process_path
Processes.user Processes.user_id Processes.vendor_product
| `drop_dm_object_name(Processes)`
| `security_content_ctime(firstTime)`
| `security_content_ctime(lastTime)`
| `windows_file_collection_via_copy_utilities_filter`

Author

Teoderick Contreras, Splunk

Data Sources

Sysmon EventID 1Windows Event Log Security 4688CrowdStrike ProcessRollup2

References

https://cert.gov.ua/article/6284730

Raw Content

name: Windows File Collection Via Copy Utilities
id: dbdd556d-9da8-4c42-9980-8a3ffe25a758
version: 6
creation_date: '2021-05-07'
modification_date: '2026-05-13'
author: Teoderick Contreras, Splunk
status: production
type: Anomaly
description: The following analytic detects the use of Windows command-line copy utilities, such as xcopy, to systematically collect files from user directories and consolidate them into a centralized location on the system. This activity is often indicative of malicious behavior, as threat actors frequently use such commands to gather sensitive information, including documents with .doc, .docx, and .pdf extensions. The detection focuses on identifying recursive copy operations targeting user folders, such as Documents, Desktop, or other directories that commonly store personal or organizational files. Malware that performs this behavior typically attempts to evade detection by using legitimate Windows utilities, executing commands through cmd.exe or other scripting hosts, and writing the collected files to directories like C:\ProgramData or temporary storage locations. Once collected, the information may be staged for exfiltration, used for lateral movement, or leveraged for further compromise of the environment. By monitoring for these types of file collection patterns, security teams can identify suspicious activity early, differentiate between normal administrative tasks and potentially malicious scripts, and prevent sensitive data from being exfiltrated. This analytic is particularly relevant for environments where confidential documents are present and attackers may attempt to harvest them using built-in Windows tools.
data_source:
    - Sysmon EventID 1
    - Windows Event Log Security 4688
    - CrowdStrike ProcessRollup2
search: |
    | tstats `security_content_summariesonly` count min(_time) as firstTime max(_time) as lastTime
    from datamodel=Endpoint.Processes where
    (
      Processes.process_name IN ("copy.exe", "xcopy.exe")
      OR
      Processes.original_file_name IN ("copy.exe", "xcopy.exe")
    )
    Processes.process IN (
      "*.7z*",
      "*.bmp*",
      "*.db*",
      "*.doc*",
      "*.gif*",
      "*.gz*",
      "*.jpg*",
      "*.log*",
      "*.pdf*",
      "*.png*",
      "*.ppt*",
      "*.rar*",
      "*.rtf*",
      "*.tar*",
      "*.txt*",
      "*.xls*",
      "*.zip*"
    )
    by Processes.action Processes.dest Processes.original_file_name
    Processes.parent_process Processes.parent_process_exec Processes.parent_process_guid
    Processes.parent_process_id Processes.parent_process_name Processes.parent_process_path
    Processes.process Processes.process_exec Processes.process_guid Processes.process_hash
    Processes.process_id Processes.process_integrity_level Processes.process_name Processes.process_path
    Processes.user Processes.user_id Processes.vendor_product
    | `drop_dm_object_name(Processes)`
    | `security_content_ctime(firstTime)`
    | `security_content_ctime(lastTime)`
    | `windows_file_collection_via_copy_utilities_filter`
how_to_implement: The detection is based on data that originates from Endpoint Detection and Response (EDR) agents. These agents are designed to provide security-related telemetry from the endpoints where the agent is installed. To implement this search, you must ingest logs that contain the process GUID, process name, and parent process. Additionally, you must ingest complete command-line executions. These logs must be processed using the appropriate Splunk Technology Add-ons that are specific to the EDR product. The logs must also be mapped to the `Processes` node of the `Endpoint` data model. Use the Splunk Common Information Model (CIM) to normalize the field names and speed up the data modeling process.
known_false_positives: Administrators may execute this command for testing or auditing.
references:
    - https://cert.gov.ua/article/6284730
drilldown_searches:
    - name: View the detection results for - "$user$" and "$dest$"
      search: '%original_detection_search% | search  user = "$user$" dest = "$dest$"'
      earliest_offset: $info_min_time$
      latest_offset: $info_max_time$
    - name: View risk events for the last 7 days for - "$user$" and "$dest$"
      search: '| from datamodel Risk.All_Risk | search normalized_risk_object IN ("$user$", "$dest$") | stats count min(_time) as firstTime max(_time) as lastTime values(search_name) as "Search Name" values(risk_message) as "Risk Message" values(analyticstories) as "Analytic Stories" values(annotations._all) as "Annotations" values(annotations.mitre_attack.mitre_tactic) as "ATT&CK Tactics" by normalized_risk_object | `security_content_ctime(firstTime)` | `security_content_ctime(lastTime)`'
      earliest_offset: 7d
      latest_offset: "0"
intermediate_findings:
    entities:
        - field: user
          type: user
          score: 20
          message: An instance of $parent_process_name$ spawning $process_name$ was identified on endpoint $dest$ by user $user$ attempting to collect documents..
        - field: dest
          type: system
          score: 20
          message: An instance of $parent_process_name$ spawning $process_name$ was identified on endpoint $dest$ by user $user$ attempting to collect documents..
threat_objects:
    - field: parent_process_name
      type: parent_process_name
    - field: process_name
      type: process_name
analytic_story:
    - LAMEHUG
asset_type: Endpoint
mitre_attack_id:
    - T1119
product:
    - Splunk Enterprise
    - Splunk Enterprise Security
    - Splunk Cloud
category: endpoint
security_domain: endpoint
tests:
    - name: True Positive Test
      attack_data:
        - data: https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/malware/lamehug/T1119/doc_collection/xcopy_event.log
          source: XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
          sourcetype: XmlWinEventLog
      test_type: unit