GTIG Reports First Autonomous GenAI-Powered Malware in the Wild: PromptSpy Uses Gemini API for Real-Time Device Control; Russia-Nexus CANFAIL and LONGSTREAM Deploy LLM-Generated Decoy Code
Date: 2026-05-25
Tags: malware, nation-state
Executive Summary
Google Threat Intelligence Group published a May 11, 2026 report documenting the transition from experimental to industrial-scale use of generative AI in adversarial workflows. The report reveals previously unreported capabilities in PromptSpy, an Android backdoor that uses the Gemini API to autonomously navigate victim devices in real time without human supervision, including biometric data capture and anti-uninstall overlays. Separately, GTIG confirmed Russia-nexus malware families CANFAIL and LONGSTREAM are using LLM-generated decoy code to obfuscate malicious payloads targeting Ukrainian organizations. Defenders should hunt for Gemini API calls from non-standard applications and implement detection for LLM-characteristic code patterns in malware samples.
Campaign Summary
| Field | Detail |
|---|---|
| Campaign / Malware | PromptSpy (Android), CANFAIL (Russia-nexus), LONGSTREAM (Russia-nexus) |
| Actor / Attribution | PromptSpy: Unknown (samples from Hong Kong and Argentina). CANFAIL/LONGSTREAM: Russia-nexus (high confidence, per GTIG) |
| Target | PromptSpy: Android users. CANFAIL/LONGSTREAM: Ukrainian organizations |
| Vector | PromptSpy: sideloaded APK via dedicated website. CANFAIL/LONGSTREAM: targeted intrusion operations |
| Status | active |
| First Observed | PromptSpy: 2026-01-13 (VT upload). CANFAIL/LONGSTREAM: ongoing since at least 2025 |
Detailed Findings
PromptSpy: Autonomous GenAI-Powered Android Backdoor
ESET researchers first identified PromptSpy in February 2026 as the first known Android malware to abuse generative AI in its execution flow. According to ESET researcher Lukas Stefanko, PromptSpy uses Google's Gemini to interpret on-screen elements and generate step-by-step instructions for UI manipulation to maintain persistence.
Google's GTIG report from May 11 revealed additional capabilities that go significantly beyond the initial ESET findings. According to GTIG, PromptSpy contains an autonomous agent module called GeminiAutomationAgent. The module serializes the device's visible user interface hierarchy into an XML format via the Accessibility API and sends it to the gemini-2.5-flash-lite model. Gemini returns structured JSON responses containing action types and spatial coordinates, which PromptSpy parses to simulate physical gestures: clicks, swipes, and navigation. The AI interprets the device state and generates commands in real time without human supervision.
According to Google, PromptSpy can capture victim biometric data to replay authentication gestures and regain access to compromised devices. If a victim attempts to uninstall the malware, it identifies the on-screen coordinates of the uninstall button and renders an invisible overlay that intercepts touch events, making the button appear unresponsive. Its command-and-control infrastructure, including Gemini API keys and VNC relay servers, can be updated dynamically at runtime.
ESET noted that PromptSpy has not yet appeared in their wider telemetry, suggesting it may still be a proof of concept. ESET confirmed it has not been observed on Google Play. Google stated it has disabled the assets associated with this activity.
CANFAIL and LONGSTREAM: LLM-Generated Code Obfuscation
GTIG confirmed that Russia-nexus threat actors targeting Ukrainian organizations are deploying two malware families that use LLM-generated decoy code to obfuscate their malicious functionality.
According to Help Net Security, CANFAIL contains LLM-authored comments that explicitly describe blocks of code as "unused filler," indicating the threat actor specifically requested the model generate large volumes of inert code for obfuscation purposes.
According to GTIG, LONGSTREAM contains 32 separate instances of code querying the system's daylight saving time status, a repetitive and functionally irrelevant pattern designed to make the script appear benign to analysts. This bloating technique exploits the tendency of automated analysis tools and human reviewers to evaluate code volume as an indicator of legitimacy.
APT45 AI-Industrialized Vulnerability Research
GTIG also documented North Korean state actor APT45 sending thousands of recursive prompts to AI models to systematically analyze CVEs and validate proof-of-concept exploits. According to the Korea JoongAng Daily, this pattern represents efforts to automate vulnerability analysis and attack-code testing at industrial scale.
MITRE ATT&CK Mapping
| Technique | ID | Context |
|---|---|---|
| Abuse Elevation Control Mechanism: Accessibility Features | T1548 | PromptSpy uses Accessibility API to serialize UI hierarchy for Gemini |
| Input Capture | T1056 | PromptSpy captures biometric data for authentication replay |
| Obfuscated Files or Information | T1027 | CANFAIL and LONGSTREAM use LLM-generated junk code for obfuscation |
| Application Layer Protocol: Web Protocols | T1071.001 | PromptSpy communicates with Gemini API over HTTPS |
| Impair Defenses: Disable or Modify Tools | T1562.001 | PromptSpy renders invisible overlay to prevent uninstallation |
| Exploit Public-Facing Application | T1190 | APT45 uses AI to automate CVE analysis and PoC validation |
IOCs
Domains
No domain IOCs published by source
Full URL Paths
No URL IOCs published by source
Splunk Format
No IOCs available for Splunk query
File Hashes
No hash IOCs published by source
Detection Recommendations
Monitor for outbound API calls to generativelanguage.googleapis.com from non-standard applications or processes, particularly on Android endpoints managed via MDM. Alert on Accessibility Service registrations by applications not in the organization's approved app list. Hunt for Android APKs that request both Accessibility Service and Internet permissions simultaneously with VNC-related functionality. For CANFAIL/LONGSTREAM detection, implement static analysis rules that flag source code containing LLM-characteristic patterns: educational docstrings in operational code, commented blocks explicitly labeled as filler, and high-repetition of functionally irrelevant system calls such as repeated daylight saving time queries. Monitor for scripts containing 10+ identical system information queries that serve no functional purpose. Network detection: flag outbound traffic to Gemini API endpoints from servers or endpoints where such traffic is not expected.
References
- [Google Cloud Blog / GTIG] Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access (2026-05-11) — https://cloud.google.com/blog/topics/threat-intelligence/ai-vulnerability-exploitation-initial-access
- [ESET] PromptSpy ushers in the era of Android threats using GenAI (2026-02-19) — https://www.welivesecurity.com/en/eset-research/promptspy-ushers-in-era-android-threats-using-genai/
- [Help Net Security] Google researchers uncover criminal zero-day exploit likely built with AI (2026-05-11) — https://www.helpnetsecurity.com/2026/05/11/google-ai-vulnerability-exploitation/
- [The Next Web] Google identifies first AI-developed zero-day exploit and thwarts planned mass exploitation event (2026-05-11) — https://thenextweb.com/news/google-ai-zero-day-exploit-cybersecurity-arms-race
- [Korea JoongAng Daily] Rise of AI raises fears of North Korean hacking capabilities (2026-05-14) — https://koreajoongangdaily.joins.com/news/2026-05-14/national/northKorea/Rise-of-AI-raises-fears-of-North-Korean-hacking-capabilities/2592651
- [Kaspersky / Securelist] Disclosing new PebbleDash-based tools by Kimsuky (2026-05-14) — https://securelist.com/kimsuky-appleseed-pebbledash-campaigns/119785/