
What is a deepfake
SITUATION ASSESSMENT In March 2023, a deepfake video depicting Ukrainian President Volodymyr Zelenskyy surrendering to Russian forces circulated…
\n\n
For most of human history, the phrase «seeing is believing» held true. Photographs, audio recordings, and video footage served as gold-standard evidence — in courtrooms, newsrooms, military intelligence, and public discourse. That era is ending. Deepfakes — synthetic media generated or manipulated by artificial intelligence — have shattered the assumption that audiovisual content reliably represents reality. A video of a military commander issuing illegal orders, an audio recording of a politician making racist remarks, or a photograph of troops committing atrocities can now be fabricated with consumer-grade technology.
For military and defense professionals, deepfakes represent an existential threat to information integrity. Adversaries can manufacture evidence of war crimes, create false flag operations, impersonate commanders, and sow confusion during critical decision-making windows. Understanding deepfakes — how they are made, how to detect them, and how to defend against their weaponization — is essential for cognitive warfare readiness.
Deepfakes are synthetic media created using deep learning — a subset of artificial intelligence based on neural networks. The term combines «deep learning» with «fake.» Deepfakes can:
Swap faces: Replace one person’s face with another’s in existing video
Lip-sync: Alter mouth movements to match new audio (dubbing)
Generate entirely synthetic people: Create realistic human faces that do not exist
Synthesize voice: Clone a person’s voice from limited samples
Generate full-body movement: Create synthetic video of a person performing actions they never performed
The technology has advanced rapidly. Early deepfakes (2017-2019) were detectable by visible artifacts — unnatural blinking, inconsistent lighting, blurring around face edges. Contemporary deepfakes (2024+) are often indistinguishable from authentic recordings to the naked eye.
The most common deepfake architecture. Two neural networks compete:
Generator: Creates synthetic images or videos
Discriminator: Attempts to distinguish real from synthetic
The generator improves until the discriminator cannot reliably tell real from fake. This adversarial process produces increasingly realistic output.
Face-swapping deepfakes typically use autoencoders. A common encoder learns to compress facial features from multiple people. Individual decoders then reconstruct specific faces. By swapping decoders, the system places one person’s expressions onto another’s face.
Recent advances in diffusion models (Stable Diffusion, DALL-E, Midjourney) have made high-quality synthetic image generation accessible to consumers. Video diffusion (Sora, Runway Gen-2) is rapidly closing the gap with still images.
Text-to-speech models (ElevenLabs, Resemble AI, VALL-E) can clone a voice from as little as three seconds of sample audio. These models capture not only pitch and tone but also emotional inflection, accent, and speech patterns.
The most common deepfake type. One person’s face is mapped onto another’s body while preserving facial expressions, gaze, and mouth movements.
Military threat: A deepfake of an enemy commander ordering troops to fire on civilians could justify retaliation. A deepfake of a friendly commander issuing illegal orders could undermine trust in legitimate command.
Original video of a person speaking is preserved, but their mouth movements are altered to match new audio. The person appears to say words they never spoke.
Military threat: A deepfake of a political leader announcing surrender, capitulation, or controversial policy could trigger real-world consequences before verification is possible.
Full facial expression transfer. The movements of a source actor are mapped onto a target person’s face, making the target appear to express emotions, react to events, or deliver speeches they never performed.
AI-generated faces and personas. These do not correspond to any real human. Adversaries use synthetic personas as fake journalists, analysts, witnesses, or social media influencers.
Military threat: Synthetic personas can penetrate intelligence-sharing networks, build relationships with real analysts, and extract information or spread disinformation.
Synthetic audio of a specific person speaking words they never said.
Military threat: A voice-cloned phone call appearing to be from a commander ordering troop movements. A fake audio recording of a defense official admitting to illegal operations.
Generative AI creating entire scenes — people, environments, actions — that never occurred. Not manipulation of existing footage but creation from scratch.
Deepfakes make disinformation more vivid, emotional, and shareable. A fake video of a politician engaging in corrupt behavior is more impactful than a text-based lie. The video format exploits trust in visual evidence.
A deepfake released days before an election — a candidate making offensive remarks, admitting to fraud, or conspiring with foreign powers — could shift outcomes. Even if debunked after the election, the damage is done.
Deepfakes need not be convincing to be effective. The mere possibility that any video could be fake creates «liar’s dividend» — authentic compromising footage can be dismissed as AI-generated. Adversaries can flood the information space with so many deepfakes that audiences cannot distinguish real from fake, retreating into cynicism and distrust of all media.
Deepfakes can support traditional military deception (MILDEC):
Fake video of troop movements misleading adversaries about attack locations
Synthetic audio of enemy commanders revealing operational plans
Fabricated evidence of atrocities to shape international opinion
Deepfakes enable sophisticated social engineering:
Video call impersonating a superior officer requesting sensitive information
Voice clone calling a family member to facilitate a ransom scam
Synthetic persona building long-term relationships with intelligence analysts
Deepfakes can create compromising material of individuals who never engaged in compromising behavior. A fabricated video of an official with a prostitute or accepting a bribe becomes a coercive tool.
| Method | What It Detects | Limitations |
|---|---|---|
| Inconsistent blinking | Early deepfakes (less common in modern systems) | Modern GANs generate realistic blinking |
| Lighting and shadow inconsistencies | Mismatched illumination between face and environment | Improving rapidly |
| Unnatural eye gaze | Deepfakes often lack realistic saccades and gaze direction | Newer models address this |
| Lip-sync artifacts | Mismatch between phonemes and mouth shapes | High-quality systems minimize |
| Temporal inconsistencies | Artifacts that appear only on specific frames | Detection requires computational resources |
| Spectral artifacts | Manipulation leaves traces in frequency domain | Requires specialized analysis |
| Digital watermarking / cryptographic signatures | Authenticated vs. unauthenticated content | Requires cameras with signing capability |
Professional deepfake detection uses:
Convolutional neural networks (CNNs) trained to detect manipulation artifacts
Frequency domain analysis (artifacts invisible to human eye appear in Fourier transforms)
Biological signal detection (heartbeat, pulse, micro-expressions)
Multi-modal consistency (does audio match video? Do lip movements match phonemes?)
While technical detection is increasingly necessary, humans can spot certain red flags:
Unnatural eye contact or gaze that never shifts
Inconsistent skin texture (waxy, overly smooth, inconsistent with lighting)
Strange or absent blinking (blinking too frequently, too rarely, or asymmetrically)
Lip movements that don’t match audio (especially on specific phonemes like «b,» «m,» «p»)
Background artifacts (blurring, warping, inconsistencies around face edges)
Emotional incongruence (facial expression doesn’t match content of speech)
Critical warning: As deepfake quality improves, human detection becomes unreliable. In high-stakes contexts, technical verification is essential.
Cryptographic Content Authentication:
Cameras and recording devices cryptographically sign media at capture
Any manipulation breaks the signature
Requires hardware-level implementation and key management infrastructure
Blockchain Provenance:
Recording metadata (time, location, device) recorded on immutable ledger
Coalition for Content Provenance and Authenticity (C2PA) standard
Enables verification of media chain of custody
Forensic Watermarking:
Imperceptible watermarks embedded in authentic media
Manipulation destroys or alters watermark
AI Detection Systems:
Continuous investment in detection models (adversarial competition between generation and detection)
Government and military detection capabilities must outpace commercial generation
Verification Protocols:
No time-sensitive decisions based solely on unverified audiovisual media
Multi-channel verification (e.g., confirm video through independent communication path)
Trusted authentication services for media verification
Pre-bunking and Inoculation:
Train personnel and public on deepfake existence and limitations of human detection
Communicate that absence of visible artifacts does not guarantee authenticity
Establish trusted verification channels before crises
Rapid Response:
Pre-planned protocols for suspected deepfake incidents
Fast attribution: is this content authentic or synthetic?
Public communication: rapid debunking with evidence
Legal and Policy Frameworks:
Criminalization of malicious deepfake creation and distribution
Platform requirements for deepfake labeling and detection
International norms against deepfake use in warfare and elections
Shortly after Russia’s invasion of Ukraine, a deepfake video circulated showing Ukrainian President Volodymyr Zelenskyy telling Ukrainian soldiers to lay down their arms and surrender. The deepfake was relatively low-quality (poor lip-sync, unnatural head positioning) and was quickly debunked.
Impact: Limited. The deepfake was detected and publicly refuted within hours. However, it demonstrated adversary intent and the speed at which deepfakes could be deployed during a crisis.
Lessons: Rapid debunking works when trusted communication channels exist. Zelenskyy’s authentic video response within hours reinforced his credibility.
A deepfake creator (Chris Ume) produced a series of highly realistic TikTok videos of «Tom Cruise» performing magic tricks, telling jokes, and acting strangely. The deepfakes were so convincing that many viewers believed they were authentic, despite Cruise not having a TikTok account.
Impact: Demonstrated that consumer-grade deepfakes could achieve near-perfect realism. Raised public awareness of deepfake technology.
Lessons: Public inoculation through exposure to high-quality examples reduces surprise and vulnerability.
A voice clone of Michael Bloomberg, the former New York City mayor and presidential candidate, was used to make robocalls urging voters not to vote in the New Hampshire Democratic primary. The call was traced to a Texas-based telecom company.
Impact: First documented use of AI voice cloning in a U.S. election. Demonstrated low-cost, scalable election interference.
Lessons: Voice cloning is accessible and difficult to trace. Legal frameworks for election-related synthetic media are inadequate.
Live video calls can now be deepfaked in real time. An adversary could impersonate a commander, intelligence officer, or family member during a live video call — not just a recorded message.
AI systems that generate deepfakes, distribute them through bot networks, adapt based on engagement metrics, and produce follow-up content — all without human intervention. The disinformation campaign runs itself.
As detection improves, generation improves. This is an adversarial arms race. No permanent detection solution exists. Defenders must continuously update detection models and prioritize authentication over detection.
The long-term risk is not successful deepfakes but the erosion of trust in all media. When any video can be dismissed as «probably AI,» authentic evidence loses power. Adversaries need not create convincing deepfakes — only the possibility that authentic footage could be fake.
Deepfakes and audiovisual manipulation represent a fundamental shift in the information environment. The assumption that «seeing is believing» — reliable for centuries — is now invalid. Adversaries can fabricate evidence, impersonate leaders, and sow confusion at unprecedented scale and speed.
Defense requires a multi-layered approach: cryptographic authentication at capture, forensic detection after distribution, operational verification protocols, public inoculation through awareness, and legal frameworks for accountability. No single countermeasure is sufficient.
For military and defense professionals, the deepfake threat is not hypothetical. Adversaries are already deploying synthetic media in conflict zones and election campaigns. The question is not whether deepfakes will be used against your forces or your nation — but when. Preparation begins now.

SITUATION ASSESSMENT In March 2023, a deepfake video depicting Ukrainian President Volodymyr Zelenskyy surrendering to Russian forces circulated…
SITUATION ASSESSMENT In March 2023, a deepfake video depicting Ukrainian President Volodymyr Zelenskyy surrendering to Russian forces circulated across social media platforms before being rapidly...
Weekly intelligence briefings on cognitive warfare, disinformation, and defense strategies.