Pre-release Preview
In an era where file uploads are ubiquitous — from profile pictures to document attachments — the security implications of accepting user-submitted media are often overlooked. Malicious actors have long exploited image parsers, audio decoders, and document processors to deliver payloads hidden within seemingly innocent files. Enter Disarm, a Java library designed to neutralize these threats by re-encoding media files into safe, predictable formats.
At its core, Disarm takes a simple but effective approach: rather than trying to detect malware (a cat-and-mouse game), it transforms files. By re-encoding images, audio, video, and text through trusted processing pipelines, any embedded exploits, steganographic payloads, or malformed data structures are stripped away. What comes out the other side is a clean file that behaves exactly as expected.
The library handles four media categories:
Every file is validated against a strict white list before processing. If the detected MIME type isn't explicitly permitted, the file is rejected outright. No parsing of unknown formats means no parser exploits.
Configurable limits prevent resource exhaustion attacks. Each format type has its own ceiling, with a global upper bound as a safety net.
Images are decoded and re-encoded pixel-by-pixel. Audio is transcoded through a full decode-encode cycle. This process inherently destroys any malicious payloads that rely on parser quirks or format ambiguities.
Re-encoding naturally discards EXIF data, ID3 tags, and other metadata fields that could contain tracking information or exploit code.
The library follows a clean pipeline architecture:
Configuration is centralized, making it straightforward to adjust limits, add new formats to the whitelist, or modify processing parameters.
The project is in active development. Image and audio processing are fully functional with comprehensive format support. Video processing is being implemented, and the text sanitization module handles encoding validation and basic XSS prevention.
The codebase prioritizes security and performance — every input is treated as potentially hostile, with validation at multiple layers.
Future plans include:
Disarm is an open-source project aimed at making file upload handling safer by default. Stay tuned for the initial release.*