N0153.DEV

Project @ 2026-01-18T03:11:27.468237Z

Pre-release Preview

In an era where file uploads are ubiquitous — from profile pictures to document attachments — the security implications of accepting user-submitted media are often overlooked. Malicious actors have long exploited image parsers, audio decoders, and document processors to deliver payloads hidden within seemingly innocent files. Enter Disarm, a Java library designed to neutralize these threats by re-encoding media files into safe, predictable formats.

What Does It Do?

At its core, Disarm takes a simple but effective approach: rather than trying to detect malware (a cat-and-mouse game), it transforms files. By re-encoding images, audio, video, and text through trusted processing pipelines, any embedded exploits, steganographic payloads, or malformed data structures are stripped away. What comes out the other side is a clean file that behaves exactly as expected.

The library handles four media categories:

Images — Re-encoded using OpenCV with optional watermarking support
Audio — Processed through FFmpeg (via the JAVE library) supporting MP3, OGG, FLAC, WAV, and more
Video — Container and codec normalization (in development)
Text — Encoding validation, HTML entity escaping, and dangerous script removal

Key Features

Format Whitelisting

Every file is validated against a strict white list before processing. If the detected MIME type isn't explicitly permitted, the file is rejected outright. No parsing of unknown formats means no parser exploits.

Size Constraints

Configurable limits prevent resource exhaustion attacks. Each format type has its own ceiling, with a global upper bound as a safety net.

Defensive Re-encoding

Images are decoded and re-encoded pixel-by-pixel. Audio is transcoded through a full decode-encode cycle. This process inherently destroys any malicious payloads that rely on parser quirks or format ambiguities.

Metadata Stripping

Re-encoding naturally discards EXIF data, ID3 tags, and other metadata fields that could contain tracking information or exploit code.

Architecture Highlights

The library follows a clean pipeline architecture:

Detection — MIME type extraction and format identification
Validation — Whitelist checking and size limit enforcement
Processing — Format-specific handlers for each media type
Output — Timestamped filenames with normalized extensions

Configuration is centralized, making it straightforward to adjust limits, add new formats to the whitelist, or modify processing parameters.

Use Cases

Web Applications — Sanitize user uploads before storage
Email Gateways — Clean attachments in transit
Content Management Systems — Process media before publishing
API Endpoints — Validate and transform incoming files

Current Status

The project is in active development. Image and audio processing are fully functional with comprehensive format support. Video processing is being implemented, and the text sanitization module handles encoding validation and basic XSS prevention.

The codebase prioritizes security and performance — every input is treated as potentially hostile, with validation at multiple layers.

Looking Ahead

Future plans include:

Complete video re-encoding pipeline
Additional output format options
Performance benchmarking and optimization
Expanded text sanitization rules

Disarm is an open-source project aimed at making file upload handling safer by default. Stay tuned for the initial release.*