N0153.DEV

Project @ 2026-06-02T22:32:58.294976Z

The Problem It Solves

Every application that accepts user-uploaded files faces the same uncomfortable question: what exactly is in that file? A JPEG upload could carry embedded malware. A text file could contain script injection payloads, malicious Unicode, or dangerous URL schemes. An audio clip could be crafted with a bitrate designed to trigger a parsing bug in your decoder. The naive approach — checking the file extension and moving on — is plainly not enough.

Disarm is a Java library that takes a different approach: instead of trying to detect every known threat, it destroys the threat surface entirely by re-encoding every file from scratch. The output is always a clean, standard-compliant file with known-good properties. Whatever was hiding in the input doesn't survive the trip.

How It Works

Disarm's architecture is built around a layered defense model. Before a single byte of a file is handed to a codec, it passes through a multi-stage validation pipeline:

Path safety — traversal sequences (../, ./) and symlinks are rejected outright.
MIME whitelist — only explicitly permitted formats are accepted, checked against an immutable FormatRegistry.
File size enforcement — per-format size limits prevent resource exhaustion before any processing begins.
Codec whitelisting — audio and video codec strings are checked against a known-good list per container format.
Bitrate and sample rate bounds — values are validated against format-specific maximums (e.g. MP3 caps at 320 kbps, WAV at 4,608 kbps).
Duration limits — audio and video tracks are capped at a configurable maximum (default 5 minutes) before any decoding begins.

Only after all six gates pass does the actual re-encoding happen.

What It Handles

Disarm covers four media categories, each with its own sanitization logic:

Images

Images are decoded via OpenCV and re-scaled to configured maximum dimensions (default 512×512px), preserving aspect ratio. The output is always a clean raster with no embedded metadata or hidden payloads. As a bonus, Disarm supports optional watermarking: a logo image can be overlaid at any of five positions (top-left, top-right, bottom-left, bottom-right, or randomized), with configurable transparency.

Audio

Audio re-encoding runs through JAVE (FFmpeg wrapper) and supports MP3, OGG, FLAC, WAV, AU, and AIF. If FFmpeg fails for any reason, Disarm automatically falls back to a native javax.sound.sampled implementation for WAV, AIF, and AU formats — so processing never silently stops. All output audio is standardized to 44.1 kHz, 2 channels, 128 kbps (except FLAC).

Video

Video sanitization uses JAVE/FFmpeg to re-encode MP4, WEBM, MKV, and MOV files, validating both the video and embedded audio tracks independently. Bitrate, codec, sample rate, frame rate, and duration are all checked before processing begins.

Text

Text sanitization is more nuanced than it might look. Disarm:

Detects and handles BOM markers for UTF-8, UTF-16 BE/LE, and UTF-32 BE
Validates that file bytes are valid UTF-8 or ASCII
HTML-escapes the five dangerous characters (&, <, >, ", ')
Strips <script> tags, control characters (U+0000–U+001F, U+007F–U+009F), and dangerous URL schemes (javascript:, data:, vbscript:)
Applies NFKC Unicode normalization and removes zero-width characters (U+200B–U+200D, U+FEFF)
Always writes output as UTF-8, regardless of input encoding

The result is a clean, normalized, safely-escaped UTF-8 text file.

Clean API Design

Disarm exposes a straightforward fluent configuration API:

DisarmConfig config = DisarmConfig.builder()
    .setGeneralOutputPath("output/")
    .setAudioMaxDuration(600_000)   // 10 minutes in ms
    .setImgMaxWidth(1024)
    .setImgMaxHeight(1024)
    .setKeepOriginal(false)
    .build();

The DisarmConfig object is fully immutable once built — no setters, no mutation after construction. The runtime state that flows through a processing pipeline lives in a separate DisarmState object, keeping configuration and execution cleanly separated.

Disarming a file is a single call:

App app = new App(state, config);
long ms = app.fileDisarm(Path.of("uploads/suspicious.mp3"));

Bulk processing, optional logo watermarking, and delete-on-completion are all first-class features of the same API.

Security Philosophy

A few design choices are worth highlighting:

Error messages are intentionally vague. When validation fails, Disarm returns generic rejection messages rather than explaining precisely which check failed. This is deliberate — detailed error feedback gives an attacker a roadmap for crafting inputs that inch closer to passing.

Format restrictions are compile-time constants. The FormatRegistry is a final class with no setters. Changing the whitelist requires recompiling the library, not editing a config file. This prevents runtime misconfiguration.

Re-encoding, not inspection. Disarm doesn't maintain a database of threat signatures. It doesn't try to detect malware. It simply destroys the original and builds a fresh, safe file. Known-unknown threats become irrelevant when the output is always produced from scratch.

CLI Support

Disarm ships with a PicoCLI-based command-line interface for direct use without embedding it in an application:

java -jar disarm-1.0-SNAPSHOT-uber.jar -o output/ -l logo.png -do uploads/file.mp4

Options:

-o / --output — output directory
-l / --logo — optional watermark image (PNG only)
-do / --delete-original — remove the source file after processing

The uber-jar bundles all dependencies, so deployment is a single file copy.

Stack

Image processing — OpenCV 4.9.0 (openpnp)
Audio/video re-encoding — JAVE 3.5.0 (FFmpeg wrapper)
CLI parsing — PicoCLI
Logging — Log4j2
Testing — JUnit 5.11.0
Build — Maven, Java 21

Current State and Roadmap

Disarm v0.1 is functional across all four media categories. Known limitations being tracked for future releases:

UTF-32 LE BOM detection is not yet implemented
MIME type detection currently relies on Files.probeContentType() — a magic-bytes fallback is planned for v0.2
Absolute path checking is intentionally disabled for local use; it will be re-enabled alongside networking features
Thread safety is not guaranteed — DisarmState is designed for single-file sequential processing

Why This Matters

File upload handling is one of the most consistently exploited attack surfaces in web applications. Libraries that simply validate file extensions or check MIME headers provide weak guarantees — MIME types are trivially spoofed and extension checks are easily bypassed. A library that re-encodes every input from scratch, validates it against strict format-specific limits before touching it, and produces output with known-good properties is a fundamentally stronger defense.

Disarm's approach is simple enough to audit, opinionated enough to be useful out of the box, and configurable enough to adapt to different deployment requirements. For any Java application that needs to handle untrusted media uploads, it's a solid foundation.

Disarm is MIT licensed. Source: disarm.n0153.dev