Files
simple-mail-cleaner/README.md
2026-01-23 14:01:49 +01:00

303 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Simple Mail Cleaner
State-of-the-art mail cleanup tool with multi-tenant support, newsletter unsubscribe automation, and a modern web UI.
## Stack
- Backend: Node.js + TypeScript + Fastify
- Frontend: React + Vite + TypeScript + i18n
- DB: PostgreSQL
- Queue: Redis + BullMQ worker
Node.js:
- Docker images use Node.js 24.13.0 (LTS)
## Quick start
```bash
docker compose up --build
```
- Web UI: `http://localhost:${WEB_PORT}` (see root `.env`)
- API: `http://localhost:${API_PORT}`
- API Docs: `http://localhost:${API_PORT}/docs` (only if `ENABLE_SWAGGER=true`)
## API (initial)
- `POST /auth/register` `{ tenantName, email, password }`
- `POST /auth/login` `{ email, password }`
- `GET /tenants/me` (auth)
- `GET /mail/accounts` (auth)
- `POST /mail/accounts` (auth)
- `PUT /mail/accounts/:id` (auth)
- `DELETE /mail/accounts/:id` (auth)
- `POST /mail/cleanup` (auth) `{ mailboxAccountId, dryRun, unsubscribeEnabled, routingEnabled }`
- `GET /jobs` (auth)
- `GET /jobs/:id/events` (auth)
- `GET /jobs/:id/stream-token` (auth) -> short-lived SSE token
- `GET /jobs/:id/stream?token=...` (SSE using short-lived token)
- `GET /rules` (auth)
- `POST /rules` (auth)
- `PUT /rules/:id` (auth)
- `DELETE /rules/:id` (auth)
- `GET /admin/tenants` (admin)
- `PUT /admin/tenants/:id` (admin)
- `GET /admin/users` (admin)
- `PUT /admin/users/:id` (admin)
- `PUT /admin/users/:id/role` (admin)
- `POST /admin/users/:id/reset` (admin)
- `GET /admin/accounts` (admin)
- `PUT /admin/accounts/:id` (admin)
- `GET /admin/jobs` (admin)
- `GET /admin/jobs/:id/events` (admin)
- `POST /admin/jobs/:id/cancel` (admin)
- `POST /admin/jobs/:id/retry` (admin)
- `DELETE /admin/jobs/:id` (admin)
- `POST /admin/impersonate/:userId` (admin)
- `GET /admin/tenants/:id/export` (admin)
- `GET /admin/tenants/:id/export?scope=users|accounts|jobs|rules&format=csv|zip` (admin, zip returns jobId)
- `GET /admin/exports` (admin)
- `GET /admin/exports/:id` (admin)
- `GET /admin/exports/:id/download` (admin)
- `POST /admin/exports/purge` (admin)
- `DELETE /admin/exports/:id` (admin)
- `GET /jobs/exports/:id/stream-token` (admin) -> short-lived SSE token
- `GET /jobs/exports/:id/stream?token=...` (SSE using short-lived token)
Export queue:
- ZIP exports are queued via Redis/BullMQ and processed by the worker container.
- `DELETE /admin/tenants/:id` (admin)
OAuth:
- `POST /oauth/gmail/url` (auth)
- `GET /oauth/gmail/callback` (Google redirect)
- `GET /oauth/gmail/status/:accountId` (auth)
- `GET /oauth/gmail/ping/:accountId` (auth)
UI:
- Admin panel supports password reset, job cancel/retry, tenant export/delete, and impersonation.
## Notes
- Newsletter detection will use `List-Unsubscribe` headers + heuristics.
- Weblink unsubscribe uses HTTP first, mailto fallback (SMTP required).
- Worker scans headers and applies routing rules (MOVE/DELETE) when not in dry run.
## Cleanup job behavior (what the button does)
When you click **“Bereinigung starten / Start cleanup”** a cleanup job is created and queued. The worker connects to the selected mailbox and:
1. Opens the INBOX (or first mailbox matching “inbox”).
2. Fetches recent message headers (subject/from/headers).
3. Detects newsletter candidates (ListUnsubscribe, ListId, heuristics).
4. Applies your routing rules (MOVE/ARCHIVE/LABEL/DELETE) if enabled.
5. Attempts to unsubscribe using `ListUnsubscribe` (HTTP oneclick or mailto).
6. Logs all actions and progress as job events (visible in the UI).
### The three checkboxes explained
**Dry run (keine Änderungen)**
Runs the full scan and logs what *would* happen, but **does not move/delete/unsubscribe** any mail and **does not send unsubscribe emails**. Useful for testing rules safely.
**Unsubscribe aktiv**
Enables `ListUnsubscribe` handling.
- **Preference** is controlled by the admin setting **“UnsubscribeMethode bevorzugen”** (`UNSUBSCRIBE_METHOD_PREFERENCE`): `http` (default), `mailto`, or `auto`.
- **HTTP** is tried first when preference is `http` or `auto` (oneclick POST when supported).
- **Fallback:** if HTTP fails, the worker **automatically falls back to mailto** when available.
- **MAILTO** sends an email via SMTP (requires SMTP host + app password).
- If preference is `mailto`, mailto is tried first; HTTP is only attempted if no mailto target exists.
**Routing aktiv**
Applies your configured rules (conditions → actions).
- MOVE/ARCHIVE/LABEL/DELETE will be executed when not in dry run.
- If disabled, no rule actions are executed (only detection + optional unsubscribe).
## Seed data
```bash
cd backend
DATABASE_URL=postgresql://mailcleaner:mailcleaner@localhost:5432/mailcleaner \\
SEED_ADMIN_EMAIL=admin@simplemailcleaner.local \\
SEED_ADMIN_PASSWORD=change-me-now \\
SEED_TENANT=Default Tenant \\
SEED_TENANT_ID=seed-tenant \\
npm run prisma:seed
```
- DSGVO: tenant isolation supported; sensitive secrets are encrypted at rest when `ENCRYPTION_KEY` is set.
## Prisma 7 config
Prisma 7 moved datasource URLs out of the schema into `backend/prisma.config.ts`.
- `DATABASE_URL` must be set when running Prisma CLI commands (generate/migrate).
- The config loads the repo root `.env` automatically when run from `backend/`.
## Admin password reset (CLI)
Reset an admin password via CLI:
```
docker compose exec api npm run admin:reset -- admin@simplemailcleaner.local NEW_PASSWORD
```
Generate a temporary password (forces change on next login):
```
docker compose exec api npm run admin:reset -- admin@simplemailcleaner.local
```
## Security hardening (public hosting)
The app includes a security hardening pass for public deployments. Highlights:
- **No public DB/Redis ports** by default (only API/Web are bound, DB/Redis are internal to Docker).
- **CORS locked down** via `CORS_ORIGINS`.
- **Rate limiting** globally and stricter on auth endpoints.
- **Shortlived SSE tokens** instead of using the user JWT in URLs.
- **OAuth state signed** to prevent token injection.
- **SSRF protections** for ListUnsubscribe HTTP and custom mail hosts.
- **Secrets encrypted at rest** (OAuth tokens, app passwords, Google client secret).
- **Swagger disabled** by default in production.
- **Production env validation** rejects default secrets and missing encryption key.
### Findings and fixes (audit log)
- **Open DB/Redis ports** → removed public port bindings in `docker-compose.yml`.
- **Default secrets in production** → config validation blocks default JWT/seed secrets in `NODE_ENV=production`.
- **Tokens/app passwords stored in plain text** → encrypted at rest with `ENCRYPTION_KEY`.
- **SSRF via unsubscribe URLs / custom hosts** → private network block + scheme validation + timeouts.
- **OAuth state not verifiable** → state is now a signed, expiring JWT.
- **JWT in SSE URL** → replaced with shortlived stream tokens.
- **CORS allowall** → restricted by `CORS_ORIGINS`.
- **Swagger exposed** → disabled by default in production.
- **No rate limiting** → global and authspecific rate limits added.
### Required production settings
Set these in `.env` before going public:
- `NODE_ENV=production`
- `JWT_SECRET=<strong secret>`
- `ENCRYPTION_KEY=<min 32 chars>`
- `CORS_ORIGINS=https://your-domain.tld`
- `TRUST_PROXY=true` (when behind nginx)
- `ENABLE_SWAGGER=false`
- `SEED_ENABLED=false` (after initial setup)
### Optional hardening
- `ALLOW_CUSTOM_MAIL_HOSTS=false` (default) to force provider defaults
- `BLOCK_PRIVATE_NETWORKS=true` (default) to block private IPs in unsubscribe URLs
## Environment
All config lives in the repo root `.env` (see `.env.example`).
## Docker services (docker-compose)
The stack is split into small, single-purpose services so each part can scale and restart independently:
- **web** (Frontend UI)
- React + Vite singlepage app.
- Serves the user/admin interface and talks to the API via `VITE_API_URL`.
- In dev it runs the Vite server; in production it serves the built assets.
- **api** (Backend API)
- Fastify server with auth, rules, mail account management, and job control.
- Issues shortlived SSE tokens and exposes the job/event endpoints.
- Connects to Postgres for persistence and Redis for queues.
- **worker** (Background jobs)
- BullMQ worker that executes cleanup jobs and export jobs.
- Handles IMAP/Gmail processing, unsubscribe actions, and rule execution.
- Writes progress events and updates job state in Postgres.
- **postgres** (Database)
- Primary data store for users, tenants, mailboxes, rules, jobs, candidates, and settings.
- Keeps all state so jobs can resume after restarts.
- **redis** (Queue & cache)
- BullMQ queue backend for cleanup/export jobs.
- Used for fast job coordination between API and worker.
How they interact:
1. **web** calls **api** for login, rule management, and job start.
2. **api** enqueues jobs in **redis** and persists state in **postgres**.
3. **worker** consumes jobs from **redis**, processes mailboxes, and writes results to **postgres**.
4. **web** streams job events from **api** (SSE) for live progress.
Export settings:
- `EXPORT_DIR` (default `/tmp/mailcleaner-exports`)
- `EXPORT_TTL_HOURS` (default `24`)
Cleanup settings:
- `CLEANUP_SCAN_LIMIT` (default `0` = no limit). Set to a number to cap how many recent emails are scanned per run.
Proxy settings (Nginx Proxy Manager):
- `TRUST_PROXY=true`
- `VITE_API_URL=https://your-domain.tld`
- `GOOGLE_REDIRECT_URI=https://your-domain.tld/oauth/gmail/callback`
- `CORS_ORIGINS=https://your-domain.tld`
Local ports (override via `.env` in repo root):
- `BIND_IP` (default `127.0.0.1`)
- `API_PORT` (default `8000`, now set to `8201` in `.env`)
- `WEB_PORT` (default `3000`, now set to `3201` in `.env`)
## Reverse proxy notes (Nginx)
- Terminate TLS at nginx.
- Only expose nginx (80/443) publicly.
- Keep API/Web bound to `127.0.0.1` (or internal Docker network).
- Set `TRUST_PROXY=true` so the app honors `X-Forwarded-*` headers.
## Nginx Proxy Manager (NPM) setup
Minimal steps to run behind Nginx Proxy Manager with limited nginx customization.
### 1) Bind services locally
In `.env`:
```
BIND_IP=127.0.0.1
API_PORT=8201
WEB_PORT=3201
```
### 2) Put NPM and Mailcleaner in the same Docker network
If NPM runs in Docker, attach both stacks to a shared network (example: `proxy`).
Create network once:
```
docker network create proxy
```
Add to `docker-compose.yml`:
```
networks:
proxy:
external: true
```
Then attach services:
```
services:
api:
networks: [proxy]
web:
networks: [proxy]
```
### 3) Create proxy hosts in NPM
Create **two** Proxy Hosts:
**Frontend**
- Domain: `app.your-domain.tld`
- Scheme: `http`
- Forward Hostname/IP: `mailcleaner-web`
- Forward Port: `3000`
- Websockets: ON
- Block Common Exploits: ON
- SSL: Lets Encrypt, Force SSL
**API**
- Domain: `api.your-domain.tld`
- Scheme: `http`
- Forward Hostname/IP: `mailcleaner-api`
- Forward Port: `8201`
- Websockets: ON
- Block Common Exploits: ON
- SSL: Lets Encrypt, Force SSL
### 4) Environment for public hosting
Set in `.env`:
```
NODE_ENV=production
TRUST_PROXY=true
CORS_ORIGINS=https://app.your-domain.tld
VITE_API_URL=https://api.your-domain.tld
GOOGLE_REDIRECT_URI=https://api.your-domain.tld/oauth/gmail/callback
ENABLE_SWAGGER=false
JWT_SECRET=<strong secret>
ENCRYPTION_KEY=<min 32 chars>
SEED_ENABLED=false
```