Migrating from JSON Files to DynamoDB with boto3

I stored everything in JSON files — blog posts, social data, safety caches, geocoded addresses. It worked for a single-instance Flask app, but JSON on disk doesn't survive container recycling, can't share across nodes, and is coupled to your deploy pipeline.

Why DynamoDB

Cost and simplicity. On-demand pricing means $0/month at rest, ~$6/month at my write volume. No servers, no connection pools, no schema migrations. The architecture stayed the same: in-memory globals for fast reads, DynamoDB replacing json.dump() for persistence. On startup, query DynamoDB to populate memory. Two tables handle everything. com-blog-data for user/admin content (blog posts, social posts, sightings, comments) with TTL-based expiration per entity type. com-blog-cache for machine-generated data (3,323 safety incidents, 4,328 geocoded addresses). Generic PK/SK/ttl/data schema, no GSIs — all filtering happens in memory.

Secrets Migration

A deployment crash revealed that .ebignore's recursive auth.json pattern was silently breaking subdirectory imports. This prompted an audit of every module reading credentials from files. Now every module checks the environment variable first, falls back to auth.json for local dev, and handles errors gracefully. GEMINI_API_KEY, FLASK_SECRET_KEY, and auth credentials all follow this pattern.

Lessons

A one-character .ebignore change silently broke three modules at import time. The real fix wasn't patching the ignore file: it was making every module resilient to missing credentials. DynamoDB's Decimal requirement for numbers needs conversion on both read and write, easy to forget. Each migration phase renamed JSON files to .migrated instead of deleting, so rollback is just a rename and redeploy. Four phases, each independently deployable: blog posts first, then user-generated content, then large caches, then cleanup. Total cost: ~$6/month for storage that survives deployments and scales to multiple nodes.