How a Simple Export Feature Turned Into a Performance Bottleneck and How We Scaled Past 20,000 Records
Ever faced a situation where a simple record exporting feature evolves into a full-blown performance nightmare? 🚨
Let me walk you through our journey of scaling challenges, a story that might hit close to home for many engineers.
The Starting Point:
The export feature began as a straightforward implementation:
• Synchronous processing
• Minimal complexity
Handling under 3,000 records was a breeze, until our users base grew, and things started to break.
🔥 The Challenge:
• Export requests timing out and blocking the main thread.
• Server resources maxing out, causing delays.
• Frustrated users unable to access their data.
1️⃣ Phase 1
Initial setup: Synchronous processing
• Problem: Main thread blocking
• Impact: System-wide slowdowns
• Reality check: We needed a scalable approach.
2️⃣ Phase 2: Going Asynchronous with Django RQ
To tackle the bottleneck, we:
• Started running export operations asynchronously as a background jobs using Django RQ.
• Ensured users received progress feedback on the initiated export.
• Result: Smooth sailing to 14K records
By the time we hit 14,000 records, the timeouts reappeared. But, this time, it wasn't blocking the main thread since it is running asynchronously. The system wasn’t broken, but it couldn’t keep up with the demand.
We briefly considered switching to Celery, a robust option for managing tasks at scale. But given our existing infrastructure and familiarity with Django RQ, we decided to stick with it. Instead, we focused on optimizing how we processed records.
3️⃣ The solution: Batch processing:
Instead of processing 10,000 records in one massive operation, we:
PDF_BATCH_SIZE = 1000
XLS_BATCH_SIZE = 2500
len_data = data.count()
if len_data and len_data > PDF_BATCH_SIZE:
for i in range(0, len_data, PDF_BATCH_SIZE):
batch = records[i:i+PDF_BATCH_SIZE]
process_export_batch.delay(batch, export_format)
else:
process_export_batch.delay(records, export_format)
• Split exports into manageable chunks based on format:
- PDFs: Smaller batches (~1,000 records) due to higher processing demands.
- CSVs: Larger batches (~2,500 records) because they’re less resource-intensive.
• Processed each batch independently
• Result: Significantly reduced processing time and resource consumption.
This simple yet effective technique allowed us to seamlessly scale to 20,000+ records without further interruptions.
🔑 Key Takeaways for Backend Engineers
1️⃣ Scaling is iterative: Each growth milestone brings new challenges—be ready to adapt.
2️⃣ Batch processing for the win: Breaking large tasks into smaller chunks minimizes resource strain.
3️⃣ Stick with what works—but optimize: Incremental changes often trump overhauls.
Scaling isn't about finding a silver bullet—it’s about evolving step by step.
Have you tackled any scaling challenges? Let’s share insights in the comments!