Overview
In the Distributed Service Management UI, administrators can configure the Batch Size for Ingestion jobs.
Batch size controls how many records are processed per database transaction before the system commits and distributes work to processing instances.
The default values are tuned for an 8-core processor, but optimal settings may vary depending on:
Total record volume
Number of ingestion instances
SQL Server performance
Storage I/O capability
This article explains how to choose an appropriate batch size based on your environment.
Note: If you have to choose between a single 16-core machine or two 8-core machines, it’s generally better to go with two 8-core machines. This is because of how I/O fetch works: in one fetch cycle, each 8-core machine can fetch 100 units, so two machines together fetch 200 units, whereas a single 16-core machine only fetches 100 units in the same cycle.
This makes the dual 8-core setup more efficient for I/O-heavy workloads.
Key Factors That Affect Batch Size
Two primary variables determine optimal configuration:
Total Records Being Processed
High volume (for 8 instances: 100,000+ records)
Low volume (under 100,000 records)
Number of Ingestion Instances
Low (≤ 8 instances)
High (8+ instances)
Recommended Batch Size Guidelines
| Record Volume | Instances | Recommended Batch Size | Reasoning |
|---|---|---|---|
| High (100K+) | ≤ 8 | 100 | Fewer instances reduce SQL contention. Larger batch size reduces repeated database fetch operations. |
| Low (<100K) | ≤ 8 | 25 | Smaller dataset does not require large batches. Prevents uneven workload distribution. |
| High (100K+) | 8+ | 50 | More instances increase SQL I/O pressure. Moderate batch size balances distribution and prevents SQL choking. |
| Low (<100K) | 8+ | 25 | Smaller batches allow better distribution across workers and prevent idle instances. |
Why Batch Size Matters
Batch size directly impacts:
SQL CPU utilization
Disk I/O
Worker distribution efficiency
Overall ingestion throughput
If Batch Size Is Too High:
SQL Server may experience high CPU utilization
Increased I/O pressure
Longer lock durations
Potential performance degradation across the system
If Batch Size Is Too Low:
Increased database round trips
More frequent commits
Reduced overall throughput
Possible worker idle time in distributed environments
General Recommendation
The numbers above are guidelines — not fixed rules.
Final batch size selection should consider:
SQL Server performance capacity
CPU and RAM availability
Disk I/O speed
Network latency (if distributed across machines)
If your SQL Server is high-performing and not experiencing CPU or I/O pressure, batch sizes may be increased cautiously while monitoring performance metrics.
Monitoring Best Practices
When adjusting batch size:
Monitor SQL CPU usage
Monitor disk latency
Watch ingestion throughput (records/hour)
Check for worker idling
Observe SQL blocking or long-running transactions
Adjust incrementally rather than making large jumps.
Summary
Batch size should balance:
Efficient distribution across ingestion instances
Controlled SQL resource usage
Stable ingestion throughput
Start with the recommended guidelines and fine-tune based on your system’s performance characteristics.
Flowchart Diagram
Final Monitoring Step (Always Apply)
After setting batch size:
Is the SQL Server instance experiencing memory pressure or deadlock contention?
│
├── YES → Reduce Batch Size
│
└── NO → System stable
│
└─ If SQL underutilized,
cautiously increase batch size
Comments
0 comments
Please sign in to leave a comment.