Use this to figure out how big your virtual machine should be for the On Premise Engine.
TLDR Simple Requirements
More cores decrease runtime. Min 2GB of RAM per core.
History Scan Core Count Equation - 100 mailboxes, 1 year of history on a 4 core machines takes 2.2 to 11 days to process.
Ongoing daily scans - Minimum 2 cores suggested. Each core can probably handle 200 mailboxes.
The On Premise Engine can't fully utilize and you'll see diminishing returns after 24 cores.
Detailed Breakdown
200K Emails/CPU/Day
Based on our experience a modern CPU can process about 200K emails per day per core.
The On Premise Engine is normally configured to not transmit internal only emails. The 95th percentile mailbox receives an average of 250 emails per day that are external. But the average mailbox only has 50 emails per day that are external.
History Scans
For a history scan where you are going back multiple days you can improve the runtime by allocating more CPU cores. The rough estimation is below.
Mailboxes X 250 emails per day X Days of history = Total Emails
Total Emails / CPU Cores / 110K = Days to Process History
Example History Pull
4 core machine
100 mailboxes
365 days of history
Worst Case (250 external emails per day)
100 mailboxes X 250 external emails per day X 365 days of history = 9.1M emails
9.1M emails / 4 CPU Cores / 200K = 11.375 days to process history
Best Case (50 external emails per day)
100 mailboxes X 50 external emails per day X 365 days of history = 1.825M emails
1.825M emails / 4 CPU Cores / 200K = 2.28 days to process history
⚠️ After 24 cores performance per core may not continue to scale.
Ongoing Daily Scans
We suggest always having at least 2 cores even for daily ongoing scans.
For daily processing you need enough cores to process 1 day of data. Sometimes additional mailboxes will be added so you'll need some some spare capacity to check those mailboxes for history as well.
Worst Case (250 external emails per day)
100 mailboxes X 250 external emails per day X 1 days of history = 25K emails
25K emails / 1 CPU Cores / 200K = 12.5% utilization
Best Case (50 external emails per day)
100 mailboxes X 50 external emails per day X 1 days of history = 5K emails
5K emails / 1 CPU Cores / 200K = 2.5% utilization
RAM Requirements
You should allocate at least 2 GB of RAM per CPU.
Cores, not logical processors
Don't count logical processors as part of the core count.
For example, if you have 8 cores and 16 logical processors then only 8 are "real" cores. The other 8 are hyperthreading cores. You'll never get your performance above 50% when looking at CPU usage based on logical processors. If you're at 50% you're using 100% of your cores.
The important part is that the two logical processors share the same Execution engine, meaning that the units that make up the core are not duplicated. Once, for example, the arithmetic unit is used by one thread, it cannot be used by the other thread. This prevents total parallelism, so does not allow two threads to execute in parallel instructions of the same type - one has to wait for the other to finish.
has_hyperthreading_cores = true
If your server has hyperthreading on the cores then you need to set has_hyperthreading_cores = true in the appsettings.json file or environment variables. Otherwise SigParser will oversubscribe cores which will make performance worse.