Some thoughts on load balancing...

Parallelization with OpenMP and MPI, scalability, reproducibility, errors, problems suggestions
Post Reply
Andy_Turner
Posts: 18
Joined: Tue Jan 12, 2016 11:38 am
Security question 1: No
Security question 2: 92

Some thoughts on load balancing...

Post by Andy_Turner » Mon Sep 19, 2016 1:18 pm

Since development is turning to variance reduction and weight widows, and a few people have asked about 'rendezvous/dump' capability, I just wanted to include my thoughts on some related issues.

With MCNP the existing work distribution model is not ideal. Since there is a desire to write restart files at intervals during a calculation, there exists fixed batches of histories. Within a batch, histories are distributed equally between MPI ranks, and all ranks are required to finish their batch before the next set of batches is distributed.

Variance reduction leads to some histories being quickly terminated, whilst others can be sampled many times. This distribution of time-per-history can lead to MPI inefficiency. Extreme cases would be having 1-2 histories that take hours, which then can't be distributed evenly across nodes.
I have often said to MCNP developers, that what would really help scalability is some more clever work distribution policy.

- Assume there is a requirement to write a restart file at intervals. In that case, when any process has exhausted its history stack it can't just go and get more work from the master.
- However, it would be good for a rank to be able to give work back to the master or to idle ranks. E.g. If one node still has a large particle bank whilst another is doing no work, it can claw some of the work back from one node and send it out to another. This would require a little communication at 'balance intervals' between the main rendezvous cycle.
- If possible, it would be useful to be able to permit the particle bank associated with a single history to be distributed across nodes. e.g. where one history is particularly involved, parts of it can be started on other nodes.

I also have read that for 10,000 core scaling and higher, people need to implement decentralised communication. E.g. there is no rank0 coordinator, and each rank is only allowed to talk to a limited number of other ranks. You distribute the particle banks initially to the nodes, and at fixed balance intervals the busy nodes pass some of the particle bank directly to the less busy nodes.

Anyway maybe there is potential to implement something that is smarter and more effective than what MCNP has done.
- Andy Turner, CCFE

User avatar
gavin.ridley.utk
Posts: 95
Joined: Wed Mar 16, 2016 7:07 am
Security question 1: No
Security question 2: 40

Re: Some thoughts on load balancing...

Post by gavin.ridley.utk » Thu Nov 01, 2018 1:50 am

Are you referring to Paul Romano's PhD thesis in terms of the decentralized communications?
Gavin Ridley

Yellowstone Energy

Post Reply