Results from mpi runs

Questions and discussion about applications, input, output and general user topics
Peter Wolniewicz
Posts: 135
Joined: Mon Dec 13, 2010 5:50 pm

Results from mpi runs

Post by Peter Wolniewicz » Wed Sep 28, 2011 2:51 pm

Hi! When running an mpi run, is the output that is continuously written on screen (or on nohup.out) the output from one of the threads or is it the average from all threads?

I.e., is the calculated k-effective and the detector output continuously collected or after all runs have finished?

The reason Im asking:

I have two files, the only difference between them is:

diff 19 19long
< set pop 200000 100000 100
---
> set pop 200000 100000 1000

the first one is run with 1 cpu, and the second one with 8 cpus. None of the runs are completed but a check on the .res files gives for file "19":

% ----- Begin active cycles -----

1.00191 1.00191 0.00000
1.00083 1.00137 0.00054
1.00280 1.00185 0.00057
1.00091 1.00161 0.00046
1.00051 1.00139 0.00042
1.00296 1.00165 0.00043
0.99701 1.00099 0.00076
0.99869 1.00070 0.00072
1.00384 1.00105 0.00072
1.00127 1.00107 0.00064
0.99944 1.00092 0.00060
1.00165 1.00098 0.00055
... and so on ...

and for file "19long" which is run on mpi :

% ----- Begin active cycles -----

0.99839 0.99839 0.00000
1.01223 1.00531 0.00688
0.98707 0.99923 0.00728
0.99960 0.99932 0.00515
0.99600 0.99866 0.00405
1.00167 0.99916 0.00334
0.98720 0.99745 0.00331
0.99654 0.99734 0.00287
1.00643 0.99835 0.00272
0.99697 0.99821 0.00244
0.99881 0.99827 0.00221
0.98858 0.99746 0.00217
0.99477 0.99725 0.00201
0.98995 0.99673 0.00193
... and so on ...

As you can see the file that is run on only one cpu and with less non-active cycles seems to have a much lesser uncertainty.
I expected the mpi run to be less uncertain not only because its 8 cpus but also because it has had a longer history of inactive cycles.

What have I missed?

cheers

/Peter

User avatar
Jaakko Leppänen
Site Admin
Posts: 2377
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Results from mpi runs

Post by Jaakko Leppänen » Wed Sep 28, 2011 9:33 pm

The results from all MPI tasks are combined at the end of the transport cycle. All results printed before that are from a single CPU. The history output doesn't combine results at all, not even at the end.
- Jaakko

Peter Wolniewicz
Posts: 135
Joined: Mon Dec 13, 2010 5:50 pm

Re: Results from mpi runs

Post by Peter Wolniewicz » Fri Sep 30, 2011 12:19 am

Ok so its pointless to have a run that goes continuously as you will never get any combined result from it ? :)

Peter Wolniewicz
Posts: 135
Joined: Mon Dec 13, 2010 5:50 pm

Re: Results from mpi runs

Post by Peter Wolniewicz » Fri Sep 30, 2011 12:23 am

BTW by the end of the transport cycle you mean at the end of each cycle or by the end of all cycles in the input file?

User avatar
Jaakko Leppänen
Site Admin
Posts: 2377
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Results from mpi runs

Post by Jaakko Leppänen » Fri Sep 30, 2011 1:07 am

Yes, the results are combined at the end of the entire transport simulation (not after each generation), and if the calculation isn't completed, you'll never get the results from the parallel tasks. This is something I'm planning to fix in Serpent 2.
- Jaakko

User avatar
fabioalcaro
Posts: 7
Joined: Wed Sep 19, 2012 6:00 pm
Security question 1: No
Security question 2: 92
Location: Torino

Re: Results from mpi runs

Post by fabioalcaro » Mon Sep 24, 2012 4:33 pm

Hi,
I have question concerning the parallel mode calculation. I run the BWR example in three different configurations, with fixed seed of random number generator. The exec commands for the three cases are respectively:
(1) sss bwr > bwr.log
(2) mpirun -np 2 sss bwr > bwr.log
(3) mpirun -np 4 sss bwr > bwr.log
The run parameters POP=2000 and CYCLES=500 are the same for all the run, but I obtain the following results for the KEFF (of course, other counters and reaction rates also differ!):
1) ANA_KEFF (idx, [1: 2]) = [ 1.06977E+00 0.00113 ];
(2) ANA_KEFF (idx, [1: 2]) = [ 1.07017E+00 0.00115 ];
(3) ANA_KEFF (idx, [1: 2]) = [ 1.06839E+00 0.00118 ];
I expected to obtain the same results with different time of computation. Where is the flaw in my reasoning?

SOLVED
Last edited by fabioalcaro on Mon Sep 24, 2012 7:02 pm, edited 1 time in total.
Image Fabio

User avatar
Jaakko Leppänen
Site Admin
Posts: 2377
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Results from mpi runs

Post by Jaakko Leppänen » Mon Sep 24, 2012 6:47 pm

The MPI parallel mode in Serpent 1 is not reproducible similar to MCNP. You should expect to see differences in the results when running with a different number of parallel tasks, but not beyond statistical accuracy.
- Jaakko

Peter Wolniewicz
Posts: 135
Joined: Mon Dec 13, 2010 5:50 pm

Re: Results from mpi runs

Post by Peter Wolniewicz » Wed Jul 30, 2014 11:41 pm

Has this been solved in current version? That is, Im looking for the combined results from the DET.m file

User avatar
Jaakko Leppänen
Site Admin
Posts: 2377
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Results from mpi runs

Post by Jaakko Leppänen » Thu Jul 31, 2014 3:40 pm

No, the calculation is still not reproducible in MPI mode. To attain reproducibility, we would have to sacrifice the good scalability of MPI parallelization, and it would also require major changes in the code. Serpent 2 has an option to run reproducible MPI calculations but the simulation slows down so much that it's only practical for debugging purposes. Even so, the reproducibility is often lost in burnup calculation, because of limited floating point precision of nuclide densities.
- Jaakko

Peter Wolniewicz
Posts: 135
Joined: Mon Dec 13, 2010 5:50 pm

Re: Results from mpi runs

Post by Peter Wolniewicz » Thu Jul 31, 2014 3:42 pm

Great! Thanks for clarifying that!

Best regards and keep up the good work,

Peter

Post Reply