Page 1 of 1

MPI run does not stop using Torque

Posted: Wed Oct 18, 2017 3:23 am
by drvcj
Our cluster use Torque/Maui for workload management but I'm not if this is relevant or not. The issue is Serpent MPI calculation does not stop on completion automatically. It just keep running without producing any new results or updating output files.

I'm running Serpent 2.1.29 and the MPI executable was compiled using MPICH.

Has anyone else seen this before?

Re: MPI run does not stop using Torque

Posted: Wed Oct 18, 2017 9:16 am
by Jaakko Leppänen
What is the last thing that the code prints out?

Re: MPI run does not stop using Torque

Posted: Sat Oct 21, 2017 7:00 am
by drvcj
In the .out file, it is:

Isotopic composition (non-zero densities):

-------------------------------------------------------------------
Nuclide a. weight temp a. dens a. frac m. frac
-------------------------------------------------------------------
6012.12c 11.99999 1200.0 3.13749E-05 3.54948E-04 8.00000E-05
......

In the _res.m file:
% Delayed neutron parameters (Meulekamp method):

BETA_EFF (idx, [1: 14]) = [ 6.98240E-03 0.00799 2.03562E-04 0.04540 1.07096E-03 0.02025 1.04737E-03 0.02181 3.18216E-03 0.01214 1.09692E-03 0.02026 3.81435E-04 0.03572 ];
LAMBDA (idx, [1: 14]) = [ 8.69875E-01 0.01924 1.24908E-02 3.0E-06 3.15473E-02 0.00039 1.10531E-01 0.00046 3.21989E-01 0.00035 1.34274E+00 0.00025 8.97243E+00 0.00224 ];

In the _sens.m file:
ADJ_PERT_KEFF_SENS_E_INT = reshape(ADJ_PERT_KEFF_SENS_E_INT, [2, SENS_N_PERT, SENS_N_ZAI, SENS_N_MAT]);
ADJ_PERT_KEFF_SENS_E_INT = permute(ADJ_PERT_KEFF_SENS_E_INT, [4, 3, 2, 1]);

BTW, I'm running sensitivity calculation. It has been hanging there for a day. Thank you very much for your help.

Re: MPI run does not stop using Torque

Posted: Sat Oct 21, 2017 4:53 pm
by Jaakko Leppänen
What about the run-time log?

Re: MPI run does not stop using Torque

Posted: Sat Oct 21, 2017 6:36 pm
by drvcj
Hi,

Here is the run time log.

Code: Select all

------------------------------------------------------------

Serpent 2.1.29 -- Static criticality source simulation

Title: "DLFR-Core"

Active cycle  500 / 500  Source neutrons :  7981

Running time :                  6:30:02
Estimated running time :        6:30:02
Estimated running time left :   0:00:00

Estimated relative CPU usage :    99.5%

k-eff (analog)    = 1.03990 +/- 0.00066  [1.03861  1.04119]
k-eff (implicit)  = 1.04034 +/- 0.00027  [1.03981  1.04088]

(O4) (SENS) (MPI=1) (OMP=1)
------------------------------------------------------------

Transport cycle completed in 4.44 hours.
Note there is another "completion" section 206 lines above the bottom:

Code: Select all

------------------------------------------------------------

Serpent 2.1.29 -- Static criticality source simulation

Title: "DLFR-Core"

Active cycle  500 / 500  Source neutrons :  7981

Running time :                  6:05:09
Estimated running time :        6:05:09
Estimated running time left :   0:00:00

Estimated relative CPU usage :    98.6%

k-eff (analog)    = 1.03990 +/- 0.00066  [1.03861  1.04119]
k-eff (implicit)  = 1.04034 +/- 0.00027  [1.03981  1.04088]

(O4) (SENS) (MPI=1) (OMP=1)
------------------------------------------------------------

Transport cycle completed in 4.35 hours.

Re: MPI run does not stop using Torque

Posted: Tue Oct 24, 2017 12:56 pm
by Jaakko Leppänen
The "MPI=1" suggests that you may be running multiple independent calculations instead of a single MPI-parallelized calculation. See: http://serpent.vtt.fi/mediawiki/index.p ... t_MPI_mode

Re: MPI run does not stop using Torque

Posted: Wed Oct 25, 2017 3:23 am
by drvcj
Thanks for the explanation.

I compiled Serpent 2 with the following option

Code: Select all

# GNU Compiler:

CC       = gcc
CFLAGS   = -Wall -ansi -ffast-math -O3
LDFLAGS  = -lm

# Parallel calculation using Open MP:

CFLAGS  += -DOPEN_MP
CFLAGS  += -fopenmp
LDFLAGS += -fopenmp

# This is needed in newer gcc versions to supress some unnecessary warnings

CFLAGS += -Wno-unused-but-set-variable

# Remove this if compilation with mpicc produces unnecessary warnings

CFLAGS += -pedantic
And at the same time, I also turned on

Code: Select all

# Parallel calculation using MPI:

# NOTE: The use of hybrid MPI/OpenMP mode requires thread-safe MPI
#       implementation. Some MPI implementations, such as some versions (?) of
#       Open MPI are not thread safe, which will cause problems in memory
#       management routines (calloc, realloc and free). These problems may
#       result in failure in memory allocation or unexpected behaviour due to
#       corrupted registers (?).

CC       = mpicc
CFLAGS  += -DMPI
The calculation is then executed by "sss2 -mpi N". To be honest, I'm not sure if the MPI messes things up. I'm going to run a case with "sss2 -omp N".