Prolematic parallel calculation on certain workstations

Report any suspected bugs and unexpected behavior here

Prolematic parallel calculation on certain workstations

Postby xixipang » Tue Apr 24, 2018 4:41 pm

Hi!

Recently I am trying to use SERPENT 2.1.30 on some of my workstations.

The problem is that when the number of threads using OpenMP exceeds the number of core (I have already turned on the hyper-thread in bios), the memory usage increases drastically with ongoing depletion steps and the program stops due to insufficient memory. No MPI is used in compiling the code. This problem will not happen when the number of thread is less than the number of core, that the memory usage is stable during depletion. However, when other codes are running on the workstation occupying the CPU at the same time, the problem will accasionally appear even the number of thread is less than the core number.

The input problem is the sample input of PWR fuel assembly given in the manual. I have tried to compile serpent using gcc 4.8, gcc 5.4, gcc 7.2, icc 2018, running on Ubuntu 16.04, centos 6.9, centos 7.0 installed in thinkstation P910(Xenon E5 2690v4*2, 56 threads, 64g MEM) and Lenovo System 3850 (Xeon E7 4809 v3 *2, 32 threads, 32g MEM). The memory increasing problem seems to be general on both workstations. However, it will not happen on the workstation Dell Precision 7820T (xeon gold 5120*2, 56threads, 192g MEM Ubuntu 16.04). No matter the executable serpent was compiled on the which workstation, the problem appears only on the first two.

It seems that the reasons of compile, operation system have been excluded. Could this be a bug of the code or there is really some hardware incompatibility for the first two workstations?

Now we come up with a compromising plan that we recompile serpent using mpi and run serpent using the following configuration:

mpiexec –np 2 serpent –omp 20 input

Sometimes it works out but sometimes it fails. Due to the MEM limitation on first two workstations I haven't tried to use full mpi. It is quite random.

Does anyone have the same problem?

Thanks a lot!
xixipang
 
Posts: 23
Joined: Mon Nov 04, 2013 12:37 pm

Return to Bug reports

Who is online

Users browsing this forum: No registered users and 1 guest

cron