Prolematic parallel calculation on certain workstations

Report any suspected bugs and unexpected behavior here

Prolematic parallel calculation on certain workstations

Postby xixipang » Tue Apr 24, 2018 4:41 pm

Hi!

Recently I am trying to use SERPENT 2.1.30 on some of my workstations.

The problem is that when the number of threads using OpenMP exceeds the number of core (I have already turned on the hyper-thread in bios), the memory usage increases drastically with ongoing depletion steps and the program stops due to insufficient memory. No MPI is used in compiling the code. This problem will not happen when the number of thread is less than the number of core, that the memory usage is stable during depletion. However, when other codes are running on the workstation occupying the CPU at the same time, the problem will accasionally appear even the number of thread is less than the core number.

The input problem is the sample input of PWR fuel assembly given in the manual. I have tried to compile serpent using gcc 4.8, gcc 5.4, gcc 7.2, icc 2018, running on Ubuntu 16.04, centos 6.9, centos 7.0 installed in thinkstation P910(Xenon E5 2690v4*2, 56 threads, 64g MEM) and Lenovo System 3850 (Xeon E7 4809 v3 *2, 32 threads, 32g MEM). The memory increasing problem seems to be general on both workstations. However, it will not happen on the workstation Dell Precision 7820T (xeon gold 5120*2, 56threads, 192g MEM Ubuntu 16.04). No matter the executable serpent was compiled on the which workstation, the problem appears only on the first two.

It seems that the reasons of compile, operation system have been excluded. Could this be a bug of the code or there is really some hardware incompatibility for the first two workstations?

Now we come up with a compromising plan that we recompile serpent using mpi and run serpent using the following configuration:

mpiexec –np 2 serpent –omp 20 input

Sometimes it works out but sometimes it fails. Due to the MEM limitation on first two workstations I haven't tried to use full mpi. It is quite random.

Does anyone have the same problem?

Thanks a lot!
xixipang
 
Posts: 23
Joined: Mon Nov 04, 2013 12:37 pm

Re: Prolematic parallel calculation on certain workstations

Postby aak52 » Mon Sep 03, 2018 7:41 pm

Hi,
We also see this problem on two different machines. Has a solution been found?
aak52
 
Posts: 5
Joined: Wed Apr 20, 2016 8:37 pm

Re: Prolematic parallel calculation on certain workstations

Postby Ville Valtavirta » Tue Sep 04, 2018 7:03 pm

Hi,

At which point of the calculation does this problem occur (preprocessing, transport, after active cycles, i.e. burnup)?

Does the problem occur when the number of threads exceeds the number of physical or logical cores?

As Serpent should not do any memory allocation in OpenMP-parallel parts, I'm not sure what this problem might be related to.

It might be a good idea to try different OpenMP-libraries, are you using Intel or GNU libraries?

-Ville
Ville Valtavirta
 
Posts: 281
Joined: Fri Sep 07, 2012 1:43 pm

Re: Prolematic parallel calculation on certain workstations

Postby Jaakko Leppänen » Wed Sep 05, 2018 10:31 am

When you say "no MPI is used in compiling the code", do you mean that also option "CFLAGS += -DMPI" is disabled in the Makefile)?

Have you tried running the calculation in DEBUG mode? Memory issues are often a symptom of some other problem, which may be caught earlier by setting the debug options on.
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1971
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Prolematic parallel calculation on certain workstations

Postby aak52 » Tue Sep 11, 2018 3:00 pm

Hi,

Memory usage increases exponentially with burnup. The attached plot shows burnup vs MISC_MEMSIZE from the .res output, with 150gb after 130 steps.

I compiled using icc 17.0.4, with "CFLAGS += -DMPI" disabled the Makefile. I am running on a HPC, on a single node (allocated exclusively to my Serpent job) with 2 x Intel Gold Xeon Skylake 6142 processors, 2.6GHz 16-core (32 cores per node), 192 GB per node.

I am not running on more threads than the number of physical cores (as far as I can tell). I've seen the problem with -omp 32, -omp 16, and -omp 15. But it occurs intermittently, and sometimes the memory grows very slowly, ending up with only ~200mb after 100 depletion steps. We will try it in DEBUG mode next and report if we find anything.

Thanks!
Attachments
serpent_mem.jpg
serpent_mem.jpg (24.78 KiB) Viewed 452 times
aak52
 
Posts: 5
Joined: Wed Apr 20, 2016 8:37 pm

Re: Prolematic parallel calculation on certain workstations

Postby Jaakko Leppänen » Fri Sep 14, 2018 12:04 pm

I think the best way to start debugging this would be to figure out which subroutine allocates the memory. The line that prints MISC_MEMSIZE in the standard output is:

Code: Select all
printf("MISC_MEMSIZE              (idx, 1)        = %1.2f;\n", RDB[DATA_TOT_MISC_BYTES]/MEGA);

if you copy this line to burnupcycle.c before various function calls, you should get a rough idea on where the memory allocation occurs (add some comment on each printout to distinguish them).

The main loop starts on line 95:

Code: Select all
      for (step = 0; step < steps; step++)

The main subroutines that follow are:

PrepareTransportCycle()
TransportCycle()
PrintCompositions()
WriteDepFile()
PrintDepOutput()
DepletionPolyfit()
SetDepStepSize()
BurnMaterials()
CollectBurnData()
SumDivCompositions()

Could you please add these checks and report back?
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1971
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Prolematic parallel calculation on certain workstations

Postby aak52 » Fri Sep 14, 2018 5:30 pm

Hi Jaakko,

Please see the attached. I've included the modified burnupcycle.c with MISC_MEM0...MISC_MEM10 inserted as you suggest, the output (sorry, only the MISC_MEM* printouts, as the entire output is too large to upload!), and my input file (run with -omp 32). It seems like the memory usage increases during TransportCycle().

Cheers
Alisha
Attachments
serpent_memory.zip
(71.89 KiB) Downloaded 9 times
aak52
 
Posts: 5
Joined: Wed Apr 20, 2016 8:37 pm

Re: Prolematic parallel calculation on certain workstations

Postby Jaakko Leppänen » Mon Sep 17, 2018 9:32 am

Thank you. That should narrow down the candidates to two subroutines:

normalizecritsrc.c
flushbank.c

Could you add similar print commands at the beginning and end of each to see if the memory size is increased while executing these subroutines?
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1971
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Prolematic parallel calculation on certain workstations

Postby aak52 » Mon Sep 17, 2018 2:22 pm

It appears the increases are coming from normalizecritsrc.c. Modified functions and printouts attached. Thanks!
Attachments
serpent_mem2.zip
(45.66 KiB) Downloaded 5 times
aak52
 
Posts: 5
Joined: Wed Apr 20, 2016 8:37 pm

Re: Prolematic parallel calculation on certain workstations

Postby Jaakko Leppänen » Tue Sep 18, 2018 11:57 am

I think the problem is caused by the fact that you are running only 2000 neutrons per cycle for 32 threads (less than 100 neutrons per thread). The random variation in the population size causes one of the thread-wise particle stacks to run out, which causes an error condition where the code starts to reallocate more memory after each cycle.

So the problem might be solved by increasing the population size (try at least 20000), which should also improve parallel scalability. You can also try changing the for-loop on line 430 of normalizecritsrc.c from:

Code: Select all
for (n = 0; n < (long)RDB[DATA_OMP_MAX_THREADS]; n++)
    if ((long)GetPrivateData(ptr, n) < min)
      min = (long)GetPrivateData(ptr, n);

to:

Code: Select all
  for (n = 0; n < (long)RDB[DATA_OMP_MAX_THREADS]; n++)
    if ((i = (long)GetPrivateData(ptr, n)) > 0)
      if (i < min)
   min = i;

This should get rid of the error condition.
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1971
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Next

Return to Bug reports

Who is online

Users browsing this forum: No registered users and 1 guest

cron