Hybrid MPI/Multithread runs with High memory requirements

Parallelization with OpenMP and MPI, scalability, reproducibility, errors, problems suggestions

Hybrid MPI/Multithread runs with High memory requirements

Postby Diego » Wed Dec 20, 2017 6:48 pm

Dear colleagues;
I'm trying to run a quite large burn problem (~60 GB RAM) in a cluster with up to 64 GB per node (several nodes available). The cluster uses MOAB for administrations (queues are submitted with msub, indicating mpi and OMPthreads, similar to qsub)

For such purpose I used to options of Hydrid MPI/Multithread compilation of serpent 2.1.29 and I get the following issues:

1- If I use OpenMPI (non-thread safe), everything looks OK but when increasing memory I get a calloc error:
Fatal error in function Mem:
Memory allocation failed (calloc, 655360, 8, 4612.92)

I suppose that it is related to the non-thread safe issue (it is quite random with problem size and running progress).

2- If I use Intel thread safe mpi compilation (intel mpi 2017), I get an error from queue system that I cannot understand the exit code.
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 3330 RUNNING AT nodeXX
= EXIT CODE: 9
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
(I also performed some checks in debug mode but no extra info was obtained). For me It looks like some thread exceeds the memory limit, but I am not sure how to check that. I have tried to assign all memory in the node to a single MPI, but then all OMP looks like being run in a single processor (Despite I get (O1) (U) (XE) (MPI=X) (OMP=YY) (CE) in the output).

Obviously, this problem disappears when the memory requirement is lower.

Does anybody have experienced similar problems before or has some advice/ideas to share?

Thanks in advance,
Best,
Diego
Diego
 
Posts: 70
Joined: Wed Jun 01, 2011 8:49 pm

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Jaakko Leppänen » Wed Dec 20, 2017 9:18 pm

How do you run this calculation (how many OpenMP threads and how many MPI tasks)? Are you sure all MPI tasks are started in different nodes, so that they do not share the memory space?

Also, at which point does this error occur? What is the last thing printed in the output log?
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1962
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Diego » Thu Dec 21, 2017 11:47 am

Hi Jaakko,
I'm performing calculations with 20 OMP (fixed) and increasing mpi from 1 to 100. To ensure that mpi are in different nodes I ask to report the bindings (-report-bindings to mpirun) and I get the list of nodes and Ranks.

For non-threaded cases, the error appears diverse processing stages when increasing memory. Even I get (for some ~30GB memory cases) to perform transport but I get the error in the burning step, something like
Running burnup calculation:
- Step 1 / 2
- Algorithm: CE
- Time interval: 40.3 minutes
- Burnup interval: 0.01 MWd/kgU
- Cumulative burn time after step: 40.3 minutes
- Cumulative burnup after step: 0.01 MWd/kgU
- Transmutation cross sections from direct tallies
- Bateman equations solved by CRAM
Burning 77248 materials:
0% complete
***** Fri Dec 15 14:40:51 2017:
- MPI task = 0
- OpenMP thread = 16
- RNG parent seed = 1513345091
***** Fri Dec 15 14:40:51 2017:
- MPI task = 0
- OpenMP thread = 13
- RNG parent seed = 1513345091
***** Fri Dec 15 14:40:51 2017:
- MPI task = 0
- OpenMP thread = 19
- RNG parent seed = 1513345091
- RNG history seed = 9963469143737094211
Fatal error in function Mem:
Memory allocation failed (calloc, 1557, 8, 35691.26)
Simulation aborted.


For the Thread safe, is quite random also (some times in memory allocation, some times in Calculating DT neutron majorant cross section, etc), but I have no print at all in the output (I looks like the queue systems aborts it without much advice).

Thanks,
Diego
Diego
 
Posts: 70
Joined: Wed Jun 01, 2011 8:49 pm

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Jaakko Leppänen » Sat Dec 23, 2017 2:52 pm

The error message means that the system call handling memory allocation returns an error, so basically that no more memory can be allocated. Can you check how much the calculation uses memory when it is running? If the consumption is close to the limit, additional memory required for the processing and burnup routines may just push it over it. The thing is that this error usually occurs at the same point of the calculation, unless some other process is running in the same memory space.
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1962
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Diego » Thu Jan 04, 2018 1:38 pm

Thanks Jaakko!
Unfortunately I have no way to check the memory allocated on the system (just from the serpent output ). When I use the OpenMPI to compile (non-thread safe) I can run the transport cycles when the requested RAM memory is well below the node RAM limit (considering the 0.8 factor used by serpent as default). Nevertheless everything looks fine but when I get to the burn calculation I get the error I wrote before:

Fatal error in function Mem:

Memory allocation failed (calloc, 1557, 8, 35611.26)

Simulation aborted.

It is strange as far as when I see the serpent output it looks like I should have enough memory available:

AVAIL_MEM (idx, 1) = 64040.27 ;
ALLOC_MEMSIZE (idx, 1) = 35611.26;
MEMSIZE (idx, 1) = 32846.30;

When I try to use the threaded-safe (intel mpi), I get stuck with the error from queue system (maybe some mpiexec.hydra options are not properly set and I cannot figure out what's wrong).

I would appreciate any ideas!
Thanks,
Diego
Diego
 
Posts: 70
Joined: Wed Jun 01, 2011 8:49 pm

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Jaakko Leppänen » Fri Jan 05, 2018 11:11 am

In burnmaterials.c there is a subroutine called BurnMaterials0(), that handles the OpenMP-parallel part of the burnup solution. Inside this subroutine there are calls to four major functions: CalculateTransmuXS(), StoreTransmuXS(), MakeBurmatrix() and MatrixExponential().

Could you add some print-statements in the subroutine to see how far the calculation gets? What makes things a bit more complicated is that there are several threads running the same routine and you prints from all of them, so you should probably add the thread number (OMP_THREAD_NUM) to each print.
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1962
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Diego » Thu Jan 11, 2018 1:42 pm

Jaakko,
I added several prints up to figure out were exactly is the Memory allocation fail (it's a little tricky with OMP).
I realized that the allocation fails in line 381 of burnmaterials.c (version 2.1.29 of serpent), inside the BurnMaterials0 function only for some OMP_THREADS
Code: Select all
l. 378  /* Allocate memory for composition vectors */
l. 379
l. 380   N0 = (double *)Mem(MEM_ALLOC, sz, sizeof(double));
l. 381   Neos = (double *)Mem(MEM_ALLOC, sz, sizeof(double));


I cannot figure out why this happens.

Besides, if I change the compiler (from intel to gnu, maintaining OpenMPI) I get the allocation error a little beyond (in line 91 of makeburnmatrix.c)
Code: Select all
l.89  /* Allocate memory for matrix */
l.90
l.91  A = ccsMatrixNew(sz, sz, nsz);


Again, I cannot figure out why.

Furthermore, we also detected some Memory allocation failures for other (simpler) models with lower memory demands at other calculation stages (for example for simple FA case we also get a memory allocation fail for this compilation after "Processing nuclide inventory list... " that is avoided if TMP treatment is not carried out).

BTW, do you have any recommendation of compilers (and options) for the hybrid MPI/Multithread that you've already tested?
Thanks in advance,
Diego
Diego
 
Posts: 70
Joined: Wed Jun 01, 2011 8:49 pm

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Jaakko Leppänen » Thu Jan 11, 2018 4:10 pm

The failure seems to occur when the code is trying to allocate more memory, so it may just be the case of insufficient memory to run the problem. Could you add another printf-statement before line 381:

Code: Select all
printf("%ld %lf\n", sz, RDB[DATA_REAL_BYTES]/MEGA);

This prints the length of the nuclide vector (should be ~1600) and the total amount of allocated memory.
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1962
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Diego » Fri Jan 12, 2018 10:49 am

Jaakko,
As you said, the output I get after including with this print statement is:

Code: Select all
1557 35611.256039


BTW, the limit of available memory is ~64 GB.

Diego
Diego
 
Posts: 70
Joined: Wed Jun 01, 2011 8:49 pm

Re: Hybrid MPI/Multithread runs with High memory requirements

Postby Jaakko Leppänen » Fri Jan 12, 2018 6:40 pm

So basically 34 GB out of 64 GB is used when memory allocation for 1557 doubles fails. Doesn't make any sense...

Can you double-check that there is nothing else running in the same node? Like log in an check with top or something like that?
- Jaakko
User avatar
Jaakko Leppänen
Site Admin
 
Posts: 1962
Joined: Thu Mar 18, 2010 10:43 pm
Location: Espoo, Finland

Next

Return to Parallelization

Who is online

Users browsing this forum: No registered users and 1 guest