Serpent2 on blue gene/q

Parallelization with OpenMP and MPI, scalability, reproducibility, errors, problems suggestions
orca.blu
Posts: 59
Joined: Wed Apr 20, 2011 1:39 pm

Serpent2 on blue gene/q

Post by orca.blu » Thu Apr 18, 2013 5:08 pm

Hi,

I am trying to use Serpent 2 on a blue gene/q system with runjob.

I do not know how to pass the -mpi and -omp arguments to serpent.

After successful cross-compilation,
I tried with:
  • [...]
    EXEC=/fermi/home/userexternal/maufiero/serpent/zero/serpent2.1.12/sss2
    runjob --np $TOTAL_MPI_PROCESSES --ranks-per-node $TASK_PER_NODE --envs OMP_NUM_THREADS=$THREADS --env-all : $EXEC -mpi 1024 -omp 2 /gpfs/scratch/userexternal/maufiero/input/beta/8_u235_u238_xs_drift_i/main_input$
    [...]
but it does not work:
  • Job started at Thu Apr 18 15:51:21 CEST 2013

    ***** Thu Apr 18 15:51:24 2013 (seed = 1366293084, MPI task = 0, OMP thread = 0)

    Fatal error in function ParseCommandLine:

    Recursive call failed

    Simulation aborted.
I also tried with “--args” but I got the same result.

We asked to the support... but maybe it is something really stupid.

Do you have an idea?

Thank you.

User avatar
Jaakko Leppänen
Site Admin
Posts: 2356
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Serpent2 on blue gene/q

Post by Jaakko Leppänen » Thu Apr 18, 2013 11:35 pm

The error message is related to MPI. When started with the -mpi command line option, Serpent actually executes itself using mpirun. My guess is that the executable is not found in the path where Serpent thinks it is. The path is defined near the beginning of header.h:

Code: Select all

/* mpirun executable path */

#define MPIRUN_PATH "mpirun"
Another option is to run sss2 directly using mpirun without the -mpi command line option. In fact, I think you are already doing the same thing with runjob, so maybe just leave out the "-mpi 1024" option and see if that works?
- Jaakko

orca.blu
Posts: 59
Joined: Wed Apr 20, 2011 1:39 pm

Re: Serpent2 on blue gene/q

Post by orca.blu » Mon Apr 22, 2013 11:44 am

Thanks, it starts after removing “-mpi 1024”.

Unfortunately a new problem arose.

May be you have a simple working suggestion for this problem too...

Serpent keeps on reading the input.
After some fprintf in readtextfile.c,
I realized that the EOF (end of file) character is never recognized.

Anyone with similar problems?
Is the EOF codification machine dependent? Do you know where it is defined?
Any other suggestions?

Thank you!
Manuele Aufiero
LPSC/IN2P3/CNRS Grenoble

User avatar
Jaakko Leppänen
Site Admin
Posts: 2356
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Serpent2 on blue gene/q

Post by Jaakko Leppänen » Mon Apr 22, 2013 12:08 pm

Have you tried to add an extra newline at the end of the input file?
- Jaakko

orca.blu
Posts: 59
Joined: Wed Apr 20, 2011 1:39 pm

Re: Serpent2 on blue gene/q

Post by orca.blu » Mon Apr 22, 2013 3:21 pm

Yes, it does not help.

It keeps on reading the EOF character, without realizing that it is the EOF character...

I added some fprintf:

Code: Select all

[...]
      while((c = fgetc(fp)) != EOF)
        {
           fprintf(out, "%c ... %i\n", c, sz);
[...]
and I got:

Code: Select all

[...]
s ... 1294
e ... 1295
t ... 1296
  ... 1297
d ... 1298
e ... 1299
l ... 1300
n ... 1301
u ... 1302
  ... 1303
0 ... 1304

 ... 1305

 ... 1306           <-- this is the last real character (newline)
� ... 1307
� ... 1308
� ... 1309
� ... 1310
� ... 1311
[...]
Manuele Aufiero
LPSC/IN2P3/CNRS Grenoble

User avatar
Jaakko Leppänen
Site Admin
Posts: 2356
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Serpent2 on blue gene/q

Post by Jaakko Leppänen » Mon Apr 22, 2013 11:49 pm

Have you tried running the same input in another system? It could be something platform-dependent.

Also, DOS-format text files are different from UNIX-format files, and since the input routine cannot handle DOS-format, Serpent tries to identify the format and terminate the calculation with an error message. This could could be something that slips through the check.
- Jaakko

User avatar
Jaakko Leppänen
Site Admin
Posts: 2356
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Serpent2 on blue gene/q

Post by Jaakko Leppänen » Mon Apr 22, 2013 11:55 pm

Could you try printing the numerical value of the character:

Code: Select all

[...]
      while((c = fgetc(fp)) != EOF)
        {
           fprintf(out, "%c %d... %i\n", c, (int)c, sz);
[...]
- Jaakko

orca.blu
Posts: 59
Joined: Wed Apr 20, 2011 1:39 pm

Re: Serpent2 on blue gene/q

Post by orca.blu » Tue Apr 23, 2013 11:42 am

I tried on many other systems (x86) without any problem.

Serpent continues reading character 255 instead of EOF.
I believe it is a machine dependent problem, I asked to the support for help.

Code: Select all

s 115... 1294
e 101... 1295
t 116... 1296
  32... 1297
d 100... 1298
e 101... 1299
l 108... 1300
n 110... 1301
u 117... 1302
  32... 1303
0 48... 1304

 10... 1305

 10... 1306
� 255... 1307
� 255... 1308
� 255... 1309
� 255... 1310

Manuele Aufiero
LPSC/IN2P3/CNRS Grenoble

orca.blu
Posts: 59
Joined: Wed Apr 20, 2011 1:39 pm

Re: Serpent2 on blue gene/q

Post by orca.blu » Tue Apr 23, 2013 3:50 pm

The suggested to change the code everywhere with:

Code: Select all


 while(1) { 
    c = fgetc(fp); 
    if (feof(fp)) break; 
[...]
   }

or something like that.

:(

As far as you remember, are there many source files in which text files are read?
Manuele Aufiero
LPSC/IN2P3/CNRS Grenoble

User avatar
Jaakko Leppänen
Site Admin
Posts: 2356
Joined: Thu Mar 18, 2010 10:43 pm
Security question 2: 0
Location: Espoo, Finland
Contact:

Re: Serpent2 on blue gene/q

Post by Jaakko Leppänen » Wed Apr 24, 2013 12:22 pm

I counted about 12 subroutines that compare a string character to EOF. So there's a lot of if-statements that need to be changed, but the modification should be relatively simple.

Did you try the feof-function in readtextfile.c? If it works in blue gene/q, does it also work in other systems as well?
- Jaakko

Post Reply