Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Incorrect result of mpi_reduce over real(16) sums. (2019)

$
0
0

I have found that MPI_REDUCE does not perform correctly sum reduction over real(16) variables.

Here is a simple code:

program testred16

use mpi_f08

implicit none

integer :: me,np
real(16) :: voltq,voltq0
real(8) :: voltd,voltd0
!
! initialize mpi and get the rank and total number of precesses
!
call mpi_init
call mpi_comm_rank(mpi_comm_world,me)
call mpi_comm_size(mpi_comm_world,np)
!
! determine total volume of active computational domain and send to the master
!
voltq = 1.0q0
voltd = 1.0d0
write(*,*) 'voltq is',voltq,'in rank',me
write(*,*) 'voltd is',voltd,'in rank',me
voltq0 = 0.0q0
voltd0 = 0.0d0

call mpi_reduce(voltq,voltq0,1,mpi_real16,mpi_sum,0,mpi_comm_world)
call mpi_reduce(voltd,voltd0,1,mpi_real8, mpi_sum,0,mpi_comm_world)

if(me.eq.0) then
  write(*,*) 'voltq0 (16):',voltq0
  write(*,*) 'voltd0 ( 8):',voltd0
endif

call mpi_finalize

end program

I have compiled it by issuing the following command:

mpiifort -o  test-mpi-real-16 test-mpi-real-16.f90 -check all -traceback -O0 -debug -warn all

Here are some results:

$ mpiexec -np 2 ./test-mpi-real-16
 voltq is   1.00000000000000000000000000000000       in rank           1
 voltd is   1.00000000000000      in rank           1
 voltq is   1.00000000000000000000000000000000       in rank           0
 voltd is   1.00000000000000      in rank           0
 voltq0 (16):   1.00000000000000000000000000000000      
 voltd0 ( 8):   2.00000000000000     
$ mpiexec -np 4 ./test-mpi-real-16
 voltq is   1.00000000000000000000000000000000       in rank           1
 voltq is   1.00000000000000000000000000000000       in rank           2
 voltd is   1.00000000000000      in rank           2
 voltq is   1.00000000000000000000000000000000       in rank           3
 voltd is   1.00000000000000      in rank           3
 voltd is   1.00000000000000      in rank           1
 voltq is   1.00000000000000000000000000000000       in rank           0
 voltd is   1.00000000000000      in rank           0
 voltq0 (16):   1.00000000000000000000000000000000      
 voltd0 ( 8):   4.00000000000000     
$ mpiexec -np 8 ./test-mpi-real-16
 voltq is   1.00000000000000000000000000000000       in rank           1
 voltd is   1.00000000000000      in rank           1
 voltq is   1.00000000000000000000000000000000       in rank           2
 voltd is   1.00000000000000      in rank           2
 voltq is   1.00000000000000000000000000000000       in rank           4
 voltd is   1.00000000000000      in rank           4
 voltq is   1.00000000000000000000000000000000       in rank           6
 voltd is   1.00000000000000      in rank           6
 voltq is   1.00000000000000000000000000000000       in rank           7
 voltd is   1.00000000000000      in rank           7
 voltq is   1.00000000000000000000000000000000       in rank           3
 voltd is   1.00000000000000      in rank           3
 voltq is   1.00000000000000000000000000000000       in rank           5
 voltd is   1.00000000000000      in rank           5
 voltq is   1.00000000000000000000000000000000       in rank           0
 voltd is   1.00000000000000      in rank           0
 voltq0 (16):   1.00000000000000000000000000000000      
 voltd0 ( 8):   8.00000000000000

The reduction of real(16) variable is wrong, whereas real(8) reduction is right. I encountered same error in previous versions (2017,2018), but by issuing the environment variable I_MPI_ADJUST_REDUCE = 1, it was fixed. Now I cannot recover exact result whatsever value I set (or leaving it unset).


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>