Quantcast
Channel: Clusters and HPC Technology
Viewing all articles
Browse latest Browse all 927

Intel multinode Run Problem

$
0
0

Hi There,

I have a system with 6 computenodes, /opt folder is nfs shared and intel parallel studio cluster version installed on nfs server.

I am using slurm as workload manager. When i run a vasp job on 1 node there is no problem, But when i start to run the job on 2 or more nodes i am getting the following errors;

rank = 28, revents = 29, state = 1
Assertion failed in file ../../src/mpid/ch3/channels/nemesis/netmod/tcp/socksm.c at line 2988: (it_plfd->revents & POLLERR) == 0
internal ABORT - process 0

 

I tested the ssh between computenodes with sshconnectivity.exp /nodefile

The user information is shared over ldap server which is headnode.

I couldn't find a working solution in the net. Do anyone has ever had this error?

Thanks.

 

 


Viewing all articles
Browse latest Browse all 927

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>