Hello All,
I've installed gridengine 6.2u2 from binaries on a linux x86_64 (CentOS 5.3). I've used the GUI installer. The installer lit "processing" for the first host for a few minutes then failed (and failed all other hosts by dependency). Checking the file in install_logs shows:
starting sge_qmaster
sge_qmaster start problem
Reached 5min timeout, while waiting for qmaster PID file.
sge_qmaster daemon didn't start. Please check your
autoinstall configuration file! Installation failed!
When I start /etc/init.d/sgeqmaster.default I get the same error. For 5 minutes it will try to execute
/usr/share/gridengine/bin/lx24-amd64/qping -info red2 6444 qmaster 1 which fails with:
endpoint red2/qmaster/1 at port 6444: can't find connection
The same thing happens if I use the CLI installer. Any help would be greatly appreciated.
qmaster won't start
Hi!
I have the same problem that "guyt" but when i do /bin/lx24-x86/gethostip hostname i have
hostaname: UnKnown host. I don't know why?
Thank you very mach
How is your hostname resolved?
Can you call /usr/share/gridengine/utilbin/lx24-amd64/gethostname? What does it return. Is this the expected hostname?
Check if you have a /tmp/sge_messages file on the host where you tried to install/start qmaster. If so, what does it say?
Lubos.
Hi! I have the same problem
Hi!
I have the same problem that "guyt" but when i do /bin/lx24-x86/gethostip hostname i have
hostaname: UnKnown host. I don't know why?
Thank you very mach
Re: How is your hostname resolved?
Thank you Lubos,
My hostname is resolved fine and there was no /tmp/sge_messages. I was banging my head agianst the wall with this for a couple of days then a kernel update came through the usual channel, I've done it and rebooted the machine and now qmaster starts without a problem. I don't know what it was, I doubt the new kernel had anything to do with this, more likely it was the reboot (how windows is that?)
anyway, it's fixed now so thanks.
Guyt