qmaster won't start

Hello All,

I've installed gridengine 6.2u2 from binaries on a linux x86_64 (CentOS 5.3). I've used the GUI installer. The installer lit "processing" for the first host for a few minutes then failed (and failed all other hosts by dependency). Checking the file in install_logs shows:


starting sge_qmaster

sge_qmaster start problem

Reached 5min timeout, while waiting for qmaster PID file.
sge_qmaster daemon didn't start. Please check your
autoinstall configuration file! Installation failed!

When I start /etc/init.d/sgeqmaster.default I get the same error. For 5 minutes it will try to execute
/usr/share/gridengine/bin/lx24-amd64/qping -info red2 6444 qmaster 1 which fails with:

endpoint red2/qmaster/1 at port 6444: can't find connection

The same thing happens if I use the CLI installer. Any help would be greatly appreciated.

qmaster won't start

Hi!

I have the same problem that "guyt" but when i do /bin/lx24-x86/gethostip hostname i have
hostaname: UnKnown host. I don't know why?

Thank you very mach

How is your hostname resolved?

Can you call /usr/share/gridengine/utilbin/lx24-amd64/gethostname? What does it return. Is this the expected hostname?

Check if you have a /tmp/sge_messages file on the host where you tried to install/start qmaster. If so, what does it say?

Lubos.

Hi! I have the same problem

Hi!

I have the same problem that "guyt" but when i do /bin/lx24-x86/gethostip hostname i have
hostaname: UnKnown host. I don't know why?

Thank you very mach

Re: How is your hostname resolved?

Thank you Lubos,

My hostname is resolved fine and there was no /tmp/sge_messages. I was banging my head agianst the wall with this for a couple of days then a kernel update came through the usual channel, I've done it and rebooted the machine and now qmaster starts without a problem. I don't know what it was, I doubt the new kernel had anything to do with this, more likely it was the reboot (how windows is that?)

anyway, it's fixed now so thanks.

Guyt