[Nagios-devel] Nagios3 hang
ae at op5.se
Fri May 8 07:39:45 UTC 2009
Oliver Peng wrote:
> Hi All:
> I write an event handler to start our daemon service. The problem is
> that once the event handler code finished, whole Nagios process keep
> hanging. I check the CPU usage is 100% and by using strace I can see
> a lot of following system call:
> read(5, 0xbf879450, 1023) = -1 EAGAIN (Resource
> temporarily unavailable)
> By checking the source code, I found the reason of this problem.
> Nagios main process create a pipe fd[5,6] and fork a child process.
> In this child process, it use popen to create pipe and invoke event
> handler script. Once child process get output from event handler
> script, it transfer it to main process by using fd 6. Because in
> popen function, it doesn't close fd 6 before invoking event handler
> script, fd 6 is inherited down to our daemon process. Because our
> daemon process doesn't close fd 6, Nagios main process keep getting
> To fix this problem, I would suggest to fix two parts.
> 1 Don't use popen. Instead create pipe, fork child process and close
> all unused file descriptor before invoking event handler script.
> 2 In main process, set timeout for getting EAGAIN error code to
> avoid endless loop.
Thanks for the writeup of this problem. I'll look into it.
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Register now for Nordic Meet on Nagios, June 3-4 in Stockholm
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
More information about the Nagios-devel