[Nagios-devel] Nagios3 hang

Oliver Peng oliver.peng at skywave.com
Fri May 8 02:35:07 UTC 2009


Hi All:

I write an event handler to start our daemon service. The problem is that once the event handler code finished, whole Nagios process keep hanging. I check the CPU usage is 100% and by using strace I can see a lot of following system call:

read(5, 0xbf879450, 1023)               = -1 EAGAIN (Resource temporarily unavailable)

By checking the source code, I found the reason of this problem.

Nagios main process create a pipe fd[5,6] and fork a child process. In this child process, it use popen to create pipe and invoke event handler script. Once child process get output from event handler script, it transfer it to main process by using fd 6. Because in popen function, it doesn't close fd 6 before invoking event handler script, fd 6 is inherited down to our daemon process. Because our daemon process doesn't close fd 6, Nagios main process keep getting EAGAIN.

Suggestion:

To fix this problem, I would suggest to fix two parts.

1 Don't use popen. Instead create pipe, fork child process and close all unused file descriptor before invoking event handler script.

2 In main process, set timeout for getting  EAGAIN error code to avoid endless loop.

Oliver
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nagios.com/pipermail/nagios-devel/attachments/20090507/72e53f34/attachment.html>


More information about the Nagios-devel mailing list