[Nagios-devel] [PATCH] Re: alternative scheduler

Fredrik Thulin ft at it.su.se
Fri Dec 3 09:20:54 UTC 2010


On Thu, 2010-12-02 at 10:29 +0100, Andreas Ericsson wrote:
...
> > I haven't used Merlin yet (I intend to do some testing), but the model
> > of distributed schedulers each handling smaller numbers of checks
> > works around that problem. But if it works that way then it really
> > just hides the issue.
> 
> Yes and no. No matter what scheduler you're using you'll sooner or later
> run into networks too large for one system to handle. When that happens,
> you'll have to expand sideways, and merlin lets you do just that. It's
> orthogonal to having a scheduler that distributes load somewhat evenly
> on a single system.

Yes, Merlin and similar approaches dividing the total workload over
multiple machines let you scale sideways, but does it scale (well) in
the context of each server? Does it still have just one scheduler per
server (or worse, per the whole system)?

Anyone using Nagios in a large environment can be expected to have at
least 4 cores today - probably 8. I believe what you write later on :

> Yes. The current problem is that Nagios has problems saturating the
> CPUs on powerful hardware. It's quite commonplace to see a system
> with high latency and low load.

is an effect of Nagios having a single scheduler, which as multi core
machines will continue to evolve, will have a smaller and smaller amount
of the total CPU cycles of a server at it's disposal.

Shameless plug : with an Erlang based scheduler, you get one run queue
per available core for free. https://github.com/fredrikt/nagios-pers

To really make my point : consider how poorly current Nagios would
utilize a server with 100 cores of say 1GHz each. The single scheduler
running on one of the cores wouldn't be able to keep very many of the
other 99 cores busy. Does Merlin solve this?

/Fredrik






More information about the Nagios-devel mailing list