[Nagios-devel] freshness_threshold bug - big problem

Jochen Bern Jochen.Bern at LINworks.de
Fri Dec 17 12:07:41 UTC 2010


On 12/17/2010 12:10 PM, Rodney Ramos wrote:
> Than I understood that you confirm the problem

I confirm that my 3.2.3 autodetermines the host's freshness threshold as
check_interval+additional_freshness_latency, even in SOFT non-OK cases,
when active checks would use retry_interval instead.

I'm not calling it a "problem" yet, though, because the specifics you
quote (apparently from a local copy of the docs ?) are absent from the
docs at http://nagios.sourceforge.net/docs/3_0/freshness.html .

Nonetheless, when I compare base/checks.c::is_host_result_fresh() to
base/checks.c::is_service_result_fresh(), it seems that the latter
*does* do the if-then-else you describe, while it's absent from the former:

[...]
/* tests whether or not a service's check results are fresh */
int is_service_result_fresh(service *temp_service, time_t current_time,
int log_this){
[...]
   /* use user-supplied freshness threshold or auto-calculate a
freshness threshold to use? */
   if(temp_service->freshness_threshold==0){
      if(temp_service->state_type==HARD_STATE ||
temp_service->current_state==STATE_OK)

freshness_threshold=(temp_service->check_interval*interval_length)+temp_service->latency+additional_freshness_latency;
      else

freshness_threshold=(temp_service->retry_interval*interval_length)+temp_service->latency+additional_freshness_latency;
      }
   else
      freshness_threshold=temp_service->freshness_threshold;
[...]
/* checks to see if a hosts's check results are fresh */
int is_host_result_fresh(host *temp_host, time_t current_time, int
log_this){
[...]
   /* use user-supplied freshness threshold or auto-calculate a
freshness threshold to use? */
   if(temp_host->freshness_threshold==0)

freshness_threshold=(temp_host->check_interval*interval_length)+temp_host->latency+additional_freshness_latency;
   else
      freshness_threshold=temp_host->freshness_threshold;
[...]

I have no idea whether that's intentional, though ...

> 18:56:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 59s
>   (threshold=0d 0h 15m 17s). I'm forcing an immediate check of the host.
> 18:56:23 HOST ALERT: Unfresh;DOWN;SOFT;2;(null)
> 
> --> It´s wrong. It should be about 18:42:05, 2 minutes after the SOFT1, as
> your retry_interval is 2 minutes.
> 
> 19:28:13 Warning: The results of host 'Unfresh' are stale by 0d 0h 0m 39s
>   (threshold=0d 0h 15m 18s). I'm forcing an immediate check of the host.
> 19:28:23 HOST ALERT: Unfresh;DOWN;SOFT;3;CRITICAL: All life functions
> terminated
> 
> --> It´s wrong. It should be about 18:58:23, 2 minutes after the SOFT2, as
> your retry_interval is 2 minutes.

(You missed the spurious *second* SOFT2 between these two, which upends
the prediction of "correct" check times even further ...)

P.S. to my previous mail: I also noted that, in spite of the config
saying "initial_state o", the host was listed as PENDING in the CGIs
after the first reload. Is that expected behaviour?

Kind regards,
								J. Bern
-- 
Jochen Bern, Systemingenieur --- LINworks GmbH <http://www.LINworks.de/>
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP = D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C27
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Geschäftsführer Metin Dogan, Oliver Michel




More information about the Nagios-devel mailing list