Detecting an unwanted condition — Part 1

An earlier post, Managing channels, ends with the observation that "perhaps there should be monitoring in place, so that if no readings have come in during some period of time the pico's owner could be notified by text or email." This recently became an issue because of a different failure mode with a similar symptom.

In the earlier case, an error in managing channels made it so that no temperatures were recorded for hours on end. A few days ago, just one of the sensors stopped sending temperatures for several hours before this was noticed and days before someone could go on-site to reset it.

This post will show how to watch for the situation where readings are not received from one or more sensors for over an hour. It is easy to react to an event, but harder to react to something not happening.

We are going to layer on a ruleset that will send an email message to the pico's owner when it notices that expected readings have not arrived.

Layering rulesets in a pico is an early post that defines "terminal events", one of which we will be able to use for this purpose. 

Plan for two phases

We will layer on a new ruleset — which we will name io.picolabs.plan.wovyn-monitor. Its job will be to count the number of readings received so far from each sensor (phase one) and then to complain if there were none in an hour's time (phase two) for any sensor.

This will be possible because the acceptHeartbeat rule in the io.picolabs.plan.wovyn-sensors ruleset has always raised a terminal event. In context, it looks like this:

  rule acceptHeartbeat {
    select when io_picolabs_plan_wovyn_sensors heartbeat
      ...
    pre {
      device = event:attrs{["property","name"]}
      local_name = ent:mapping{device}
      ...
    }
    if local_name then noop()
    fired {
      ...
      raise io_picolabs_plan_wovyn_sensors event "temp_recorded"
        attributes {"name":local_name,"time":time.makeMT(),"temp":tempF}
    }
  }

Besides the things that the rule needs to do to record an incoming temperature, this rule raises the io_picolabs_plan_wovyn_sensors:temp_recorded event. The ruleset did not need that event; it was merely raised for future purposes*.

Phase One (counting)

Today we're going to add a ruleset to simply count the number of temperature reports, per location, using this rule to react to the heretofore unused event:

  rule count {
    select when io_picolabs_plan_wovyn_sensors:temp_recorded
      name re#(.+)# setting(local_name)
    fired {
      ent:counts{local_name} := ent:counts{local_name}.defaultsTo(0) + 1
    }
  }

Aside. Just seven lines of code, but there is a lot going on: this rule will be considered each time the event is raised, but it will only be selected when the value of the name attribute matches the regular expression, and its value will be bound within the rule to the name local_name. When selected, the rule will always fire (since there is no conditional action specified), and when it fires, it will assign a value in the ent:counts map using the local name as the key. The value will be the previous value (if there is none, the defaultsTo operator will supply zero) plus one. The new sum will become the value in the map for that local name. Aside over.

We'll use the bazaar app (described in the Tutorial for a new application post) to create the boilerplate code for a new PLAN application.

which will give us this code:

ruleset io.picolabs.plan.wovyn-monitor {
  meta {
    name "counts"
    use module io.picolabs.plan.apps alias app
    shares count
  }
  global {
    count = function(_headers){
      app:html_page("manage counts", "",
<<
<h1>Manage counts</h1>
>>, _headers)
    }
  }
}

We'll insert the count rule into the code above (just before the closing curly brace). Then, we'll also insert a rule to initialize the entity variable ent:counts when the application is installed:

  rule initialize {
    select when io_picolabs_plan_wovyn_monitor factory_reset
    fired {
      ent:counts := {}
    }
  }

We'll finish off this phase by showing the counters in the count function (new code highlighted in blue):

    count = function(_headers){
      app:html_page("manage counts", "",
<<
<h1>Manage counts</h1>
#{ent:counts.map(function(v,k){
  <<#{k}: #{v}<br>
>>
}).values().join("")}
>>, _headers)
    }
  }

Now we can install the app, by copying the link from the bottom of the code editor page, pasting it into the box at the bottom of the list of apps in our Manage applications page, and clicking the Add button. A link to the app will then appear in the list of installed apps, and when we run it, after waiting for some readings to come in, it will display something like:


In this case, the screenshot was taken about three hours after the phase one app had been installed. It has gathered all of the local names as keys of the ent:counts map, and counted 21 readings received from each one of them.

The complete ruleset at the end of phase one can be seen here.

Interlude: testing

After leaving this app running for over a day, it displays updated counts:

So, it appears to be working fine. It even provided this useful and unexpected information: the porch sensor must be missing a reading every so often.

We're going to add some code to check and reset the counters every hour. But how will we know that it will actually send an email message when a failure occurs?

One idea is to start the countdown, and then with just a couple minutes remaining in the first hour, manually reset all of the counters to zero. By doing this, we should simulate a failure and see the email message.

So, here we are going to add a button to manually reset the counters.

First, a rule to reset the counters:

  rule resetCounts {
    select when io_picolabs_plan_wovyn_monitor manual_reset
    fired {
      ent:counts := ent:counts.map(function(v,k){0})
    }
  }

Aside. The right hand side of the assignment uses the builtin Map operator map to compute from the ent:counts map, and create a new map with the value at each key replaced with zero. The new (zeroed-out) map then is assigned to be the new value of ent:counts. Aside over.

Then we'll add a reset button to the screen computed by the count function:

<form action="#{app:event_url(meta:rid,"manual_reset")}">
<button type="submit">reset</button>
</form>

The button works, but needs to redisplay the count.html page afterwards, so we'll add this rule to the ruleset:

  rule redirectToHomePage {
    select when io_picolabs_plan_wovyn_monitor manual_reset
    send_directive("_redirect",{"url":app:query_url(meta:rid,"count.html")})
  }

Now that we have the ability to manually test the automation we can proceed to phase two. The ruleset at this point can be found here.

Phase two

To be described in a subsequent post.

Notes

*One possible future use could have been a rule to detect a freezing temperature in the shed. An example (showing the power of a declarative event expression):

  rule detectShedProblem {
    select when io_picolabs_plan_wovyn_sensors temp_recorded
      name re#Shed#
      where event:attrs{"temp"} < 32
    fired {
      ... // take appropriate action
    }
  }

Other possible uses would be computing various aggregations, such as keeping a running average temperature per day per location, or keeping maximum and minimum values, etc. These could then be displayed on demand.

No comments:

Post a Comment