Picos and computation


Picos are persistent objects living on the Internet. They also can do computation. However, they work best when a limited amount of computation is required to react to each event, and to respond to queries.

Most web applications fit this pattern well. They need persistent state, including sessions. They need to be on the Internet and readily available. They need to respond and react very quickly to actions taken by the persons using them.

Service-level objectives

Web applications can be characterized by the requirements for availability, throughput, and frequency of use. Together the goals for these characteristics of an application are called service-level objectives (SLOs).

Availability

As long as the pico engine hosting a pico is running, and the server it is running on is connected to the Internet, the pico will be available. It might be momentarily busy, but query requests and events will be queued up for it by the pico engine. When the pico will actually evaluate a query or event depends on the other characteristics.

Throughput

Long-running computations would be inconsistent with the other requirements of web applications, and are not a good fit for picos. For example, you wouldn't want to ask a pico to compute the digits of π or do weather forecasting or any other compute intensive work.

Frequency of use

Unless your web application is insanely popular, like those of the tech giants, you will probably be fielding only a handful of requests every minute or so, or even less. Picos work well at such rhythms.

A pair of fallacies

Returning to the frequency of use metric, a few popular web applications are used much more frequently than most others. The distribution seems to follow Zipf's law, illustrated here:


a plot of Zipf's curve

The vertical scale is number of visitors or frequency of requests, and the horizontal scale is number of distinct web applications.

Very few web applications have a large number of visitors. There is a long tail of web applications with very few visitors.

Assuming success is measured by high numbers on the vertical scale, the two fallacies are:

  1. The developers of the popular web applications must be incredibly talented and their technical expertise must be credited for the success.
  2. New techniques created by these developers (after their success has been achieved), if adopted by me, would make my web application successful too.

The truth is that web applications can be successful (meet their objectives) without having an enormous frequency of use. A technique used by popular websites to meet their demand won't be necessary for most web applications (in other words, you aren't going to need it (YAGNI)).

Advantages of using picos

Picos give several advantages to programmers describing computations. The programmer doesn't have to configure, set up, or manage an external database. The programmer doesn't have to worry about memory leaks or critical sections. The programmer has the full power of a functional programming paradigm.

Database-less

The pico engine provides a database used to store the state of each pico. All the programmer has to do is assign values to entity variables (in the postlude of rules) and those values are persisted.

Run to completion

Picos respond to queries (by running a function), and react to events (by evaluating a schedule of rules). These run one at a time, and then when they finish, the work is complete.

There are a number of advantages to the notion of a task running to completion. For one, memory allocation. Once the query or event is completed, all memory used by it is freed up en masse, so there is no worry of memory leaks.

There is also no need for critical sections in the code, because the code for each query/event will run in a single thread. Only when the pico has finished with a query or event will it be given the next one to work on.

Functional programming in KRL

Values come into a computation either as literals or as provided by the external environment (that makes the request or generates the event).

Name binding

KRL uses a typical syntax for identifiers: letters of the Roman alphabet, decimal digits, and underscores (not starting with a digit).

A value can be bound to a name, using the equal sign as a binding operator. For example, fav_number = 5 binds the value 5 to the name fav_number and that name can be used from that point on as a synonym for 5. It cannot be re-bound, later, to another value.

Additionally, some statements use a setting clause to bind a value produced earlier in the statement to a name (enclosed in parentheses).

It can't be stressed enough that these are not variables, and are not and cannot be assigned a value. Names are bound to values.

Value types

KRL uses pre-defined strings to identify value types:

  • "Boolean"
  • "Number"
  • "RegExp"
  • "String"
  • "Null"
  • "Array"
  • "Map"
  • "Function"

Given a value (either a literal value or a name (that is bound to a value)), you can get the value's type by applying the typeof operator. See documentation page "Universal Operators".

Array and Map values can be nested in themselves and each other. A very common combination is an Array of Maps.

Operations

Values may be operated upon by functions (either built-in or defined by a programmer) to produce new values. Again, values cannot be mutated.

For example, an Array of Maps can be sorted, based on a value bound to a key of each map in the array. If such an array* contains maps that consistently have this structure (this is not valid KRL (uses "a String" and "a Map" to indicate that there is a string or a map nested at that point))

[
  {
    "rid": a String,
    "url": a String,
    "config": a Map,
    "meta": {
      "krl": a String,
      "krlMeta": a Map,
      "hash": a String,
      "flushed": a String,
      "compiler": a Map,
    }
  },
...
]

it would be possible to sort that array by applying the sort operator (see documentation page "Array Operators") on the hash string in each map. The result will be a new array containing a copy of the maps in the desired order. The original array value is unmodified.

Functions are first-class values

The function named by, defined here, can be used as shown below to sort such an array of maps, bound to the name my_rulesets:

    by = function(key){ // returns a function comparing two maps
      function(a,b){a{key}.encode() cmp b{key}.encode()}
    }
    my_rulesets.sort(by(["meta","hash"]))

The final expression is evaluated by first calling the function by, passing in an array of strings (functioning as a path into the map). That function will return a function that compares two maps in the manner required by the sort operator.

* A careful reader may have noticed that the array of maps shown (in outline form) is a JSON object. Any JSON object can be easily coerced into a KRL value. If it comes from an outside environment as a string, it is sufficient to apply the decode operator to that string value (see the documentation "String Operators" page).

No comments:

Post a Comment