# Humans Need Not Apply

Under that category of “technology is neither good nor bad; and it is seldom neutral”,  I just watched a very interesting and well-done video about the impact of intelligent machine technology on our jobs.

In part, it compares horses and people.  When the automobile started entering our economy, horses might have said, “This will make horse-life easier and we horses can move to more interesting and easier jobs”.  But that didn’t happen; the horse population peaked in 1915 and has been declining ever since.  I’m sure we all agree that intelligent and cognitive applications will certainly replace some jobs.  The question is: will there be enough new jobs to keep humans fully employed?   Might unemployment raise to 45% as the video suggests?  How many future job descriptions will contain the phrase “Humans Need Not Apply”.

What the video fails to discuss is how massive unemployment might be averted.  I’d like to see even some proposals or suggestions.  Do you have any ideas?

I would also like to think that I—a high-tech, machine-learning, cognitive-app, AI technologist— would be immune to these kinds of changes.  But I’m less certain after watching this video.    You should definitely check it out.

# Characterizing Clouds

In a previous post, I made the distinction between global clouds (most aggregated) and local clouds (least aggregated.  But there can be clouds that aggregate at multiple levels.  What terminology should we use to describe such clouds?

Ownership:  Perhaps the most important attribute of any cloud is its owner.  I believe many/most people loose track of this idea.  Sure, they use the Facebook cloud, but they often forget that Facebook, the company, owns all that data and uses it for THEIR benefit.

Scope:  We can refer to clouds by the entity that aggregates them.  For example, we could talk about personal clouds, which are clouds scoped to a single person.  We could also have home clouds, neighborhood clouds, city clouds, state clouds, national clouds, and finally global clouds.  More about this in a future post.

Relative Size:  I originally considered using relative size terms like micro-cloud as a cloud a million times smaller than the currently largest cloud.  But, as the biggest clouds get bigger, all the terms must change, so relative size is a bad idea.

Absolute Size:  Rather than relative size, we should refer to clouds by their absolute size.  A cloud that aggregates roughly 10 total (atomic + compute) nodes is called a deca-cloud (the “deca” prefix means 10). A cloud that roughly aggregates 1000 nodes is called a kilo-cloud.   The largest clouds on the planet (Google, Facebook, etc) are giga-clouds or peta-clouds.

Network:  The network cost and complexity grows faster (super-linear) than the number of nodes it interconnects.  The network needs of 1000, isolated kilo-clouds is radically less than the network needs of a single giga-cloud, even though they contain the same number of nodes.  Raw network costs (in dollars) captures the idea, but is not as abstract as I’d like.

Bandwidth:  We can also characterize local and global clouds by their bandwidth (bytes sent per second) needs.  A larger cloud will naturally require more total bandwidth than a smaller cloud.  Bandwidth per node is a better measure.  But all hub and spoke networks (single hop between spoke nodes and the hub node) that perform the same computation have the same bandwidth/node; there is no distinction between a deca-cloud and a giga-cloud.

# Local Clouds

“Cloud” is one of the industry’s current buzz words.  And while there are many good reasons to implement and use global clouds,  we shouldn’t thoughtlessly push everything to such clouds. Let me explain why I say this, and some of its implications.  In particular, I am going to argue that local clouds are a better solution for some problems.

A big reason for global clouds is the desire to aggregate all the data into a single cloud (OK, it may just be virtually aggregated into a single, distributed cloud infrastructure—but all the points below still apply).  This use case is the poster child for “big data”.  All the aggregated data can be analyzed by the global cloud, this way and that way,  correlations and patterns can be found, predictive models can be constructed and validated,  etc.

But a big reason against global clouds—the one that pushed me down this line of thinking—is privacy.  There is a saying: “If you don’t pay for a product, you ARE the product”.  How can so many large cloud infrastructures be free to users?  Because cloud providers harvest and sell their users and users’ data.  I don’t like paying for services anymore than the next guy, but I really don’t like be sold.  Me any my data are MINE !

Another huge, industry buzz word is IoT (Internet of Things).  The IoT movement starts at the opposite end of the aggregation spectrum with atomic nodes on each individual device.  Each atomic node has to support a minimal set of sensors and actuators plus enough networking to enable other, more capable, nodes to read and manipulate the atomic node.  Most atomic nodes do not need to also be a “compute node”, which contain a general purpose processor and (relatively) large amounts of storage.  Most atomic nodes only need to communicate (directly or indirectly) with a compute node.

A common view of IoT is that the compute node should be in the global cloud.  This makes some sense, because ALL of the data from ALL of the atomic nodes are available for global analysis in the global cloud.  But it also gives tremendous (unfair?) advantage to the cloud owner.  Typically the cloud owner claims ownership of all this data—and they certainly have legal claim to all derived data they construct from the individual data.  Plus, they can reliably infer a lot of personal information from it:  personal habits, when are you active, what do you like, which topics interest you, etc, etc, etc.  I see this as a huge loss of personal privacy.

But do we really need to push massive oceans of data from all the atomic IoT nodes up to global clouds?  Are global clouds the only viable location for compute nodes?  I say, “No”.

I believe a better way—at least for some applications—is to build local compute nodes, near the atomic nodes.  The local compute node(s) acts like a local cloud.  But none of this data is ever pushed up to a global cloud.  This achieves many of our objectives (atomic nodes remain simple, a single compute node can service multiple atomic nodes) but does not violate privacy.

I’ll have more to say about “local clouds” in upcoming posts.

# 2015 is the Mole-of-Bits Year

Take a look at this graphic from IDC.  It estimates the totals number of bits in the world over time.  There are many things going on in this graphic.  It shows that enterprise data is certainly growing, but does not comprise the majority of data; sensor data and social media data far outstrip it.

It also shows that a huge fraction of all data contains uncertainty. This has dramatic implications for old school programmers.  Programming absolutely must continue to adopt new approaches to handle uncertain input data.  Particularly for emerging cognitive applications. The traditional excuse of classifying ANY input errors as “garbage in” just won’t not cut it any more.

But my favorite part of this graphic are the axes; forget the graphic curves (how often does that happen). The x-axis shows time with 2015 on the far right.  The y-axis shows the number of bits in the world. For the chemists among you, 10 to the 23rd is an essentially Avagadro’s number (6.02 E23), which is the number molecules in a “mole”.  What does this data mean?  Imagine you’re holding a tablespoon filled with water.   You’re holding a mole of water molecules.  The chart above implies that this year, 2015, there will be one bit of data for EVERY molecule of atomic H2O in that tablespoon.  To me, that is nothing short of INCREDIBLE and AWESOME.  When I was growing up, I remember trying to imagine how we would ever have such a gi-nor-mous number of macroscopic things.  Well here we are, and in my lifetime.  I’m moved.

So I hearby officially declare

2015 as the “mole of bits” year

# Will Superintelligent AIs Be Our Doom?

I am quite focused on advancing computer science so it becomes more capable and able to solve more of our problems.  Over the last many decades, procedural programming enabled us to solve many broad classes of problems, but there are still have many problems outside procedural’s grasp.  Artificial Intelligence (AI), aka cognitive computing, is one good way to approach many of the remaining problems.  So I spend a lot of time trying to advance these new technologies.

However, one of my favorite phrases is from Melvin Kranzberg:  “Technology is neither good nor bad; and it is seldom neutral”.  So we (both ME and YOU) must always carefully consider the implications of our technologies.

Along those lines, I just read this excerpt titled Will Superintelligent AIs Be Our Doom?  I don’t believe we should never explore a technology just because it might cause harm.  If that were the case, we should have never developed most of the technologies that make up modern life.  I do believe there is a possibility AI could get away from us.  The take-away for me is: we need to consider both the wildly good and wildly bad possibilities.  That at least helps us understand — as best as possible — what might actually happen.

# Sync

More travel for me and another great book.  This one is Sync: How Order Emerges from Chaos in The Universe, Nature, and Daily Live by Steven Strogatz.

The simplest example,  echoed on the book cover, is how do millions of fireflies synchronize their blinking.  There is no global “blink now” clock.  There is no master firefly orchestrating the performance.  So how do the fireflies end up synchronized?  It’s a simple idea that requires a lot of thought to truly understand.  The problem seems so simple when you first hear it.  I would have probably waved my hands at it, mumbled that the solution probably looked like XYZ, and then moved on. Only when you really try to solve it, do you begin to see the depth and subtly of this particular rat hole.

Strogatz also discusses many other sync problems:  pairs of electrons in superconductors, pacemaker cells in the heart, hints at neurons in the brain, and many others.  I was particularly intrigued by his discussion of the propagation of chemical activation waves in 2 and 3 dimensions.

There are no equations in the book, but Strogatz’s prose is sufficient to give a good taste of the novel mathematics he and others have used to address these problems.  It was a quick read.

As with most books I read, I begin to ponder how I might apply some of its ideas to multiagent systems.  For one, there are multiple ties to social networking (the A is-connected-to B kind; not the Twitter kind).  I also try to re-imagine his waves of chemical activations around a Petri dish transformed into waves of protocol interactions around a social network.

Most of Strogatz’s problems appear to require continuous-space and continuous-time.  Most of my multiagent system problems are simpler, requiring only discrete-space and discrete-time. I’ve developed some “half-vast” (say it out loud) ideas about using a model checker to approach protocol problems with sync-like elements.  I’d introduce some social operators into the model and an expanded CTL-like expression language.  Models would only need to be expanded enough to check the specific properties under consideration.  I’d also need to introduce agent variables that range over a group of agents to express the kinds of properties I have in mind.  The classic CTL temporal operators are strictly time-like, whereas the social operators would be strictly space-like.  Unfortunately, my current ideas could easily cause a massive state space explosion, so I still have work to do.

I so love reading books like Sync:  books that open intellectual doors and expand conceptual horizons.  I expect I’ll be thinking about these ideas for quite some time.

# Using Commitments: When Things Go Wrong

This is the last in my series on commitments, where we look at what happens when debtors don’t satisfy their commitments.

A debtor agent makes a commitment before satisfying the commitment.  In fact, that’s the whole point of using commitments.  It is a way for a debtor to tell others what it will do in the future.  Since debtors are assessed penalties when they fail to satisfy a commitment, they are certainly encouraged to do satisfy. However, bad things can happen between making and satisfying a commitment.

To paraphrase Spock in The Wrath of Khan, “There are two possibilities. They are unable to respond. They are unwilling to respond”First, a debtor may be unable to satisfy all of its commitments.  Toyota had a lot of commitments to deliver automobiles, but when the tsunami hit it, Toyota was no longer able to keep all of those commitments.

Second, event considering the cost of penalties, a debtor may be unwilling to satisfy all of its commitments.  Airlines knowingly overbook flights, creating more commitments than they can possibly keep.  They willing pay the penalties for their broken commitments.  This covers the case where debtors originally intended in good faith to satisfy their commitments but it is no longer profitable to do so, and where debtors made a commitment never intending to satisfy them.  We don’t distinguish between good and bad intentions in our commitment formalism, because it makes little difference and it is often impossible to distinguish anyway.

To model real world situations, we have to allow these messy cases.  We can not require debtors to always satisfy their commitments. The only thing we do require is that debtors must not leave their commitments hanging forever.  If they can’t satisfy a commitment, then then must eventually cancel it.   Commitment cancellation could be triggered by a timeout mechanism.

# Using Commitments: Serial Composition

I Promise

Last time, we discussed commitment transformations,.  This time we look at a different kind of transformation that captures the idea of a “chain reaction”:  where knocking over a single domino causes a whole chain of dominos to topple.

The basic idea is to combine multiple commitments that form a chain of commitments into a single commitment.  A chain is well-defined when

• If the first commitment is satisfied, then the second one becomes unconditional
• If the first 2 commitments satisfied, then the third becomes unconditional
• If the first N-1 commitments satisfied, then the Nth commitment becomes unconditional

Since we assume debtors typically honor their commitments (I’ll have more to say about this in a future post) then unconditional commitments will typically be satisfied and deliver the consequent to the creditors.  So whenever the first antecedent becomes true (“fires”), a “chain reaction” eventually occurs and the whole chain of commitment consequents eventually become true (“fire”).  The chain reaction may not occur instantly, but if all debtors eventually honor their commitments, the full chain reaction will eventually occur. Further, the result is a commitment from the set of all debtors to the set of all the creditors.

The chain is incrementally constructed using a serial composition operator of two commitments. So let’s define it and get those gorpy details out of the way now. This post requires more notation than previous posts. I try to avoid notation, but here its just the simplest approach.  I’m using the cool MathJax-Latex WordPress plugin to format this material. This should render in most browsers. Let me know if it doesn’t render well in your browser.

Serial Composition Operator:
When the condition $$C_A.csq \models C_B.ant$$ holds, then the serial composition of two commitments, $$C_{\oplus} = C_A \oplus C_B$$, is well-defined, and

\begin{align*} C_{\oplus}.debt & := C_A.debt \cup C_B.debt \\ C_{\oplus}.cred & := C_A.cred \cup C_B.cred \\ C_{\oplus}.ant & := C_A.ant \\ C_{\oplus}.csq & := C_A.csq \land C_B.ant \land C_B.csq \end{align*}

The well-defined condition simply means that $$C_B.ant$$ must be true whenever $$C_A.csq$$ is true. So if the first commitment becomes unconditional and is eventually satisfied, then the second comment will also eventually become unconditional.

The transformations in the previous posts never changed the debtors or creditors.  Here, the composite commitment unions the debtors and creditors.  All the debtors are responsible for ensuring the commitment is satisfied. There are different ways to combine responsibilities.  Responsibility can be several (each debtor is responsible for just its portion), joint (each debtor is individually responsible for the entire commitment), or joint and several (the creditors hold one debtor fully responsible, who then pursues other debtors). Serial composition uses several responsibility so that a debtor is never compelled to assume additional responsibilities. The result of serial composition is useful for reasoning about multiple commitments, but all the original commitments must be kept around if we need to determine which specific debtor(s) failed and broke the chain reaction.

Serial composition composes just two commitments, but we can use it repeatedly to combine a chain of multiple commitments. To compose more than two commitments, compose from left to right (the serial composition operator is left associative).

Enough theory, we need an example.  Assume Alice promises to pay Bob some money.  If Alice won’t see Bob today, she can pass the money through Chris, a trusted friend of hers.  (I refer to this as “pay via middle man” or PayViaMM).  There are two commitments.  The first is Alice’s commitment to pay Bob.  The second is Chris’s commitment to pass on any payments he receives (Alice trusts Chris because of this commitment).

C1 = C(Alice, Bob, promise, AlicePays)
C2
= C(Chris, Alice, AlicePays, ChrisPays)

Serial composition is well-defined because $$AlicePays \models AlicePays$$.  Applying the definition above we get the resulting commitment for this chain of two commitments.

$$C1 \oplus C2 := C(\{Alice, Chris\}, \{Alice, Bob\}, promise, AlicePays \land ChrisPays)$$

Just remember this key idea:  we defined an commitment operator (serial composition) that creates a single commitment that describes the effect of a chain of commitments.  The whole chain is well-defined if firing the first antecedent causes a chain reaction.

# Using Commitments: Transformations

I Promise

This is another post about using commitments to describe various real-world situations.  For thousands of years, mathematicians have used rules to mechanically transform algebraic equations into simpler and more useful forms.  Similarly, we can define rules to transform agent commitments into simpler and more useful forms.

Let’s extend the pizza example.  Assume Alice commits to Bob that she will give him a pizza and a beer if he either gives her $10 or he helps her move to a new house. C(Alice, Bob,$10 or helpmove, pizza and beer)

We can break this single commitment down into a set of simpler commitments.  First, Alice will give Bob pizza and beer in either of two situations (an OR operator in the commitment’s antecedent).  This gives us two simpler commitments.

C(Alice, Bob, $10, pizza and beer) C(Alice, Bob, helpmove, pizza and beer) We can decompose these down even further, because Alice will give Bob both a pizza and a beer (an AND operator in the commitment’s consequent). This gives us four commitments C(Alice, Bob,$10, pizza)
C(Alice, Bob, \$10, beer)
C(Alice, Bob, helpmove, pizza)
C(Alice, Bob, helpmove, beer)

Symbolically, we write these commitment transformations like this

C(debt, cred, ant1 OR ant2, csq) = C(debt, cred, ant1, csq),   C(debt, cred, ant2, csq)
C(debt, cred, ant, csq1 AND csq2) = C(debt, cred, ant, csq1),   C(debt, cred, ant, csq2)

In some situations, it is easier to work with these four separate commitments so decomposing a complex commitment makes sense.  In other cases, we might want fewer commitments, and reverse the transformations, merging simpler commitments into more complex commitments.

The key point is once we explicitly describe commitments, we can also define rules to mechanically manipulate them, just like we do with algebraic equations.

# On Intelligence

Last week I went to Singapore for business and endured some dang long flights with a lot of time on my hands.  I tried watching the “Anchorman 2” movie, but it was just too off the wall (unintelligent?) for me, so I gave up on it.  Some of the other movies, like “Jack Ryan: Shadow Recruit”, were better.  But I still had a lot of time, so I turned to reading “On Intelligence” by Jeff Hawkins.

In the early chapters, where he was poking holes at a number of established approaches to intelligence, I was a bit skeptical.  But then, as he settled into his memory-prediction framework, he started to win me over with his different view.

Hawkins talks in depth about the biological processes in the human neocortex, which was interesting.  But the most interesting idea to me was his description of a “memory-prediction framework”.  Basically, this framework includes the obvious case of signals flowing from sensors up to higher cognitive levels, plus the less obvious case of signals flowing down from higher to lower cognitive levels.  Each cognitive level remembers what has previously occurred and predicts what is likely to occur next.  These cognitive levels detect and predict “sequences of sequences” and “structures of structures”.  Predictions allow for missing or garbled sensor data to be filled in automatically. There is also an exception case, where the prediction from above is at odds with the sensor data from below.  Exceptions also flow up the cognitive hierarchy until they are handled at some level.  If they flow high enough, we become aware of them as “something’s not right”.

What I find most intriguing is how this memory-prediction framework might be implemented artificially.  While Hawkins does address this, layered Hidden Markov Models (HMMs) would seem to be a useful direction.  Jim Spohrer tells me that Kurzweil’s book “How to Create a Mind” suggests exactly this, so I’m adding that to my reading list.

I wonder how much training data it would take to train such a model.  I can’t help but think of a baby randomly jerking and flexing its arms and legs; boys endlessly throwing and catching balls; and kids riding their bikes for hours. All these activities would generate a lot of training data.

I also pondered the implications for service science.  Do service systems have a hierarchy of concepts similar to lower and higher cognitive functions?  What kind of “memories” and “predictions” do service systems have?  Service systems always have documents and policies, but that is not the kind of “active memory” Hawkins thinks is important.  Service employees clearly have internal memories, but are there active memories between small groups of employees?  Do departments or entire organizations have memories?  What are the important “invariant representations” of different service systems?  Should we focus on the differences between Person arrives at front desk vs. Guest checks-in vs. Guest is on a week long vacation vs. Guest is satisfied with service? What are the common sequences (or even sequences of sequences) in an evolving customer encounter?  If we knew them, could we can predict next events.  “Be prepared” seems like a more modest and achievable goal for a service system than the kind of moment-by-moment predication Hawkins envisions.

If you’re particularly interested in bio-inspired intelligence, there is a lot of meat in this book to keep you busy and fascinated.  If you’re more interested in the artificial mechanisms for intelligence, like I am, focus on the memory-prediction framework.  Either way, I recommend this book.