Tag Archives: AI

How should we understand the word “Understand”

What does the word “understand” mean?.  From the outside, is it possible to know whether someone — or some AI program — “understands” you.  What does that even mean?  

I assert that if you “understand” something, then you should be able to answer questions and perform tasks based on your understanding. If there are multiple tasks, then there are multiple meanings of “understand”.  Consider this classic nursery rhyme:

Jack and Jill went up the hill
To fetch a pail of water
Jack fell down and broke his crown
And Jill came tumbling after

There are many different tasks an AI program can perform, leading to multiple different meanings of “understand”.  Different programs can perform different tasks: 

  1. Return Counts: 4 lines, 25 words
    A simple procedural program can possess a very rudimentary understanding of the text.
  2. Return Parts of speech: nouns: Jack, Jill, hill, … verbs: went, fell, broke
    Simple NLP processing can understand each word’s part of speech.
  3. Return Translation: Jack und Jill gingen den Hügel hinauf …
    Translation between, say, English and German, requires more understanding of the text to ensure noun and verb agreement, how to properly reorder the words, etc. 
  4. Return Summary: Story about boy and girl’s disaster while doing a daily task
    Summarization is a much harder task. 
  5. Use Common sense: it’s odd they went uphill to get water
    It’s “just common sense” that you go down the hill to get water, not up the hill.  This is a very hard problem. 
  6. Create Interpretation: Attempt by King Charles I to reform the taxes on liquid measures. He … ordered that the volume of a Jack (1/8 pint) be reduced, but the tax remained the same. … “Jill” (actually a “gill”, or 1/4 pint) is said to reflect that the gill dropped in volume as a consequence. [Wikipedia]
    I love this explanation about the nursery rhyme from WIkipedia: it was political condemnation, encoded as a poem, about King Charles’ attempt to raise tax revenue without changing the tax rate.  A program that could return explanations like this have an extremely deep understanding of the poem and its social and political context. 

Of course, we could add many other definitions/tasks to this list, each leading to a new definition for “understand”. As the list grows, some pairs of definitions can’t be ordered according to difficulty, so that list would not be totally ordered.

This highlights a major source of confusion.  A company whose software implements a simple task (high on this list) can correctly claim their software “understands”. But the lay public most often interprets “understand” to be a very complex task (low on the list). When this happens the company has “overhyped” or “oversold” their software.

The fundamental problem is some words, like “understand”, are just too vague.  Eskimos have over 50 words for different kinds of “snow”, each describing a particular shades of meaning. I assert we need more granular words for “understands” — and other similarly vague words — to represent the different shadings.

A good example of what I mean comes from the US National Highway Traffic Safety Administration (NHTSA). They define multiple capability levels for autonomous vehicles:

  • Level 0: The driver (human) controls it all: steering, brakes, throttle, power.
  • Level 1: Most functions are still controlled by the driver, but a specific function (like steering or accelerating) can be done automatically by the car.
  • Level 2: at least one driver assistance system of “both steering and acceleration/deceleration using information about the driving environment” is automated, like cruise control and lane-centering. … The driver must still always be ready to take control of the vehicle.
  • Level 3: Drivers are able to completely shift “safety-critical functions” to the vehicle, under certain traffic or environmental conditions. The driver is still present and will intervene if necessary, but is not required to monitor the situation in the same way it does for the previous levels.
  • Level 4: “fully autonomous” vehicles are “designed to perform all safety-critical driving functions and monitor roadway conditions for an entire trip. However, … it does not cover every driving scenario.
  • Level 5: fully-autonomous system that expects the vehicle’s performance to equal that of a human driver, in every driving scenario—including extreme environments like dirt roads that are unlikely to be navigated by driverless vehicles in the near future.

If a company claims they offer a “Level 3” car, the public will correctly know what to expect.

So the next time someone says “I understand”, give them a few tasks to see how deeply they really do “understand”.

What do you think?  Did you “understand” this post.  🙂

AI Era Requires a Probabilistic Mindset

You may have heard that AI (or cognitive) era applications are “probabilistic”. What does that really mean?

Let me illustrate with two hypothetical applications.

First Application

You hire me to write a billing application for your consulting firm.  You give me the list of all your employees and their hourly billing rates.  You also give me last year’s data, including the number of hours each consultant worked for each client account and the bills you generated.

I do my development and come back to you later and say “I’ve finished the application and testing shows it computes the correct amount for 90% of the bills”.  What do you say?  You say,

  • You’re not done yet
  • You’re not getting paid yet
  • Get back to work

This is a classic “procedural era” application.  We all expect — and demand — 100% correct answers for procedural era applications.

Second Application

Now let’s change the requirements a little bit. Sometimes you give discounts and sometimes you give freebies.  You do this because it makes clients happy — and happy clients are important to your business.  Sometimes you do this for your largest clients, because they bring you so much business.  Sometimes you do this for clients that are considering big orders, because you want to show how much you value them.  Sometimes you do this for “friends & family” clients, because increasing their happiness increases your own happiness.

Again, you give me the bills for the last year.  And again, I do my development and come back to you and say “I’ve finished the application and testing shows it computes the correct amount for 90% of the bills”.  What do you say this time? You say

  • This is fabulous
  • Here’s a bonus for such a high accuracy rate
  • I’ve got this other program I’d like you to write for me.

Why such a different response between these two similar applications?

This is a classic “cognitive era” application. There’s no obvious formula for how to make clients happy.  An expectation for 100% correct answers is completely unrealistic.  For some medical diagnoses, even expert physicians only agree with each other about 85% of the time, so how can we ever even think a computer program can be correct 100% of the time.

The example illustrates that we must change our mindset as we move into the cognitive era. While a few application achieve NEAR (but not exactly) 100% accuracy (e.g. hand-writing digit classification), many successful cognitive era applications achieve well below 90% accuracy. 70% and 80% accuracy is the best we’ve been able to achieve.

Perfection is simply not a realistic goal in the cognitive era.

That’s why we say cognitive applications are PROBABILISTIC.


Scott N. Gerard

What is Cognitive Computing?

Cognitive computing is all the rage these days.  But what is it, really?  I’ve been thinking about it quite a bit lately, and I believe I have come to a few novel conclusions.

Wikipedia has a nice long article about Cognition.  It expansively covers a great many things that I would agree are “cognitive”, but not (yet) “cognitive computing”.  I’m interested in writing cognitive software; not is constructing an full, artificially intelligent, “faux human”.  So, I’ll focus only on just cognitive computing.

Rob High, CTO of IBM’s Watson Group, defines cognitive computing as four “E”s.

  • Cognitive systems are able to learn their behavior through education
  • That support forms of expression that are more natural for human interaction
  • Whose primary value is their expertise; and
  • That continue to evolve as they experience new information, new scenarios, and new responses
  • and does so at enormous scale.

Adaptive

I agree with education and evolve, although I see these two as similar concepts.  To make these ideas fit my somewhat artificial classification system below, I rename this idea to adaptive.

Ambiguity

However, I disagree with limiting the definition to human expression.  There are many processes that I believe require cognitive skills that are not naturally interpreted by humans.  Dolphin and bat echo-location are good examples; they are a kind of “seeing” but humans can’t do it.   Any application that can monitor the network communication into and out of an organization and correctly identify data leakage gets my vote for “cognitive”, even though humans can’t do it.

Ambiguity is a better criterion than human expression.

Many human expressions are difficult to interpret because they are ambiguous.  I offer the following two examples.

  • Natural language is very ambiguous.  The classic sentence “Time flies like an arrow; fruit flies like a banana” has many different possible interpretations.  Sentences can have ambiguous parses (is “time” a noun, or is it an adjective modifying “flies”; is “flies” a verb or a noun; etc).  Words can be ambiguous, commonly called word sense disambiguation (WSD).  Is “bass” a fish or a kind of musical instrument?
  • Human emotions are ambiguous.  They require interpreting facial expressions, body language, sarcasm, etc. And people often disagree on the proper interpretation of a person’s emotion:  “Is Bob angry at me?”  “No, You know how he is.  He was just making a joke.”

To be more precise, an input is ambiguous when there are multiple output interpretations consistent with that input.  The goal is to determine which output interpretation(s) are, in some sense, most appropriate. Many elements (surrounding context, background knowledge, common sense, etc.) help decide which interpretations are most appropriate.

Pushing this idea farther, we should change from discussions of structured data vs. unstructured data and start discussing unambiguous data vs. ambiguous data.

From:  structured data vs. unstructured data

To:     unambiguous data vs. ambiguous data

There are many cases where structured vs. unstructured misses the point.  A row of structured data is easy to process not because it is physically separated into separate fields.  It is easy to process because here is only one way to interpret that row.  Structured data can even be ambiguous, in which case we need to “clean the data” (remove ambiguity).  Java code has exactly the same structure as natural language, but compilers are not “cognitive” because the Java programming language is unambiguous.

The fundamental problem is to accept an ambiguous input plus its available context, and search through the space of all possible interpretations for the most appropriate output(s).  That is, a cognitive process is a search process.

In the Programmable Era, programmers were able to resolve low levels of ambiguity by the seat of their pants, either because there were few possible interpretations or because interpretation resolution could be “factored” into a sequence of more or less independent resolution steps.  But as the amount of ambiguity increases, programmers are unable to satisfactorily resolve ambiguity by the seat of their pants.  In the Cognitive Era, programmers need Ambiguity Resolution Frameworks (ARFs) to help them process large amounts of ambiguity.  Machine learning is one kind of ARF which takes as input multiple features (each of which can be understood by the programmer) and combines all the features together to resolve down to few interpretations (note that I’m not requiring  ARFs to perfectly resolve all ambiguity to a single interpretation).  The Cognitive Era is largely populated by cases where imperfect resolution of large  interpretation spaces is an unavoidable consequence of the input’s irreducible ambiguity.

Action

I also disagree that expertise is a defining criterion for cognitive computing.  A better, more inclusive, criterion is action.

Only humans can accept and interpret expertise.  Requiring a cognitive system to output expertise necessarily forces a human “into the loop”.  While appropriate is some cases, it is wrong to require a human in the loop of every cognitive system.  Rather, we should encourage the development of autonomous systems that are able to act on their own.  The distinction between expertise and action is not completely black and white:  Watson’s Jeopardy! system did both by ringing a buzzer (action) and providing an response (expertise).

Many years ago, IBM defined the “autonomic MAPE loop” consisting of four steps: M: monitor or sense, A: analyze, P: plan, and E: execute or effectors or act.  Not all cognitive systems must contain a MAPE loop, but I see it as more inclusive than the 4 E’s above.  Expertise is best characterized as the output of the Analysis step, requiring a human to perform the Plan step.  The Observe-Interpret-Evaluate-Decide loop is similar to the MAPE loop, with Observer=M, Interpret & Evaluate=A, Decide=P & E.  But they both end with an action.

So instead of the 4 E’s, I suggest we define cognitive computing by the 3 A’s:  Adaptive, Ambiguous, and Action.

We Must Not Loose Control of Artificial Intelligence

There have been lots of science fiction stories where a scientist creates a technology with the best of intentions, but then something unforeseen happens, and the technology gets away from him.  The book Frankenstein was probably the first.  The movie Trancendence is a recent example where an AI project goes horribly wrong.  There are many other examples.

I really love AI because it truly can change our world for the better.  Such techniques will allow us to do all kinds of things that are unimagined today.  But there is also a real possibility that such powerful technologies can be used against us by evil people, and, yes, even the possibility that they turn into evil autonomous agents.  It is up to us to be careful and prudent about such possibilities.

The Future of Life Institute published an open letter  urging additional research into ensuring that we don’t loose control of AI’s tremendous capabilities.  The letter is short, but contains, in part

We recommend expanded research aimed at ensuring that increasingly capable AI systems are robust and beneficial: our AI systems must do what we want them to do. 

I encourage you to ready this brief letter.  And—If this concerns you like it concerns me—to join me and sign the open letter.

Humans Need Not Apply

Under that category of “technology is neither good nor bad; and it is seldom neutral”,  I just watched a very interesting and well-done video about the impact of intelligent machine technology on our jobs.

In part, it compares horses and people.  When the automobile started entering our economy, horses might have said, “This will make horse-life easier and we horses can move to more interesting and easier jobs”.  But that didn’t happen; the horse population peaked in 1915 and has been declining ever since.  I’m sure we all agree that intelligent and cognitive applications will certainly replace some jobs.  The question is: will there be enough new jobs to keep humans fully employed?   Might unemployment raise to 45% as the video suggests?  How many future job descriptions will contain the phrase “Humans Need Not Apply”.

What the video fails to discuss is how massive unemployment might be averted.  I’d like to see even some proposals or suggestions.  Do you have any ideas?

I would also like to think that I—a high-tech, machine-learning, cognitive-app, AI technologist— would be immune to these kinds of changes.  But I’m less certain after watching this video.    You should definitely check it out.

Will Superintelligent AIs Be Our Doom?

I am quite focused on advancing computer science so it becomes more capable and able to solve more of our problems.  Over the last many decades, procedural programming enabled us to solve many broad classes of problems, but there are still have many problems outside procedural’s grasp.  Artificial Intelligence (AI), aka cognitive computing, is one good way to approach many of the remaining problems.  So I spend a lot of time trying to advance these new technologies.

However, one of my favorite phrases is from Melvin Kranzberg:  “Technology is neither good nor bad; and it is seldom neutral”.  So we (both ME and YOU) must always carefully consider the implications of our technologies.

Along those lines, I just read this excerpt titled Will Superintelligent AIs Be Our Doom?  I don’t believe we should never explore a technology just because it might cause harm.  If that were the case, we should have never developed most of the technologies that make up modern life.  I do believe there is a possibility AI could get away from us.  The take-away for me is: we need to consider both the wildly good and wildly bad possibilities.  That at least helps us understand — as best as possible — what might actually happen.

On Intelligence

OnIntelligenceCoverLast week I went to Singapore for business and endured some dang long flights with a lot of time on my hands.  I tried watching the “Anchorman 2” movie, but it was just too off the wall (unintelligent?) for me, so I gave up on it.  Some of the other movies, like “Jack Ryan: Shadow Recruit”, were better.  But I still had a lot of time, so I turned to reading “On Intelligence” by Jeff Hawkins.

In the early chapters, where he was poking holes at a number of established approaches to intelligence, I was a bit skeptical.  But then, as he settled into his memory-prediction framework, he started to win me over with his different view.

Hawkins talks in depth about the biological processes in the human neocortex, which was interesting.  But the most interesting idea to me was his description of a “memory-prediction framework”.  Basically, this framework includes the obvious case of signals flowing from sensors up to higher cognitive levels, plus the less obvious case of signals flowing down from higher to lower cognitive levels.  Each cognitive level remembers what has previously occurred and predicts what is likely to occur next.  These cognitive levels detect and predict “sequences of sequences” and “structures of structures”.  Predictions allow for missing or garbled sensor data to be filled in automatically. There is also an exception case, where the prediction from above is at odds with the sensor data from below.  Exceptions also flow up the cognitive hierarchy until they are handled at some level.  If they flow high enough, we become aware of them as “something’s not right”.

What I find most intriguing is how this memory-prediction framework might be implemented artificially.  While Hawkins does address this, layered Hidden Markov Models (HMMs) would seem to be a useful direction.  Jim Spohrer tells me that Kurzweil’s book “How to Create a Mind” suggests exactly this, so I’m adding that to my reading list.

I wonder how much training data it would take to train such a model.  I can’t help but think of a baby randomly jerking and flexing its arms and legs; boys endlessly throwing and catching balls; and kids riding their bikes for hours. All these activities would generate a lot of training data.

I also pondered the implications for service science.  Do service systems have a hierarchy of concepts similar to lower and higher cognitive functions?  What kind of “memories” and “predictions” do service systems have?  Service systems always have documents and policies, but that is not the kind of “active memory” Hawkins thinks is important.  Service employees clearly have internal memories, but are there active memories between small groups of employees?  Do departments or entire organizations have memories?  What are the important “invariant representations” of different service systems?  Should we focus on the differences between Person arrives at front desk vs. Guest checks-in vs. Guest is on a week long vacation vs. Guest is satisfied with service? What are the common sequences (or even sequences of sequences) in an evolving customer encounter?  If we knew them, could we can predict next events.  “Be prepared” seems like a more modest and achievable goal for a service system than the kind of moment-by-moment predication Hawkins envisions.

If you’re particularly interested in bio-inspired intelligence, there is a lot of meat in this book to keep you busy and fascinated.  If you’re more interested in the artificial mechanisms for intelligence, like I am, focus on the memory-prediction framework.  Either way, I recommend this book.