Moving Between Representation Levels – the Key to Making an AI System Work (Part 1)

April 12, 2018 AJMaren Comments 2 comments

Representation Levels: The Key to Understanding AI

“No computation without representation”

Jerry Fodor (1975). The Language of Thought, p.34. online access.

One of the key notions underlying artificial intelligence (AI) systems is not only that of knowledge representation, but that a good AI system will successively move disparate pieces of low-level, or signal-level information up the abstraction ladder.

For example, an image understanding system will have a low-level component that extracts edges and regions from the image (or from a multi-sensor set of images), and then progress upwards in abstraction until, finally, the various components of the system are understood as semantic things.

There is a huge gap between processing low-level elements in a visual system (edges and regions), and actually interpreting the content of that image. This has been the challenge in creating vision systems for nearly forty years.

One of the big breakthroughs in understanding how to create a vision system (which is one of the classic tough-AI problems) came about in the early 1980’s, with the realization that we couldn’t just leap, like Superman, from the street-level to the top of a building. That is, we were not going to interpret a collection of edges and regions as something that made sense, all at once.

Similarly, we face an analogous challenge in language systems, whether we’re interpreting spoken language or text. They both require that transition from a collection of low-level things (e.g., extracted terms, or a “bag of words”) up to something that is much more abstract and symbolic (e.g., discourse representation, or article summary, or any number of cognitive tasks).

A Brief Word on the Low-Level and High-Level Representations

One of the tasks that I give my students is that of taking a piece of raw data, and interpreting it at multiple representation levels. So, if the student takes a sentence from an online article, then the student needs to figure out:

Signal or statistical level: what are the specific things that are extracted? These will be the original set of strings, further reduced into “extracted terms” (nouns or noun phrases, and also other kinds of terms – verbs, adjectives, etc.),
Syntactic, structural, or semiotic level: what are the relationships between the things in the original data source? This will include such things as parts-of-speech and overall grammatical structure (the syntax of a sentence) in natural language processing (NLP), and perceptual groupings and other relationships between boundary edges and regions in an image, and
Symbolic level: what are the notional things that are represented? For example, if we have a picture of a cat, the “notional thing” is that there is a cat depicted in the image. If we have a piece of text that has the phrase “President Donald Trump,” then the notional thing is the person of Donald Trump, who has the role of being President, etc.

To recap: at the signal or statistical level, we have a set of representations, each of which is still very data-specific. These are the original data source and elements extracted from it, which may have some levels of processing associated with them. For example, in natural language processing (NLP), we extract “meaningful” terms, and disregard “stop words” such as “a,” “the,” “but,” etc. We may also disregard other terms that are too general to be of much use. Similarly, in image processing, we may extract a series of edges and regions, having disregarded certain small edges and extracted regions as being simply noise.

Thus, at the end of our signal / statistical processing phase, we have a set of things – often a vector list of things – and we have their attributes. For example, in NLP, we have a word frequency count of certain terms. In image processing, we know the location, size/shape, and extent of edges and regions.

This collection of lightly-processed low-level elements is not enough to give us a meaningful understanding of content.

In fact, getting to a “meaningful understanding” is still a long way off.

Before we address the transition from signal-to-symbolic processing, let’s identify what it is that we store at the semantic/symbolic level.

Semantic / Symbolic Representations

The notion of semantic or symbolic representation is not new. In fact, AI started with realizing that this kind of knowledge representation was super-important. The first few decades of AI work were largely focused on explicit representation of either specific things (declarative representation) or specific procedures for figuring things out (procedural representation). One of the earliest important AI works was Allen Newell’s paper, The knowledge level, presented in 1980 and published in 1981.

At the semantic level, we have our world-view. We know about things, whether specific objects (e.g. Donald Trump, the person, in the role of being President of the United States), or even more abstract things (e.g., that there is such a thing as “presidency,” and it is part of the Executive Branch of the government of the United States). We not only know about things and their properties (roles and/or attributes), but also their relationship with each other.

For example, if we do a Google search on the well-known AI researcher Y. Bengio, we get an aspect of Google’s Knowledge Graph about Dr. Bengio, as shown in the following figure.

Google's Knowledge Graph for AI researcher Y. Bengio. The Knowledge Graph comes to us in two parts: (1) a short (structured) summary of key properties (roles and attributes), and (2) a visual depiction (via headshots and names) of Dr. Bengio's closest research associates. (The latter is an encapsulation of a graph.) — Google’s Knowledge Graph for AI researcher Y. Bengio. The Knowledge Graph comes to us in two parts: (1) a short (structured) summary of key properties (roles and attributes) (lower right of the graph), and (2) a visual depiction (via headshots and names) of Dr. Bengio’s closest research associates (upper part of graph). (Note: the listing of associates is an encapsulation of a graph.)

AI went through a phase. Expert systems, which were based on symbolic AI, were the earliest efforts to make useful AI systems. These were the forerunners of AI applications, back in the early 1980’s. Expert systems had limitations – brittleness, difficulty in upgrade and maintenance, etc.

Thus, when connectionist systems (neural networks) emerged in 1986 with useful learning rules – meaning that they really COULD solve problems – they were hailed as the great new approach to AI.

With some ups and downs, this has led us to our current fascination with deep learning.

There are some high-level (symbolic) representations embedded in neural network systems.

However, these systems are still needing improvement in going from connection statistical to symbolic knowledge representation.

Trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: It just cannot be done.

David Marr (1982). Vision (San Francisco: W.H. Freeman), p. 27

Two papers listed below are very good, very important reads.

Also, it’s real important to note: symbolic AI never really died. Over the past three decades, there was immense work behind-the-scenes, leading to systems that now have worldwide prominence, such as IBM’s Watson. Also, systems such as Alexa have very strong knowledge (symbolic) components.

We’ll pick up on this again.

In particular, we’ll focus on the question of: how do we go from signal-level data to symbolic AI?

If you’ll recall, the previous post addressed how the last election was hacked by humans aggressively using machine learning algorithms, combined with large-scale data aggregation and several other methods.

This coming election will have an even stronger element of AI in the various campaigns.

In order to understand an AI – in order to either build a good one, or to game against an existing one – we need a framework for understanding them. That’s why we’re starting with the basics: representations and transitioning between representation levels.

Live free or die, my friend –

AJ Maren

Live free or die: Death is not the worst of evils.
Attr. to Gen. John Stark, American Revolutionary War

Most Crucial To-Reads (Journal and arXiv)

Bengio, Y., Courville, A., and Vincent, P. Representation Learning: A Review and New Perspectives. arXiv: 1206.5538 [cs.LG]. online access, accessed April 1, 2018 by AJM.

Abstract: The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.

Dr. A.J.’s Note: This takes the whole notion of representation and representation levels closer to current time. Most of us can read through Sect. 5 (p. 11). (The remaining sections involve lots of math and lots of theory. We may have a chance to return to them later.)

Chen, X.L., Li, L.-J., Li, F.-F., and Gupta, A. (2018, Mar. 29). Iterative visual reasoning beyond convolutions, arXiv:1803.11189 [cs.CV]. online access, accessed April 1, 2018 by AJM.

Abstract: We present a novel framework for iterative visual reasoning. Our framework goes beyond current recognition systems that lack the capability to reason beyond stack of convolutions. The framework consists of two core modules: a local module that uses spatial memory to store previous beliefs with parallel updates; and a global graph-reasoning module. Our graph module has three components: a) a knowledge graph where we represent classes as nodes and build edges to encode different types of semantic relationships between them; b) a region graph of the current image where regions in the image are nodes and spatial relationships between these regions are edges; c) an assignment graph that assigns regions to classes. Both the local module and the global module roll-out iteratively and cross-feed predictions to each other to refine estimates. The final predictions are made by combining the best of both modules with an attention mechanism. We show strong performance over plain ConvNets, e.g. achieving an 8.4% absolute improvement on ADE measured by per-class average precision. Analysis also shows that the framework is resilient to missing regions for reasoning.

Academic Articles and Books on Representations for AI

Marr D. (1982). Vision (San Francisco: W.H. Freeman).
McClamrock, R. (1991, May). Marr’s Three Levels: A Re-evaluation. Minds and Machines, 1 (2), 185–196.
online access, accessed April 1, 2018 by AJM.
Newell, A. (1980, Aug 19). The knowledge level. Presidential Address, American Association for Artificial Intelligence. AAAI80, Stanford University. Later published in Artificial Intelligence and AI Magazine (1981, July). online access, accessed April 1, 2018 by AJM.
Warren, W.H. (2012). Does this computational theory solve the right problem? Marr, Gibson, and the goal of vision. Perception, 41(9): 1053–1060. doi: 10.1068/p7327. online access, accessed April 1, 2018 by AJM.

Useful Blogs

Singhal, A. (2012, May 16). Introducing the Knowledge Graph: things, not strings. Google. online access, accessed April 1, 2018 by AJM.

Readable Techy-News on This and Related Subjects

Patterson, D. (2018, Feb. 20, 8:00 AM PST). How data quality affects the success or failure of a political campaign – Paul Westcott, vice president of data at L2, explains why data quality is everything when it comes to reaching voters and winning elections. TechRepublic. online access, accessed by AJM, Apr. 13, 2018. Contains video. Dr. A.J.’s Note: Warning! – This site may hack your computer and start playing vid after vid, and I had to reboot to get the vids to stop. Damn annoying.
Newcomb, A. (2018, Feb. 21, 7:57 AM ET). Artificial intelligence could supercharge hacking and election meddling, study warns – AI programs can make it easier for trolls with minimal technical skills to make fake videos, audio, researchers warn, NBC News. online access, accessed by AJM, Apr. 13, 2018. Contains video.

Roose, K. (2018, Feb. 10). His 2020 Campaign Message: The Robots Are Coming, The New York Times. online access, accessed by AJM, Apr. 13, 2018. (About Andrew Yang, Democrat announcing candidacy for 2020 election.)

Previous Related Posts

Hi Alianna,

I very definitely agree that ‘representation’ is the key to understanding AI. I further agree that “a good AI system will successively move disparate pieces of low-level, or signal-level information up the abstraction ladder.” I think your references are also useful and relevant to your points.

I am no expert on image recognition, but my sense is that it is still more of a dark art because we really have no grounded basis for decomposing and analyzing images. We see shapes and structures building up to discrete images, but that is not the way (if I understand correctly) how deep learning works with images. The intermediate representations in DL are indeed abstract and difficult to understand on symbolic levels. I can also see how deep learning can abstract symbolic representations.

When, however, we are dealing with language or human communications (even as interpreted by machines) I think we are also seeking abstractions (perhaps a better term is generalizations or prescissions) that give some sort of explanatory power. We may be able to get good results from deep learning on symbolic information, but we don’t know why and can’t see how it occurred. DL seems to be destined to be a black box.

Now, I know you are not speaking specifically or particularly about deep learning. But for us to be able to meaningfully transition from the signal to the symbolic, I think we need to be able to understand the transitions all of the way up what you call the transition ladder. If we can not understand what happens as each rung of this ladder is climbed, do we know what any of it means?

Put another way, my prejudice is that there needs to be explanatory power and understandability at each rung of the ladder for us to be dealing with real representations, not statistical facsimiles of same. For that requirement, we still need a representation that is itself understandable and grounded.

I fear that much of what passes for AI now works within acceptable error bounds, and in many applications is already providing impressive results, but we do not know why and do not know really how to improve. I do not believe we will gain that understanding until the basis of our representations is grounded in what is real, whatever that means, not just arbitrary abstractions.

Keep up the great posts!

2 thoughts on “Moving Between Representation Levels – the Key to Making an AI System Work (Part 1)”

Mike Bergman says:

April 16, 2018 at 3:05 am

Hi Alianna,

I very definitely agree that ‘representation’ is the key to understanding AI. I further agree that “a good AI system will successively move disparate pieces of low-level, or signal-level information up the abstraction ladder.” I think your references are also useful and relevant to your points.

I am no expert on image recognition, but my sense is that it is still more of a dark art because we really have no grounded basis for decomposing and analyzing images. We see shapes and structures building up to discrete images, but that is not the way (if I understand correctly) how deep learning works with images. The intermediate representations in DL are indeed abstract and difficult to understand on symbolic levels. I can also see how deep learning can abstract symbolic representations.

When, however, we are dealing with language or human communications (even as interpreted by machines) I think we are also seeking abstractions (perhaps a better term is generalizations or prescissions) that give some sort of explanatory power. We may be able to get good results from deep learning on symbolic information, but we don’t know why and can’t see how it occurred. DL seems to be destined to be a black box.

Now, I know you are not speaking specifically or particularly about deep learning. But for us to be able to meaningfully transition from the signal to the symbolic, I think we need to be able to understand the transitions all of the way up what you call the transition ladder. If we can not understand what happens as each rung of this ladder is climbed, do we know what any of it means?

Put another way, my prejudice is that there needs to be explanatory power and understandability at each rung of the ladder for us to be dealing with real representations, not statistical facsimiles of same. For that requirement, we still need a representation that is itself understandable and grounded.

I fear that much of what passes for AI now works within acceptable error bounds, and in many applications is already providing impressive results, but we do not know why and do not know really how to improve. I do not believe we will gain that understanding until the basis of our representations is grounded in what is real, whatever that means, not just arbitrary abstractions.

Keep up the great posts!

1. AJMaren says:
  
  April 19, 2018 at 8:21 pm
  
  Mike, this is a great comment, and I agree with you on many points.
  Rather than attempt a detailed response here, I’m going to weave in responses in my coming posts – and link back to your comment here and to your website – http://www.mkbergman.com/.
  For all of you who are joining the conversation, Mike Bergman is one of the thought-leaders in developing not just ontologies, but a formalized, systematic, and principled way for developing ontologies. He’s got a book submitted for publication, and I’ll be reviewing and referencing that book as it comes out.
  In the meantime, we’re all looking for ways to transform the “dark art” into something more explicit.
  I’m liking the “dark art” reference, because in a sense, it’s as though we were doing chemistry – before the realization of the periodic table!
  Thanks, Mike – we’ll be keeping this conversation open!
  All my best – AJM

Alianna J. Maren

Statistical Mechanics, Neural Networks, Artificial Intelligence

Moving Between Representation Levels – the Key to Making an AI System Work (Part 1)

April 12, 2018 AJMaren Comments 2 comments

Representation Levels: The Key to Understanding AI

A Brief Word on the Low-Level and High-Level Representations

Semantic / Symbolic Representations

Most Crucial To-Reads (Journal and arXiv)

Academic Articles and Books on Representations for AI

Useful Blogs

Readable Techy-News on This and Related Subjects

Related

Previous Related Posts

Related

2 thoughts on “Moving Between Representation Levels – the Key to Making an AI System Work (Part 1)”

Leave a Reply Cancel reply

Representation Levels: The Key to Understanding AI

A Brief Word on the Low-Level and High-Level Representations

Semantic / Symbolic Representations

Most Crucial To-Reads (Journal and arXiv)

Academic Articles and Books on Representations for AI

Useful Blogs

Readable Techy-News on This and Related Subjects

Related

Previous Related Posts

Share this:

Related

2 thoughts on “Moving Between Representation Levels – the Key to Making an AI System Work (Part 1)”

Leave a Reply Cancel reply