Archive

Author Archive

Semantic-enabled Applications vs. non-Semantic-enabled Applications

July 17, 2011 Comments off

This entry is offered as background.

Semantic-based applications operate on data. This can be data in a database, entries in a blog or wiki, text in documents or web pages. The data can be real-time data such as clicks in a browser, streaming video, or a stock ticker. The data can be anything. This is of course true of non-semantic-based applications as well.

The difference between semantic-based applications and non-semantic-based applications comes from their use of ontologies.

Semantic-based applications use special information models (semantic models) that describe the data they operate on. These semantic models are commonly known as “ontologies”. Note: ontologies are different than a database schema. And creating ontologies is not the same as “data modeling”.

Ontologies are usually maintained and accessed separately from the data they describe. To accomplish this, semantic-based applications need to have some way to relate the data to the concepts in the ontology/ontologies that describes it. The simplest form of this relationship is via “semantic tags”.

Semantic tags are used to identify data as representing or containing information related to one or more concepts in the ontology. More sophisticated systems employ classification functions that dynamically identify data as representing one or more concepts. Still more sophisticated systems create and maintain mappings of relationships as well as the concepts. And even more sophisticated systems maintain these mappings on multiple dimensions (for example, mapping different “Points of View”).

Semantic-based applications enable reasoning about data by examining the related ontologies. This can be as simple as maintaining a hierarchical index of the semantic contents of a document and then querying the index to find semantically related data. Logic can also be applied to find inferences where, for example, the existence of concept A and B and C in one or more documents implies the existence of concept D. Still more sophisticated systems can determine semantic overlap between the information sought and the information available, or between different “points of View”. And even more sophisticated systems can discover new data and new concepts.

What is important here is this:

  1. Semantic-based applications use an external information model (ontology) to semantically describe data in terms of concepts and their relationships.
  2. There is some way to relate the data to the items in the ontology.
  3. Reasoning can be performed about the data based on its relationship to the ontology.

There are many ways all this can happen. Each has its own pros and cons. This will become evident in later blog entries.

Natural Language Understanding

July 17, 2011 Comments off

Understanding Natural Language is complex. In particular, extracting the meaning from all but the simplest Natural Language communication presents a number of challenges for machine understanding. Let me explain.

In Natural Language there are many ways to say the same thing. Subjects, objects, and verbs can change position (as well as form) to form sentences each with an equivalent meaning. A subject, object, or verb may appear as a single word or a complex phrase. For that matter, whole sentences can be simple or complex, festooned with coordinating conjunctions, subordinate clauses, and all manner of punctuation (or frequently missing or incorrect punctuation).

In Natural Language flexibility abounds. There are many ways to express notions of time, identity, location, possession, and quantity. There are many ways to express characteristics, values, and units of measure. There are many ways to express relationships between two things, concepts, or other relationships. In addition, there are many ways to express negation.

In Natural Language the opportunity for ambiguity abounds. For a given grammar, there may be more than one way to correctly divide the sentence into terms and phrases. In fact, there may be more than one valid part-of-speech for a given word in a sentence. Punctuation may help, but more often then not, is missing or incorrect.

A sentence can contain words that stand for, or reference, other words in the same sentence or in previous sentences. This includes pronouns (e.g., “he”, “she”, “it”, “those”, “they”, “him”, “her”, “them”, etc.) and possessive pronouns (e.g., “his”, “hers”, “theirs”, etc.). Further, a sentence can contain indexicals that refer to one or more concepts expressed in the current sentence, in one or more previous sentences, or in sentences yet to come.

Determining the sense of a word (or multi-word term) in a sentence can be tricky. The sense of a given word can change based on how it is used in a sentence. It can change based on the presence or absence of other words in a sentence. It can change base on the content of one or more previous sentences or sentences yet to come.

There are many kinds of sentences. There are declarations, questions, commands, and exclamations. Sentences can contain quotes. Sentences can be quotes. Sentences can contain conditionals (e.g. “when”, “if”, etc.) or statements of probability (e.g. “may” vs “will”). Sentences can directly reference other sentences.

When it comes to understanding Natural Language, context is crucial. Different readers (or listeners) derive the meaning of a communication based on their individual skill with the language, as well as, their individual background knowledge of the content of the conversation. Much of our understanding of language is rooted in both a local semantic context (this document or set of documents) and one or more larger semantic contexts (domain-specific and general knowledge).

Many ways of saying the same thing

July 17, 2011 Comments off

Subjects and Objects can move around.  For example; “Joe gave Bob the ball” also be stated as “The ball was given to Bob by Joe.” Verbs can change “direction.” For example; “Joe gave Bob the ball” can be restated as “Bob received the ball from Joe.” Substitution of Equivalent words or phrases (e.g. synonyms) is often not simple and may be constrained by the semantic context.  For example; you may say that, “Joe ran for office” is the same as “Joe campaigned for office.”  Here “ran” and “campaigned” may be valid synonyms in this context.  However, it may not be correct to substitute “sprinted” for “ran” to produce, “Joe sprinted for office.”  However, if you add a determiner, “the”, before “office”, that substitution may make sense as in, “Joe sprinted for the office.”  Note: You will need to consult the semantic context to determine what is probably meant.  There will be more on this later.

Subjects and Objects

July 17, 2011 Comments off

Subjects and objects of a sentence can be simple or complex.
A single word (e.g. “Joe”)
A pronoun (e.g. “he”, “she”, “it”,…)
An indexical (e.g. “that” as in “That made all the difference”)
An indexical is a reference to something previously stated.
A noun phrase (e.g. “President Lincoln”)
A gerund phrase (e.g. “going to town”)
A infinitive phrase (e.g. “to go to town”)
Conjoined (e.g. “Joe and Bob”, “Joe or Bob”,…)
And there is more.

Word Sense in context

July 17, 2011 Comments off

Depending on other words in the same sentence
Consider the word “took”
“Joe took his medicine.”
“took” may mean “swallowed”
“Joe took his medicine with him.”
“took” may mean “brought”
“Joe took his medicine like a man.”
“took his medicine” may mean “endured ”
“Joe took his medicine and hid it.”
“took” may simply mean “moved”
Depending on other words in other sentences
Consider the word “took” again in this paragraph:
“Joe was wrong and he knew it.  Finally, it was time.  Joe took his medicine.”

World View, Meaning and Understanding

July 17, 2011 Comments off

Let us begin with some basic definitions.

An individual or collective World View is an operating model composed of concepts, instances, and relationships, along with how they are known to (may possibly) inter-relate. These inter-relations may be constrained in time, space, identity, quantity, negation, and possession.

When we read we experience the Natural Language text. The relationship between what we experience and our “World View” is the meaning we ascribe to what we read. Meaning is often a complex relationship between what we experience and our World View.
Understanding is gained as a result of determining the meaning of our experiences in terms of our World View.

From these definitions we can create a form of Machine Understanding by:

  1. creating and populating a persistent model of one or more World Views,
  2. establishing a model to represent the relationships between what is experienced (read) and the World View,
  3. providing the ability to render Natural Language sources to a common form for processing,
  4. providing the ability to map rendered Natural Language to a World View and store that mapping as meaning
Categories: Definitions, Technology

SIRA

July 17, 2011 Comments off

We refer to the implementation of our patented and patent-pending core reading, understanding, and writing technologies together as the SIRA Technology (or just “SIRA”).

SIRA stands for “Semantic Insights Research Assistant”.

I pronounce it “sigh”-“rah”.  Others on the team say “sear”-“rah”.

Categories: Definitions, Technology

Ontology in the SIRA Technology

July 17, 2011 Comments off

We built our system around the notion that language (for that matter; drawings, songs, smells, or any other encoding of perceived semantics) provides a systematic (although often complex) way to express relationships between and among items. These items may range from the physical to the abstract. The relationships may express time, location, participation, or a wide variety of physical or conceptual relatedness.

When you “teach” SIRA about your interests, SIRA constructs (or adds to) a semantic model (i.e. ontology) representing the concepts and relationships of interest. If these concepts are already known, they may be referenced and perhaps enhanced. If the relationships are already known, they too may be referenced and perhaps enhanced. How this happens will be covered in other blog entries.

A previously existing ontology (perhaps with a supplied upper ontology) may provide a “useful” organization of concepts and relationships. This organization is useful to SIRA in two ways. First, while reading, the specializations and instances of a concept or relationship of interest are automatically recognized. And second, while reporting, the written information can be organized with abstract concepts first, followed by successively more detailed concepts. This is in addition to SIRA’s other report organization schemes.

We should also state that “well designed upper ontologies” can provide an excellent foundation for reasoning that exploits generalization-specialization and part-whole relationships. The topic of reasoning in SIRA will be covered in other blog entries.

About Semantic Insights

December 30, 2010 Comments off

Semantic Insights is the R&D division of Trigent Software, Inc.

We focus on developing semantics-based information products and plug-ins for third party products that produce high-value results serving the needs of general users requiring little or no training.

Visit us at www.semanticinsights.com