Our Mission

July 17, 2011 Comments off

Our Mission:

  1. The SIRA technology was developed to automate research tasks requiring natural language, domain-knowledge, understanding and reasoning.
  2. SIRA-based products must be easy-to-use requiring little or no training beyond what the user already understands.

Mission Status:

  1. We have developed a patented and patent-pending core technology capable of supporting a wide range of applications.
  2. We have developed PriArt, an embodiment of the SIRA technology that automates the understanding and reading of natural language text (initially English), gathers specific information of interest and produces a variety of useful reports.
  3. Other SIRA embodiments have been conceived and prototyped.

Today SIRA can:

  1. Semantically understand a statement of your interest expressed in Natural Language
  2. Read through a vast number of documents
  3. Identify semantic relevant information of interest in Natural Language text
  4. Report the findings in useful ways including Natural Language text

 

Advertisements

Selected PriArt pilots begin

July 17, 2011 Comments off

PriArt is a web-based application for conducting Semantic Search/Research.

By “Semantic Search” we mean:

Rather than using ranking algorithms such as Google’s PageRank to predict relevancy, Semantic Search uses semantics, or the science of meaning in language, to produce highly relevant search results.  […] the goal is to deliver the information queried by a user rather than have a user sort through a list of loosely related keyword results.” – http://en.wikipedia.org/wiki/Semantic_search

PriArt quickly reads through a potentially large corpus of documents and reports just the information you are looking for.

We are starting pilots with a limited number of users. To qualify please contact us here.

Fast, Accurate and Easy to Use

July 17, 2011 Comments off

Ultimately, we (at Semantic Insights) are engineers and not researchers. Our goal is to build something useful, and something that will sustain itself (and us) financially. We recognize that whatever we develop needs to perform fast enough, produce accurate enough results, and be easy enough to use to warrant end user investment.

We knew that the system we envisioned needed to understand whole sentences in context. Existing technologies fail to meet one or more of our requirements. “Statistical proximity matching” is clearly not accurate enough. “Keyword search” misses anaphora and “key-phrase search” misses intervening terms and other equivalent sentence structures. Traditional Natural Language Processing (NLP) appears too slow. Statistically-based Parts of Speech Taggers are wholly inadequate. Building “Ontologies” has all the problems of standards, in addition to requiring time, expertise, and significant up-front investment. “Text mining” results are too limited to warrant the implementation costs.

In short, we found current technology significantly lacking for our purpose. So, we began with first principles and built our approach from the bottom up.

Categories: Requirements, Technology

Semantic-enabled Applications vs. non-Semantic-enabled Applications

July 17, 2011 Comments off

This entry is offered as background.

Semantic-based applications operate on data. This can be data in a database, entries in a blog or wiki, text in documents or web pages. The data can be real-time data such as clicks in a browser, streaming video, or a stock ticker. The data can be anything. This is of course true of non-semantic-based applications as well.

The difference between semantic-based applications and non-semantic-based applications comes from their use of ontologies.

Semantic-based applications use special information models (semantic models) that describe the data they operate on. These semantic models are commonly known as “ontologies”. Note: ontologies are different than a database schema. And creating ontologies is not the same as “data modeling”.

Ontologies are usually maintained and accessed separately from the data they describe. To accomplish this, semantic-based applications need to have some way to relate the data to the concepts in the ontology/ontologies that describes it. The simplest form of this relationship is via “semantic tags”.

Semantic tags are used to identify data as representing or containing information related to one or more concepts in the ontology. More sophisticated systems employ classification functions that dynamically identify data as representing one or more concepts. Still more sophisticated systems create and maintain mappings of relationships as well as the concepts. And even more sophisticated systems maintain these mappings on multiple dimensions (for example, mapping different “Points of View”).

Semantic-based applications enable reasoning about data by examining the related ontologies. This can be as simple as maintaining a hierarchical index of the semantic contents of a document and then querying the index to find semantically related data. Logic can also be applied to find inferences where, for example, the existence of concept A and B and C in one or more documents implies the existence of concept D. Still more sophisticated systems can determine semantic overlap between the information sought and the information available, or between different “points of View”. And even more sophisticated systems can discover new data and new concepts.

What is important here is this:

  1. Semantic-based applications use an external information model (ontology) to semantically describe data in terms of concepts and their relationships.
  2. There is some way to relate the data to the items in the ontology.
  3. Reasoning can be performed about the data based on its relationship to the ontology.

There are many ways all this can happen. Each has its own pros and cons. This will become evident in later blog entries.

Natural Language Understanding

July 17, 2011 Comments off

Understanding Natural Language is complex. In particular, extracting the meaning from all but the simplest Natural Language communication presents a number of challenges for machine understanding. Let me explain.

In Natural Language there are many ways to say the same thing. Subjects, objects, and verbs can change position (as well as form) to form sentences each with an equivalent meaning. A subject, object, or verb may appear as a single word or a complex phrase. For that matter, whole sentences can be simple or complex, festooned with coordinating conjunctions, subordinate clauses, and all manner of punctuation (or frequently missing or incorrect punctuation).

In Natural Language flexibility abounds. There are many ways to express notions of time, identity, location, possession, and quantity. There are many ways to express characteristics, values, and units of measure. There are many ways to express relationships between two things, concepts, or other relationships. In addition, there are many ways to express negation.

In Natural Language the opportunity for ambiguity abounds. For a given grammar, there may be more than one way to correctly divide the sentence into terms and phrases. In fact, there may be more than one valid part-of-speech for a given word in a sentence. Punctuation may help, but more often then not, is missing or incorrect.

A sentence can contain words that stand for, or reference, other words in the same sentence or in previous sentences. This includes pronouns (e.g., “he”, “she”, “it”, “those”, “they”, “him”, “her”, “them”, etc.) and possessive pronouns (e.g., “his”, “hers”, “theirs”, etc.). Further, a sentence can contain indexicals that refer to one or more concepts expressed in the current sentence, in one or more previous sentences, or in sentences yet to come.

Determining the sense of a word (or multi-word term) in a sentence can be tricky. The sense of a given word can change based on how it is used in a sentence. It can change based on the presence or absence of other words in a sentence. It can change base on the content of one or more previous sentences or sentences yet to come.

There are many kinds of sentences. There are declarations, questions, commands, and exclamations. Sentences can contain quotes. Sentences can be quotes. Sentences can contain conditionals (e.g. “when”, “if”, etc.) or statements of probability (e.g. “may” vs “will”). Sentences can directly reference other sentences.

When it comes to understanding Natural Language, context is crucial. Different readers (or listeners) derive the meaning of a communication based on their individual skill with the language, as well as, their individual background knowledge of the content of the conversation. Much of our understanding of language is rooted in both a local semantic context (this document or set of documents) and one or more larger semantic contexts (domain-specific and general knowledge).

Many ways of saying the same thing

July 17, 2011 Comments off

Subjects and Objects can move around.  For example; “Joe gave Bob the ball” also be stated as “The ball was given to Bob by Joe.” Verbs can change “direction.” For example; “Joe gave Bob the ball” can be restated as “Bob received the ball from Joe.” Substitution of Equivalent words or phrases (e.g. synonyms) is often not simple and may be constrained by the semantic context.  For example; you may say that, “Joe ran for office” is the same as “Joe campaigned for office.”  Here “ran” and “campaigned” may be valid synonyms in this context.  However, it may not be correct to substitute “sprinted” for “ran” to produce, “Joe sprinted for office.”  However, if you add a determiner, “the”, before “office”, that substitution may make sense as in, “Joe sprinted for the office.”  Note: You will need to consult the semantic context to determine what is probably meant.  There will be more on this later.

Subjects and Objects

July 17, 2011 Comments off

Subjects and objects of a sentence can be simple or complex.
A single word (e.g. “Joe”)
A pronoun (e.g. “he”, “she”, “it”,…)
An indexical (e.g. “that” as in “That made all the difference”)
An indexical is a reference to something previously stated.
A noun phrase (e.g. “President Lincoln”)
A gerund phrase (e.g. “going to town”)
A infinitive phrase (e.g. “to go to town”)
Conjoined (e.g. “Joe and Bob”, “Joe or Bob”,…)
And there is more.