Appendix: NLP Expressions

For a data source to be able to interpret natural language queries, it must be taught how to convert those queries into machine-friendly terms. KnowledgeKube's NLP contains a substantial library of terms and how they translate to raw data, so in order to apply this logic to your own data there's actually not a lot you need to do.

When NLP is initialised for a data source, it will examine the name of each column in the source table in an attempt to understand the column's contents. For each word it recognises, it will create a rule linking that column to related natural language terms. These rules are known as NLP Expressions, and are managed via the Expressions panel.

Managing a Data Source's NLP expressions.

For example, if the source table contains a column named ProductWeight, KnowledgeKube will assume the word "weight" is the most relevant part of that name. Since the system understands the meaning of the word weight, it will automatically group related terms such as "mass", "weighing", and "weigh", and assign a default unit of measurement such as kg. Any logical queries associated with those words can now be used to interrogate the data source using natural language queries such as "products that weigh more than 1kg" and "products with a mass of 5".

When no unit is specified in a natural language query, KnowledgeKube will use the default unit of measurement. For example, in a query such as "products longer than 20" where the Length column uses a default unit of feet, the query will be interpreted as "products longer than 20ft".

The Expression field, in the Expression panel requires a valid combination of field names, operators, constants, and functions. Each expression is assigned a type depending on its anticipated result, for example:

  • Expressions that contain only a column name will be assigned the same type as the data in that column.
  • Expressions that include a logical comparison will always return a True or False value, and are assigned the True/False type.
  • String manipulation expressions - such as concatenation - are expected to return a String and will therefore be assigned that type.
  • Arithmetic expressions are expected to return a Numeric value, and will therefore be assigned that type.
  • Date calculations are expected to return a DateTime value and will therefore be assigned that type.

When a valid expression has been entered, a message will appear underneath it stating the assigned type, and the remaining fields in the dialog will change to reflect that type. These fields are as follows:

Field Used with Data Types Description Example
Noun or Verb DateTime, Numeric, String Type one or more terms from the NLP dictionary to associate with the expression. This allows the chosen terms to be used as part of NLP queries on the data. If more than one word is entered, they must be separated using pipe symbols. "width|breadth" to describe "ProductWidth"
"True" Phrase True/False Type a term to describe values that cause the expression to return True. "scarce" to describe "Stock < 50"
Dimension Numeric Use the drop-down menu to select the physical quantity that best matches the value returned by the expression. "volume" to describe "ProductHeight * ProductWidth * ProductLength"
Units Numeric Use the drop-down menu to select the unit of measurement for the chosen Dimension. "yard" to describe "length"
Direction Numeric Use the drop-down to further clarify the numeric measurement. You will only see this field if the chosen Dimension requires it. "age" or "interval" to describe a "time" expression.

Configuring a string expression using a data source field called ProductName.

The Examples field at the bottom of the dialog will update automatically each time you make a change to the expression or any of the other fields. It displays sample NLP queries compatible with the the expression, which will be displayed to anyone who submits an invalid query against the NLP data. Although you can modify this text if you like, you should ensure what you type is a valid sample, or you risk confusing end users. Also be aware that whatever you type here will be overwritten if you make any further changes to the expression.

Message presented to users who type an invalid NLP query. Note the suggested examples in the lower half of the messaage.