Forming a Query

Overview

The Parametric Search Appliance's search can be as simple or as complex as you need it to be. Usually users will just need to enter a few words that best describe what they are trying to locate. To perform more complicated searches you might use any combination of logic operators, special pattern matchers, concept expansion, or proximity operations.

Ranking Factors

The ranking algorithm takes into consideration relative word ordering, word proximity, database frequency, document frequency, and position in text. The relative importance of these factors in computing the quality of a hit can be altered under Ranking Factors on the Options page.

Natural Language Query

Users may enter a query in the form of a sentence or question. The software will automatically identify the important words and phrases within your query and remove the "noise words."

Example:
What is the state of the art in text retrieval?

The software will search for:
state of the art AND text AND retrieval

Keywords, Phrases, and Wildcards

Locating words are as easy as typing them in, just like in a word processor. Letter cases will be ignored.

The wild-card character * (asterisk) may be used to match just the prefix of a word or to ignore the middle of something.

If the desired item is more complicated than the simple * wild-card can accomplish, try using the Regular Expression Matcher.

To locate a number of adjacent words in a specific order, surround them with " (double quotation) characters. Putting a - (hyphen) between words will also force order and one word proximity.

Keyword Examples

Query Locates
john john, John
"john public" John Public
web-browser Web browser, web-browser
John*Public John Q. Public, John Public
456*a*def 1-456-789-ABCDEF
activate activate, activation, activated, ... *

Invoking Thesaurus Expansion

The Parametric Search Appliance has a vocabulary of over 250,000 word and phrase associations. Each entry is generally classifiable by either its meaning or part of speech.

Depending on the administrator's Synonyms setting for this profile, synonyms may already be included for each term in your query. If not, synonyms may be included for individual terms within your query by preceding them with a ~ (tilde) character.

Applying Search Logic

Texis and Metamorph use set logic for text queries. Set logic is easier to use and provides more abilities than boolean. The examples below make reference to single keywords, but keep in mind that each keyword can represent an entire list of things or any of the special pattern matchers.

Sets (or lists) of things are specified by placing the elements within parenthesis, separated by commas. Example: (bob,joe,sam,sue). In the examples below, a list like this could replace any of the keywords.

The default behavior of the search is to locate an intersection (or 'AND') of every element within a query. This means that the query: "microsoft bob interface" is the equivalent to the boolean query: "microsoft AND bob AND interface."

  • - (without): The - (minus) is the most commonly used logic symbol. It means the answer should EXCLUDE references to that item.
  • + (mandatory): The + (plus) symbol in front of a search item means that the answer MUST INCLUDE that item. This is generally used in conjunction with the permutation operation.
  • @N (permute): The @ followed by a number indicates how many intersections to locate of the terms in your query. This may be confusing at first, but it is very powerful.

Search Logic Examples

Query Finds
bob sam joe Bob with Sam and Joe
bob sam -joe Bob with Sam without Joe
bob sam joe @1 Bob with Sam, or Bob with Joe, or Joe with Sam
A B C D @1 AB or AC or AD or BC or BD or CD
+A B C D @1 ABC or ABD or ACD
A B C -D @1 ( AB or AC or BC ) without D

Using the Special Pattern Matchers

These pattern matchers are used to locate hard-to-find items within text:

If improperly used these pattern matchers can slow queries. Therefore they require other keyword(s) in the query and are disabled entirely under Page proximity.

For more details, see the Vortex manual on Query Protection.

Special Pattern Examples

Query Matcher Finds
ronald %regan Approx Ronald Raygun, Ronald Re-an, Ronald 8eagan
%75MYPARTNO9045d/6a Approx Anything within 75% of looking like MYPARTNO9045d/6a
/19[789][0-9] RegEXpr 1970-1999
/[1-9]{3}\-=[0-9]{4} RegEXpr Phone numbers: 555-1212, 820-2200
#87 Numeric four score and seven, 87
#>0<1 Numeric Fractions like 9/16, 55%, 0.123, 15 nanoseconds

Rules of Thumb

If you get too many junk or nonsense answers, try:

  • Add some more words to your query.
  • Decrease the range of the Proximity control.
  • Change the Word Forms control to Exact.
  • Look at the Match Info and see why they are showing up.
  • Use the Exclusion Operator (-) to remove unwanted terms.
  • If you are searching for a phrase, hyphenate the words together.

If you don't get any answers, or just too few:

  • Remove some more words to your query.
  • Examine your spelling.
  • Increase the scope of the Proximity control.
  • It just might not be there?