Search Query Syntax

Scout uses SQLite FTS5 for full-text search. The q parameter on any search endpoint accepts the full FTS5 query syntax, giving you powerful tools for building precise queries.

Note

All search queries operate on the content field only. The identifier field is used internally for upsert logic and is excluded from search matching and scoring. You never need to specify a column name in your queries.

The examples below assume you have a client initialized:

from scout.client import Scout
scout = Scout('http://localhost:8000')

Most examples use search(), which searches across all indexes (or a specified subset). To restrict a search to a single index, use get_index() instead — the query syntax is identical.

Quick Reference

Feature

Syntax

Example

Single term

word

q='python'

Implicit AND

word1 word2

q='python web'

OR

word1 OR word2

q='flask OR django'

NOT

word1 NOT word2

q='python NOT java'

Phrase

"word1 word2"

q='"web framework"'

Prefix

prefix*

q='frame*'

NEAR

NEAR(w1 w2, N)

q='NEAR(python web, 5)'

Initial token

^word

q='^python'

Grouping

(expr)

q='(flask OR django) AND python'

All documents

*

q='*'

Simple Queries

The simplest query is a single word. It matches any document whose content contains that word (after stemming - see below).

results = scout.search('python')

Multiple words are joined with implicit AND, all terms must be present in the document:

results = scout.search('python web framework')

This returns only documents that contain all three of the words python, web, and framework.

To restrict a search to specific indexes, pass index as a string or list:

# Single index
results = scout.search('python', index='my-index')

# Multiple indexes
results = scout.search('python', index=['idx1', 'idx2'])

You can also use get_index() to search within a single index:

results = scout.get_index('my-index', q='python')

All Documents

You can use the wildcard '*' search query and filter directly on metadata across all documents:

results = scout.search('*', category='tutorial')

Alternately, you can omit the query and search across a single index or all indexes:

# Just in "my-index".
results = scout.get_index('my-index', category='tutorial')

# All documents.
results = scout.get_documents(category='tutorial')

Boolean Operators

FTS5 supports three boolean operators: AND, OR, and NOT. Operators must be UPPERCASE.

OR: match documents containing either term:

results = scout.search('flask OR django')

NOT: exclude documents containing a term:

results = scout.search('python NOT javascript')

AND: explicitly require both terms (this is the default, so python AND web is equivalent to python web):

results = scout.search('python AND web')

Use parentheses to group sub-expressions:

results = scout.search('(flask OR django) AND python')

This matches documents that contain python and at least one of flask or django.

A more complex example:

results = scout.search('(flask OR django) NOT javascript')

Phrase Queries

Wrap terms in double quotes to require an exact sequence of tokens:

results = scout.search('"web framework"')

The query "web framework" matches a web framework for python but not the framework is web-based (because the tokens are not adjacent).

Phrases can be combined with boolean operators:

results = scout.search('"web framework" OR "REST API"')

Prefix Queries

Append * to a token to match any word that starts with that prefix:

results = scout.search('frame*')

This matches framework, frameworks, framed, etc.

Note

Scout’s FTS index is configured with prefix = [2, 3], meaning 2- and 3-character prefixes are pre-indexed for fast lookup. Longer prefixes work too, but the first two/three characters benefit from the index.

Prefix queries combine naturally with other features:

results = scout.search('pyth* NOT javascript')

NEAR Queries

The NEAR operator matches documents where two or more terms appear within a specified distance (in tokens) of each other:

results = scout.search('NEAR(python web, 3)')

This matches documents where python and web are within 3 tokens of each other. The default distance (when omitted) is 10:

results = scout.search('NEAR(python web)')

Initial Token Queries

The ^ operator matches only if the token appears at the very beginning of the content field:

results = scout.search('^python')

This matches python web framework but not learning python basics.

Stemming

Scout uses the Porter stemmer with the unicode61 tokenizer. This means queries automatically match morphological variants of words:

  • run matches running, runs

  • belief matches believe, believes, believing

  • connection matches connected, connect, connecting, connects, etc.

Stemming is applied to both document content at index time and to query terms at search time, so you do not need to worry about exact word forms.

Case Sensitivity

All queries are case-insensitive. The queries Python, python, and PYTHON all return the same results.

Combining Features

All of the above features can be combined freely:

# Phrase + boolean + prefix
results = scout.search('"web framework" OR pyth*')

# NEAR + NOT
results = scout.search('NEAR(python web, 5) NOT django')

# Initial token + boolean grouping
results = scout.search('^python AND (flask OR django)')

# Complex grouped expression
results = scout.search('(flask OR django) AND "REST API" NOT legacy')

Combined with metadata filters:

# Using search() across all indexes
results = scout.search(
    'python OR javascript',
    category='tutorial',
    level='beginner')

# Using get_index() to also restrict to a single index
results = scout.get_index(
    'my-index',
    q='python OR javascript',
    category='tutorial',
    level='beginner')

For the full set of metadata filter operations, see Filtering on Metadata.

Error Handling

If you send a malformed query (unbalanced quotes, dangling operators, etc), Scout will return a 400 Bad Request with a JSON error message:

{"error": "Invalid search query: unterminated string"}

Common mistakes:

  • "unclosed quote

  • AND OR foo - consecutive operators

  • (foo AND bar - unbalanced parentheses

  • NOT - operator with no operand

Ranking

By default, search results are ranked using the BM25 algorithm, which is built into SQLite FTS5. Documents that are a better match for your query appear first.

Each document in the response includes a score field. Scores are negative, and lower (more negative) values indicate better matches:

{
  "content": "python web framework flask tutorial",
  "score": -1.4206,
  ...
}

This convention comes from SQLite FTS5’s built-in rank column, which returns negative BM25 values so that a simple ascending sort puts the best matches first. A score of -2.98 is a better match than -0.02.

You can control ranking with the ranking parameter:

# Default BM25 ranking (best match first)
results = scout.search('python', ranking='bm25')

# No ranking — results returned in insertion order, score omitted
results = scout.search('python', ranking='none')

You can also control sort order with the ordering parameter:

# Sort by score (best match first — this is the default with BM25)
results = scout.search('python', ordering='score')

# Sort by ID descending (newest first)
results = scout.search('python', ordering='-id')

# Sort by content alphabetically
results = scout.search('python', ordering='content')

Valid ordering choices: id, identifier, content, score (only when a search query is present). Prefix with - for descending.