Examples¶

This document walks through two complete examples showing how to use Scout in practice. Both examples use the Python client, but every operation has an equivalent curl call documented in the Scout Server reference.

Example 1: Personal Blog ¶

Consider a personal blog with two kinds of searchable content: entries and comments. We will create a separate index for each, populate them with documents, attach images to entries, and then search and filter the results.

Setting up ¶

Start the server and create a client:

from scout.client import Scout

scout = Scout('http://localhost:8000')

Create the two indexes:

scout.create_index('blog-entries')
scout.create_index('blog-comments')

Indexing blog entries ¶

Each blog entry becomes a document whose content is the full text of the post. Metadata stores structured fields we will want to filter on later:

scout.create_document(
    'Welcome to my blog!  This is my very first post, where I introduce '
    'myself and talk about what I plan to write about.',
    'blog-entries',
    identifier='entry-1',
    title='Welcome to my blog!',
    url='/blog/welcome/',
    published='true',
    date='2026-01-15')

scout.create_document(
    'Today my cat ate a spider and later he was sick.',
    'blog-entries',
    identifier='entry-2',
    title='Spider Adventures',
    url='/blog/spiders/',
    published='true',
    date='2026-02-03')

scout.create_document(
    'My cat ate another spider, I could tell because there were some '
    'legs on his bed. He was not sick, so it must have been a different '
    'type of spider',
    'blog-entries',
    identifier='entry-3',
    title='More Spider News',
    url='/blog/more-spiders/',
    published='true',
    date='2026-03-20')

# A draft that has not been published yet.
scout.create_document(
    'Draft post about pest control...still fleshing out ideas here.',
    'blog-entries',
    identifier='entry-4',
    title='Pest Control (draft)',
    url='/blog/pest-control/',
    published='false',
    date='2024-04-01')

Attaching images ¶

If your blog entries have a primary image, you can attach it to the document so that search results can display a thumbnail:

scout.attach_files(
    'entry-2',
    {'spider.jpg': open('/path/to/spider.jpg', 'rb')})

scout.attach_files(
    'entry-3',
    {'spider-remnants.png': open('/path/to/spider-remnants.png', 'rb')})

When you retrieve or search documents later, each result will include an attachments list with download URLs.

Indexing comments ¶

Comments are stored in a separate index. The metadata includes the parent entry’s identifier (for filtering) and a spam flag:

scout.create_document(
    'Looking forward to the content',
    'blog-comments',
    identifier='comment-1',
    entry_id='entry-1',
    author='alice',
    spam='false',
    date='2026-01-16')

scout.create_document(
    'What did the spider look like?',
    'blog-comments',
    identifier='comment-2',
    entry_id='entry-2',
    author='bob',
    spam='false',
    date='2026-02-04')

scout.create_document(
    'Buy cheap watches at http://example.com',
    'blog-comments',
    identifier='comment-3',
    entry_id='entry-2',
    author='spambot',
    spam='true',
    date='2026-02-04')

Searching entries ¶

Full-text search over all published entries:

results = scout.get_documents(q='spiders', index='blog-entries', published='true')
for doc in results['documents']:
    print(doc['metadata']['title'], '-', doc['metadata']['url'], doc['score'])
# Spider Adventures - /blog/spiders/ -0.268...
# More Spider News - /blog/more-spiders/ -0.252...

Search with a wildcard to match prefixes:

results = scout.get_documents(q='spid*', index='blog-entries')
for doc in results['documents']:
    print(doc['metadata']['title'])
# Spider Adventures
# More Spider News

Filtering by date range (all entries from February 2026 onward):

results = scout.get_documents(
    index='blog-entries',
    published='true',
    date__ge='2026-02-01')
for doc in results['documents']:
    print(doc['metadata']['date'], doc['metadata']['title'])
# 2026-02-03 Spider Adventures
# 2026-03-20 More Spider News

Exclude drafts from results:

results = scout.get_documents(index='blog-entries', published='true')
print(len(results['documents']))  # 3 (the draft is excluded)

Searching comments ¶

Find all non-spam comments on a particular entry:

results = scout.get_documents(
    index='blog-comments',
    entry_id='entry-2',
    spam='false')
for doc in results['documents']:
    print(doc['metadata']['author'], ':', doc['content'])
# bob : What did the spider look like?

Search comments across all entries:

results = scout.get_documents(q='spiders', index='blog-comments', spam='false')
for doc in results['documents']:
    print(doc['metadata']['author'], 'on', doc['metadata']['entry_id'])
# bob on entry-2

Updating and deleting ¶

Because every document was created with an identifier, we can update and delete without ever tracking Scout’s internal integer IDs.

Publish a draft by updating its metadata:

scout.update_document(
    document_id='entry-4',
    metadata={
        'title': 'Pest Control',
        'url': '/blog/pest-control/',
        'published': 'true',
        'date': '2026-04-10',
    })

Remove a spam comment:

scout.delete_document('comment-3')

Re-indexing content ¶

Identifiers also give you upsert semantics: calling create_document with an identifier that already exists will update the existing document rather than creating a duplicate. This makes it safe to re-run your indexing script at any time — for example after editing a blog post:

# This updates entry-2 in place because the identifier already exists.
scout.create_document(
    'Today my cat ate a spider and later he was sick. I think it was '
    'one of those little, fast spiders.',
    'blog-entries',
    identifier='entry-2',
    title='Spider Adventures (updated)',
    url='/blog/spiders/',
    published='true',
    date='2026-02-03')

# Verify there is still only one document with this identifier.
doc = scout.get_document('entry-2')
print(doc['content'][-20:])  # ...little, fast spiders.

This pattern is convenient for a full re-index. Iterate over every post in your application database and call create_document with the same identifiers. Scout handles the create-or-update logic for you.

Example 2: News Website ¶

A news website has several content types — articles, local events, and sports scores — each in its own index. A master index that contains every document allows site-wide search.

Setting up ¶

from scout.client import Scout

scout = Scout('http://localhost:8000')

scout.create_index('articles')
scout.create_index('events')
scout.create_index('sports')
scout.create_index('master')  # Everything goes here too.

Indexing content ¶

A helper function keeps things DRY by always adding documents to the master index alongside the category-specific index:

def index_content(content, category, **metadata):
    """Index a piece of content into its category index and the master index."""
    return scout.create_document(
        content,
        [category, 'master'],
        **metadata)

# Articles
index_content(
    'The city council voted Tuesday to approve the new downtown park '
    'proposal after months of public debate.',
    'articles',
    identifier='article-100',
    headline='City Council Approves Downtown Park',
    section='local',
    date='2024-06-11')

index_content(
    'Global markets rallied on Friday after the central bank signaled a '
    'pause in rate hikes.  Tech stocks led the gains.',
    'articles',
    identifier='article-101',
    headline='Markets Rally on Rate Pause Signal',
    section='business',
    date='2024-06-14')

index_content(
    'A new study shows that urban green spaces significantly improve '
    'mental health outcomes for nearby residents.',
    'articles',
    identifier='article-102',
    headline='Green Spaces Linked to Better Mental Health',
    section='science',
    date='2024-06-15')

# Local events
index_content(
    'The annual Summer Jazz Festival returns to Riverside Park on July 4th '
    'with headlining performances by several Grammy-winning artists.',
    'events',
    identifier='event-200',
    title='Summer Jazz Festival',
    venue='Riverside Park',
    date='2024-07-04')

index_content(
    'Downtown Farmers Market every Saturday morning from 8am to noon.  '
    'Fresh produce, baked goods, and local crafts.',
    'events',
    identifier='event-201',
    title='Downtown Farmers Market',
    venue='Main Street Plaza',
    date='2024-06-01',
    recurring='true')

# Sports scores
index_content(
    'The Lions defeated the Bears 27-14 in a dominant home performance.  '
    'Quarterback Smith threw for 3 touchdowns.',
    'sports',
    identifier='game-300',
    home_team='Lions',
    away_team='Bears',
    score='27-14',
    date='2024-06-09')

index_content(
    'Eagles and Hawks played to a 1-1 draw in a rain-soaked match.  '
    'Both goals came in the second half.',
    'sports',
    identifier='game-301',
    home_team='Eagles',
    away_team='Hawks',
    score='1-1',
    date='2024-06-10')

Searching within a category ¶

Search only articles:

results = scout.get_documents(q='park', index='articles')
for doc in results['documents']:
    print(doc['metadata']['headline'])
# City Council Approves Downtown Park

Filter articles by section:

results = scout.get_documents(index='articles', section='business')
for doc in results['documents']:
    print(doc['metadata']['headline'])
# Markets Rally on Rate Pause Signal

Search only events at a particular venue:

results = scout.get_documents(index='events', venue='Riverside Park')
for doc in results['documents']:
    print(doc['metadata']['title'], '-', doc['metadata']['date'])
# Summer Jazz Festival - 2024-07-04

Search sports results for a specific team:

results = scout.get_documents(index='sports', home_team='Lions')
for doc in results['documents']:
    print(doc['metadata']['home_team'], 'vs', doc['metadata']['away_team'],
          doc['metadata']['score'])
# Lions vs Bears 27-14

Site-wide search ¶

The master index lets you search across every content type at once:

results = scout.get_index('master', q='park')
# OR: results = scout.get_documents(q='park', index='master')
for doc in results['documents']:
    print(doc['indexes'], doc['content'][:60] + '...')
# ['articles', 'master'] The city council voted Tuesday to approve the new do...
# ['events', 'master']   The annual Summer Jazz Festival returns to Riverside ...

Using the documents endpoint with multiple indexes achieves the same thing without a dedicated master index:

results = scout.get_documents(
    q='park',
    index=['articles', 'events', 'sports'])
for doc in results['documents']:
    print(doc['content'][:60] + '...')

Date range queries work the same way across all indexes:

results = scout.get_index(
    'master',
    date__ge='2024-06-10',
    date__le='2024-06-15')
for doc in results['documents']:
    print(doc['metadata']['date'], doc['content'][:50] + '...')

Equivalent example specifying indexes explicitly:

results = scout.get_documents(
    index=['articles', 'events', 'sports'],
    date__ge='2024-06-10',
    date__le='2024-06-15')
for doc in results['documents']:
    print(doc['metadata']['date'], doc['content'][:50] + '...')

Working with attachments ¶

Attach a PDF of the full print article:

scout.attach_files('article-100', {
    'downtown-park-full.pdf': open('downtown-park-full.pdf', 'rb'),
})

Download the attachment to a local file:

raw_bytes = scout.download_attachment('article-100', 'downtown-park-full.pdf')
with open('downloaded-article.pdf', 'wb') as fh:
    fh.write(raw_bytes)

Later, find all PDF attachments across the articles index:

pdfs = scout.search_attachments(index='articles', mimetype='application/pdf')
for att in pdfs['attachments']:
    print(att['filename'], att['data_length'], 'bytes')
# downtown-park-full.pdf 84521 bytes

Using SearchSite for automatic indexing ¶

If your application uses model classes, you can use SearchSite to automatically index and remove objects without manually calling create_document and delete_document:

from scout.client import Scout, SearchProvider, SearchSite

class ArticleProvider(SearchProvider):
    def content(self, article):
        return '%s %s' % (article.headline, article.body)

    def identifier(self, article):
        return 'article-%s' % article.id

    def metadata(self, article):
        return {
            'headline': article.headline,
            'section': article.section,
            'date': str(article.pub_date),
        }

scout = Scout('http://localhost:8000')
site = SearchSite(scout, 'articles')
site.register(Article, ArticleProvider)

# When a new article is created:
site.store(article)

# When an article is deleted:
site.remove(article)

This pattern works well inside ORM hooks (such as Peewee post_save and post_delete signals or Django signals) to keep the search index in sync with your database automatically.

Examples¶

Example 1: Personal Blog ¶

Setting up ¶

Indexing blog entries ¶

Attaching images ¶

Indexing comments ¶

Searching entries ¶

Searching comments ¶

Updating and deleting ¶

Re-indexing content ¶

Example 2: News Website ¶

Setting up ¶

Indexing content ¶

Searching within a category ¶

Site-wide search ¶

Working with attachments ¶

Using SearchSite for automatic indexing ¶

Scout

Navigation

Related Topics