Search

Build user-facing Elasticsearch searches with Sigmie — typo tolerance, faceted navigation, highlighting, semantic search, and filter-parser syntax.

On this page

newSearch() is the high-level entry point for user-facing search: typo tolerance, faceting, highlighting, weighting, semantic matching, all in one fluent chain.

For lower-level access to Elasticsearch’s boolean query DSL, see Advanced Queries.

use Sigmie\Mappings\NewProperties;
 
$props = new NewProperties;
$props->name();
$props->text('description');
 
$results = $sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('snow white')
->get();

Two arguments are required: the properties (so Sigmie knows how to query each field) and the query string.

Query string

The user input you’re searching for:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('snow white')
->get();

Add multiple query strings with different weights to bias the score:

$sigmie->newSearch('characters')
->properties($props)
->queryString('Mickey', weight: 2)
->queryString('Goofy', weight: 1)
->get();

Limit which fields are searched

By default, every searchable field in your properties is queried. Narrow to specific fields with fields():

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Snow White')
->fields(['name']) // only search `name`
->get();

Limit which fields are returned

Reduce response size by selecting only the fields you need:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Snow White')
->retrieve(['name', 'description'])
->get();

Filter

The filter parser reads filters in a human-friendly syntax:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Sleeping Beauty')
->filters('stock>0 AND is:active AND NOT category:"Drama"')
->get();

Filters narrow the result set but don’t affect relevance scoring.

Sort

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Snow White')
->sort('_score:desc name:asc')
->get();

_score:desc is the default. _score:asc is not allowed — Elasticsearch can’t sort relevance ascending. See Sort Parser for full syntax.

Typo tolerance

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Sleping Buety') // typos OK
->typoTolerance()
->get();

The default policy: one typo allowed for terms 3+ characters long, two typos for 6+. Override the thresholds:

->typoTolerance(oneTypoChars: 4, twoTypoChars: 8)

Restrict typos to specific fields:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Sleping Buety')
->typoTolerance()
->typoTolerantAttributes(['name'])
->get();

Highlight matches

Wrap matching tokens in HTML for direct display:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('sleeping beauty')
->highlighting(
['name'],
prefix: '<mark>',
suffix: '</mark>',
)
->get();

Default prefix/suffix is <em> / </em>.

Weight fields

Give certain fields more influence on relevance:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('sleeping beauty')
->weight(['name' => 4, 'description' => 1])
->get();

A match in name now scores 4× higher than the same match in description.

Minimum score

Drop low-relevance results:

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('Mickey')
->weight(['name' => 5])
->minScore(2)
->get();

Paginate

$sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('sleeping beauty')
->from(10)
->size(10)
->get();

from(10)->size(10) returns the second page (skip first 10, take next 10).

page() is a shortcut:

->page(2, 20) // page 2, 20 per page (== from(20)->size(20))

Deduplicate

Return one hit per value of a field. Useful for product variants:

$sigmie->newSearch('products')
->properties($props)
->queryString('sneakers')
->uniqueBy('product_id')
->get();

Include the next best matches from each group as inner hits:

->uniqueBy('product_id', top: 3)

The collapse field must be single-valued (e.g. keyword).

Facets

Build sidebar filters with one method. See Facets:

$response = $sigmie->newSearch('products')
->properties($props)
->queryString('laptop')
->facets('brand category price:100')
->get();
 
$facets = $response->json('facets');

Semantic search

Enable vector matching alongside keyword search:

$sigmie->newSearch('articles')
->properties($props)
->semantic()
->queryString('artificial intelligence')
->get();

Use vectors only (no keyword matching):

->semantic()->disableKeywordSearch()

See Semantic Search for embeddings setup and accuracy levels.

Autocomplete

$response = $sigmie->newSearch('fairy-tales')
->properties($props)
->autocompletePrefix('m')
->fields(['name'])
->retrieve(['name'])
->get();
 
$suggestions = $response->json('autocomplete');

Multi-language

Search across multiple indices:

$result = $sigmie->newSearch("$germanIndex,$englishIndex")
->properties($props)
->queryString('door tür')
->get();

Nested fields

Search and retrieve nested fields with dot notation:

$sigmie->newSearch('users')
->properties($props)
->queryString('Pluto')
->fields(['contact.dog.name'])
->retrieve(['contact.dog.name'])
->get();

Reading results

$response = $sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('mickey')
->get();
 
$response->total(); // total matching documents
$response->hits(); // array of hits
$response->json('hits'); // raw hits array
$response->json('hits.0._source'); // a specific value via dot notation

Empty query strings

By default, an empty query string returns every document. To return nothing instead:

->noResultsOnEmptySearch()

Async execution

promise() returns a Guzzle promise instead of executing immediately:

$promise = $sigmie->newSearch('fairy-tales')
->properties($props)
->queryString('mickey')
->promise();

Iterating over all matching hits

size() is for UIs. For exports, migrations, or bulk re-processing, use each() or lazy() to stream every matching document. Both reuse your filters, query string, and field scoping, and page internally using Point-in-Time + search_after — so concurrent writes don’t break the cursor.

With a callback

use Sigmie\Document\Hit;
 
$sigmie->newSearch('orders')
->properties($props)
->filters('status:completed')
->each(function (Hit $hit) use ($csv): void {
$csv->writeRow($hit->_source);
});

Each Hit exposes _id, _source, and _score.

With a generator

$generator = $sigmie->newSearch('orders')
->properties($props)
->filters('status:completed')
->lazy();
 
foreach ($generator as $hit) {
processHit($hit);
}

Page size

Default 500 per page. Tune for memory vs. round-trips:

$sigmie->newSearch('products')
->properties($props)
->chunk(100)
->each(function (Hit $hit): void {
// 100 at a time
});

Sort during iteration

Point-in-Time needs a deterministic sort. Sigmie handles this for you:

  • NewSearch::sort() — your sort string is kept. Sigmie appends a stable tiebreaker (_shard_doc on Elasticsearch, _id on OpenSearch) if you didn’t already provide one. _score-only or _doc-only sorts are replaced by the tiebreaker.
  • NewQuery::sortString() / sort(array) — call before the query method (matchAll, bool, etc.). Omit sort entirely to stream in stable but unranked order. Use field names that exist in your mapping (often a .keyword sub-field for text).
  • raw() — include a top-level sort key in the body you pass.
$multi->raw('orders', [
'query' => ['match_all' => (object) []],
'sort' => [['processed_at' => 'asc']],
]);

When the body includes collapse, Sigmie does not append the tiebreaker — Elasticsearch only allows one sort key with collapse + search_after, and that’s your responsibility.

Multi-search

newMultiSearch() registers multiple queries; a single _msearch returns one page each. To stream all matching hits across registered queries, call each() or lazy() on the multi-search:

use Sigmie\Document\Hit;
 
$multi = $sigmie->newMultiSearch();
 
$multi->newSearch('orders')
->properties($orderProps)
->filters('status:pending')
->chunk(200);
 
$multi->newQuery('products')->matchAll();
 
$multi->raw('orders', [
'query' => ['term' => ['status' => 'pending']],
])->chunk(200);
 
foreach ($multi->lazy() as $hit) {
exportRow($hit);
}

Each registered search runs its own PIT iteration; results yield in registration order. Set chunk() per query — the multi-search has no global chunk size.

Note: each() and lazy() ignore from(), size(), page(), and highlighting() — these are pagination/display concerns. Sort is honored as described above.