Basic fuzzy queries:
GET /docs/doc/_search
{
"query": {
"fuzzy": {
"body": "robot"
}
}
}
GET /docs/doc/_search
{
"query": {
"match": {
"body": {
"query": "robot",
"fuzziness": "auto"
}
}
}
}You can use fuzzy or match query with explicit fuzziness attribute. Be warned that fuzzy query is not analyzed.
fuzzyness means how different the matched words can be from the one in the query (max edit distance). Your best option is to use AUTO, which automatically changes the value according to the words length. Values are usually 0, 1, 2. You can't go higher than 2.
This query yelds only the "Transformers" document in the results. If you set fuzzyness to 2 it will yield "Robin Hood" as well: robot has an edit distance of 2 from robin.
GET /docs/doc/_search
{
"query": {
"multi_match": {
"fields": [ "title", "body", "keywords" ],
"query": "leetle nemmo fisch",
"fuzziness": "auto",
"operator": "and"
}
}
}"fuzziness": "auto" is a default value. The default for operator is or, if you use "operator": "and" the matching will be more complete phrases oriented, just like you need. You cannot use match_phrase with fuzzy queries, so you have to fallback to this strategy.
You can use prefix_length to limit the initial characters that will not be fuzzied, making the query less expensive:
GET /docs/doc/_search
{
"query": {
"multi_match": {
"fields": [ "title*", "body", "keywords" ],
"query": "litle nemmo fisch",
"fuzziness": "auto",
"operator": "and",
"prefix_length": 3
}
}
}If you increase prefix_length to 4 there will be no matches with the example query (litl will not match litt, fisc will not match fish...)
Let's get all the documents that don't have the title containing "nemo" and a body that don't contain "fish", using the fuzzied forms "nemmo" and "fisch" in the query:
GET /docs/doc/_search
{
"query": {
"bool": {
"must_not": [
{
"match": {
"title": {
"query": "nemmo",
"fuzziness": "auto"
}
}
},
{
"match": {
"body": {
"query": "fisch",
"fuzziness": "auto"
}
}
}
]
}
}
}This is a filter. You can add as many must_not clauses as you want, just keep the following format:
{
"match": {
"title": {
"query": "nemmo",
"fuzziness": "auto"
}
}
}It's the usual match query, but with explicit fuzziness attribute to make it fuzzy.
This query will give higher boost to documents where the match is in the title field:
GET /docs/doc/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"keywords": {
"query": "action",
"fuzziness": "auto"
}
}
},
{
"match": {
"body": {
"query": "litle",
"fuzziness": "auto",
"boost": 5
}
}
}
]
}
}
}Not clear what "no-op" means, could you please elaborate?
Anyway, the basic must query is this:
GET /docs/doc/_search
{
"query": {
"bool": {
"must": [
{ "fuzzy": { "title": "nemmo"} },
{ "fuzzy": { "body": "fisch"} }
]
}
}
}Remember that fuzzy is not analyzed. It may work for you if you don't need analysis, but if you do you should use match + fuzzines as in the examples above.
Here I am enhancing the should fuzzy query with a geo_distance filter:
GET /docs/doc/_search
{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"match": {
"keywords": {
"query": "action",
"fuzziness": "auto"
}
}
},
{
"match": {
"body": {
"query": "litle",
"fuzziness": "auto"
}
}
}
]
}
},
"filter": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 45,
"lon": 10
}
}
}
}
}
}The query without geo_distance filter was returning 2 results, now it picks only the one within the expected distance.