Introducing JSON queries | Vespa Blog

We recently introduced a new addition to the Search API – JSON queries. The search request can now be executed with a POST request, which includes the query-parameters within its payload. Along with this new query we also introduce a new parameter SELECT with the sub-parameters WHERE and GROUPING, which is equivalent to YQL.

The new query

With the Search APIs newest addition, it is now possible to send queries with HTTP POST. The query-parameters has been moved out of the URL and into a POST request body – therefore, no more URL-encoding. You also avoid getting all the queries in the log, which can be an advantage.

This is how a GET query looks like:

GET /search/?param1=value1¶m2=value2&...

The general form of the new POST query is:

POST /search/ { param1 : value1, param2 : value2, ... }

The dot-notation is gone, and the query-parameters are now nested under the same key instead.

Let’s take this query:

GET /search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22bad%22%3B&ranking.queryCache=false&ranking.profile=vespaProfile&ranking.matchPhase.ascending=true&ranking.matchPhase.maxHits=15&ranking.matchPhase.diversity.minGroups=10&presentation.bolding=false&presentation.format=json&nocache=true

and write it in the new POST request-format, which will look like this:

POST /search/ { "yql": "select \* from sources \* where default contains \"bad\";", "ranking": { "queryCache": "false", "profile": "vespaProfile", "matchPhase": { "ascending": "true", "maxHits": 15, "diversity": { "minGroups": 10 } } }, "presentation": { "bolding": "false", "format": "json" }, "nocache": true }

With Vespa running (see Quick Start or
Blog Search Tutorial),
you can try building POST-queries with the new querybuilder GUI at http://localhost:8080/querybuilder/, which can help you build queries with e.g. autocompletion of YQL:

image

The Select-parameter

The SELECT-parameter is used with POST queries and is the JSON equivalent of YQL queries, so they can not be used together. The query-parameter will overwrite SELECT, and decide the query’s querytree.

Where

The SQL-like syntax is gone and the tree-syntax has been enhanced. If you’re used to the query-parameter syntax you’ll feel right at home with this new language. YQL is a regular language and is parsed into a query-tree when parsed in Vespa. You can now build that tree in the WHERE-parameter with JSON. Lets take a look at the yql: select * from sources * where default contains foo and rank(a contains "A", b contains "B");, which will create the following query-tree:

image

You can build the tree above with the WHERE-parameter, like this:

{
    "and" : [
        { "contains" : ["default", "foo"] },
        { "rank" : [
            { "contains" : ["a", "A"] },
            { "contains" : ["b", "B"] }
        ]}
    ]
}

Which is equivalent with the YQL.

Grouping

The grouping can now be written in JSON, and can now be written with structure, instead of on the same line. Instead of parantheses, we now use curly brackets to symbolise the tree-structure between the different grouping/aggregation-functions, and colons to assign function-arguments.

A grouping, that will group first by year and then by month, can be written as such:

| all(group(time.year(a)) each(output(count())
         all(group(time.monthofyear(a)) each(output(count())))

and equivalentenly with the new GROUPING-parameter:

"grouping" : [
    {
        "all" : {
            "group" : "time.year(a)",
            "each" : { "output" : "count()" },
            "all" : {
                "group" : "time.monthofyear(a)",
                "each" : { "output" : "count()" },
            }
        }
    }
]

Wrapping it up

In this post we have provided a gentle introduction to the new Vepsa POST query feature, and the SELECT-parameter. You can read more about writing POST queries in the Vespa documentation. More examples of the POST query can be found in the Vespa tutorials.

Please share experiences. Happy searching!

Vespa Product Updates, October/November 2019: Nearest Neighbor and Tensor Ranking, Optimized JSON Tensor Feed Format, Matched Elements in Complex Multi-value Fields, Large Weighted Set Update Performance, and Datadog Monitoring Support

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the September Vespa product update, we mentioned Tensor Float Support, Reduced Memory Use for Text Attributes, Prometheus Monitoring Support, and Query Dispatch Integrated in Container.

This month, we’re excited to share the following updates:

Nearest Neighbor and Tensor Ranking

Tensors are native to Vespa. We compared elastic.co to vespa.ai testing nearest neighbor ranking using dense tensor dot product. The result of an out-of-the-box configuration demonstrated that Vespa performed 5 times faster than Elastic. View the test results.

Optimized JSON Tensor Feed Format

A tensor is a data type used for advanced ranking and recommendation use cases in Vespa. This month, we released an optimized tensor format, enabling a more than 10x improvement in feed rate. Read more.

Matched Elements in Complex Multi-value Fields 

Vespa is used in many use cases with structured data – documents can have arrays of structs or maps. Such arrays and maps can grow large, and often only the entries matching the query are relevant. You can now use the recently released matched-elements-only setting to return matches only. This increases performance and simplifies front-end code.

Large Weighted Set Update Performance

Weighted sets in documents are used to store a large number of elements used in ranking. Such sets are often updated at high volume, in real-time, enabling online big data serving. Vespa-7.129 includes a performance optimization for updating large sets. E.g. a set with 10K elements, without fast-search, is 86.5% faster to update.

Datadog Monitoring Support

Vespa is often used in large scale mission-critical applications. For easy integration into dashboards,
Vespa is now in Datadog’s integrations-extras GitHub repository.
Existing Datadog users will now find it easy to monitor Vespa.
Read more.

About Vespa: Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.