Vespa Product Updates, August 2019: BM25 Rank Feature, Searchable Parent References, Tensor Summary Features, and Metrics Export

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the recent Vespa product update, we mentioned Large Machine Learning Models, Multithreaded Disk Index Fusion, Ideal State Optimizations, and Feeding Improvements. Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.

This month, we’re excited to share the following feature updates with you:

BM25 Rank Feature

The BM25 rank feature implements the Okapi BM25 ranking function and is a great candidate to use in a first phase ranking function when you’re ranking text documents. Read more.

Searchable Reference Attribute

A reference attribute field can be searched using the document id of the parent document-type instance as query term, making it easy to find all children for a parent document. Learn more.

Tensor in Summary Features

A tensor can now be returned in summary features.
This makes rank tuning easier and can be used in custom Searchers when generating result sets.
Read more.

Metrics Export

To export metrics out of Vespa, you can now use the new node metric interface. Aliasing metric names is possible and metrics are assigned to a namespace. This simplifies integration with monitoring products like CloudWatch and Prometheus. Learn more about this update.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.

Vespa Product Updates, August 2020

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the June updates,
we mentioned Approximate Nearest Neighbor Search, Streaming Search Speedup, Rank Features, and a GKE Sample Application.

This month, we’re excited to share the following updates:

Introducing NLP with Transformers on Vespa

There has been considerable interest lately to bring sophisticated natural language processing (NLP) power
using machine learned models such as BERT and other transformer models to production.
We have extended the tensor execution engine in Vespa to support transformer based models,
so you can deploy transformer models as part of your Vespa applications
and evaluate these models in parallel on each content partition when executing a query.
This makes it possible to scale evaluation to any corpus size without sacrificing latency.

Grafana how-to

We released a new Grafana integration
by leveraging our existing Prometheus integration, with a few improvements.
This allows you to add Grafana monitoring to the Quick Start
and you can add a random load to generate a sample work graph.
We’ve provided a sample application to get you started with monitoring Vespa using Grafana.

Improved GEO Search Support

We added support for geoLocation items
to the Vespa query language
to make it possible to create arbitrary query conditions which include positional information.
We also added additional distance rank features to provide more support for ranking by positions.

Query Profile Variants Optimizations

Query Profile Variants
make it possible to configure bundles of query parameters which vary by properties of the request, such as e.g market, bucket, or device.
We added a new algorithm for resolving the parameters that applies for a given query which greatly reduces both compilation and resolution time with variants,
leading to faster container startup and lower query latency for applications using variants.

Build Vespa on Debian 10

Thanks to contributions from ygrek, you can now
build Vespa on Debian 10.


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet
or email) about any of these new features or future improvements you’d like to request.

Subscribe to the mailing list for more frequent updates!

Vespa Newsletter, August 2023 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned Vector Streaming Search, Embedder Models from Huggingface,
GPU Acceleration of Embedding Models, Model Hub and Dotproduct distance metric for ANN.
Today, we’re excited to share the following updates:

Multilingual sample app

In the previous newsletter, we announced Vespa E5 model support.
Now we’ve added a multilingual-search sample application.
Using Vespa’s powerful indexing language
and integrated embedding support, you can embed and index:

field embedding type tensor<float>(x[384]) {
    indexing {
        "passage: " . input title . " " . input text | embed | attribute
    }
}

Likewise, for queries:

{
    "yql": "select ..",
    "input.query(q)": "embed(query: the query to encode)",
}

With this, you can easily use multilingual E5 for great relevance,
see the simplify search with multilingual embeddings
blog post for results.
Remember to try the sample app,
using trec_eval to compute NDCG@10.

ANN targetHits

Vespa uses targetHits
in approximate nearest neighbor queries.
When searching the HNSW index in a post-filtering case,
this is auto-adjusted in an effort to still expose targetHits hits to first-phase ranking after post-filtering
(by exploring more nodes).
This increases query latency as more candidates are evaluated.
Since Vespa 8.215, the following formula is used to ensure an upper bound of adjustedTargetHits:

adjustedTargetHits = min(targetHits / estimatedHitRatio,
                         targetHits * targetHitsMaxAdjustmentFactor)

You can use this to choose to return fewer hits over taking longer to search the index.
The target-hits-max-adjustment-factor
can be set in a rank profile and overridden
per query.
The value is in the range [1.0, inf], default 20.0.

Tensor short query format in inputs

In Vespa 8.217, a short format for mapped tensors can be used in input values.
Together with the short indexed tensor format, query tensors can be like:

"input": {
    "query(my_indexed_tensor)": [1, 2, 3, 4],
    "query(my_mapped_tensor)": {
        "Tablet Keyboard Cases": 0.8,
        "Keyboards":0.3
    }
}

Pyvespa

During the last month, we’ve released PyVespa
0.35,
0.36 and
0.37:

  • Requires minimum Python 3.8.
  • Support setting default stemming of Schema: #510.
  • Add support for first phase ranking:
    #512.
  • Support using key/cert pair generated by Vespa CLI:
    #513
    and add deploy_from_disk for Vespa Cloud: #514 –
    this makes it easier to interoperate with Vespa Cloud and local experiments.
  • Specify match-features in RankProfile:
    #525.
  • Add utility to create a vespa feed file for easier feeding using Vespa CLI:
    #536.
  • Add support for synthetic fields: #547
    and support for Component config:
    #548.
    With this, one can run the multivector sample application –
    try it using the multi-vector-indexing notebook.

Vespa CLI functions

The Vespa command-line client has been made smarter,
it will now check local deployments (e.g. on your laptop) and wait for the container cluster(s) to be up:

$ vespa deploy
Waiting up to 1m0s for deploy API to become ready...
Uploading application package... done

Success: Deployed . with session ID 2
Waiting up to 1m0s for deployment to converge...
Waiting up to 1m0s for cluster discovery...
Waiting up to 1m0s for container default...

The new function vespa destroy
is built for quick dev cycles on Vespa Cloud.
When developing, easily reset the state in your Vespa Cloud application by calling vespa destroy.
This is also great for automation, e.g., in a GitHub Action.
Local deployments should reset with fresh Docker/Podman containers.

Optimizations and features

  • Vespa indexing language now supports to_epoch_second
    for converting iso-8601 date strings to epoch time.
    Available since Vespa 8.215.
    Use this to easily convert from strings to a number when indexing –
    see example.
  • Since Vespa 8.218, Vespa uses onnxruntime 1.15.1.
  • Since Vespa 8.218, one can use create to create non-existing cells before a
    modify-update operation is applied to a tensor.
  • Vespa allows referring to models by URL in the application package.
    Such files can be large, and are downloaded per deploy-operation.
    Since 8.217, Vespa will use a previously downloaded model file if it exists on the requesting node.
    New versions of the model must use a different URL.
  • Some Vespa topologies use groups of nodes to optimize query performance –
    each group has a replica of a document.
    High-query Vespa applications might have tens or even hundreds of groups.
    Upgrading such clusters in Vespa Cloud takes time, having only one replica (= group) out at any time.
    With groups-allowed-down-ratio,
    one can set a percentage of groups instead,
    say 25%, for only 4 cycles to upgrade a full content cluster.

Blog posts since last newsletter


Thanks for reading! Try out Vespa on Vespa Cloud
or grab the latest release at vespa.ai/releases and run it yourself! 😀