Vespa Product Updates, January 2019: Parent/Child, Large File Config Download, and a Simplified Feeding Interface

In last month’s Vespa update, we mentioned ONNX integration, precise transaction log pruning, grouping on maps, and improvements to streaming search performance. Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and Oath Ads Platforms. Thanks to feedback and contributions from the community, Vespa continues to evolve.

This month, we’re excited to share the following updates with you:


We’ve added support for multiple levels of parent-child document references. Documents with references to parent documents can now import fields, with minimal impact on performance. This simplifies updates to parent data as no denormalization is needed and supports use cases with many-to-many relationships, like Product Search. Read more in parent-child.

File URL references in application packages

Serving nodes sometimes require data files which are so large that it doesn’t make sense for them to be stored and deployed in the application package. Such files can now be included in application packages by using the URL reference. When the application is redeployed, the files are automatically downloaded and injected into the components who depend on them.

Batch feed in java client

The new SyncFeedClient provides a simplified API for feeding batches of data with high performance using the Java HTTP client. This is convenient when feeding from systems without full streaming support such as Kafka and DynamoDB.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to see.

Vespa Product Updates, January 2020

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa

In the December product update,
we mentioned improved ONNX support,
new rank feature attributeMatch().maxWeight,
free lists for attribute multivalue mapping,
faster updates for out-of-sync documents,
and ZooKeeper 3.5.6.

This month, we’re excited to share the following updates:

Tensor Functions

The tensor language has been extended with functions to allow the representation of very complex neural nets, such as BERT models, and better support for working with mapped (sparse) tensors:

  • Slice
    makes it possible to extract values and subspaces from tensors.
  • Literal tensors
    make it possible to create tensors on the fly, for instance from values sliced out of other tensors
    or from a list of scalar attributes or functions.
  • Merge
    produces a new tensor from two mapped tensors of the same type,
    where a lambda to resolve is invoked only for overlapping values.
    This can be used, for example, to supply default values which are overridden by an argument tensor.

New Sizing Guides

Vespa is used for applications with high performance or cost requirements.
New sizing guides for queries and
are now available to help teams use Vespa optimally.

Performance Improvement for Matched Elements in Map/Array-of-Struct

As maps or arrays in documents can often grow large,
applications use matched-elements-only
to return only matched items. This also simplifies application code.
Performance for this feature is now improved – ex: an array or map with 20.000 elements is now 5x faster.

Boolean Field Query Optimization

Applications with strict latency requirements, using boolean fields and concurrent feed and query load, have a latency reduction since Vespa 7.165.5 due to an added bitCount cache. For example, we realized latency improvement from 3ms to 2ms for an application with a 30k write rate. Details in #11879.

Bug fixes / errata

Regression introduced in Vespa 7.141 may cause data loss or inconsistencies when using ‘create: true’ updates

There exists a regression introduced in Vespa 7.141 where updates marked as create: true (i.e. create if missing) may cause data loss or undetected inconsistencies in certain edge cases. This regression was introduced as part of an optimization effort to greatly reduce the common-case overhead of updates when replicas are out of sync.

Fixed in Vespa 7.157.9 and beyond. If you are running a version affected (7.141 up to and including 7.147) you are strongly advised to upgrade.

See #11686 for details.

About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet
or email) about any of these new features or future improvements you’d like to request.

Vespa Product Updates, January 2021

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa

In the previous update,
we mentioned Improved Tensor Ranking Performance, Apache ZooKeeper Integration, Vespa Python API for Researchers and ONNX Integration.

Subscribe to the mailing list to get these updates delivered to your inbox.

This month, we’re excited to share the following updates:

Automatic Reindexing

When the indexing pipeline of a Vespa application changes
(index script / index mode, or linguistics libraries),
Vespa can automatically reprocess stored data
such that the index is updated according to the new specification.
Reindexing can be triggered and inspected for an application’s full corpus, for only certain content clusters,
or for only certain document types in certain clusters, using the new reindex endpoint.
This eliminates the need for data re-feed and makes it easier to improve the application’s relevance.
Read more.

Tensor Optimizations

Sparse tensor dot product performance has improved by adding the optimized
For tests on a single node, 9M passages ColBERT
(like in vespa-engine/vespa#15854),
this has cut latency by 64% and hence tripled query throughput.

Query Profile Variant Initialization Speedup

Query profiles are used to store query variables in configuration.
In some applications, it is convenient to allow the values in query profiles to vary
depending on variables input in the query.
E.g, a query profile can contain values depending on the market in which the request originated,
the device model and the bucket in question.
With many dimensions, the space of possible combinations grows huge.
With vespa-engine/vespa#15969,
container query profiles configuration load 10x faster for an extreme use case with variants in many dimensions.

Explainlevel Query Parameter

Use the new explainlevel
query parameter to trace query execution in Vespa.
With this, you can see the query plan used in the matching and ranking engine –
use this for low level debugging of query execution.

PR System Testing

Vespa Team loves contributions!
However, all pull request checks must pass.
Since Jan 6, one can invoke system testing from pull requests.
If you have made changes involving the config model, OSGi bundles or dependency injection,
we require that the pull request is created with [run-systemtest] in the title.
This will run an extended test suite as part of the checks.
Read more in contributing.

About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

Vespa Newsletter, January 2022 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa

In the previous update,
we mentioned Tensor performance improvements, Match features and the Vespa IntelliJ plugin.
Today, we’re excited to share the following updates:

Faster node recovery and re-balancing

When Vespa content nodes are added or removed,
data is auto-migrated between nodes
to maintain the configured data distribution.
The throughput of this migration is throttled to avoid impact to regular query and write traffic.
We have worked to improve this throughput by using available resources better,
and since November we have been able to approximately double it –
read the blog post.

Reindexing speed

Most schema changes in Vespa are effected immediately,
but some require re-indexing.
Reindexing the corpus can take time, and consumes resources.
It is now possible to configure how fast to re-index in order to balance this tradeoff,
see reindex speed.
Read more about schema changes.


pyvespa 0.14.0 is released with the following changes:

  • Add retry strategy to delete_data,
    get_data and update_data (#222).
  • Deployment parameter disk_folder defaults to the current working directory for both Docker and Cloud deployments
  • Vespa connection now accepts cert and key as separate arguments.
    Using both certificate and key values in the cert file continue to work as before

Explore the new text-image
and text-video sample applications with pyvespa,
and read more about pyvespa.

Improved support for Weak And and unstructured user input

You can now use type=weakAnd in the Query API.
Used with userInput,
it is easy to create a query using weakAnd
with unstructured input data in a query, for a better relevance / performance tradeoff compared to all / any queries.


Semantic Rules have added better support for making synonym expansion rules through the * operator,
see #20386,
and proper stemming in multiple languages,
see Semantic Rules directives.
Read more about query rewriting.

Language detection

If no language is explicitly set in a document or a query, and stemming/nlp tokenization is used,
Vespa will run a language detector on the available text.
Since Vespa 7.518.53, the default has changed from Optimaize to OpenNLP.
Read more.

New blog posts

  • ML model serving at scale
    is about model serving latency and concurrency,
    and is a great primer on inference threads, intra-operation threads and inter-operation threads.
  • Billion-scale knn part two
    goes in detail on tensor vector precision types, memory usage, precision and performance
    for both nearest neighbor and approximate nearest neighbor search.
    Also learn how HNSW works with number of links in the graph and neighbors to explore at insert time,
    and how this affects precision.

Vespa Newsletter, January 2023 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa

It’s a busy winter at the Vespa HQ. We are working on some major new features,
which will be announced soon, but we’re also finding the time to make smaller improvements – see below!

Interested in search ranking? Don’t miss these blog posts

We have done some deep diving into using machine learning to improve ranking in search applications lately,
and of course, we’re blogging and open-sourcing all the details to make it easy for you to build on what we are doing.
See these recent blog posts:

New Vespa improvements

In the previous update,
we mentioned ANN pre-filter performance, parent field hit estimates,
model training notebooks, and Vespa Cloud GCP Support.
This time, we have the following improvements:

Simpler tensor JSON format

Since Vespa 8.111, Vespa allows tensor field values to be written in JSON
without the intermediate map containing “blocks”, “cells” or “values”.
The tensor type will then dictate the format of the tensor content.
Tensors can also be returned in this format in
and model evaluation
by requesting the format short-value –
see the tensor format documentation.

Supplying values for missing fields during indexing

Vespa allows you to add fields outside the “document” section in the schema configuration
that get their values from fields in the document.
For example, you can add a vector embedding of a title and description field like this:

field myEmbedding type tensor(x[128]) {
    indexing: input title . " " . input description | embed | attribute

But what if descriptions are sometimes missing?
Then Vespa won’t produce an embedding value at all, which may not be what you want.
From 8.116, you can specify an alternative value for expressions that don’t produce a value
using the || syntax:

field myEmbedding type tensor(x[128]) {
    indexing: input title . " " . (input description || "") | embed | attribute

Since January 31, it is possible to set up private connectivity between a customer’s VPC
and their Vespa Cloud application using AWS PrivateLink.
This provides clients safe, non-public access to their applications
using private IPs accessible from within their own VPCs –
read more.

Content node performance

Vespa content nodes store the data written to Vespa, maintain indexes over it, and run matching and ranking.
Most applications spend the majority of their hardware resources on content nodes.

  • Improving query performance is made easier with new match phase profiling
    since Vespa 8.114. This gives insight into what’s most costly in matching your queries (ranking was already supported).
    Read more at phased-ranking.html.
  • Since Vespa 8.116, Vespa requires minimum
    Haswell microarchitecture.
    A more recent optimization target enables better optimizations and, in some cases, gives 10-15% better ranking performance.
    It is still possible to run on older microarchitectures, but then you must compile from source;
    see #25693.

Vespa start and stop script improvements

Vespa runs in many environments, from various self-hosted technology stacks to Vespa Cloud –
see multinode-systems
and basic-search-on-gke.
To support running as a non-root user inside containers with better debug support,
the vespa start/stop-scripts are now refactored and simplified –
this will also make Vespa start/stop snappier in some cases.

Container Performance and Security

With Vespa 8.111, Vespa upgraded its embedded Jetty server from version 9.x to 11.0.13.
The upgrade increases performance in some use cases, mainly when using HTTP/2,
and also includes several security fixes provided with the Jetty upgrade.

Log settings in services.xml

During debugging, it is useful to be able to tune which messages end up in the log,
especially when developing custom components.
This can be done with the vespa-logctl tool on each node.
Since Vespa 8.100, you can also control log settings in services.xml –
see logging.
This is also very convenient when deploying on Vespa Cloud.

Vespa Cloud: Autoscaling with multiple groups

When allocating resources on Vespa Cloud
you can specify both the number of nodes and node groups you want in content clusters
(each group has one or more complete copies of all the data and can handle a query independently):

<nodes count="20" groups="2">

If you want the system to automatically find the best values for the given load, you can configure ranges:

<nodes count="[10, 30]" groups="[1, 3]">

This might lead to groups of sizes from 4 to 30, which may be fine,
but sometimes you want to control the size of groups instead?
From 8.116, you can configure group size instead (or in addition to) the number of groups:

<nodes count="[10, 30]" group-size="10">

Like the other values, group-size can also be ranges.
See the documentation.

In addition to choosing resources, a content cluster must also be configured with a redundancy –
the number of copies to keep of each piece of data in each group.
With variable groups this may cause you to have more copies than you strictly need to avoid data loss,
so since 8.116, you can instead configure the minimum redundancy:


The system will then ensure you have at least this many copies of the data,
but not make more copies than necessary in each group.

Vespa Cloud: Separate read/write data plane access control

When configuring the client certificate to use for your incoming requests (data plane) on Vespa Cloud,
you can now specify whether each certificate should have read- or write-access or both.
This allows you to e.g., use one certificate for clients with read access while having another –
perhaps less distributed – certificate for write access.
See the Security Guide
for more details on how to configure it.

Thanks for reading! Try out Vespa on Vespa Cloud
or grab the latest release at and run it yourself! 😀