Vespa Product Updates, December 2018: ONNX Import and Map Attribute Grouping

Hi Vespa Community!

Today we’re kicking off a blog post series of need-to-know updates on Vespa, summarizing the features and fixes detailed in Github issues.

We welcome your contributions and feedback about any new features or improvements you’d like to see.

For December, we’re excited to share the following product news:

Streaming Search Performance Improvement
Streaming Search is a solution for applications where each query only searches a small, statically determined subset of the corpus. In this case, Vespa searches without building reverse indexes, reducing storage cost and making writes more efficient. With the latest changes, the document type is used to further limit data scanning, resulting in lower latencies and higher throughput. Read more here.

ONNX Integration
ONNX is an open ecosystem for interchangeable AI models. Vespa now supports importing models in the ONNX format and transforming the models into Tensors for use in ranking. This adds to the TensorFlow import included earlier this year and allows Vespa to support many training tools. While Vespa’s strength is real-time model evaluation over large datasets, to get started using single data points, try the stateless model evaluation API. Explore this integration more in Ranking with ONNX models.

Precise Transaction Log Pruning
Vespa is built for large applications running continuous integration and deployment. This means nodes restart often for software upgrades, and node restart time matters. A common pattern is serving while restarting hosts one by one. Vespa has optimized transaction log pruning with prepareRestart, due to flushing as much as possible before stopping, which is quicker than replaying the same data after restarting. This feature is on by default. Learn more in live upgrade and prepareRestart.

Grouping on Maps
Grouping is used to implement faceting. Vespa has added support to group using map attribute fields, creating a group for values whose keys match the specified key, or field values referenced by the key. This support is useful to create indirections and relations in data and is great for use cases with structured data like e-commerce. Leverage key values instead of field names to simplify the search definition. Read more in Grouping on Map Attributes.

Questions or suggestions? Send us a tweet or an email.

Vespa Product Updates, December 2019: Improved ONNX support, New rank feature attributeMatch().maxWeight, Free lists for attribute multivalue mapping, faster updates for out-of-sync documents, Zookeeper 3.5.6

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the November Vespa product update, we mentioned Nearest Neighbor and Tensor Ranking, Optimized JSON Tensor Feed Format, Matched Elements in Complex Multi-value Fields, Large Weighted Set Update Performance and Datadog Monitoring Support.

Today, we’re excited to share the following updates:

Improved ONNX Support

Vespa has added more operations to its ONNX model API, such as GEneral Matrix to Matrix Multiplication (GEMM) –
see list of supported opsets.
Vespa has also improved support for PyTorch through ONNX,
see the pytorch_test.py example.

New Rank Feature attributeMatch().maxWeight

attributeMatch(name).maxWeight was added in Vespa-7.135.5. The value is  the maximum weight of the attribute keys matched in a weighted set attribute.

Free Lists for Attribute Multivalue Mapping

Since Vespa-7.141.8, multivalue attributes uses a free list to improve performance. This reduces CPU (no compaction jobs) and approximately 10% memory. This primarily benefits applications with a high update rate to such attributes.

Faster Updates for Out-of-Sync Documents

Vespa handles replica consistency using bucket checksums. Updating documents can be cheaper than putting a new document, due to less updates to posting lists. For updates to documents in inconsistent buckets, a GET-UPDATE is now used instead of a GET-PUT whenever the document to update is consistent across replicas. This is the common case when only a subset of the documents in the bucket are out of sync. This is useful for applications with high update rates, updating multi-value fields with large sets. Explore details here.

ZooKeeper 3.5.6

Vespa now uses Apache ZooKeeper 3.5.6 and can encrypt communication between ZooKeeper servers.

About Vespa: Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.

Vespa Product Updates, December 2020

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned new Container Thread Pools and Feed Throughput improvements.

Subscribe to the mailing list to get these updates delivered to your inbox.

This month, we’re excited to share the following updates:

Tensor Performance Improvements

Vespa 7.319.17 and onwards includes new optimizations to
tensors with sparse dimensions.
We have implemented new memory structures to represent sparse and mixed tensors
and a new pipeline for evaluating tensor operations.
This has enabled applications to deploy new advanced ranking models using mixed tensors in production.
An example is a use case where end-to-end average latency went from 135ms to 13ms; a 10x speedup.
When measuring the latency of only mixed tensor operations, the speedup is 150x.
Latency improvement for basic sparse tensor operations is around 40%,
while more advanced sparse tensor operations have a speedup of up to 50x.

Vespa Container Apache ZooKeeper Integration

Vespa allows you to add custom Java components for query and document processing.
If this code needs a shared lock across servers in a cluster,
you can now configure a container cluster to run an embedded ZooKeeper cluster
and access it through an injected component.
Read more

Pyvespa

pyvespa is a python library created to enable faster prototyping
and facilitate Machine Learning experiments for Vespa applications.
The library is under active development and ready for trial usage.
Please give it a try and help the Vespa team improve it through feedback and contributions.
Read more

ONNX Runtime

To increase Vespa’s capacity for evaluating large models,
both in performance and model types supported,
Vespa has integrated ONNX Runtime.
This makes it easier to use both Vespa and ONNX, as there is no conversion.
See the blog post for details.


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

Vespa Newsletter, December 2021 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned schema inheritance, improved data dump performance,
“true” query item, faster deployments and Hamming distance for ranking.
This time, we have the following updates:

Tensor performance improvements

Since Vespa 7.507.67, Euclidian distance calculations using int8 are 250% faster, using HW-accelerated instructions.
This speeds up feeding to HSNW-based indices, and reduces latency for nearest neighbor queries.
This is relevant for applications with large data sets per node – using int8 instead of float uses 4x less memory,
and the performance improvement is measured to bring us to 10k puts/node when using HSNW.

With Vespa 7.514.11, tensor field memory alignment for types <= 16 bytes is optimized.
E.g. a 104 bit = 13 bytes int8 tensor field will be aligned at 16 bytes, previously 32, a 2x improvement.
Query latency might improve too, due to less memory bandwidth used.

Refer for #20073 Representing SPANN with Vespa
for details on this work, and also see
Bringing the neural search paradigm shift to production
from the London Information Retrieval Meetup Group.

Match features

Any Vespa rank feature or function output can be returned along with regular document fields by adding it to the list of
summary-features of the rank profile.
If a feature is both used for ranking and returned with results,
it is re-calculated by Vespa when fetching the document data of the final result
as this happens after the global merge of matched and scored documents.
This can be wasteful when these features are the output of complex functions such as a neural language model.

The new match-features
allows you to configure features that are returned from content nodes
as part of the document information returned before merging the global list of matches.
This avoids re-calculating such features for serving results
and makes it possible to use them as inputs to a (third) re-ranking evaluated over the globally best ranking hits.
Furthermore, calculating match-features is also part of the
multi-threaded per-matching-and-ranking execution on the content nodes,
while features fetched with summary-features are single-threaded.

Vespa IntelliJ plugin

Shahar Ariel has created an IntelliJ plugin for editing schema files,
find it at docs.vespa.ai/en/schemas.html#intellij-plugin.
Thanks a lot for the contribution!