Vespa Product Updates, June 2020

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the May updates,
we mentioned Improved Slow Node Tolerance, Multi-Threaded Rank Profile Compilation, Reduced Peak Memory at Startup, Feed Performance Improvements and Increased Tensor Performance.

This month, we’re excited to share the following updates:

Vespa now supports approximate nearest neighbor search which can be combined with filters and text search.
By using a native implementation of the HNSW algorithm,
Vespa provides state of the art performance on vector search:
Typical single digit millisecond response time, searching hundreds of millions of documents per node,
but also uniquely allows vector query operators to be combined efficiently with filters and text search –
which is usually a requirement for real-world applications such as text search and recommendation.
Vectors can be updated in real time with a sustained write rate of a few thousand vectors per node per second.
Read more in the documentation on nearest neighbor search.

Streaming Search Speedup

Streaming Search is a feature unique to Vespa.
It is optimized for use cases like personal search and e-mail search –
but is also useful in high-write applications querying a fraction of the total data set.
With #13508,
read throughput from storage increased up to 5x due to better parallelism.

Rank Features

  • The (Native)fieldMatch rank features are optimized to use less CPU query time, improving query latency for
    Text Matching and Ranking.
  • The new globalSequence rank feature is an inexpensive global ordering of documents in a system with stable system state.
    For a system where node indexes change, this is inaccurate.
    See globalSequence documentation for alternatives.

GKE Sample Application

Thank you to Thomas Griseau for contributing a new sample application
for Vespa on GKE,
which is a great way to start using Vespa on Kubernetes.


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet
or email) about any of these new features or future improvements you’d like to request.

Subscribe to the mailing list for more frequent updates!

Vespa Newsletter, June 2022 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned tensor formats, grouping improvements, new query guides,
modular rank profiles and pyvespa docker image and deployments.
Today, we’re excited to share the following updates:

Vespa 8

Vespa 8 is released. Vespa is now on Java 17 and
CentOS Stream 8.
Read more about what this means for you in the blog post.

Pre/Post ANN filtering support

Approximate Nearest Neighbor is a popular feature in Vector Search applications, also supported in Vespa.
Vespa has integral support for combining ANN search with filters,
like “similar articles to this, in US market, not older than 14 days”.
From Vespa 7.586.113, users can configure whether to use pre- or post-filtering, with thresholds.
This enables a much better toolset to trade off precision with performance, i.e. balance cost and quality.
Read more in constrained-approximate-nearest-neighbor-search.

Fuzzy matching

Thanks to alexeyche, Vespa supports fuzzy query matching since 7.585 –
a user typing “spageti” will now match documents with “spaghetti”.
This is implemented using Levenshtein edit distance search –
e.g. one must make two “edits” (one-character changes) to make “spaghetti” from “spageti”.
Find the full contribution in #21689 and documentation at
query-language-reference.html#fuzzy.

pyvespa

pyvespa 0.22 introduces an experimental ranking module
to support learning to rank tasks that can be applied to
data collected from Vespa applications containing ranking features.
It starts by creating a listwise ranking framework based on TensorFlow Ranking that covers data pipelines,
fitting models and feature selection algorithms.

Embedding support

A common technique in modern big data serving applications is to map the subject data – say, text or images –
to points in an abstract vector space and then do computation in that vector space.
For example, retrieve similar data by finding nearby points in the vector space,
or using the vectors as input to a neural net.
This mapping is usually referred to as embedding
read more about Vespa’s built-in support.

Tensors and ranking

fast-rank
enables ranking expression evaluation without de-serialization, to decrease latency, on the expense of more memory used.
Supported for tensor field types with at least one mapped dimension.

Tensor short format
is now supported in the /document/v1 API.

Support for importing onnx models in rank profiles is added.

Blog posts and training videos

Find great Vespa blog posts on
constrained ANN-search,
hybrid billion scale vector search,
and Lester Solbakken + Jo Kristian Bergum at the
Berlin Buzzwords conference –
follow Jo Kristian for industry leading commentary.

New training videos for Vespa startup troubleshooting and auto document redistribution
are available at vespa.ai/resources:

Vespa.ai: Troubleshooting startup, singlenode

Vespa.ai: Troubleshooting startup, multinode

Vespa.ai: Bucket distribution - intro