Vespa Product Updates, March 2019: Tensor updates, Query tracing and coverage

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In last month’s Vespa update, we mentioned Boolean Field Type, Environment Variables, and Advanced Search Core Tuning. Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and Oath Ads Platforms. Thanks to feedback and contributions from the community, Vespa continues to grow.

This month, we’re excited to share the following updates with you:

Tensor update

Easily update individual tensor cells. Add, remove, and modify cell is now supported. This enables high throughput and continuous updates as tensor values can be updated without writing the full tensor.

Advanced Query Trace

Query tracing now includes matching and ranking execution information from content nodes –
Query Explain,  is useful for performance optimization.

Search coverage in access log

Search coverage is now available in the access log. This enables operators to track the fraction of queries that are degraded with lower coverage. Vespa has features to gracefully reduce query coverage in overload situations and now it’s easier to track this. Search coverage is a useful signal to reconfigure or increase the capacity for the application. Explore the access log documentation to learn more.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.

Vespa Product Updates, March 2021

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned Automatic Reindexing, Tensor Optimizations, Query Profile Variant Initialization Speedup,
Explainlevel Query Parameter and PR System Testing.
Subscribe to the mailing list to get these updates delivered to your inbox.

This month, we’re excited to share the following updates:

New features in document/v1/

The /document/v1/ API
is the easiest way to interact with documents.
Since Vespa 7.354, this API lets users easily update or remove a selection of the documents,
rather than just single documents at a time.
It also lets users copy documents directly between clusters.
These new features are efficient and useful for production use-cases;
and also increase the expressiveness of the API,
which is great for playing around with- and learning Vespa.

weakAnd.replace

Queries with many OR-terms can recall a large set of the corpus for first-phase ranking,
hence increasing query latency.
In many cases, using WeakAnd (WAND)
can improve query performance by skipping the most irrelevant hits.
Since Vespa 7.356, you can use weakAnd.replace
to auto-convert from OR to WeakAnd to cut query latency.
Thanks to Kyle Rowan for submitting this in
#16411!

Improved feed-block at full node

Vespa has protection against corrupting indices when exhausting disk or memory:
Content nodes block writes at a given threshold.
Recovering from a blocked-write situation is now made easier with
resource-limits –
this blocks external writes at a lower threshold than internal redistribution,
so the content nodes retain capacity to rebalance data.

Reduced memory at stop/restart

Index and attribute structures are flushed when Vespa is stopped.
Since Vespa 7.350,
the flushing is staggered based on the size of the in-memory structures to minimize temporary memory use.
This allows higher memory utilization and hence lower cost,
particularly for applications with multiple large in-memory structures.


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

Vespa Newsletter, March 2023 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned Better Tensor formats, AWS PrivateLink, Autoscaling, Data Plane Access Control
as well as Container and Content Node Performance.

We also want to thank you for your PRs! In particular (see below),
most of the new pyvespa features were submitted from non-Vespa Team members – thank you!
We are grateful for the contributions, please do keep those PRs coming!

We’re excited to share the following updates:

GPU-accelerated ML inference

In machine learning, computing model inference is a good candidate for being accelerated by special-purpose hardware, such as GPUs.
Vespa supports evaluating multiple types of machine-learned models in stateless containers,
e.g., TensorFlow,
ONNX,
XGBoost,
and LightGBM models.
For some use cases, using a GPU makes it possible to perform model inference with higher performance,
and at a lower price point, when compared to using a general-purpose CPU.

The Vespa Team is announcing support for GPU-accelerated ONNX model inference in Vespa,
including support for GPU instances in Vespa Cloud –
read more.

Vespa Cloud: BCP-aware autoscaling

As part of a business continuity plan (BCP),
applications are often deployed to multiple zones so the system has ready, on-hand capacity
to absorb traffic should a zone fail.
Using autoscaling in Vespa Cloud sets aside resources in each zone to handle an equal share of the traffic from the other zones
in case one of them goes down – e.g., it assumes a flat BCP structure.

This is not always how applications wish to structure their BCP traffic shifting though –
so applications can now define their BCP structure explicitly
using the BCP tag in
deployment.xml.
Also, during a BCP event, when it is acceptable to have some delay until capacity is ready,
you can set a deadline until another zone must have sufficient capacity to accept the overload;
permitting delays like this allows autoscaling to save resources.

Vespa for e-commerce

Screenshot

Vespa is often used in e-commerce applications.
We have added exciting features to the shopping sample application:

  • Use NLP techniques to generate query suggestions from the index content
    based on spaCy and en_core_web_sm.
  • Use the fuzzy query operator
    and prefix search for great query suggestions –
    this handles misspelled words and creates much better suggestions than prefix search alone.
  • For query-contextualized navigation,
    the order in which the groups are rendered is determined by both counting and the relevance of the hits.
  • Native embedders are used to map the textual query and document representations into dense high-dimensional vectors,
    which are used for semantic search – see embeddings.
    The application uses an open-source embedding model,
    and inference is performed using stateless model evaluation,
    during document and query processing.
  • Hybrid ranking /
    Vector search:
    The default retrieval uses approximate nearest neighbor search in combination with traditional lexical matching.
    The keyword and vector matching is constrained by the filters such as brand, price, or category.

Read more about these and other Vespa features used in
use-case-shopping.

Optimizations and features

  • Vespa supports multiple schemas with multiple fields.
    This can amount to thousands of fields.
    Vespa’s index structures are built for real-time, high-throughput reads and writes.
    With Vespa 8.140, the static memory usage is cut by 75%, depending on field types.
    Find more details in #26350.
  • Extracting documents is made easier using vespa visit in the Vespa CLI.
    This makes it easier to clone applications
    with data to/from self-hosted/Vespa Cloud applications.

pyvespa

Pyvespa – the Vespa Python experimentation library – is now split into two repositories:
pyvespa and learntorank;
this is for better separation of the python API and to facilitate prototyping and experimentation for data scientists.
Pyvespa 0.32 has been released with many new features for fields and ranking;
see the release notes.

This time, most of the new pyvespa features are submitted from non-Vespa Team members!
We are grateful for – and welcome more – contributions. Keep those PRs coming!

GCP Private Service Connect in Vespa Cloud

In January, we announced AWS Private Link.
We are now happy to announce support for GCP Private Service Connect in Vespa Cloud.
With this service, you can set up private endpoint services on your application clusters in Google Cloud,
providing clients with safe, non-public access to the application!

In addition, Vespa Cloud supports deployment to both AWS and GCP regions in the same application deployment.
This support simplifies migration projects, optimizes costs, adds cloud provider redundancy, and reduces complexity.
We’ve made adopting Vespa Cloud into your processes easy!

Blog posts since the last newsletter


Thanks for reading! Try out Vespa on Vespa Cloud
or grab the latest release at vespa.ai/releases and run it yourself! 😀