Join us in San Francisco on September 26th for a Meetup

Hi Vespa Community,

Several members from our team will be traveling to San Francisco on September 26th for a meetup and we’d love to chat with you there.

Jon Bratseth (Distinguished Architect) will present a Vespa overview and answer any questions.

To learn more and RSVP, please visit:

https://www.meetup.com/SF-Big-Analytics/events/254461052/

Hope to see you!

The Vespa Team

Vespa Product Updates, September 2019: Tensor Float Support, Reduced Memory Use for Text Attributes, Prometheus Monitoring Support, and Query Dispatch Integrated in Container

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the August Vespa product update, we mentioned BM25 Rank Feature, Searchable Parent References, Tensor Summary Features, and Metrics Export. Largely developed by Yahoo engineers, Vespa is an open source big data processing and serving engine. It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform. Thanks to feedback and contributions from the community, Vespa continues to grow.

This month, we’re excited to share the following updates with you:

Tensor Float Support

Tensors now supports float cell values, for example tensor<float>(key{}, x[100]). Using the 32 bits float type cuts memory footprint in half compared to the 64 bits double, and can increase ranking performance up to 30%. Vespa’s TensorFlow and ONNX integration now converts to float tensors for higher performance. Read more.

Reduced Memory Use for Text Attributes 

Attributes in Vespa are fields stored in columnar form in memory for access during ranking and grouping. From Vespa 7.102, the enum store used to hold attribute data uses a set of smaller buffers instead of one large. This typically cuts static memory usage by 5%, but more importantly reduces peak memory usage (during background compaction) by 30%.

Prometheus Monitoring Support

Integrating with the Prometheus open-source monitoring solution is now easy to do
using the new interface to Vespa metrics.
Read more.

Query Dispatch Integrated in Container

The Vespa query flow is optimized for multi-phase evaluation over a large set of search nodes. Since Vespa-7-109.10, the dispatch function is integrated into the Vespa Container process which simplifies the architecture with one less service to manage. Read more.

We welcome your contributions and feedback (tweet or email) about any of these new features or future improvements you’d like to request.

Vespa Product Updates, September 2020

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


Photo by

ThisisEngineering RAEng on

Unsplash

In the August updates,
we mentioned NLP with Transformers on Vespa, Grafana How-to, Improved GEO Search and Query Profile Variants.

This month, we have several exciting updates to share:

ONNX-Runtime

We have completed integration with ONNX-Runtime in Vespa’s ranking framework,
which vastly increases the capabilities of evaluating large deep-learning models in Vespa
both in terms of model types we support and evaluation performance.
New capabilities within hardware acceleration and model optimizations – such as quantization –
allows for efficient evaluation of large NLP models like BERT and other Transformer models during ranking.
To demonstrate this, we have created an end-to-end question/answering system all within Vespa,
using approximate nearestneighbors and large BERT models to reach state-of-the-art on the Natural Questions benchmark.
Read more.

Hamming Distance

The approximate nearest neighbor ranking feature now also supports the
hamming distance metric.

Conditional Update Performance Improvements

Conditional writes are used for test-and-set operations when updating the document corpus.
As long as the fields in the condition are
attributes (i.e. in memory),
the write throughput is now the same as without a condition, up to 3x better than before the optimization.

Compressed Transaction Log with Synced Ack

Vespa uses a transaction log for write performance.
The transaction log is now synced to disk before the write ack is returned.
The transaction log is now also compressed in order to reduce IO,
and can improve update throughput by 10X if writing to attributes only.

In the News

Learn from the OkCupid Engineering Blog about how OkCupid uses Vespa to launch new features,
ML models in query serving, simplify operations and cut deployment drastically:
tech.okcupid.com/vespa-vs-elasticsearch-for-matching-millions-of-people


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Verizon Media Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

We welcome your contributions and feedback (tweet
or email) about any of these new features or future improvements you’d like to request.

Subscribe to the mailing list for more frequent updates!

Vespa Product Updates, September 2021

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned HTTP/2, ONNX Runtime and factory.vespa.oath.cloud.

This month, we’re excited to share the following updates:

Vespa CLI

Vespa CLI is a zero-dependency tool built with Go, available for Linux, macOS and Windows –
it greatly simplifies interaction with a Vespa instance. Use Vespa CLI to:

  • Clone Vespa sample applications
  • Deploy an application to a Vespa installation running locally or remote
  • Deploy an application to a dev zone in Vespa Cloud
  • Feed and query documents
  • Send custom requests with automatic authentication

Read more.

Nearest neighbor search performance improvement

Exact nearest neighbor search without HNSW index improved serving performance by 20x in Vespa 7.457.52
when combined with query filters and using multiple threads per query.
HNSW index itself has a reduced memory footprint, too –
this enables applications to fit larger data sets for nearest neighbor use cases.
Reindexing an HNSW-index
is multithreaded since Vespa 7.436.31.
This makes it much faster to apply e.g. changes in what distance function is used depending on available CPU cores –
and is now as fast as regular multi-threaded updates to HNSW index.

Paged tensor attributes

Fields indexed as attributes are stored in memory which enables fast partial updates,
flexible match modes, grouping, range queries, sorting, parent/child imports, and direct use in ranking.
Dense tensor attributes can now be set as paged, which means they will mostly reside on disk rather than in memory.
This is useful for large tensors where fast access over many documents per query is not required.
Read more.

mTLS

With #7219,
Vespa now supports mTLS across all internal services and endpoints.
See the blog post
for an introduction and the reference documentation for setup instructions.

Feed performance

Many applications use Vespa for real-time partial update
rates in the 1000s per node per second.
Since Vespa 7.468.9, the Vespa Distributor uses multiple threads by default
with each thread handling a distinct set of the document bucket space.
Context switching is reduced by using async operations in the network threads.
The end-to-end feed throughput have increased significantly:

  • 25-40% increase in throughput for partial updates
  • 25% increase in throughput for puts of summary-only data

Sentencepiece Embedder

A common task in modern IR is to embed a document or query in a vector space for retrieval and/or ranking,
which often means turning a natural language text into a tensor.
Since 7.474.25, Vespa ships with a native implementation
of SentencePiece,
a language agnostic and fast algorithm for this task.
You can use it by having it injected into your own Java code, or by:

  • On the query side, passing tensors as “embed(some text)”
  • On the indexing side, use the “embed” command to turn a text field into a tensor.

As part of this we added a genetic Embedder interface,
so applications can plug in any algorithm for this and use it in queries and indexing as described above.
See this system test
for an example using this.


About Vespa: Largely developed by Yahoo engineers,
Vespa is an open source big data processing and serving engine.
It’s in use by many products, such as Yahoo News, Yahoo Sports, Yahoo Finance, and the Yahoo Ad Platform.
Thanks to feedback and contributions from the community, Vespa continues to grow.

Vespa Newsletter, September 2022 | Vespa Blog

Kristian Aune

Kristian Aune

Head of Customer Success, Vespa


In the previous update,
we mentioned Vespa 8, pre/post ANN filtering support, fuzzy matching, pyvespa experimental ranking module,
embedding support and new tensor features
Today, we’re excited to share the following updates:

Rank-phase statistics

With rank-phase statistics
it is easy to measure relative query performance on a per-document-level,
like “Which documents appear most often in results, which ones never do?”.
The statistics are written in configurable attributes per document,
for analysis using the Vespa query- and aggregation APIs.
Use this feature for real-time tracking of ranking performance,
and combine with real-time updates for tuning.

Schema feeding flexibility

Since Vespa 8.20, a document feed can contain unknown fields using
ignore-undefined-fields.
While the default behavior is to reject feeds with unknown fields,
this can make it easier to optimize or evolve the schema to new use cases,
with less need to coordinate with client feeds.

Beta: Query Builder and Trace Visualizer

New beta applications for building queries and analyzing query traces available at
github.com/vespa-engine/vespa/tree/master/client/js/app.
This is the first step towards helping users experiment easily with queries,
and the Trace Visualizer can be used to help pinpoint query latency bottlenecks.

Rank trace profiling

Use rank trace profiling to expose information about how time spent on ranking is distributed between individual
rank features.
Available since Vespa 8.48,
use trace.profileDepth
as a query parameter, e.g. &tracelevel=1&trace.profileDepth=10.
This feature can be used for content node rank performance analysis.

Feeding bandwidth test

When doing feeding throughput tests, it can often be hard to distinguish latency inside your Vespa application
vs. validating the available bandwidth between client and server.
Since Vespa 8.35, the vespa-feed-client
supports the --speed-test parameter for bandwidth testing.
Note that both client and server Vespa must be on 8.35 or higher.

Training video

Vespa allows plugging in your own Java code in both the document- and query-flows, to implement advanced use cases.
Using query tracing and a debugger can be very useful in developing and troubleshooting this custom code.
For an introduction, see Debugging a Vespa Searcher:

Vespa.ai: Debugging a Vespa Searcher