Internship at Vespa | Vespa Blog

Mathias Chunnoo

Mathias Chunnoo

Intern I, Summer of 2021

Ole-Magnus Vian Norum

Ole-Magnus Vian Norum

Intern II, Summer of 2021

Photo by Charles Deluvio on Unsplash

Over the course of the summer we the interns have gotten to explore the Vespa
engine and the workings of the company. At the start of our internship we got
an introduction to the company and the tools that the they used. To get
familiar with the vespa engine we went through the getting started tutorial,
where we made a news recommendation system.

During our internship we got to work on many different things, but
our two main projects were to use Vespa to implement two sample
applications for searching through the Vespa documentation. These two sample
apps were named


The search-as-you-type application aims to implement an interface where results
are displayed live while the user is typing in the search bar. This requires
the search to both generate hits on incomplete words and to retrieve these hits
as close to instantly as possible. A substring search would fit the need for
incomplete word searches, however for large corpora, this would not meet the
performance requirement. Our solution instead uses n-grams (groups of n
characters) to simulate a substring-like search. The idea is to search for
n-grams and rank the hits where the n-grams are bunched up together higher than
the hits where the n-grams are spread throughout the document. After trying
various configurations we found that 3-grams in combination with Vespas
nativeRank fit our needs
very well. In addition we combined this with index search such that if the
search string consists of complete words the indexed search hits would rank
higher than the n-gram search hits.

schema doc {
    field gram_content type string {
        indexing: input content | index | summary
        match {
            gram-size: 3
        summary: dynamic
    document doc {
        field content type string {
            indexing: index | summary
            summary: dynamic
            stemming: best
    rank-profile weighted_doc_rank inherits default {
        rank-properties {
            $contentWeight: 10.0
            $gramContentWeight: 1.0
        first-phase {
            expression {
                query(contentWeight) * nativeRank(content)
                + query(gramContentWeight) * nativeRank(gram_content)

After the Vespa application was in place, we needed an actual search bar for
the search as you type to take place. This was implemented by incorporating a
simple static web page java server into the Vespa application and writing some
javascript to query the Vespa application every time a character was entered
into the search bar. In addition, a debounce-function was used to avoid race
condition due to simultaneous query requests.


The idea behind the search-suggestion application is to suggest possible search
terms to the user before they have typed out their whole query. In our
implementation these suggestions comes either from the document texts the users are
searching through or from previous searches performed by other users.

The first iteration of the search-suggestion application fed new search terms
by “put”, which thus resulted in storing multiples of the same terms. To
calculate relevance and single out terms, the generated hits where grouped and
counted. This was not a scalable solution as more data for the same search
terms would result in a linear increase in both storage space and process time
for each query. To solve this we switched to feeding by “update”. In other
words, adding unseen terms and incrementing their “query_count” variable when
processing previously seen terms.

    "update": "id:term:term::example",
    "create": true,
    "fields": {
      "term": {
        "assign": "example"
      "corpus_count": {
        "assign": 181
      "document_count": {
        "assign": 40
    "update": "id:term:term::example",
    "create": true,
    "fields": {
      "term": { "assign": "example" },
      "query_count": { "increment": 1 }

Since we were going to use queries written by users as suggestions, we had to
implement some form of moderation as to what could be suggested. To solve this
problem we made a list of allowed terms and used a document processor to filter
out any documents that contained terms not in the list. We chose to generate
the allowed-list by listing every word used in the document text. This made it
so that all relevant terms could be suggested, and things that could be seen as
offensive or otherwise irrelevant would not come up as suggestions, as they was
not contained in the document text and thus would be blocked by the document
processor and not fed.

For the first iteration of the search-suggestion, the application used Vespas
streaming search with prefix matching to search for documents with matching
prefixes. After a presentation of the application and some discussion it was
believed that streaming search would not be scalable as the number of
concurrent users increased. To test this belief we did a benchmark of the
application using fbench (benchmark results).
As suspected the performance of streaming search drastically decreased as
the number of concurrent users increased. We decided to change the application
to use index prefix search, and after a comparison benchmark test it was
confirmed that this implementation scaled much better than streaming search.

Like we did with the search-as-you-type application, we incorporated a static
web page java server into the Vespa application and wrote some javascript for
querying suggestions on every input and showing these suggestions in a dropdown
under the search bar.

We also took the two sample applications we made and integrated them in to an
already existing sample application
which is deployed on to Vespa Cloud. As of now this deployment is also used for
search suggestion on the Vespa documentation sites in the search bar.

AWS Lambda

One of the goals of the search-suggestion application was to favor searches
that where previously searched for. To accomplish this we decided to create a AWS
Lambda function which would read query logs and feed search term from these
back into the Vespa application. The reason for this was that the query logs
where stored in AWS S3 buckets and that this would make it possible to
continuously trigger the Lambda function and process future query logs. The
biggest problem we faced when writing this Lambda function was decompressing
the logs. Vespa stores its query logs compressed with zstd-compression and
finding a zstd-library usable in a AWS Lambda context was not straight forward.
Initially a lot of time was spent learning AWS SAM and deploying a docker image
to the Lambda function, as this would let us use native C++ libraries in the
Lambda. However, we later found a library which compiled to web assembly and
would let us decompress filed with just a Node.js Lambda function.

Other projects

While working on the main projects, we also got to work on other smaller side
projects. One of these side projects was to implement a visualization view of
proton’s memory usage. This view was
created using react and incorporated into the Vespa Console. We also got to
work with Vespa’s performance tests and moved some of the private performance
tests over to the public opensource repository by changing the document sets so
that they did not use private or sensitive data.

The experience at Vespa

At the start of the internship it felt a bit daunting to make our own sample
applications, given that the vespa engine was something completely new to us.
Even after the getting started tutorial many things were still unclear, but as we
started working on the sample applications, more and more things became clearer as
we had to dive in to the documentation and previously written code to be able to
make the applications from scratch.

Since there is so much you can do with Vespa it was hard at times to find the answers
to our questions in the documentation. Not necessarily because it did not exist, but
because we did not know what to search for to find the right documents. This led to us
some times using Vespas public github-repositories to find answers to our questions.
Even though we some times did not find the answers we were looking for, we never felt
lost as there were always someone ready to help us out when we got stuck.

During our internship at Vespa we got to learn a lot about the vespa search
engine and information retrieval in general. We have especially learned some
different methods of doing search with incomplete queries and query processing.
We have also gotten a feel for how it is to work in a software company through
daily stand-ups and presentations of our projects.

Working here have given us insight into the workflow and github etiquette of a
software company. The internship has given us experience with working in a
team of developers and how to collaborate through github effectively. We
have touched upon various technologies from writing user interfaces in React to
writing performance tests in Ruby. We have also gotten to work with and learn
about important platforms and services like Amazon Web Service and Docker, which
is commonly used in companies but not taught in schools.

Even after the internship there are still many things that we have not touched
upon or learned about, and we wish we could explore more. To go deeper in
to the lower levels of the code and the working of Vespa, and to learn more about
search from the experienced people working at Vespa. We grew attached to the
projects we worked on and wished we had more time to fine tune, and improve
them to get an even better search in the Vespa documentation.

We really enjoyed our stay here, with a nice staff who have an incredible
expertise regarding search and information retrieval, from whom we have learned a
lot. The experience at Vespa has been really pleasant and educational, and is some thing
that has and will benefit us in the future.