Description
Is your feature request related to a problem? Please describe.
Queries look like this:
SELECT *
FROM recommender_items
WHERE cluster_kmeans_flat_k_8192_faiss_id IN (@cluster_kmeans_flat_k_8192_faiss_ids)
LIMIT 0 | all(group(cluster_kmeans_flat_k_8192_faiss_id)
max(50)
each(max(80)each(output(summary()))))
And the summary fetching time is very long because in the summary
we want to fetch the scores/signals of the first-phase
score components (which include the vector similarity) to track them for further scoring optimisations (idea is the same as described here).
Such a workload causes the high match thread utilisation, even when CPU utilisation being somewhat low, see screenshots below:


Some more context:
My use case is provide diversity in search results according to some pre-calculated cluster_id. And to do that we use grouping. While also we need some fine-grained matching statistics (e.g. vector similarity) that was used for ranking to further tune the scoring model.
We fetch from Vespa around ~50 groups each with ~70 hits. Which gives about 3000-4000 hits per request to fetch summary features. The main issue is the increasing latency: under even a moderate load (~40% CPU utilization) the latencies are growing a lot and Vespa cannot respond within the latency budget (250ms). The Vespa response timing shows that summary fetching takes more time than searching itself.
{
"querytime": 0.018000000000000002,
"summaryfetchtime": 0.023,
"searchtime": 0.042
}
My experiments show that the high latency is due to summary-features being recalculated on each summary request. By reducing the number of features fetched, the latency dropped significantly.
The idea is that if we could get the match-features for the grouped hits, we could avoid the long summary fetch times.
Describe the solution you'd like
I want to be able to add match-features
to the hits that are grouped.
Describe alternatives you've considered
Exclude computational features from summaries. And collect ranking scores offline.
Additional context
Slack thread.