Support match-features in grouping

**Is your feature request related to a problem? Please describe.**
Queries look like this:
```
SELECT *
FROM recommender_items
WHERE cluster_kmeans_flat_k_8192_faiss_id IN (@cluster_kmeans_flat_k_8192_faiss_ids)
LIMIT 0 | all(group(cluster_kmeans_flat_k_8192_faiss_id)
              max(50)
              each(max(80)each(output(summary()))))
```
And the summary fetching time is very long because in the `summary` we want to fetch the scores/signals of the `first-phase` score components (which include the vector similarity) to track them for further scoring optimisations (idea is the same as described [here](https://x.com/jonbratseth/status/1912134283176583620)). 

Such a workload causes the high match thread utilisation, even when CPU utilisation being somewhat low, see screenshots below:

<img width="1322" alt="Image" src="https://github.com/user-attachments/assets/676baee6-d99d-4da8-9f59-05526f577956" />

<img width="1331" alt="Image" src="https://github.com/user-attachments/assets/d41cdbc9-7676-4842-ae77-b535ead6bfca" />

Some more context:

My use case is provide diversity in search results according to some pre-calculated cluster_id. And to do that we use grouping. While also we need some fine-grained matching statistics (e.g. vector similarity) that was used for ranking to further tune the scoring model.
We fetch from Vespa around ~50 groups each with ~70 hits. Which gives about 3000-4000 hits per request to fetch summary features. The main issue is the increasing latency: under even a moderate load (~40% CPU utilization) the latencies are growing a lot and Vespa cannot respond within the latency budget (250ms). The Vespa response timing shows that summary fetching takes more time than searching itself.
{  
  "querytime": 0.018000000000000002,
  "summaryfetchtime": 0.023,
  "searchtime": 0.042
}
My experiments show that the high latency is due to summary-features being recalculated on each summary request. By reducing the number of features fetched, the latency dropped significantly.
The idea is that if we could get the match-features for the grouped hits, we could avoid the long summary fetch times.

**Describe the solution you'd like**
I want to be able to add `match-features` to the hits that are grouped.

**Describe alternatives you've considered**
Exclude computational features from summaries. And collect ranking scores offline.

**Additional context**
Slack [thread](https://vespatalk.slack.com/archives/C01QNBPPNT1/p1745317765808609).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support match-features in grouping #33961

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support match-features in grouping #33961

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions