Skip to content

[Feature Request] Assess performance capability before a model is loaded #20998

Open
@beaufortfrancois

Description

@beaufortfrancois

Describe the feature request

Assess performance capability without downloading the full model.

Describe scenario use case

For some models, the performance may be a blocker. Since model downloads can be quite large, I wonder if there should be a way for web developers to know their machine performance class for running a model without downloading it completely first.

I believe this would involve running the model code with zeroed-out weights, which would still require buffer allocations but would allow the web app to catch out-of-memory errors or such. The model architecture would still needed to generate shaders, but this be much smaller than model weights.

cc @xenova @guschmue

Originally posted at huggingface/transformers.js#545 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:WebGPUort-web webgpu providerfeature requestrequest for unsupported feature or enhancementplatform:webissues related to ONNX Runtime web; typically submitted using template

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions