Intel-sklearn, CuML, and several other libraries should have optimized variants of PCA and likely a few other algorithms used. I can submit a PR implementing a few of these if you'd like. For certain types of vectors, this can cause a noticeable speedup.
GPU implementations might harm reproducibility. Might be other issues too that I haven't thought about. Thoughts?
Intel-sklearn, CuML, and several other libraries should have optimized variants of PCA and likely a few other algorithms used. I can submit a PR implementing a few of these if you'd like. For certain types of vectors, this can cause a noticeable speedup.
GPU implementations might harm reproducibility. Might be other issues too that I haven't thought about. Thoughts?