-
-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Labels
Description
In my work environment, we editably install most Python packages. This leads to long search paths, e.g. 200 entries is common. I think it should be possible to significantly improve mypy's performance in this case.
My benchmark workload is mypy -c "import torch"
on a mypyc-compiled mypy with compile level 3.
I'll run it in the following environments:
clean
rm -rf clean
python -m venv clean
uv pip install torch --python clean/bin/python
long
rm -rf long
python -m venv long
uv pip install torch --python long/bin/python
for i in $(seq 1 200); do
dir=$(pwd)/repo/$i
mkdir -p $dir
echo $dir >> $(long/bin/python -c "import site; print(site.getsitepackages()[0])")/repo.pth
done
openai
This is my main dev environment. I'll see if I can make an artificial environment that matches the performance characteristics of this more closely (this is pretty easy, just need to install a bunch of third party libraries).
bd9200b is my baseline commit
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
Time (mean ± σ): 19.372 s ± 0.179 s [User: 17.018 s, System: 2.285 s]
Range (min … max): 19.223 s … 19.570 s 3 runs
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
Time (mean ± σ): 34.571 s ± 0.085 s [User: 31.770 s, System: 2.762 s]
Range (min … max): 34.499 s … 34.664 s 3 runs
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
Time (mean ± σ): 51.342 s ± 0.472 s [User: 46.853 s, System: 4.423 s]
Range (min … max): 50.840 s … 51.776 s 3 runs
#17920 has already provided a big win here
88ae62b was the commit I measured
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
Time (mean ± σ): 19.094 s ± 0.195 s [User: 16.782 s, System: 2.243 s]
Range (min … max): 18.935 s … 19.312 s 3 runs
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
Time (mean ± σ): 24.838 s ± 0.237 s [User: 22.038 s, System: 2.750 s]
Range (min … max): 24.598 s … 25.073 s 3 runs
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
Time (mean ± σ): 34.161 s ± 0.163 s [User: 29.818 s, System: 4.289 s]
Range (min … max): 34.013 s … 34.336 s 3 runs
You can see that mypy in my environment is still 1.8x slower than it could be (and 1.3x slower in the reproducible toy environment).
Some ideas for things to experiment with:
- We could make fscache cleverer, seeing if we can scandir on parents to get cheaper is_dir and is_file. Especially when querying the existence of entries that are on search paths.
- We could avoid some of the case sensitive handling if we know our file system is case sensitive
- We could vendor some of os.path into mypy, so that mypyc can compile these functions
- Starting with Let mypyc optimise os.path.join #17949
- Make is_sub_path faster #17962 (although even just not using pathlib would probably be great)
- Use the fast path in modulefinder in more places
- Misc