-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Description
We use pytest in a Jenkins job, which ends running many concurrent pytest processes on a single Linux worker under the same user account (jenkins
). The test suite has some session-scoped fixtures which use tmpdir_factory
to create temporary directories and create files in them, which are then used throughout the duration of the entire test run.
Recently our pytest runs started taking longer than 3 hours and we started to see tests randomly fail near the end of the suite with errors like this:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-47927/...
indicating that something had deleted the tmp files created by the session-scoped fixture at the beginning of the test run.
Pytest's tmpdir_factory
fixture calls make_numbered_dir_with_cleanup
to handle creation and cleanup of tmp directories:
https://github.com/pytest-dev/pytest/blob/4.4.0/src/_pytest/tmpdir.py#L75
That function has some logic which looks for the presence of a lock file to detect if the directory is still being used by another pytest process:
https://github.com/pytest-dev/pytest/blob/4.4.0/src/_pytest/pathlib.py#L205
but the logic assumes that any lock file older than the given "lock timeout" (hardcoded to 3 hours) is stale and can just be deleted. But if the other pytest process is actually still running after 3 hours, its tmp directory will be deleted while it is still in use.
The Linux worker is CentOS 7.8 with Python 3.6.8 and pytest 4.4.0. I see the pytest master branch still has the same 3 hour timeout logic so I think this bug will affect all recent versions of pytest.