Skip to content

Support making a variable Optional in an else branch #11002

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 27, 2021
Merged

Conversation

msullivan
Copy link
Collaborator

That is, support patterns such as:

if condition:
    foo = Foo()
else:
    foo = None

Currently this does not work, but the the reverse does (because foo
will be inferred as a PartialType).

I think it might be worth tackling this in a more general way, for
other types, though I think that is a little fiddlier and likely to be
more controversial, so I'm starting with something special-cased for
the "assigning literal None" case first.

The rule we implement is that we allow updating the type of a variable
when assigning None to it if the variable's type was inferred and it
was defined in an earlier branch of the same if/then/else statement.

Some infrastructure is added to make determinations about that.

Given that this is probably my single biggest frustration with mypy,
and that this PR took me less than three hours to prepare, I am pretty
angry at myself for not having done this three years ago.

That is, support patterns such as:
```
if condition:
    foo = Foo()
else:
    foo = None
```

Currently this does not work, but the the *reverse* does (because foo
will be inferred as a PartialType).

I think it might be worth tackling this in a more general way, for
other types, though I think that is a little fiddlier and likely to be
more controversial, so I'm starting with something special-cased for
the "assigning literal None" case first.

The rule we implement is that we allow updating the type of a variable
when assigning `None` to it if the variable's type was inferred and it
was defined in an earlier branch of the same `if/then/else` statement.

Some infrastructure is added to make determinations about that.

Given that this is probably my single biggest frustration with mypy,
and that this PR took me less than three hours to prepare, I am pretty
angry at myself for not having done this three years ago.
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Copy link
Member

@emmatyping emmatyping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems really great, its a common annoyance so fixing it would be a nice win :)

I had a couple of suggestions but otherwise looks good.

@@ -832,3 +832,99 @@ main:4: error: Argument 1 to "asdf" has incompatible type "List[str]"; expected
main:4: note: "List" is invariant -- see https://mypy.readthedocs.io/en/stable/common_issues.html#variance
main:4: note: Consider using "Sequence" instead, which is covariant
[builtins fixtures/list.pyi]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other ideas for tests would be mixing different kinds of frames and control flow e.g. try/except and with statements

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, related to that: maybe it's worth supporting this pattern for try/except also?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to try that, though I could be easily convinced to drop it!

@github-actions

This comment has been minimized.

return z # E: Incompatible return value type (got "Optional[int]", expected "int")

def f3(b: bool) -> int:
# XXX: This one is a little questionable! Maybe we *do* want to allow this?
Copy link
Member

@emmatyping emmatyping Aug 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be nice to have, but it probably should be discussed more (esp. since there are few cases where the type is different from the first assignment)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and if we want to allow this, this can be implemented in a separate PR.

Copy link
Member

@emmatyping emmatyping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. Might be nice to get some more opinions on the TODO re: assigning None in an if clause, but otherwise it seems good to land.

@msullivan
Copy link
Collaborator Author

@JukkaL what do you think?

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a very nice feature that I've also wanted to have for a long time! I didn't do a full review yet, but I came up with various additional test cases.

return z # E: Incompatible return value type (got "Optional[int]", expected "int")

def f3(b: bool) -> int:
# XXX: This one is a little questionable! Maybe we *do* want to allow this?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and if we want to allow this, this can be implemented in a separate PR.

else:
y = None
reveal_type(y) # N: Revealed type is "Union[builtins.int, None]"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideas for additional test cases below. I'm not sure what the semantics should be in some of these cases, but at least it makes sense to have some well-defined semantics for these. Omitted most reveal_type(z) etc. for brevity.

Case 1:

if b:
    z = 2
elif c:
    z = None
elif d:
    z = 3

Case 2:

if c:
    x = []
else:
    x = None

Case 3:

if b:
    z = 4
else:
    if c:
        z = 5
    else:
        z = None

Case 4:

Use this at module top level, or within class body.

Case 5:

if b:
    z = 5
else:
    z = None
    if c:
        z = 6

Case 6:

if b:
    x = []
    x.append(1)
else:
    x = None

Case 7:

if b:
    x: Any = 5
else:
    x = None
reveal_type(x)

Case 8:

[Tricky, optional] Use at module top level, and have a forward reference that causes two passes over the top level during type checking.

Case 9:

[Tricky, optional] Have a forward reference where the type of the reference target is not ready yet within a function between the first and second assignment. This causes the function to be deferred.

Case 10:

Define an attribute using self.x = 5 / self.x = None. I think that it's not safe to allow this, and we may need to reject this, since the attribute type is visible outside the method (see below for an example where this might cause trouble).

Case 11:

[Very tricky, optional] Define an attribute, and cause the method to be deferred after the initial self.x assignment. Now another method refers to the x attribute, and it should not see the non-optional type.

Copy link
Collaborator Author

@msullivan msullivan Aug 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I've addressed most of the tests. Some notes:
2. Demands a type annotation
4. Is disallowed at top level. I'm not sure it has to be, but there could be complications and it seems unlikely to come up? We ignore = None assignments at class level so that pre 3.6 code can write type comments, so nothing useful happens there.
6. Works
7. Infers Any
8, 10, 11: All are disallowed
9: Works fine

@github-actions

This comment has been minimized.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Looks pretty good. It is hacky, but this is an important use case so it seems worth it, as I can't think of a cleaner way to do this (without a major redesign). Left a few remaining comments. Feel free to merge once you've addressed them.

mypy/checker.py Outdated
@@ -203,6 +203,12 @@ class TypeChecker(NodeVisitor[None], CheckerPluginInterface):
# directly or indirectly.
module_refs: Set[str]

# A map from variable nodes to a snapshot of the current frames
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Current frames" here means frames active when the variable is first initialized? Can you explain this in more detail here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more text. Also realized that we don't need to store the bools at all, since we can just read it out from the frame

mypy/checker.py Outdated
# A map from variable nodes to a snapshot of the current frames
# and whether they are associated with `if` statements. This can
# be used to determine if a variable is defined in a different
# branch of the same `if` statement.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that the inner dictionary is keyed by frame id.

Maybe clarify that if we have the same frame id in both first initialization and follow-up assignment with a true value, they are in different branches of the same if statement (assuming that I got this right)?

@github-actions
Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

jax (https://github.com/google/jax.git)
+ jax/interpreters/pxla.py:807: error: Argument 1 to "len" has incompatible type "Optional[List[Any]]"; expected "Sized"  [arg-type]
- jax/_src/numpy/lax_numpy.py:1846: error: Argument 1 to "_wraps" has incompatible type "Optional[Any]"; expected "Callable[..., Any]"  [arg-type]
+ jax/_src/numpy/lax_numpy.py:1846: note: (Skipping most remaining errors due to unresolved imports or missing stubs; fix these first)
- jax/experimental/host_callback.py:1705: note: (Skipping most remaining errors due to unresolved imports or missing stubs; fix these first)

aioredis (https://github.com/aio-libs/aioredis.git)
- aioredis/lock.py:218: error: Incompatible types in assignment (expression has type "None", variable has type "int")
- aioredis/client.py:3403: error: Incompatible types in assignment (expression has type "None", variable has type "ValuesView[Any]")

pandas (https://github.com/pandas-dev/pandas.git)
+ pandas/plotting/_matplotlib/core.py:1664: error: unused "type: ignore" comment

poetry (https://github.com/python-poetry/poetry.git)
+ poetry/mixology/failure.py:144: error: Item "NoVersionsCause" of "Union[RootCause, NoVersionsCause, DependencyCause, ConflictCause, PythonCause, PlatformCause, PackageNotFoundCause]" has no attribute "conflict"

@msullivan msullivan merged commit 56b6803 into master Aug 27, 2021
@msullivan msullivan deleted the rage-optional branch August 27, 2021 03:59
@ikonst
Copy link
Contributor

ikonst commented Jan 26, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants