Skip to content

[mypyc] Use vectorcalls to call Python objects #10153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Mar 6, 2021
Merged

[mypyc] Use vectorcalls to call Python objects #10153

merged 36 commits into from
Mar 6, 2021

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Feb 28, 2021

The vectorcall APIs can make calls to callable Python objects
significantly faster when using keyword arguments.

This makes a microbenchmark that calls interpreted functions using
keyword arguments about 1.9x faster (Python 3.8). The main benefit is
from not needing to construct a dictionary for the keyword arguments.

This also gives a minor speedup when only using positional arguments.
The nested_func microbenchmark is now about 9% faster.

The main PyObject_VectorCall API is available on Python 3.8+ (with an
underscore prefix on 3.8). On Python 3.9+ we can also use a dedicated
method vectorcall function (PyObject_VectorcallMethod).

The new APIs can be used for calls that don't use caller *args or
**kwargs. For other calls we continue to use the legacy APIs. On older
Python versions we also continue use the slower legacy APIs.

See https://docs.python.org/3.9/c-api/call.html for more details.

Vectorcalls require arguments to be stored in a C array. Add the
RArray primitive type to support them. Also add the AssignMulti op to
initialize an array. By having a dedicated op for this makes it easy
to generate compact C code. This is useful since calls are common
operations.

The reference count transform can't reason about references held in
RArray values. Add a no-op KeepAlive op that is only used to ensure
that the operands aren't freed until after the call op. KeepAlive must
also be used with SetMem and LoadMem, for consistency. Previously
these ops had similar functionality built in.

Generate vectorcall ops in IR build test cases only when the test case
opts into them by specifying a recent enough Python version through a
_python3_8 test name suffix (or _python_3_9). This avoids having to
modify a large number of test cases.

@msullivan msullivan self-requested a review March 1, 2021 03:04
@TH3CHARLie TH3CHARLie self-requested a review March 1, 2021 09:58
s = str(val)
if val >= (1 << 31):
# Avoid overflowing signed 32-bit int
s += 'U'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added unit test.

return isinstance(other, RVoid)

def __hash__(self) -> int:
return 12345
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks weird, maybe use something like hash(RVoid) (the class object)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

self.item_type = item_type
# Number of items
self.length = length
self._ctype = 'xxx'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is needed? I mean the xxx.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it. It seems useless.


Note that this interacts atypically with reference counting. We
assume that each RArray register is initialized exactly once
with this op.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be enforced somehow, for example have a class level set of destination registers, and check in __init__ that the destination is not in this set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something we should do once we have general validation of the generated IR. However, this is probably not the best starting point for IR validation. Anyway, the C compilation will fail if we try to initialize an array twice.

Copy link
Collaborator

@TH3CHARLie TH3CHARLie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks great to me and this is a great improvement! Left a comment on potential long-term improvement

@@ -488,6 +502,10 @@ def visit_load_address(self, op: LoadAddress) -> None:
src = self.reg(op.src) if isinstance(op.src, Register) else op.src
self.emit_line('%s = (%s)&%s;' % (dest, typ._ctype, src))

def visit_keep_alive(self, op: KeepAlive) -> None:
# This is a no-op.
pass
Copy link
Collaborator

@TH3CHARLie TH3CHARLie Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of having KeepAlive, but I wonder whether we should keep this analysis-only IR all the way to codegen. If more analysis-only (or mid-end only) IRs are introduced in the future, it may become reasonable to eliminate them first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually gets stripped along the way. I'll try raising an exception if we reach here.

JukkaL added 3 commits March 6, 2021 16:55
Otherwise we might have instances of KeepAlive in the IR during
code generation, which we want to avoid.
@JukkaL JukkaL merged commit 7b7cce3 into master Mar 6, 2021
@JukkaL JukkaL deleted the vectorcall-4 branch March 6, 2021 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants