-
-
Notifications
You must be signed in to change notification settings - Fork 3k
[mypyc] Use vectorcalls to call Python objects #10153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This way the same tests work on all supported Python versions. We'll need to also add an option to generate IR for later Python versions.
Also fix one test case. Refactor test name parsing code.
Attempt to fix tests.
s = str(val) | ||
if val >= (1 << 31): | ||
# Avoid overflowing signed 32-bit int | ||
s += 'U' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added unit test.
mypyc/ir/rtypes.py
Outdated
return isinstance(other, RVoid) | ||
|
||
def __hash__(self) -> int: | ||
return 12345 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks weird, maybe use something like hash(RVoid)
(the class object)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mypyc/ir/rtypes.py
Outdated
self.item_type = item_type | ||
# Number of items | ||
self.length = length | ||
self._ctype = 'xxx' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this is needed? I mean the xxx
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it. It seems useless.
|
||
Note that this interacts atypically with reference counting. We | ||
assume that each RArray register is initialized exactly once | ||
with this op. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can be enforced somehow, for example have a class level set of destination registers, and check in __init__
that the destination is not in this set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's something we should do once we have general validation of the generated IR. However, this is probably not the best starting point for IR validation. Anyway, the C compilation will fail if we try to initialize an array twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks great to me and this is a great improvement! Left a comment on potential long-term improvement
@@ -488,6 +502,10 @@ def visit_load_address(self, op: LoadAddress) -> None: | |||
src = self.reg(op.src) if isinstance(op.src, Register) else op.src | |||
self.emit_line('%s = (%s)&%s;' % (dest, typ._ctype, src)) | |||
|
|||
def visit_keep_alive(self, op: KeepAlive) -> None: | |||
# This is a no-op. | |||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of having KeepAlive
, but I wonder whether we should keep this analysis-only IR all the way to codegen. If more analysis-only (or mid-end only) IRs are introduced in the future, it may become reasonable to eliminate them first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually gets stripped along the way. I'll try raising an exception if we reach here.
Otherwise we might have instances of KeepAlive in the IR during code generation, which we want to avoid.
The vectorcall APIs can make calls to callable Python objects
significantly faster when using keyword arguments.
This makes a microbenchmark that calls interpreted functions using
keyword arguments about 1.9x faster (Python 3.8). The main benefit is
from not needing to construct a dictionary for the keyword arguments.
This also gives a minor speedup when only using positional arguments.
The nested_func microbenchmark is now about 9% faster.
The main PyObject_VectorCall API is available on Python 3.8+ (with an
underscore prefix on 3.8). On Python 3.9+ we can also use a dedicated
method vectorcall function (PyObject_VectorcallMethod).
The new APIs can be used for calls that don't use caller *args or
**kwargs. For other calls we continue to use the legacy APIs. On older
Python versions we also continue use the slower legacy APIs.
See https://docs.python.org/3.9/c-api/call.html for more details.
Vectorcalls require arguments to be stored in a C array. Add the
RArray primitive type to support them. Also add the AssignMulti op to
initialize an array. By having a dedicated op for this makes it easy
to generate compact C code. This is useful since calls are common
operations.
The reference count transform can't reason about references held in
RArray values. Add a no-op KeepAlive op that is only used to ensure
that the operands aren't freed until after the call op. KeepAlive must
also be used with SetMem and LoadMem, for consistency. Previously
these ops had similar functionality built in.
Generate vectorcall ops in IR build test cases only when the test case
opts into them by specifying a recent enough Python version through a
_python3_8 test name suffix (or _python_3_9). This avoids having to
modify a large number of test cases.