Skip to content

issues with the semantics of top-level expressions #24569

Closed
@JeffBezanson

Description

@JeffBezanson

Top-level expressions are special because they are able to cause certain side-effects that strongly affect program behavior and that ordinary expressions can't cause. These are:

  1. Binding resolution (global, const, import, using)
  2. Method definition

Fortunately, that's all. (Type definitions could be on this list, but the only tricky part of their behavior is assigning a global name, which is equivalent to item 1.)

The most desirable behavior for any statements not inside methods is "interpreter-like": all statements should see all effects of all previous statements. Unfortunately this is at odds with compiling fast code, which would like to assume a fixed world state (i.e. world counter). This tension has led to several bugs and tricky design decisions, such as:

Code behaving differently based on whether it gets compiled: #2586 #24566
Top-level code breaking inference's fixed-world assumption: #24316
Problems with binding effects: #18933 #22984 #12010

Three general kinds of solutions are possible: (1) Change top-level semantics to enable optimizations, (2) make optimizations sound with respect to top-level semantics, or (3) don't optimize top-level expressions. (1) is likely to be unpopular, since it would lead to things like:

f(x) = 1
begin
    f(x::Int) = 2
    f(1)  # gives 1
end

due to inference's usual fixed-world assumption. That leads us to (2). But it wouldn't be able to optimize very much. Consider a case like

for i = 1:100
    include("file$i.jl")
    f(1)
end

where we can't reasonably assume anything about what f(1) does.

That brings us to (3). In practice, it is basically never useful to infer and compile a top-level expression. The only exceptions are casual benchmarking (@time), and maybe an occasional long-running loop in the REPL. So to compromise, for a long time we've been compiling top-level expressions if they have loops. But that's a rather blunt instrument, since compiling loops that e.g. call eval is not worthwhile, and in other cases is unsound (#24316).

I'm not sure what to do, but overall I think we should optimize top-level code much less often (which might improve load times as well). One idea is to add a @compile or @fast macro that turns off top-level semantics, basically equivalent to wrapping the code in a 0-arg function and calling it.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions