Skip to content

create singleton ExceptionSink object to centralize logging of fatal errors#6533

Merged
stuhood merged 4 commits into
pantsbuild:masterfrom
cosmicexplorer:setup-global-error-sink
Sep 20, 2018
Merged

create singleton ExceptionSink object to centralize logging of fatal errors#6533
stuhood merged 4 commits into
pantsbuild:masterfrom
cosmicexplorer:setup-global-error-sink

Conversation

@cosmicexplorer

Copy link
Copy Markdown
Contributor

Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

Solution

Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think moving Exiter._log_exception to this class blocks landing this, but it looks fine otherwise.

Comment thread src/python/pants/bin/exception_sink.py Outdated
return cls()

def __init__(self):
self._cached_destination = None

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably default to cwd so that it is always useable, even before the destination has been set the first time.

Also I'm not sure it's really a "cached" destination... maybe just "destination".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds great, implementing this and the below now.

Comment thread src/python/pants/bin/exception_sink.py Outdated

class ExceptionSinkError(Exception): pass

def _get_timestamp(self):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than adding timestamp logic, I'd recommend moving the Exiter._log_exception method into this class.

And I think that if you guarantee that self._cached_destination is always "something" by initializing it to cwd, then you should be able to have set_destination log to the previous destination if it fails.

@cosmicexplorer cosmicexplorer Sep 20, 2018

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done that and added thorough testing in c562a49!

@cosmicexplorer

cosmicexplorer commented Sep 20, 2018

Copy link
Copy Markdown
Contributor Author

Moving Exiter._log_exception() to ExceptionSink was larger in scope (conceptually, in my head) than I expected this PR to be, but it works perfectly -- the implementation is now just:

  def _log_exception(self, msg):
    if self._workdir:
      ExceptionSink.instance.set_destination(self._workdir)
    ExceptionSink.instance.log_exception(msg)

It's great.

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Please: add the per-pid logs before landing.

Comment thread src/python/pants/base/exception_sink.py Outdated
class ExceptionSink(object):
"""A mutable singleton object representing where exceptions should be logged to."""

@memoized_classproperty

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pattern we've used elsewhere to do the "mutable singleton" thing would be to ban calls to __new__, and then put all properties directly on the class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread src/python/pants/base/exception_sink.py Outdated
return cls()

def __init__(self, destination=None):
# TODO: what assumptions can we make about the current directory of a process? Does it have to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of initializing it to something is just to increase the chances that even before it's been configured that we're still able to capture exceptions somewhere... so we're not really make assumptions, other than trying to increase our chances. The goal isn't every to actually use it without configuring it... rather to just still catch errors if we happen to fail that early.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my concern here was more "can we assume we can write to os.getcwd()?"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't, but that's not cromulent given the above.

def _iso_timestamp_for_now(self):
return datetime.datetime.now().isoformat()

# NB: This includes a trailing newline, but no leading newline.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the logs structured is fine. But we are still going to want to have a separate log per $pid. The client shouldn't have to parse anything... just look for a log with the relevant pid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@stuhood stuhood left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@classmethod
def set_destination(cls, dir_path):
if not is_writable_dir(dir_path):
# TODO: when this class sets up excepthooks, raising this should be safe, because we always

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be a note.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see you've removed it in #6539, sounds great.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need any of this in your next PR, let me know and I can backport it. Or feel free to just cherry-pick to your branch.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could land #6539 with the test marked xfail.

@cosmicexplorer cosmicexplorer Sep 21, 2018

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll cherry-pick whatever I need, I was thinking -- the changes are mostly orthogonal to what I have now. Landing with it marked xfail is a-ok to me as long as the changes don't break other tests (wouldn't think so).

class ExceptionSink(object):
"""A mutable singleton object representing where exceptions should be logged to."""

_destination = os.getcwd()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably explain why we do this ("blah blah increase the chances that if an exception occurs early in initialization we still record it somewhere").

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely agree, this is highly mysterious as is.

# avoids any errors with concurrent modification of the shared log affecting the per-pid log.
with safe_open(cls._exceptions_log_path(for_pid=pid), 'a') as pid_error_log:
pid_error_log.write(fatal_error_log_entry)
# TODO: we should probably guard this against concurrent modification somehow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that because this is a single write call it's fairly likely it goes to the log atomically. Not sure though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implication from a few minutes of searching is that people have lots of ways to create a file atomically, but appending atomically is a less common use case. I might check the CPython source for fun later. My first question would be: if I open a file to append with mode 'a', does python seek to the end of the file in the open call, or in the write() call? I would assume the latter and since this is a small message I could see atomicity. I will flesh out this TODO somewhat in the followup PR.

@stuhood stuhood merged commit 17ddd39 into pantsbuild:master Sep 20, 2018
stuhood pushed a commit that referenced this pull request Sep 22, 2018
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.
cosmicexplorer added a commit that referenced this pull request Oct 3, 2018
…errors (#6533)

### Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

### Solution

- First step in implementing #6530, a (process-)global singleton to manage where logs are kept (no logging is done yet).
- Add some very basic validation of input and very basic tests.

### Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.
cosmicexplorer pushed a commit that referenced this pull request Oct 3, 2018
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.
cosmicexplorer added a commit that referenced this pull request Oct 10, 2018
…errors (#6533)

### Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

### Solution

- First step in implementing #6530, a (process-)global singleton to manage where logs are kept (no logging is done yet).
- Add some very basic validation of input and very basic tests.

### Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.
cosmicexplorer pushed a commit that referenced this pull request Oct 10, 2018
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants