Skip to content

Catch exceptions when emitting records #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 25, 2016

Conversation

mikekap
Copy link
Contributor

@mikekap mikekap commented Oct 14, 2016

So the thread emitting them doesn't die. Also kill the process if the runner thread dies for some reason.

Closes #14

So the thread emitting them doesn't die. Also kill the process if the runner thread dies for some reason.
Copy link
Collaborator

@errm errm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge if we drop abort_on_exception bit

@@ -29,6 +29,7 @@ def start
@running = true
pos_writer.start
@thread = Thread.new(&method(:run))
@thread.abort_on_exception = true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that Thread#abort_on_exception is a global flag. I.e. Setting it here will cause an exception in any thread to cause the program to exit... I am not sure that this is what we wan't here, And I am not sure what the expectations of other code in the fluent ecosystem wrt to this flag, It is prob not the responsiblity of a plugin to set this flag one way or another and should really fall to fluentd...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep definitely agree that we shouldn't set it globally. I tried to set the thread-local flag here (http://ruby-doc.org/core-1.9.3/Thread.html#method-i-abort_on_exception) but I don't know Ruby well enough to know that I succeeded. Does that look like the right way to call the instance method rather than the class method?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are calling the instance method here.

As I understand the documentation, the accessor #abort_on_exception reports the thread local state of this flag, but the writer #abort_on_exception= sets this flag globally.

When set to true, causes all threads (including the main program) to abort if an exception is raised in thr. The process will effectively exit(0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the docs are wrong. I just tested this via

#Thread.abort_on_exception = true
thread = Thread.new { sleep }
thread.abort_on_exception = true

thread2 = Thread.new { raise "Exception from thread" }
thread.join

If you uncomment the Thread.... this will exit; otherwise this will hang. I opened an issue in the ruby bug tracker to fix the docs: https://bugs.ruby-lang.org/issues/12847

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah seems we both misread that doc. To highlight: When set to true, causes all threads to abort if an exception is raised in thr - i.e. the thread that you're setting this on, not all of them.

router.emit(tag, entry.realtime_timestamp.to_i, formatted(entry))
rescue => e
log.error("Exception emitting record: #{e}")
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable...

Were you seeing anything being thrown at this point that could be a bug, that we should look at?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's definitely a bug, but it's not in this plugin. In my case the kubernetes client that the kubernetes metadata filter plugin uses sometimes throws an SSL error which isn't wrapped properly. This kills poor journald daemon thread 😔. My plan is to report it upstream once I save the exact exception.

@@ -29,6 +29,7 @@ def start
@running = true
pos_writer.start
@thread = Thread.new(&method(:run))
@thread.abort_on_exception = true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an external view, there is a race here, since the thread can start and the abort_on_exception flag is set concurrently, which means #run could exit silently if it happens to start executing before the flag is set. One fix would be to have instead Thread.current.abort_on_exception = true inside #run as the first statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Done.

@reevoo-samuel
Copy link

Er, hello?

This Pull Request is too damn old! Merge or close this, sucka.

@mikekap
Copy link
Contributor Author

mikekap commented Oct 25, 2016

ping

@errm
Copy link
Collaborator

errm commented Oct 25, 2016

Thanks @mikekap this looks great now...thanks for the review @eregon

@errm errm merged commit eb717a3 into fluent-plugins-nursery:master Oct 25, 2016
errm added a commit that referenced this pull request Oct 25, 2016
Includes some improvments to the handling of Exceptions
thrown while emitting records.

#15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants