When a Rust program panics, it tries to get symbols for the backtrace. In case of ELF executables, the symbols don't have to be in the same executable, and can be loaded from external locations via debug-id or GNU debuglink paths.
This is generally very nice and works well, but it causes problems in more complex deployments:
It tries to open and mmap the files, which doesn't work when the program is sandboxed. I can make the sandboxed program ask for the right file, but I can't just make open work. For example, seccomp can't deny/allow access based on paths, but can allow pre-opening a file descriptor before denying all other filesystem access. It also means that I need to configure seccomp to be softer and gracefully deny access with EPERM rather than kill the process immediately when it tries to call open, as otherwise panics would get the process killed before it had a chance to print the backtrace.
The debug symbols can be large. I have servers that intentionally don't have any local disks, and only run off a RAM-based filesystem, but this means that disk space is precious. Keeping debug symbol files on disk just in case a server panics is wasteful. I have a debug symbol server which allows fetching symbols on demand over HTTP, so I'd prefer to fetch them lazily.
I've considered using a network mount for the symbols dir, but that makes a lot of other things more complicated (e.g. the symbols are installed and upgraded via package manager, so the network mount would have to be writeable and reliable to make the package manager happy).
AFAIK currently backtrace-rs is built together with libstd, so it's not easy to patch or replace.
Would it be possible to add some sort of hook to Rust/libstd/backtrace-rs to give control over obtaining of the debug symbols? I imagine something similar to panic hook called when backtrace-rs wants to create a new mapping.
Setting the panic_hook would be a workaround today? At the expense of having to manually depend on backtrace-rs and copy/paste much of the std/backtrace code.
In any case, many people have wanted more control over backtraces. Even aside from the size of debug symbols, the symbolization code itself is large (which can matter for some applications) and sometimes gnarly. I think the most promising way forward is one of the following RFCs that seeks to generalise the way #[global_allocator] works so it can be used for other things.
Seems like both issues could be solved by passing a socket that can talk whatever protocol debuginfod is speaking, then the server can take care of lookup?
My understanding is that debuginfod doesn't use sockets. Rather the client library downloads files from the debuginfod server over http. I think adding http+tls support to every rust executable is way too much and especially for security sensitive privileged executables a terrible idea.
I would like the ability (in release mode) to just dump basic library+offset and then have tooling to resolve this to the proper address later.
There are things like the chromium and Mozilla crash handlers. But those are complex pieces of software hardwired to connect to servers for reporting (I was looking for options a few years ago). Great, except if you make airgapped/offline software (like I did). The only option was often to get the crash dump on a USB drive which someone carried/drove back to a location with Internet and then send the info to us with email.
Another issue is for smaller open source projects (e.g. single developer on github) where you don't have the ability to run a big crash reporting server and all the infra like Mozilla or similar sized projects might. Still, you would prefer to not include the debug info but have a way for user to send in some form of backtrace that you can symbolise afterwards.
Both of these scenarios are ill served currently, because you need at a minimum to
Attach additional info to the trace to be able to figure out which exact build and debug info to use for symbolising. That isn't possible currently.
Also the dump just goes to stderr, while you might want to launch a separate crash handler process that pops up a dialog with some info on what happened and how to report it. Or store it in a log directory and automatically reboot the system.
Servers might have yet different needs, maybe report it automatically to some observability platform or such (I'll have to be vague on this, it is not an area I have worked in, except for airgapped servers).
Replacing the entire panic handler is a big hammer, and it feels like there should be something with more customisation points instead. Also, as I understand it, the panic handler won't be able to handle SIGSEGV/SIGBUS etc?
I don't think there is currently a comprehensive solution for this in the rust ecosystem (as far as I have found at least).
No, the backtrace is printed before the abort or unwind (or rather the panic hook is called. which by default prints a backtrace). The unstable panic_immediate_abort does abort without calling the panic hook because it's equivalent to manually calling abort.
Edited to add: Interestingly this also means that the panic message is printed even if you use catch_unwind.
And unfortunately neither is what we use: we catch_unwind on Android to turn panics into AssertionErrors. And even if we didn’t, we’d still want to stick a backtrace into our log file; it would just never be symbolicated. As mentioned we can override the panic hook, but std::backtrace will always contain the code to try to load symbols, even with -Zbuild-std, even if we dynamically don’t request it. At least, that’s where things were last time I looked into this.
Yes, this is still true. If one of the RFCs I linked earlier is accepted then that would allow it to be pluggable. A solution that I've brought up before would be to add an unstable feature to std that makes the default hook a no-op or at least removes symbolisation. This can then be used with -Zbuild-std-features. This idea was rejected in favour of a more principled solution but maybe libs can be persuaded to provide it as an interim solution.
Not if you disable the default backtrace feature of std, right? The trick is to also use the -Zbuild-std-features="" because that will make the backtrace feature not be included. With just -Zbuild-std libstd will be built with the backtrace feature enabled.
Well, yes, but I want a backtrace, I just don’t want it symbolicated. One way to accomplish this would be to split out, e.g. backtrace-unsymbolicated as a separate feature in the std build, but I figure if I proposed that it would end up back here for a general symbolication overhaul (also I don’t currently have the impetus to drive a proposal).