diff options
author | Mark Wielaard <[email protected]> | 2020-06-11 23:16:21 +0200 |
---|---|---|
committer | Mark Wielaard <[email protected]> | 2020-06-11 23:16:21 +0200 |
commit | 50a6eeef7d87623faa65126dc3d16c2a8e613aea (patch) | |
tree | 19a35135efaac56c49a30316c6572c7b4d6ec4aa /doc/debuginfod.8 | |
parent | 49f13584d60322578c19b6118393ab04236ca7bf (diff) | |
parent | a2bc0214a5615551d89cef8d160bdbaafd5f1a83 (diff) |
Merge tag 'elfutils-0.180' into mjw/RH-DTSdts-0.180
elfutils 0.180 release
Diffstat (limited to 'doc/debuginfod.8')
-rw-r--r-- | doc/debuginfod.8 | 193 |
1 files changed, 124 insertions, 69 deletions
diff --git a/doc/debuginfod.8 b/doc/debuginfod.8 index 210550e8..a645ceed 100644 --- a/doc/debuginfod.8 +++ b/doc/debuginfod.8 @@ -24,7 +24,7 @@ debuginfod \- debuginfo-related http file-server daemon .SH DESCRIPTION \fBdebuginfod\fP serves debuginfo-related artifacts over HTTP. It periodically scans a set of directories for ELF/DWARF files and their -associated source code, as well as RPM files containing the above, to +associated source code, as well as archive files containing the above, to build an index by their buildid. This index is used when remote clients use the HTTP webapi, to fetch these files by the same buildid. @@ -34,17 +34,23 @@ debuginfod servers, it queries them for the same information, just as \fBdebuginfod-find\fP would. If successful, it locally caches then relays the file content to the original requester. -If the \fB\-F\fP option is given, each listed PATH creates a thread to -scan for matching ELF/DWARF/source files under the given physical -directory. Source files are matched with DWARF files based on the -AT_comp_dir (compilation directory) attributes inside it. Duplicate -directories are ignored. You may use a file name for a PATH, but -source code indexing may be incomplete; prefer using a directory that -contains the binaries. Caution: source files listed in the DWARF may -be a path \fIanywhere\fP in the file system, and debuginfod will -readily serve their content on demand. (Imagine a doctored DWARF file -that lists \fI/etc/passwd\fP as a source file.) If this is a concern, -audit your binaries with tools such as: +Indexing the given PATHs proceeds using multiple threads. One thread +periodically traverses all the given PATHs logically or physically +(see the \fB\-L\fP option). Duplicate PATHs are ignored. You may use +a file name for a PATH, but source code indexing may be incomplete; +prefer using a directory that contains the binaries. The traversal +thread enumerates all matching files (see the \fB\-I\fP and \fB\-X\fP +options) into a work queue. A collection of scanner threads (see the +\fB\-c\fP option) wait at the work queue to analyze files in parallel. + +If the \fB\-F\fP option is given, each file is scanned as an ELF/DWARF +file. Source files are matched with DWARF files based on the +AT_comp_dir (compilation directory) attributes inside it. Caution: +source files listed in the DWARF may be a path \fIanywhere\fP in the +file system, and debuginfod will readily serve their content on +demand. (Imagine a doctored DWARF file that lists \fI/etc/passwd\fP +as a source file.) If this is a concern, audit your binaries with +tools such as: .SAMPLE % eu-readelf -wline BINARY | sed -n '/^Directory.table/,/^File.name.table/p' @@ -55,33 +61,55 @@ or even use debuginfod itself: ^C .ESAMPLE -If the \fB\-R\fP option is given each listed PATH creates a thread to -scan for ELF/DWARF/source files contained in matching RPMs under the -given physical directory. Duplicate directories are ignored. You may -use a file name for a PATH, but source code indexing may be -incomplete; prefer using a directory that contains normal RPMs -alongside debuginfo/debugsource RPMs. Because of complications such -as DWZ-compressed debuginfo, may require \fItwo\fP scan passes to -identify all source code. Source files for RPMs are only served -from other RPMs, so the caution for \-F does not apply. +If any of the \fB\-R\fP, \fB-U\fP, or \fB-Z\fP options is given, each +file is scanned as an archive file that may contain ELF/DWARF/source +files. Archive files are recognized by extension. If \-R is given, +".rpm" files are scanned; if \-D is given, ".deb" and ".ddeb" files +are scanned; if \-Z is given, the listed extensions are scanned. +Because of complications such as DWZ-compressed debuginfo, may require +\fItwo\fP traversal passes to identify all source code. Source files +for RPMs are only served from other RPMs, so the caution for \-F does +not apply. Note that due to Debian/Ubuntu packaging policies & +mechanisms, debuginfod cannot resolve source files for DEB/DDEB at +all. -If no PATH is listed, or neither \-F nor \-R option is given, then -\fBdebuginfod\fP will simply serve content that it scanned into its -index in previous runs: the data is cumulative. - -File names must match extended regular expressions given by the \-I -option and not the \-X option (if any) in order to be considered. +If no PATH is listed, or none of the scanning options is given, then +\fBdebuginfod\fP will simply serve content that it accumulated into +its index in all previous runs, and federate to any upstream +debuginfod servers. .SH OPTIONS .TP .B "\-F" -Activate ELF/DWARF file scanning threads. The default is off. +Activate ELF/DWARF file scanning. The default is off. + +.TP +.B "\-Z EXT" "\-Z EXT=CMD" +Activate an additional pattern in archive scanning. Files with name +extension EXT (include the dot) will be processed. If CMD is given, +it is invoked with the file name added to its argument list, and +should produce a common archive on its standard output. Otherwise, +the file is read as if CMD were "cat". Since debuginfod internally +uses \fBlibarchive\fP to read archive files, it can accept a wide +range of archive formats and compression modes. The default is no +additional patterns. This option may be repeated. .TP .B "\-R" -Activate RPM file scanning threads. The default is off. +Activate RPM patterns in archive scanning. The default is off. +Equivalent to \fB\%\-Z\~.rpm=cat\fP, since libarchive can natively +process RPM archives. If your version of libarchive is much older +than 2020, be aware that some distributions have switched to an +incompatible zstd compression for their payload. You may experiment +with \fB\%\-Z\ .rpm='(rpm2cpio|zstdcat)<'\fP instead of \fB\-R\fP. + +.TP +.B "\-U" +Activate DEB/DDEB patterns in archive scanning. The default is off. +Equivalent to \fB\%\-Z\ .deb='dpkg-deb\ \-\-fsys\-tarfile\fP' +\fB\%\-Z\ .ddeb='dpkg-deb\ \-\-fsys\-tarfile'\fP. .TP .B "\-d FILE" "\-\-database=FILE" @@ -91,7 +119,7 @@ data. It will contain absolute file path names, so it may not be portable across machines. It may be frequently read/written, so it should be on a fast filesytem. It should not be shared across machines or users, to maximize sqlite locking performance. The -default database file is $HOME/.debuginfod.sqlite. +default database file is \%$HOME/.debuginfod.sqlite. .TP .B "\-D SQL" "\-\-ddl=SQL" @@ -102,9 +130,10 @@ repeated. The default is nothing extra. .TP .B "\-p NUM" "\-\-port=NUM" -Set the TCP port number on which debuginfod should listen, to service -HTTP requests. Both IPv4 and IPV6 sockets are opened, if possible. -The webapi is documented below. The default port number is 8002. +Set the TCP port number (0 < NUM < 65536) on which debuginfod should +listen, to service HTTP requests. Both IPv4 and IPV6 sockets are +opened, if possible. The webapi is documented below. The default +port number is 8002. .TP .B "\-I REGEX" "\-\-include=REGEX" "\-X REGEX" "\-\-exclude=REGEX" @@ -114,13 +143,14 @@ extended REs, thus may include alternation. They are evaluated against the full path of each file, based on its \fBrealpath(3)\fP canonicalization. By default, all files are included and none are excluded. A file that matches both include and exclude REGEX is -excluded. (The \fIcontents\fP of RPM files are not subject to -inclusion or exclusion filtering: they are all processed.) +excluded. (The \fIcontents\fP of archive files are not subject to +inclusion or exclusion filtering: they are all processed.) Only the +last of each type of regular expression given is used. .TP .B "\-t SECONDS" "\-\-rescan\-time=SECONDS" -Set the rescan time for the file and RPM directories. This is the -amount of time the scanning threads will wait after finishing a scan, +Set the rescan time for the file and archive directories. This is the +amount of time the traversal thread will wait after finishing a scan, before doing it again. A rescan for unchanged files is fast (because the index also stores the file mtimes). A time of zero is acceptable, and means that only one initial scan should performed. The default @@ -143,8 +173,8 @@ independent of the groom time (including if it was zero). .B "\-G" Run an extraordinary maximal-grooming pass at debuginfod startup. This pass can take considerable time, because it tries to remove any -debuginfo-unrelated content from the RPM-related parts of the index. -It should not be run if any recent RPM-related indexing operations +debuginfo-unrelated content from the archive-related parts of the index. +It should not be run if any recent archive-related indexing operations were aborted early. It can take considerable space, because it finishes up with an sqlite "vacuum" operation, which repacks the database file by triplicating it temporarily. The default is not to @@ -152,11 +182,11 @@ do maximal-grooming. See also the \fIDATA MANAGEMENT\fP section. .TP .B "\-c NUM" "\-\-concurrency=NUM" -Set the concurrency limit for all the scanning threads. While many -threads may be spawned to cover all the given PATHs, only NUM may -concurrently do CPU-intensive operations like parsing an ELF file -or an RPM. The default is the number of processors on the system; -the minimum is 1. +Set the concurrency limit for the scanning queue threads, which work +together to process archives & files located by the traversal thread. +This important for controlling CPU-intensive operations like parsing +an ELF file and especially decompressing archives. The default is the +number of processors on the system; the minimum is 1. .TP .B "\-L" @@ -168,6 +198,19 @@ loops in the symbolic directory tree might lead to \fIinfinite traversal\fP. .TP +.B "\-\-fdcache\-fds=NUM" "\-\-fdcache\-mbs=MB" "\-\-fdcache\-prefetch=NUM2" +Configure limits on a cache that keeps recently extracted files from +archives. Up to NUM requested files and up to a total of MB megabytes +will be kept extracted, in order to avoid having to decompress their +archives over and over again. In addition, up to NUM2 other files +from an archive may be prefetched into the cache before they are even +requested. The default NUM, NUM2, and MB values depend on the +concurrency of the system, and on the available disk space on the +$TMPDIR or \fB/tmp\fP filesystem. This is because that is where the +most recently used extracted files are kept. Grooming cleans this +cache. + +.TP .B "\-v" Increase verbosity of logging to the standard error file descriptor. May be repeated to increase details. The default verbosity is 0. @@ -226,10 +269,11 @@ is made up of multiple CUs. Therefore, to disambiguate, debuginfod expects source queries to prefix relative path names with the CU compilation-directory, followed by a mandatory "/". -Note: contrary to RFC 3986, the client should not elide \fB../\fP or -\fB/./\fP or extraneous \fB///\fP sorts of path components in the -directory names, because if this is how those names appear in the -DWARF files, that is what debuginfod needs to see too. +Note: the caller may or may not elide \fB../\fP or \fB/./\fP or extraneous +\fB///\fP sorts of path components in the directory names. debuginfod +accepts both forms. Specifically, debuginfod canonicalizes path names +according to RFC3986 section 5.2.4 (Remove Dot Segments), plus reducing +any \fB//\fP to \fB/\fP in the path. For example: .TS @@ -257,10 +301,10 @@ many files. This section offers some advice about the implications. As a general explanation for size, consider that debuginfod indexes ELF/DWARF files, it stores their names and referenced source file -names, and buildids will be stored. When indexing RPMs, it stores -every file name \fIof or in\fP an RPM, every buildid, plus every -source file name referenced from a DWARF file. (Indexing RPMs takes -more space because the source files often reside in separate +names, and buildids will be stored. When indexing archives, it stores +every file name \fIof or in\fP an archive, every buildid, plus every +source file name referenced from a DWARF file. (Indexing archives +takes more space because the source files often reside in separate subpackages that may not be indexed at the same pass, so extra metadata has to be kept.) @@ -283,14 +327,14 @@ This means that the sqlite files grow fast during initial indexing, slowly during index rescans, and periodically shrink during grooming. There is also an optional one-shot \fImaximal grooming\fP pass is available. It removes information debuginfo-unrelated data from the -RPM content index such as file names found in RPMs ("rpm sdef" -records) that are not referred to as source files from any binaries -find in RPMs ("rpm sref" records). This can save considerable disk -space. However, it is slow and temporarily requires up to twice the -database size as free space. Worse: it may result in missing -source-code info if the RPM traversals were interrupted, so the not -all source file references were known. Use it rarely to polish a -complete index. +archive content index such as file names found in archives ("archive +sdef" records) that are not referred to as source files from any +binaries find in archives ("archive sref" records). This can save +considerable disk space. However, it is slow and temporarily requires +up to twice the database size as free space. Worse: it may result in +missing source-code info if the archive traversals were interrupted, +so that not all source file references were known. Use it rarely to +polish a complete index. You should ensure that ample disk space remains available. (The flood of error messages on -ENOSPC is ugly and nagging. But, like for most @@ -317,7 +361,7 @@ happens, new versions of debuginfod will issue SQL statements to \fIdrop\fP all prior schema & data, and start over. So, disk space will not be wasted for retaining a no-longer-useable dataset. -In summary, if your system can bear a 0.5%-3% index-to-RPM-dataset +In summary, if your system can bear a 0.5%-3% index-to-archive-dataset size ratio, and slow growth afterwards, you should not need to worry about disk space. If a system crash corrupts the database, or you want to force debuginfod to reset and start over, simply @@ -347,25 +391,34 @@ enabled. .SH "ENVIRONMENT VARIABLES" -.TP 21 +.TP +.B TMPDIR +This environment variable points to a file system to be used for +temporary files. The default is /tmp. + +.TP .B DEBUGINFOD_URLS This environment variable contains a list of URL prefixes for trusted debuginfod instances. Alternate URL prefixes are separated by space. Avoid referential loops that cause a server to contact itself, directly or indirectly - the results would be hilarious. -.TP 21 +.TP .B DEBUGINFOD_TIMEOUT This environment variable governs the timeout for each debuginfod HTTP -connection. A server that fails to respond within this many seconds -is skipped. The default is 5. +connection. A server that fails to provide at least 100K of data +within this many seconds is skipped. The default is 90 seconds. (Zero +or negative means "no timeout".) + -.TP 21 +.TP .B DEBUGINFOD_CACHE_PATH This environment variable governs the location of the cache where downloaded files are kept. It is cleaned periodically as this -program is reexecuted. The default is $HOME/.debuginfod_client_cache. -.\" XXX describe cache eviction policy +program is reexecuted. If XDG_CACHE_HOME is set then +$XDG_CACHE_HOME/debuginfod_client is the default location, otherwise +$HOME/.cache/debuginfod_client is used. For more information regarding +the client cache see \fIdebuginfod_find_debuginfo(3)\fP. .SH FILES .LP @@ -376,8 +429,10 @@ Default database file. .PD .TP 20 -.B $HOME/.debuginfod_client_cache +.B $XDG_CACHE_HOME/debuginfod_client Default cache directory for content from upstream debuginfods. +If XDG_CACHE_HOME is not set then \fB$HOME/.cache/debuginfod_client\fP +is used. .PD |