Randolf Jung | bcb3bc8 | 2023-06-26 16:30:14 | [diff] [blame] | 1 | # tar-stream |
| 2 | |
| 3 | tar-stream is a streaming tar parser and generator and nothing else. It operates purely using streams which means you can easily extract/parse tarballs without ever hitting the file system. |
| 4 | |
| 5 | Note that you still need to gunzip your data if you have a `.tar.gz`. We recommend using [gunzip-maybe](https://github.com/mafintosh/gunzip-maybe) in conjunction with this. |
| 6 | |
| 7 | ``` |
| 8 | npm install tar-stream |
| 9 | ``` |
| 10 | |
| 11 | [](http://travis-ci.org/mafintosh/tar-stream) |
| 12 | [](http://opensource.org/licenses/MIT) |
| 13 | |
| 14 | ## Usage |
| 15 | |
| 16 | tar-stream exposes two streams, [pack](https://github.com/mafintosh/tar-stream#packing) which creates tarballs and [extract](https://github.com/mafintosh/tar-stream#extracting) which extracts tarballs. To [modify an existing tarball](https://github.com/mafintosh/tar-stream#modifying-existing-tarballs) use both. |
| 17 | |
| 18 | |
| 19 | It implementes USTAR with additional support for pax extended headers. It should be compatible with all popular tar distributions out there (gnutar, bsdtar etc) |
| 20 | |
| 21 | ## Related |
| 22 | |
| 23 | If you want to pack/unpack directories on the file system check out [tar-fs](https://github.com/mafintosh/tar-fs) which provides file system bindings to this module. |
| 24 | |
| 25 | ## Packing |
| 26 | |
| 27 | To create a pack stream use `tar.pack()` and call `pack.entry(header, [callback])` to add tar entries. |
| 28 | |
| 29 | ``` js |
| 30 | const tar = require('tar-stream') |
| 31 | const pack = tar.pack() // pack is a stream |
| 32 | |
| 33 | // add a file called my-test.txt with the content "Hello World!" |
| 34 | pack.entry({ name: 'my-test.txt' }, 'Hello World!') |
| 35 | |
| 36 | // add a file called my-stream-test.txt from a stream |
| 37 | const entry = pack.entry({ name: 'my-stream-test.txt', size: 11 }, function(err) { |
| 38 | // the stream was added |
| 39 | // no more entries |
| 40 | pack.finalize() |
| 41 | }) |
| 42 | |
| 43 | entry.write('hello') |
| 44 | entry.write(' ') |
| 45 | entry.write('world') |
| 46 | entry.end() |
| 47 | |
| 48 | // pipe the pack stream somewhere |
| 49 | pack.pipe(process.stdout) |
| 50 | ``` |
| 51 | |
| 52 | ## Extracting |
| 53 | |
| 54 | To extract a stream use `tar.extract()` and listen for `extract.on('entry', (header, stream, next) )` |
| 55 | |
| 56 | ``` js |
| 57 | const extract = tar.extract() |
| 58 | |
| 59 | extract.on('entry', function (header, stream, next) { |
| 60 | // header is the tar header |
| 61 | // stream is the content body (might be an empty stream) |
| 62 | // call next when you are done with this entry |
| 63 | |
| 64 | stream.on('end', function () { |
| 65 | next() // ready for next entry |
| 66 | }) |
| 67 | |
| 68 | stream.resume() // just auto drain the stream |
| 69 | }) |
| 70 | |
| 71 | extract.on('finish', function () { |
| 72 | // all entries read |
| 73 | }) |
| 74 | |
| 75 | pack.pipe(extract) |
| 76 | ``` |
| 77 | |
| 78 | The tar archive is streamed sequentially, meaning you **must** drain each entry's stream as you get them or else the main extract stream will receive backpressure and stop reading. |
| 79 | |
| 80 | ## Extracting as an async iterator |
| 81 | |
| 82 | The extraction stream in addition to being a writable stream is also an async iterator |
| 83 | |
| 84 | ``` js |
| 85 | const extract = tar.extract() |
| 86 | |
| 87 | someStream.pipe(extract) |
| 88 | |
| 89 | for await (const entry of extract) { |
| 90 | entry.header // the tar header |
| 91 | entry.resume() // the entry is the stream also |
| 92 | } |
| 93 | ``` |
| 94 | |
| 95 | ## Headers |
| 96 | |
| 97 | The header object using in `entry` should contain the following properties. |
| 98 | Most of these values can be found by stat'ing a file. |
| 99 | |
| 100 | ``` js |
| 101 | { |
| 102 | name: 'path/to/this/entry.txt', |
| 103 | size: 1314, // entry size. defaults to 0 |
| 104 | mode: 0o644, // entry mode. defaults to to 0o755 for dirs and 0o644 otherwise |
| 105 | mtime: new Date(), // last modified date for entry. defaults to now. |
| 106 | type: 'file', // type of entry. defaults to file. can be: |
| 107 | // file | link | symlink | directory | block-device |
| 108 | // character-device | fifo | contiguous-file |
| 109 | linkname: 'path', // linked file name |
| 110 | uid: 0, // uid of entry owner. defaults to 0 |
| 111 | gid: 0, // gid of entry owner. defaults to 0 |
| 112 | uname: 'maf', // uname of entry owner. defaults to null |
| 113 | gname: 'staff', // gname of entry owner. defaults to null |
| 114 | devmajor: 0, // device major version. defaults to 0 |
| 115 | devminor: 0 // device minor version. defaults to 0 |
| 116 | } |
| 117 | ``` |
| 118 | |
| 119 | ## Modifying existing tarballs |
| 120 | |
| 121 | Using tar-stream it is easy to rewrite paths / change modes etc in an existing tarball. |
| 122 | |
| 123 | ``` js |
| 124 | const extract = tar.extract() |
| 125 | const pack = tar.pack() |
| 126 | const path = require('path') |
| 127 | |
| 128 | extract.on('entry', function (header, stream, callback) { |
| 129 | // let's prefix all names with 'tmp' |
| 130 | header.name = path.join('tmp', header.name) |
| 131 | // write the new entry to the pack stream |
| 132 | stream.pipe(pack.entry(header, callback)) |
| 133 | }) |
| 134 | |
| 135 | extract.on('finish', function () { |
| 136 | // all entries done - lets finalize it |
| 137 | pack.finalize() |
| 138 | }) |
| 139 | |
| 140 | // pipe the old tarball to the extractor |
| 141 | oldTarballStream.pipe(extract) |
| 142 | |
| 143 | // pipe the new tarball the another stream |
| 144 | pack.pipe(newTarballStream) |
| 145 | ``` |
| 146 | |
| 147 | ## Saving tarball to fs |
| 148 | |
| 149 | |
| 150 | ``` js |
| 151 | const fs = require('fs') |
| 152 | const tar = require('tar-stream') |
| 153 | |
| 154 | const pack = tar.pack() // pack is a stream |
| 155 | const path = 'YourTarBall.tar' |
| 156 | const yourTarball = fs.createWriteStream(path) |
| 157 | |
| 158 | // add a file called YourFile.txt with the content "Hello World!" |
| 159 | pack.entry({ name: 'YourFile.txt' }, 'Hello World!', function (err) { |
| 160 | if (err) throw err |
| 161 | pack.finalize() |
| 162 | }) |
| 163 | |
| 164 | // pipe the pack stream to your file |
| 165 | pack.pipe(yourTarball) |
| 166 | |
| 167 | yourTarball.on('close', function () { |
| 168 | console.log(path + ' has been written') |
| 169 | fs.stat(path, function(err, stats) { |
| 170 | if (err) throw err |
| 171 | console.log(stats) |
| 172 | console.log('Got file info successfully!') |
| 173 | }) |
| 174 | }) |
| 175 | ``` |
| 176 | |
| 177 | ## Performance |
| 178 | |
| 179 | [See tar-fs for a performance comparison with node-tar](https://github.com/mafintosh/tar-fs/blob/master/README.md#performance) |
| 180 | |
| 181 | # License |
| 182 | |
| 183 | MIT |