upx-3.91-src.tar.gz_UPX_compress_upx3_upxsrc_upx3.91,upx脱壳机3.91资源-CSDN下载

版权申诉

compress

74 浏览量 2022-09-24 09:14:05 上传评论收藏 1.17MB GZ 举报

共426个文件

h：137个

s：135个

cpp：55个

资源推荐

资源详情

资源评论

收起资源包目录

upx-3.91-src.tar.gz_UPX_compress_upx 3_upx src （426个子文件）

README.1ST 3KB

README.1ST 773B

bits.ash 11KB

nrv2e_d.ash 7KB

nrv2d_d.ash 7KB

bits.ash 7KB

nrv2b_d.ash 6KB

nrv2e_d.ash 3KB

nrv2d_d.ash 3KB

nrv2b_d.ash 3KB

rename.ash 3KB

macros.ash 2KB

i386-dos32.djgpp2-stubify.asm 33KB

BUGS 2KB

i386-linux.elf-main.c 26KB

i386-openbsd.elf-main.c 19KB

i386-bsd.elf-main.c 19KB

amd64-darwin.macho-main.c 18KB

i386-darwin.macho-main.c 18KB

sstrip.c 15KB

i386-linux.elf.execve-main.c 15KB

i386-bsd.elf.execve-main.c 14KB

i386-linux.elf.interp-main.c 14KB

powerpc-darwin.macho-main.c 14KB

arm-darwin.macho-main.c 13KB

amd64-linux.elf-main.c 12KB

i386-linux.elf.shell-main.c 12KB

powerpc-linux.elf-main.c 11KB

armpe_tester.c 9KB

cc_test.c 4KB

l_test.c 4KB

lzma_d_c.c 3KB

mipsel.r3000-linux.elf-main.c 34B

armeb-linux.elf-main.c 33B

arm-linux.elf-main.c 33B

mips.r3000-linux.elf-main.c 33B

armel-linux.elf-main.c 33B

COPYING 18KB

p_lx_elf.cpp 119KB

p_mach.cpp 60KB

pefile.cpp 55KB

p_w64pep.cpp 54KB

p_w32pe.cpp 53KB

pepfile.cpp 52KB

packer.cpp 48KB

p_vmlinx.cpp 46KB

main.cpp 44KB

p_armpe.cpp 38KB

p_vmlinz.cpp 34KB

linker.cpp 27KB

snprintf.cpp 27KB

p_exe.cpp 26KB

p_wcle.cpp 25KB

compress_lzma.cpp 25KB

p_ps1.cpp 24KB

p_tos.cpp 23KB

p_unix.cpp 20KB

p_lx_exc.cpp 20KB

ui.cpp 19KB

util.cpp 14KB

help.cpp 14KB

p_djgpp2.cpp 14KB

s_vcsa.cpp 13KB

s_djgpp2.cpp 12KB

s_win32.cpp 11KB

packer_f.cpp 11KB

filteri.cpp 11KB

packmast.cpp 10KB

file.cpp 10KB

packer_c.cpp 10KB

lefile.cpp 10KB

p_tmt.cpp 10KB

compress_ucl.cpp 10KB

p_lx_interp.cpp 10KB

work.cpp 9KB

packhead.cpp 8KB

p_com.cpp 8KB

c_screen.cpp 7KB

filter.cpp 7KB

compress.cpp 7KB

compress_zlib.cpp 7KB

msg.cpp 6KB

mem.cpp 5KB

except.cpp 5KB

p_lx_sh.cpp 5KB

p_elks.cpp 4KB

p_sys.cpp 4KB

c_init.cpp 4KB

p_w16ne.cpp 3KB

s_object.cpp 2KB

c_file.cpp 2KB

stdcxx.cpp 2KB

c_none.cpp 2KB

Makefile.extra 11KB

Makefile.extra 3KB

Makefile.extra 2KB

共 426 条

This document explains the concept of "filtering" in UPX. Basically filtering is a data preprocessing method which could improve the compression ratio of the files UPX processes. Currently the filters UPX uses are all based on one very special algorithm which is working well on ix86 executable files. This is what upx calls the "naive" implementation. There is also a "clever" method which works only with 32-bit executable file formats and was first implemented in UPX. Let's start with an example (from this point I assume a 32-bit file format). Consider this code fragment: 00025970: E877410600 calln FatalError 00025975: 8B414C mov eax,[ecx+4C] 00025978: 85C0 test eax,eax 0002597A: 7419 je file:00025995 0002597C: 85F6 test esi,esi 0002597E: 7504 jne file:00025984 00025980: 89C6 mov esi,eax 00025982: EB11 jmps file:00025995 00025984: 39C6 cmp esi,eax 00025986: 740D je file:00025995 00025988: 83C4F4 add (d) esp,F4 0002598B: 68A0A91608 push 0816A9A0 00025990: E857410600 calln FatalError 00025995: FF45F4 inc [ebp-0C] Here you can find two calls to a function called "FatalError". As you probably know the compression ratio is better if the compressor engine finds longer sequences of repeated strings. In this case the engine sees the following two byte sequences: E877 410600 8B and E857 410600 FF. So it can find a 3-byte-long match. Now comes the trick. On ix86 near calls are encoded as 0xE8 then a 32 bit relative offset to the destination address. Let's see what happens if the position of the call is added to that offset: 0x64177 + 0x25970 = 0x89AE7 0x64157 + 0x25990 = 0x89AE7 E8 E79A0800 8B E8 E79A0800 FF As you can see now the compressor engine finds a 5-byte-long match. Which means, that we've just saved 2 bytes of compressed data. Not bad. So this is the basic idea (the "naive" implementation). All we have to do is to "filter" the uncompressed data using this method before compression, and "unfilter" it after decompression. Simply go over the memory, find 0xE8 bytes and process the next 4 bytes as specified above. Of course there are several possibilities where this scheme could be improved. First, not only calls could be handled this way - near jumps (0xE9 + 32-bit offset) could work similarly. A second improvement could be if we limit this filtering only for the area occupied by real code - there is no point in messing with general data. Another improvement comes if the byte order of the 32-bit offset is reversed. Why? Here is another call which follows the above fragment: 000261FA: E8C9390600 calln ErrorF 0x639C9 + 0x261FA = 0x89BC3 E8 C39B 0800 compare this with E8 E79A 0800 As you can see these two functions are quite close together, but the compressor is not able to utilize this information (2-byte-long matches are usually not useful) unless the byte order of the offsets are reversed. In this case: E8 0008 9AE7 E8 0008 9BC3 So, the compressor engine finds a 3-byte-long match here. This is a nice improvement - now the engine utilizes the similarity of nearby destinations too. This is nice, but what happens when we find a "fake" call - ie. an 0xE8 which is part of another instruction? Like this: 0002A3B1: C745 E8 00000000 mov [ebp-18],00000000 In this case those nice 0x00 bytes are overwritten with some less compressible data. This is the disadvantage of the "naive" implementation. So let's be clever and try to detect and process only "real" calls. In UPX a simple method is used to find these calls. We simply check that the destinations of these calls are inside the same area as the calls themselves (so the above code is still a false positive, but it helps generally). A better method would be to actually disassemble the code - contributions are welcome :-) But this is only half of the job. We can not simply process one call then skip another one - the unfiltering process needs some information to be able to reverse the filtering. UPX uses the following idea, which works nicely. First we assume that the size of the area that should be filtered is less than 16 MiB. Then UPX scans over this area and keeps a record of the bytes that are following the 0xE8 bytes. If we are lucky, there will be bytes that were not found following 0xE8. These bytes are our candidates to be used as markers. Do you still remember that we assumed that the size of scanned area is less than 16 MiB? Well, this means that when we process a real call, the resulting offset will be less than 0x00FFFFFF too. So the MSB is always 0x00. Which is a nice place to store our marker. Of course we should reverse the byte order in the resulting offset - so this marker will appear just after the 0xE8 byte and not 4 bytes after it. That's all. Just go over the memory area, identify the "real" calls, and use this method to mark them. Then the job of the unfilter is very easy - it just searches for a 0xE8 + marker sequence and does the unfiltering if it finds one. It's clever, isn't it? :) To tell you the truth it's not this simple in UPX. It can use an additional parameter ("add_value") which makes things a little bit more complicated (for example it can happen that a found marker is proven to be unusable because of some overflow during an addition). And the whole algorithm is optimized for simplicity on the unfiltering side (as short and as fast assembly as possible - see stub/macros.ash), which makes the filtering process a little more difficult (fcto_ml.ch, fcto_ml2.ch, filteri.cpp). As it can be seen in filteri.cpp, there are lots of variants of this filtering implemented - native/clever, calls/jumps/calls&jumps, reversed/unreversed offsets - a sum of 18 slightly different filters (and another 9 variants for 16-bit programs). You can select one of them using the command line parameter "--filter=" or try most of them with "--all-filters". Or just let upx use the one we defined as the default for that executable format. EOF

评论收藏

内容反馈

版权申诉