From: Nikolai Weibull Date: 2011-10-28T15:35:55+09:00 Subject: [ruby-core:40487] Re: [ruby-core:40482] [ruby-trunk - Bug #5486] rb_stat() doesn’t respect input encoding On Fri, Oct 28, 2011 at 08:14, Nikolai Weibull wrote: > On Fri, Oct 28, 2011 at 07:28, Usaku NAKAMURA wrote: > >> Sorry, I can't understand your point. >> If you think there is a bug, would you show us the bug by code? > > That���s hard to do, but name a file in an encoding other than > 'filesystem' on an NTFS filesystem. ��What I did was accidentally > create a file whose name was encoded in UTF-16. ��Then, do > Dir['dir'].entries.each{ |e| printf "%p: %s\n", e, File.file? e }, > where 'dir' is the directory containing this file. ��e.file? will > return false for this file, even though it���s a file. ��The problem is, > as explained, in rb_stat(), as it re-encodes its argument in the > 'filesystem' encoding. Actually, it���s probably easier than that. It can be done on a HFS+ filesystem (and probably any other, as well) just as easily % echo $LC_CTYPE UTF-8 % mkdir t % touch t/�� % cat > a.rb # -*- coding: utf-8 -*- Dir.new('t').entries.each{ |e| printf "%p, %p, %s\n", e, e.encoding, File.file?(e) } ^D % ruby --version ruby 2.0.0dev (2011-10-26 trunk 33526) [x86_64-darwin10.8.0] % ruby a.rb ".", #, false "..", #, false "a��", #, false I guess the problem is that Ruby assumes that it can apply an encoding to something that it gets from the filesystem when it would probably be better to not do so. It should probably be BINARY or ASCII-8BIT instead of UTF-8. (It turns out that this example gave the same results in 1.8.7 (minus the e.encoding), so perhaps I���m doing something else wrong.) Trying to do p File.file?('t/��'.encode('UTF-16LE')) results in in `file?': path name must be ASCII-compatible (UTF-16LE): "t/\u00E5" (Encoding::CompatibilityError) I give up.