From: vmakarov@... Date: 2016-12-05T16:44:40+00:00 Subject: [ruby-core:78499] [Ruby trunk Bug#13002] Hash calculations no longer using universal hashing Issue #13002 has been updated by Vladimir Makarov. Martin D��rst wrote: > > I think it may still be somewhat too early to completely give up on the > strong/weak distinction. I really like using CPU cycles only when > necessary, and that's exactly what this proposal is about. > > I think we have to distinguish three different cases: > > 1. Calls to `#hash` inside a class (as e.g. in Bug #9381): > This always has to use strong (in the sense of universal, not > necessarily in the sense of cryptographically strong) hashing. > > 2. Calls to `#hash` for general objects/classes from inside st.c: > These have to always use strong hashing, because the hash is defined as > a method on the class, and can't be switched between weak and strong. > > 3. Calculations of hash for special objects such as `String`, `Symbol`, > `Integer`,... from directly inside st.c: In this case, I think it would be > possible to use the weak/strong distinction. I also think that the > majority, probably even a big majority, of hash keys are `String`s, > `Symbol`s, and `Integer`s, so that would keep a big part of the savings. > I haven't studied the code in detail, but I could imagine that > special-casing for `String`, `Symbol`, and (the old `Fixnum` part of) `Integer` > e.g. in do_hash could do the job. Something along the lines of the > following pseudo-code: > That what I thought until I saw the test you mentioned ``` def test_wrapper_of_special_const bug9381 = '[ruby-core:59638] [Bug #9381]' wrapper = Class.new do def initialize(obj) @obj = obj end def hash @obj.hash end def eql?(other) @obj.eql?(other) end end bad = [ 5, true, false, nil, 0.0, 1.72723e-77, :foo, "dsym_#{self.object_id.to_s(16)}_#{Time.now.to_i.to_s(16)}".to_sym, ].select do |x| hash = {x => bug9381} hash[wrapper.new(x)] != bug9381 end assert_empty(bad, bug9381) end ``` You can not use different hash functions for integer (strong for @obj.hash and weak inside the table), otherwise the test will fail. Integer.hash should be the same as hash in the tables (all hash tables). I even think that if the Integer.hash is saved (e.g. in initialize) and then used later in the wrapper object, people still expect the test success. In this case, even simultaneously switching all hashes (for tables and Integer.hash) will not work. The strong/weak approach could work for other languages though. I wanted to use it in https://github.com/dino-lang/dino and I don't see problems with it. But actually I decided to use mum-hash (https://github.com/vnmakarov/mum-hash) which is fast as the fastest non-crypto hash functions and I believe strong enough to prevent a denial attack even when the seed is known. Using a random seed makes it even stronger. But I can not propose such solution to Ruby community as nobody did a crypto-analysis for mum-hash as for siphash. > ```C > #### aside: Don't see why we need conditional compilation for the > #### following 5 lines. > #if SIZEOF_INT == SIZEOF_VOIDP > st_hash_t hash = h; > #else > st_hash_t hash = h; > #endif > ``` > I suspect it is a leftover from some experiments with the code. I think about some small improvements for hash tables after 2.4 release. So if nobody removes it, I'll remove it with the changes I am planning. Thank you, Martin. ---------------------------------------- Bug #13002: Hash calculations no longer using universal hashing https://bugs.ruby-lang.org/issues/13002#change-61877 * Author: Martin D��rst * Status: Open * Priority: Normal * Assignee: * ruby -v: * Backport: 2.1: DONTNEED, 2.2: DONTNEED, 2.3: DONTNEED ---------------------------------------- When preparing for my lecture on hash tables last week, I found that Ruby trunk doesn't do universal hashing anymore. See http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf for background. I contacted security@ruby-lang.org, but was told by Shugo that because trunk is not a published version, we can talk about it publicly. Shugo also said that the change was introduced in r56650. Following is some output from two different versions of Ruby that show the problem: On Ruby 2.2.3, different hash value for the same number every time Ruby is restarted: C:\Users\duerst>ruby -v ruby 2.2.3p173 (2015-08-18 revision 51636) [i386-mingw32] C:\Users\duerst>ruby -e 'puts 12345678.hash' 611647260 C:\Users\duerst>ruby -e 'puts 12345678.hash' -844752827 C:\Users\duerst>ruby -e 'puts 12345678.hash' 387106497 On Ruby trunk, always the same value: duerst@Arnisee /cygdrive/c/Data/ruby $ ruby -v ruby 2.4.0dev (2016-12-02 trunk 56965) [x86_64-cygwin] duerst@Arnisee /cygdrive/c/Data/ruby $ ruby -e 'puts 12345678.hash' 1846311797112760547 duerst@Arnisee /cygdrive/c/Data/ruby $ ruby -e 'puts 12345678.hash' 1846311797112760547 duerst@Arnisee /cygdrive/c/Data/ruby $ ruby -e 'puts 12345678.hash' 1846311797112760547 ---Files-------------------------------- switching_hash_removal.patch (9.21 KB) -- https://bugs.ruby-lang.org/ Unsubscribe: