From: eregontp@... Date: 2016-03-28T11:09:55+00:00 Subject: [ruby-core:74617] [Ruby trunk Feature#12222] Introducing basic statistics methods for Enumerable (and optimized implementation for Array) Issue #12222 has been updated by Benoit Daloze. It seems to me Enumerable is not the right place for this, because it expects more than just #each. Also, these methods are likely useful only for numeric collections. Maybe a "Statistics" module at a stdlib? Statistics.mean/variance/etc(enum) would be a nicer API than mixing everything in Enumerable IMHO. ---------------------------------------- Feature #12222: Introducing basic statistics methods for Enumerable (and optimized implementation for Array) https://bugs.ruby-lang.org/issues/12222#change-57742 * Author: Kenta Murata * Status: Assigned * Priority: Normal * Assignee: Yukihiro Matsumoto ---------------------------------------- As python has statistics library for calculating mean, variance, etc. of arrays and iterators from version 3.4, I would like to propose to introduce such features for built-in Enumerable, and optimized implementation for Array. Especially I want to provide Enumerable#mean and Enumerable#variance as built-in features because they should be implemented by precision compensated algorithms. The following example shows that we couldn't calculate the standard deviation for some arrays with simple variance algorithm because we get negative variance numbers. ```ruby class Array # Kahan summation def sum s = 0.0 c = 0.0 n = self.length i = 0 while i < n y = self[i] - c t = s + y c = (t - s) - y s = t i += 1 end s end # precision compensated algorithm def variance n = self.length return Float::NAN if n < 2 m1 = 0.0 m2 = 0.0 i = 0 while i < n x = self[i] delta = x - m1 m1 += delta / (i + 1) m2 += delta*(x - m1) i += 1 end m2 / (n - 1) end end ary = [ 1.0000000081806004, 1.0000000009124625, 1.0000000099201818, 1.0000000061821668, 1.0000000042644555 ] # simple variance algorithm a = ary.map {|x| x ** 2 }.sum b = ary.sum ** 2 / ary.length p (a - b) / (ary.length - 1) #=> -2.220446049250313e-16 # precision compensated algorithm p ary.variance #=> 1.2248208046392579e-17 ``` I think precision compensated algorithm is too complicated to let users implement it. -- https://bugs.ruby-lang.org/ Unsubscribe: