Higher precision for floats in range [0,1) #1

pitdicker · 2017-08-23T15:52:30Z

First a disclaimer :).
With all the discussions happening around rand, I am not sure where to open a pull request.
Also it is a bit difficult for me to explain what I am doing here...

This commit changes the way floats are created with the distribution Uniform01.
This is not a method I have found anywhere else, but loosely inspired by
https://readings.owlfolio.org/2007/generating-pseudorandom-floating-point-values/ .

With this method floats have a much better precision near zero (9 extra bits of precision for f32, and 12 extra bits for f64).

It seems to have almost no influence on the benchmarks:
before:
test distr_baseline ... bench: 1,616 ns/iter (+/- 68) = 4950 MB/s
after:
test distr_baseline ... bench: 1,621 ns/iter (+/- 30) = 4935 MB/s

dhardy · 2017-08-23T20:07:21Z

Nice. I haven't had the time today but I'll take a look and merge.

I haven't focussed on changes like this since getting the main architecture right seemed like a priority, but this is definitely worth bringing up and mentioning on the main RFC thread.

pitdicker · 2017-08-23T20:20:31Z

Thank you, and there is no hurry for me.

I am now looking what the consequences are for other parts of rand that use this.

range for example looses most of the precision again if one of it's numbers is negative. I have fixed this, but not tested en benchmarked yet.
open01 seems to rely on some precision also...

Tomorrow I will try to write something on the RFC tread.

pitdicker · 2017-08-25T05:23:29Z

A quick update:
open01, closed01 and range now preserve the extra bits of precision near 0.0.

The benchmark distr_range2_float has regressed 40%. Moving the code conversion from uint to float out from closed01, uniform01 and uniform11 fixes this. Probably some problem with inlining?

This is a good chance to simplify the code to use constants, and wrap it in a macro so there is no duplication between f32 and f64.

Which is a long way of saying, please wait a few more days before reviewing :)

pitdicker · 2017-08-27T19:04:18Z

It should be ok now.

pitdicker · 2017-08-28T06:55:48Z

The code in ziggurat in distributions uses it's own method of converting a random int to a float.
It only uses as many bits of precision as bit in the floats fraction, and uses the rest to pick a row from the ziggurat. It is perfect to use the previous code from Uniform01 here. The distribution benchmarks improve by 5~7%.

This got me thinking that it may be best to separate out the int to float conversions to an other file. It does not really fit in uniform.rs.

I'll work on it some more...

dhardy · 2017-08-28T07:17:27Z

Sure, putting the float conversions in a separate file makes sense (uniform_float.rs or float.rs or some such).

I'm not so sure about making uniform01 etc. member functions on u32 / u64. It may be useful to make these public, sure, but in general it seems a weird operation to make a member function. Well, I suppose they are only available when people use FloatConversions so maybe the design is fine.

pitdicker · 2017-08-28T07:31:34Z

You are right that making them member functions is not ideal. But I don't know enough rust yet for something better. Do you have a suggestion, or shall I just try a bit more?

dhardy · 2017-08-28T16:48:22Z

Simple functions (generic or with separate names for each type) would work. But I don't know if this would be better. As noted in my edit above, using member functions is probably fine actually.

pitdicker · 2017-08-28T18:31:16Z

Moving the functions to a separate file took care of them being public, so I was lucky :-).

I have tried to clean up the commit history.

Some benchmarks:
Before:

test distr_baseline          ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_exp               ... bench:       6,286 ns/iter (+/- 131) = 1272 MB/s
test distr_gamma_large_shape ... bench:      18,190 ns/iter (+/- 317) = 439 MB/s
test distr_gamma_small_shape ... bench:      21,233 ns/iter (+/- 427) = 376 MB/s
test distr_log_normal        ... bench:       6,650 ns/iter (+/- 94) = 1203 MB/s
test distr_normal            ... bench:       6,651 ns/iter (+/- 99) = 1202 MB/s
test distr_range2_float      ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_range2_int        ... bench:       3,102 ns/iter (+/- 16) = 2578 MB/s
test distr_range_float       ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_range_int         ... bench:       3,107 ns/iter (+/- 12) = 2574 MB/s

After:

test distr_baseline          ... bench:       1,617 ns/iter (+/- 15) = 4947 MB/s
test distr_exp               ... bench:       6,090 ns/iter (+/- 121) = 1313 MB/s
test distr_gamma_large_shape ... bench:      17,419 ns/iter (+/- 215) = 459 MB/s
test distr_gamma_small_shape ... bench:      20,304 ns/iter (+/- 259) = 394 MB/s
test distr_log_normal        ... bench:       5,849 ns/iter (+/- 64) = 1367 MB/s
test distr_normal            ... bench:       5,854 ns/iter (+/- 65) = 1366 MB/s
test distr_range2_float      ... bench:       1,620 ns/iter (+/- 7) = 4938 MB/s
test distr_range2_int        ... bench:       3,098 ns/iter (+/- 12) = 2582 MB/s
test distr_range_float       ... bench:       1,618 ns/iter (+/- 8) = 4944 MB/s
test distr_range_int         ... bench:       3,100 ns/iter (+/- 11) = 2580 MB/s

The benchmarks that depend on the ziggurat method have all improved, normal and log_normal by 12%, gamma by 4.5%, and exp by 3%. The benchmark distr_range2_float does not regress, thanks to splitting the float conversion functions off from those also generating the random integers.

Thanks for the encouragement until now!

pitdicker · 2017-08-29T07:20:28Z

I just learned the gamma and derived distributions do not use the ziggurat method. That makes me wonder why they have improved. Maybe the new code for Open01 is what makes them slightly faster...

pitdicker · 2017-08-31T16:58:04Z

The conversion functions are now moved to a different module. The new names make it clearer what is going on, so that is a plus :-)

dhardy

With a few minor changes I'd love to merge this, if you're finished tweaking of course (otherwise better to wait; there's no rush for this since the trait stuff is still undecided).

dhardy · 2017-09-01T11:40:53Z

src/utils.rs

+
+    /// Helper function to combine the fraction, exponent and sign into a float.
+    /// `self` includes the bits from the fraction and sign.
+    fn combine_float(self, exponent: i32) -> Self::F;


Maybe this should be called something like binor_exp? (I would suggest set_exp but if the exponent bits aren't currently zero it wouldn't do that.) Also the trait name: maybe FloatBitOps?

It is just a helper function, but you are better at naming! I will change it

dhardy · 2017-09-01T12:00:41Z

src/distributions/uniform.rs

+                // Because the chance to sample 0.0 is so low, this half-open
+                // range is a very good appoximation of a closed range.
+                let x = $next_u(rng);
+                x.open_closed01()


You give a good rationale for implementing [0, 1] sampling via (0, 1] instead. I wonder what happens with your extra precision sampling under common transformations. E.g. if a user samples x via this function, then what's the actual range of x + 1.0 — probably exactly [1, 2] since the loss of precision away from 0 forces rounding? Would x - 1.0 have range [-1, 0]?

Yes, scaling should mostly preserve the precision, but addition forces rounding. I hope the extra precision for Open01 does not break anyone's code. But the most important reason to use Open01 is to not get 0.0, for log, division, etc.. I don't imagine much problems in those cases.

x - 1.0 should have [-1, 0] as range. I didn't think about that! But it hurts a bit now that I am focused on precision ;-)

I'm wondering now whether we should permanently document Closed01 as having range (0, 1], or have another distribution like OpenClosed01 as well, or instead of Closed01. Because not generating 0 may be a useful property.

Of course it may be surprising that OpenClosed01 + x will often have range [x, x+1] due to rounding, but there's not much we can do about that (not knowing in general how large x is).

I would not hurry with adding more range distribution with a different closed/open story... We already have maybe to many to choose from. If someone doesn't want to generate 0.0, he can use Open01. And otherwise there is now the conversion function open_closed01. So I would not do add it right now, but who knows what the future brings?

dhardy · 2017-09-01T12:36:04Z

src/distributions/range2.rs

+                } else {
+                    RangeFloat::NegAndPos {
+                        cutoff: high,
+                        scale: (high + low) / 2.0 - high,


It seems the half-open [low, high) range works well here, doesn't it? Is a closed version, [low, high] feasible? Where low is close to zero, the same approximation trick, (low, high] may be okay (and equiv for high < 0), but for ranges spanning 0 there's no good solution?

The other question is whether this actually matters much in practice... many applications don't need so much precision in the first place, but maybe some do?

I have spend some hours trying to get the rounding at the edge of the ranges right, but without much success yet... So all the stuff about open/closed is theoretically nice, but not yet the reality :-(. That is one of the reasons I was not sure in the other pr if it made sense to expose a closed range in the treat, because it does not work really out for floats.

But I will give my test for de edge cases a few more tries. You asked above, if I am done tweaking. Maybe not yet completely ;-)

This is also a good rationale for not worrying too much about which open/closed distributions we have. I checked the "pure math" version of these distributions gives the right open/closed distributions, but what happens in practice due to rounding is another matter. :-/

dhardy · 2017-09-01T12:44:14Z

src/distributions/mod.rs

@@ -174,13 +174,19 @@ fn ziggurat<R: Rng+?Sized, P, Z>(
        // this may be slower than it would be otherwise.)
        // FIXME: investigate/optimise for the above.


I think you can remove this comment; we're already removing the next_f64 function from Rng.

dhardy · 2017-09-01T12:44:16Z

src/distributions/mod.rs

-        let u = if symmetric {2.0 * f - 1.0} else {f};
-        let x = u * x_tab[i];
+        // FIXME: the distribution is not open, but half-open.
+        //        Can that cause problems or a bias?


I have no idea, sorry. But would rejection sampling slow it down much anyway, in practice? Maybe it's not significant enough to bother anyway.

I have been reading up on the ziggurat method. There even is an improved version from 2015, that should be faster. But I added the note to make sure it didn't get lost

dhardy · 2017-09-01T12:44:55Z

src/distributions/mod.rs

+        let f = if symmetric {
+                    bits.closed_open11_fixed()
+                } else {
+                    bits.closed_open01_fixed()


Maybe documentation for these methods should make it very clear which bits are not used?

Use a different technique to create floating point numbers uniformly distributed over [0,1). The previous method only used 23 bits of precision for a f32, this method uses all available 32 bits. This results in numbers with up to 9 bits of precision extra when getting closer to zero. `Closed01` and `Open01` used multiplication to adjust the range sampled from `Uniform01`. This does not play well with an `Uniform01` that gets higher precision as it gets closer to 0.0. `Closed01` now does it's own conversion to sample from the range (0,1]. `Open01` uses the rejection method.

pitdicker · 2017-09-03T06:29:21Z

Someday I will learn how to rebase properly...

I have made the changes you suggested.
I am done tweaking this code. Maybe in the future I will have another try at the rounding with ranges.

dhardy · 2017-09-03T10:01:59Z

Great, thanks. Something like git rebase -i HEAD~5 is extremely useful for rebasing.

rand_os: doc and cargo improvements

dhardy mentioned this pull request Aug 23, 2017

Rand crate revision (pre-stabilisation) rust-lang/rfcs#2106

Closed

pitdicker force-pushed the master branch 3 times, most recently from 7f9bc7e to 800bcf3 Compare August 27, 2017 19:01

pitdicker force-pushed the master branch from 800bcf3 to 0c903f7 Compare August 28, 2017 18:18

pitdicker force-pushed the master branch from 0c903f7 to 3ce8b54 Compare August 28, 2017 18:44

pitdicker force-pushed the master branch 2 times, most recently from 5b15c53 to aa9e54a Compare August 31, 2017 16:55

dhardy approved these changes Sep 1, 2017

View reviewed changes

Paul Dicker added 5 commits September 3, 2017 07:55

Make ranges preserve precision

fadc260

Use optimized float conversions in ziggurat

3c800b1

Move FloatConversions to a new module

ca7b10f

Add a few extra benchmarks

1dbf23d

pitdicker force-pushed the master branch from aa9e54a to 1dbf23d Compare September 3, 2017 06:26

dhardy merged commit e8b06db into dhardy:master Sep 3, 2017

pitdicker mentioned this pull request Jan 19, 2018

discussion on uniform float distributions #85

Open

dhardy pushed a commit that referenced this pull request Jan 3, 2019

Merge pull request #1 from dhardy/rand_os

bddd20d

rand_os: doc and cargo improvements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Higher precision for floats in range [0,1) #1

Higher precision for floats in range [0,1) #1

pitdicker commented Aug 23, 2017 •

edited

Loading

dhardy commented Aug 23, 2017

pitdicker commented Aug 23, 2017

pitdicker commented Aug 25, 2017

pitdicker commented Aug 27, 2017

pitdicker commented Aug 28, 2017

dhardy commented Aug 28, 2017 •

edited

Loading

pitdicker commented Aug 28, 2017

dhardy commented Aug 28, 2017

pitdicker commented Aug 28, 2017 •

edited

Loading

pitdicker commented Aug 29, 2017

pitdicker commented Aug 31, 2017

dhardy left a comment

dhardy Sep 1, 2017

pitdicker Sep 1, 2017

dhardy Sep 1, 2017

pitdicker Sep 1, 2017

dhardy Sep 2, 2017

pitdicker Sep 3, 2017

dhardy Sep 1, 2017

pitdicker Sep 1, 2017

dhardy Sep 2, 2017

dhardy Sep 1, 2017

dhardy Sep 1, 2017

pitdicker Sep 1, 2017

dhardy Sep 1, 2017

pitdicker commented Sep 3, 2017

dhardy commented Sep 3, 2017

		@@ -174,13 +174,19 @@ fn ziggurat<R: Rng+?Sized, P, Z>(
		// this may be slower than it would be otherwise.)
		// FIXME: investigate/optimise for the above.

Higher precision for floats in range [0,1) #1

Higher precision for floats in range [0,1) #1

Conversation

pitdicker commented Aug 23, 2017 • edited Loading

dhardy commented Aug 23, 2017

pitdicker commented Aug 23, 2017

pitdicker commented Aug 25, 2017

pitdicker commented Aug 27, 2017

pitdicker commented Aug 28, 2017

dhardy commented Aug 28, 2017 • edited Loading

pitdicker commented Aug 28, 2017

dhardy commented Aug 28, 2017

pitdicker commented Aug 28, 2017 • edited Loading

pitdicker commented Aug 29, 2017

pitdicker commented Aug 31, 2017

dhardy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pitdicker commented Sep 3, 2017

dhardy commented Sep 3, 2017

pitdicker commented Aug 23, 2017 •

edited

Loading

dhardy commented Aug 28, 2017 •

edited

Loading

pitdicker commented Aug 28, 2017 •

edited

Loading