Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Higher precision for floats in range [0,1) #1

Merged
merged 5 commits into from
Sep 3, 2017

Conversation

pitdicker
Copy link

@pitdicker pitdicker commented Aug 23, 2017

First a disclaimer :).
With all the discussions happening around rand, I am not sure where to open a pull request.
Also it is a bit difficult for me to explain what I am doing here...

This commit changes the way floats are created with the distribution Uniform01.
This is not a method I have found anywhere else, but loosely inspired by
https://readings.owlfolio.org/2007/generating-pseudorandom-floating-point-values/ .

With this method floats have a much better precision near zero (9 extra bits of precision for f32, and 12 extra bits for f64).

It seems to have almost no influence on the benchmarks:
before:
test distr_baseline ... bench: 1,616 ns/iter (+/- 68) = 4950 MB/s
after:
test distr_baseline ... bench: 1,621 ns/iter (+/- 30) = 4935 MB/s

@dhardy
Copy link
Owner

dhardy commented Aug 23, 2017

Nice. I haven't had the time today but I'll take a look and merge.

I haven't focussed on changes like this since getting the main architecture right seemed like a priority, but this is definitely worth bringing up and mentioning on the main RFC thread.

@pitdicker
Copy link
Author

Thank you, and there is no hurry for me.

I am now looking what the consequences are for other parts of rand that use this.

range for example looses most of the precision again if one of it's numbers is negative. I have fixed this, but not tested en benchmarked yet.
open01 seems to rely on some precision also...

Tomorrow I will try to write something on the RFC tread.

@pitdicker
Copy link
Author

A quick update:
open01, closed01 and range now preserve the extra bits of precision near 0.0.

The benchmark distr_range2_float has regressed 40%. Moving the code conversion from uint to float out from closed01, uniform01 and uniform11 fixes this. Probably some problem with inlining?

This is a good chance to simplify the code to use constants, and wrap it in a macro so there is no duplication between f32 and f64.

Which is a long way of saying, please wait a few more days before reviewing :)

@pitdicker pitdicker force-pushed the master branch 3 times, most recently from 7f9bc7e to 800bcf3 Compare August 27, 2017 19:01
@pitdicker
Copy link
Author

It should be ok now.

@pitdicker
Copy link
Author

The code in ziggurat in distributions uses it's own method of converting a random int to a float.
It only uses as many bits of precision as bit in the floats fraction, and uses the rest to pick a row from the ziggurat. It is perfect to use the previous code from Uniform01 here. The distribution benchmarks improve by 5~7%.

This got me thinking that it may be best to separate out the int to float conversions to an other file. It does not really fit in uniform.rs.

I'll work on it some more...

@dhardy
Copy link
Owner

dhardy commented Aug 28, 2017

Sure, putting the float conversions in a separate file makes sense (uniform_float.rs or float.rs or some such).

I'm not so sure about making uniform01 etc. member functions on u32 / u64. It may be useful to make these public, sure, but in general it seems a weird operation to make a member function. Well, I suppose they are only available when people use FloatConversions so maybe the design is fine.

@pitdicker
Copy link
Author

You are right that making them member functions is not ideal. But I don't know enough rust yet for something better. Do you have a suggestion, or shall I just try a bit more?

@dhardy
Copy link
Owner

dhardy commented Aug 28, 2017

Simple functions (generic or with separate names for each type) would work. But I don't know if this would be better. As noted in my edit above, using member functions is probably fine actually.

@pitdicker
Copy link
Author

pitdicker commented Aug 28, 2017

Moving the functions to a separate file took care of them being public, so I was lucky :-).

I have tried to clean up the commit history.

Some benchmarks:
Before:

test distr_baseline          ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_exp               ... bench:       6,286 ns/iter (+/- 131) = 1272 MB/s
test distr_gamma_large_shape ... bench:      18,190 ns/iter (+/- 317) = 439 MB/s
test distr_gamma_small_shape ... bench:      21,233 ns/iter (+/- 427) = 376 MB/s
test distr_log_normal        ... bench:       6,650 ns/iter (+/- 94) = 1203 MB/s
test distr_normal            ... bench:       6,651 ns/iter (+/- 99) = 1202 MB/s
test distr_range2_float      ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_range2_int        ... bench:       3,102 ns/iter (+/- 16) = 2578 MB/s
test distr_range_float       ... bench:       1,620 ns/iter (+/- 17) = 4938 MB/s
test distr_range_int         ... bench:       3,107 ns/iter (+/- 12) = 2574 MB/s

After:

test distr_baseline          ... bench:       1,617 ns/iter (+/- 15) = 4947 MB/s
test distr_exp               ... bench:       6,090 ns/iter (+/- 121) = 1313 MB/s
test distr_gamma_large_shape ... bench:      17,419 ns/iter (+/- 215) = 459 MB/s
test distr_gamma_small_shape ... bench:      20,304 ns/iter (+/- 259) = 394 MB/s
test distr_log_normal        ... bench:       5,849 ns/iter (+/- 64) = 1367 MB/s
test distr_normal            ... bench:       5,854 ns/iter (+/- 65) = 1366 MB/s
test distr_range2_float      ... bench:       1,620 ns/iter (+/- 7) = 4938 MB/s
test distr_range2_int        ... bench:       3,098 ns/iter (+/- 12) = 2582 MB/s
test distr_range_float       ... bench:       1,618 ns/iter (+/- 8) = 4944 MB/s
test distr_range_int         ... bench:       3,100 ns/iter (+/- 11) = 2580 MB/s

The benchmarks that depend on the ziggurat method have all improved, normal and log_normal by 12%, gamma by 4.5%, and exp by 3%. The benchmark distr_range2_float does not regress, thanks to splitting the float conversion functions off from those also generating the random integers.

Thanks for the encouragement until now!

@pitdicker
Copy link
Author

I just learned the gamma and derived distributions do not use the ziggurat method. That makes me wonder why they have improved. Maybe the new code for Open01 is what makes them slightly faster...

@pitdicker pitdicker force-pushed the master branch 2 times, most recently from 5b15c53 to aa9e54a Compare August 31, 2017 16:55
@pitdicker
Copy link
Author

The conversion functions are now moved to a different module. The new names make it clearer what is going on, so that is a plus :-)

Copy link
Owner

@dhardy dhardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a few minor changes I'd love to merge this, if you're finished tweaking of course (otherwise better to wait; there's no rush for this since the trait stuff is still undecided).

src/utils.rs Outdated

/// Helper function to combine the fraction, exponent and sign into a float.
/// `self` includes the bits from the fraction and sign.
fn combine_float(self, exponent: i32) -> Self::F;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be called something like binor_exp? (I would suggest set_exp but if the exponent bits aren't currently zero it wouldn't do that.) Also the trait name: maybe FloatBitOps?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just a helper function, but you are better at naming! I will change it

// Because the chance to sample 0.0 is so low, this half-open
// range is a very good appoximation of a closed range.
let x = $next_u(rng);
x.open_closed01()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You give a good rationale for implementing [0, 1] sampling via (0, 1] instead. I wonder what happens with your extra precision sampling under common transformations. E.g. if a user samples x via this function, then what's the actual range of x + 1.0 — probably exactly [1, 2] since the loss of precision away from 0 forces rounding? Would x - 1.0 have range [-1, 0]?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, scaling should mostly preserve the precision, but addition forces rounding. I hope the extra precision for Open01 does not break anyone's code. But the most important reason to use Open01 is to not get 0.0, for log, division, etc.. I don't imagine much problems in those cases.

x - 1.0 should have [-1, 0] as range. I didn't think about that! But it hurts a bit now that I am focused on precision ;-)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering now whether we should permanently document Closed01 as having range (0, 1], or have another distribution like OpenClosed01 as well, or instead of Closed01. Because not generating 0 may be a useful property.

Of course it may be surprising that OpenClosed01 + x will often have range [x, x+1] due to rounding, but there's not much we can do about that (not knowing in general how large x is).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not hurry with adding more range distribution with a different closed/open story... We already have maybe to many to choose from. If someone doesn't want to generate 0.0, he can use Open01. And otherwise there is now the conversion function open_closed01. So I would not do add it right now, but who knows what the future brings?

} else {
RangeFloat::NegAndPos {
cutoff: high,
scale: (high + low) / 2.0 - high,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the half-open [low, high) range works well here, doesn't it? Is a closed version, [low, high] feasible? Where low is close to zero, the same approximation trick, (low, high] may be okay (and equiv for high < 0), but for ranges spanning 0 there's no good solution?

The other question is whether this actually matters much in practice... many applications don't need so much precision in the first place, but maybe some do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have spend some hours trying to get the rounding at the edge of the ranges right, but without much success yet... So all the stuff about open/closed is theoretically nice, but not yet the reality :-(. That is one of the reasons I was not sure in the other pr if it made sense to expose a closed range in the treat, because it does not work really out for floats.

But I will give my test for de edge cases a few more tries. You asked above, if I am done tweaking. Maybe not yet completely ;-)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a good rationale for not worrying too much about which open/closed distributions we have. I checked the "pure math" version of these distributions gives the right open/closed distributions, but what happens in practice due to rounding is another matter. :-/

@@ -174,13 +174,19 @@ fn ziggurat<R: Rng+?Sized, P, Z>(
// this may be slower than it would be otherwise.)
// FIXME: investigate/optimise for the above.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this comment; we're already removing the next_f64 function from Rng.

let u = if symmetric {2.0 * f - 1.0} else {f};
let x = u * x_tab[i];
// FIXME: the distribution is not open, but half-open.
// Can that cause problems or a bias?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea, sorry. But would rejection sampling slow it down much anyway, in practice? Maybe it's not significant enough to bother anyway.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been reading up on the ziggurat method. There even is an improved version from 2015, that should be faster. But I added the note to make sure it didn't get lost

let f = if symmetric {
bits.closed_open11_fixed()
} else {
bits.closed_open01_fixed()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe documentation for these methods should make it very clear which bits are not used?

Paul Dicker added 5 commits September 3, 2017 07:55
Use a different technique to create floating point numbers uniformly distributed
over [0,1). The previous method only used 23 bits of precision for a f32, this
method uses all available 32 bits. This results in numbers with up to 9 bits of
precision extra when getting closer to zero.

`Closed01` and `Open01` used multiplication to adjust the range sampled from
`Uniform01`. This does not play well with an `Uniform01` that gets higher
precision as it gets closer to 0.0. `Closed01` now does it's own conversion to
sample from the range (0,1]. `Open01` uses the rejection method.
@pitdicker
Copy link
Author

Someday I will learn how to rebase properly...

I have made the changes you suggested.
I am done tweaking this code. Maybe in the future I will have another try at the rounding with ranges.

@dhardy
Copy link
Owner

dhardy commented Sep 3, 2017

Great, thanks. Something like git rebase -i HEAD~5 is extremely useful for rebasing.

@dhardy dhardy merged commit e8b06db into dhardy:master Sep 3, 2017
dhardy pushed a commit that referenced this pull request Jan 3, 2019
rand_os: doc and cargo improvements
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants