Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample #195

Merged
merged 9 commits into from
Dec 5, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "rand"
version = "0.3.18"
version = "0.3.19"
authors = ["The Rust Project Developers"]
license = "MIT/Apache-2.0"
readme = "README.md"
Expand Down
27 changes: 25 additions & 2 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -260,8 +260,6 @@ pub use os::OsRng;

pub use isaac::{IsaacRng, Isaac64Rng};
pub use chacha::ChaChaRng;
#[deprecated(since="0.3.18", note="renamed to seq::sample_iter")]
pub use seq::{sample_iter as sample};

#[cfg(target_pointer_width = "32")]
use IsaacRng as IsaacWordRng;
Expand Down Expand Up @@ -1019,6 +1017,31 @@ pub fn random<T: Rand>() -> T {
thread_rng().gen()
}

#[inline(always)]
#[deprecated(since="0.3.18", note="renamed to seq::sample_iter")]
/// DEPRECATED: use `seq::sample_iter` instead.
///
/// Randomly sample up to `amount` elements from a finite iterator.
/// The order of elements in the sample is not random.
///
/// # Example
///
/// ```rust
/// use rand::{thread_rng, sample};
///
/// let mut rng = thread_rng();
/// let sample = sample(&mut rng, 1..100, 5);
/// println!("{:?}", sample);
/// ```
pub fn sample<T, I, R>(rng: &mut R, iterable: I, amount: usize) -> Vec<T>
where I: IntoIterator<Item=T>,
R: Rng,
{
// the legacy sample didn't care whether amount was met
seq::sample_iter(rng, iterable, amount)
.unwrap_or_else(|e| e)
}

#[cfg(test)]
mod test {
use super::{Rng, thread_rng, random, SeedableRng, StdRng, weak_rng};
Expand Down
27 changes: 15 additions & 12 deletions src/seq.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,33 +13,33 @@
use super::Rng;
use std::collections::hash_map::HashMap;

/// Randomly sample *up to* `amount` elements from a finite iterator.
/// Randomly sample `amount` elements from a finite iterator.
///
/// The values are non-repeating but the order of elements returned is *not* random.
/// The following can be returned:
/// - `Ok`: `Vec` of `amount` non-repeating randomly sampled elements. The order is not random.
/// - `Err`: `Vec` of *less than* `amount` elements in sequential order. This is considered an
/// error since exactly `amount` elements is typically expected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Err case I'd prefer more precise documentation: in case less than amount elements are available, a Vec of exactly the elements in iterable is returned. The order is not random.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order is actually guaranteed to be sequential. I've updated the doc, let me know if that is better.

///
/// This implementation uses `O(len(iterable))` time and `O(amount)` memory.
///
/// > If `len(iterable) <= amount` then the values will be in sequential order. In all other
/// > cases the order of the elements is neither random nor guaranteed.
///
/// # Example
///
/// ```rust
/// use rand::{thread_rng, seq};
///
/// let mut rng = thread_rng();
/// let sample = seq::sample_iter(&mut rng, 1..100, 5);
/// let sample = seq::sample_iter(&mut rng, 1..100, 5).unwrap();
/// println!("{:?}", sample);
/// ```
pub fn sample_iter<T, I, R>(rng: &mut R, iterable: I, amount: usize) -> Vec<T>
pub fn sample_iter<T, I, R>(rng: &mut R, iterable: I, amount: usize) -> Result<Vec<T>, Vec<T>>
where I: IntoIterator<Item=T>,
R: Rng,
{
let mut iter = iterable.into_iter();
let mut reservoir = Vec::with_capacity(amount);
reservoir.extend(iter.by_ref().take(amount));

// continue unless the iterator was exhausted
// Continue unless the iterator was exhausted
//
// note: this prevents iterators that "restart" from causing problems.
// If the iterator stops once, then so do we.
Expand All @@ -50,12 +50,13 @@ pub fn sample_iter<T, I, R>(rng: &mut R, iterable: I, amount: usize) -> Vec<T>
*spot = elem;
}
}
Ok(reservoir)
} else {
// Don't hang onto extra memory. There is a corner case where
// `amount <<< len(iterable)` that we want to avoid.
// `amount` was much less than `len(iterable)`.
reservoir.shrink_to_fit();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're not going with your own advice in #194 to make this fallible? I'm not sure personally; this is an iterator so shrinking to available data is common behaviour.

Err(reservoir)
}
reservoir
}

/// Randomly sample exactly `amount` values from `slice`.
Expand Down Expand Up @@ -224,11 +225,13 @@ mod test {

let mut r = thread_rng();
let vals = (min_val..max_val).collect::<Vec<i32>>();
let small_sample = sample_iter(&mut r, vals.iter(), 5);
let large_sample = sample_iter(&mut r, vals.iter(), vals.len() + 5);
let small_sample = sample_iter(&mut r, vals.iter(), 5).unwrap();
let large_sample = sample_iter(&mut r, vals.iter(), vals.len() + 5).unwrap_err();

assert_eq!(small_sample.len(), 5);
assert_eq!(large_sample.len(), vals.len());
// no randomization happens when amount >= len
assert_eq!(large_sample, vals.iter().collect::<Vec<_>>());

assert!(small_sample.iter().all(|e| {
**e >= min_val && **e <= max_val
Expand Down