Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-1400 GridFS download methods #747

Merged
merged 3 commits into from
Oct 12, 2022

Conversation

isabelatkinson
Copy link
Contributor

@isabelatkinson isabelatkinson commented Sep 16, 2022

Implements the download_to_stream method described here and the download_to_stream_by_name method described here

Comment on lines 132 to 140
// to run a single test, set the TEST_DESCRIPTION environment variable to the test's
// description
if let Ok(description) = std::env::var("TEST_DESCRIPTION") {
if description != test_case.description {
continue;
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we all probably write something to this effect when trying to debug an individual test, so I thought it would be useful to have this in the code permanently

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this contains instead of strict equality? I normally add something like that in case I want to run a few tests at once. Great idea to add this btw; I think it'll save a ton of time for all of us going forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 😎

src/gridfs.rs Outdated
files_id: Bson,
n: u32,
#[derive(Debug, Deserialize, Serialize)]
pub struct Chunk {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

going to disable the lint suppression for missing docs in places like these and add them in when I do RUST-1395

@isabelatkinson isabelatkinson marked this pull request as ready for review September 16, 2022 21:21
@isabelatkinson isabelatkinson requested review from a team and patrickfreed and removed request for a team September 16, 2022 21:21
@isabelatkinson isabelatkinson marked this pull request as draft September 26, 2022 19:08
@isabelatkinson
Copy link
Contributor Author

converting to draft as I pull in some improvements from #751, feel free to hold off on reviewing for now

@isabelatkinson isabelatkinson force-pushed the gridfs-download branch 2 times, most recently from 210d6c8 to c4f4acf Compare September 26, 2022 20:54
@isabelatkinson isabelatkinson marked this pull request as ready for review September 26, 2022 21:16
@isabelatkinson
Copy link
Contributor Author

ready for review again

Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on the review!

id: ObjectId,
files_id: Bson,
n: u32,
// default size is 255 KiB
data: Vec<u8>,
data: Binary,
}

/// A collection in which information about stored files is stored. There will be one files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is existing code, but I think this docstring could be a tad clearer. For instance, this struct models the documents in the files collection, rather than being the collection itself. Also, it could be better to say something like "metadata" rather than information, and also could be useful to mention the collection in which the files are actually stored (the "chunks" collection, right?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I left a comment about doing a later pass over documentation but it became outdated when I made some subsequent changes: #747 (comment)

I think most of the documentation needs some cleaning up so I figured I'd do it in one swoop at the end (although I made a few update in this PR) -- does that sound good re the rest of your documentation comments or would you rather have them addressed here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR sounds good to me! Sorry, yeah I missed your previous comment.

pub upload_date: DateTime,
pub filename: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub metadata: Option<Document>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to document each of these fields, similar to what we do for ChangeStreamEvent.

options::{FindOneOptions, FindOptions},
};

impl GridFsBucket {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we provide doctest examples for each of these? (sorry i know it will be a lot of similar examples, but I think it's something users really appreciate)

Comment on lines 132 to 140
// to run a single test, set the TEST_DESCRIPTION environment variable to the test's
// description
if let Ok(description) = std::env::var("TEST_DESCRIPTION") {
if description != test_case.description {
continue;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this contains instead of strict equality? I normally add something like that in case I want to run a few tests at once. Great idea to add this btw; I think it'll save a ton of time for all of us going forward.

}

/// Downloads the contents of the stored file specified by `id` and writes
/// the contents to the `destination`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name implies this of course, but can we explicitly mention that the destination here implements the AsyncWrite trait from the futures 0.3 crate, and link to it? Ditto for the tokio ones. Might also be worth mentioning that once std stabilizes AsyncWrite, this method will be deprecated (ditto for tokio ones on this too).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the to_futures_0_3 methods are just using compat_write under the hood and are temporary, what would people think about leaving them out entirely and recommending users do this directly if needed in documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abr-egn I'd be fine with that

Copy link
Contributor

@patrickfreed patrickfreed Oct 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little inconsistent with our various cursor APIs that implement the futures version of Stream, but having seen what the implementation actually entails here, I think I agree that maybe these separate methods probably aren't worth it. Instead, if we want to have consistency, maybe we should just also implement tokio's Stream on our cursor types. Then omitting the futures one here isn't as inconsistent, since we do have at least one crate's traits implemented across the board.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to have only the futures implementations then? That way we're runtime-agnostic and in line with what we already support for Cursor. The tokio compat utils go both ways, so I think the user conversion code would be basically the same regardless of which library we choose to support in our API.

id: ObjectId,
files_id: Bson,
n: u32,
// default size is 255 KiB
data: Vec<u8>,
data: Binary,
}

/// A collection in which information about stored files is stored. There will be one files
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I left a comment about doing a later pass over documentation but it became outdated when I made some subsequent changes: #747 (comment)

I think most of the documentation needs some cleaning up so I figured I'd do it in one swoop at the end (although I made a few update in this PR) -- does that sound good re the rest of your documentation comments or would you rather have them addressed here?

pub enum GridFsErrorKind {
/// The file with the given identifier was not found.
#[non_exhaustive]
FileNotFound { identifier: GridFsFileIdentifier },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little convoluted but the FileNotFound error can be returned from either download (which has an id) or download_by_name (which has a filename), and it didn't seem much better to have separate error cases for those.

id: ObjectId,
files_id: Bson,
n: u32,
// default size is 255 KiB
data: Vec<u8>,
data: Binary,
}

/// A collection in which information about stored files is stored. There will be one files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR sounds good to me! Sorry, yeah I missed your previous comment.

Copy link
Contributor

@patrickfreed patrickfreed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! tagging in the other reviews.

Going to leave my docs related commends unresolved so you can go back to them when doing the follow on PR.

@patrickfreed patrickfreed requested review from kmahar and abr-egn October 4, 2022 15:09
Copy link
Contributor

@abr-egn abr-egn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

}

/// Downloads the contents of the stored file specified by `id` and writes
/// the contents to the `destination`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the to_futures_0_3 methods are just using compat_write under the hood and are temporary, what would people think about leaving them out entirely and recommending users do this directly if needed in documentation?

Copy link
Contributor

@kmahar kmahar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor questions/comments but LGTM overall!

}

/// Downloads the contents of the stored file specified by `id` and writes
/// the contents to the `destination`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abr-egn I'd be fine with that

@isabelatkinson
Copy link
Contributor Author

Updated to remove the tokio methods. I initially wrote a test to verify that this works but removed it in favor of adding a doc example when I refresh all of the documentation which will give the same coverage.

@isabelatkinson isabelatkinson merged commit c9c4e3f into mongodb:main Oct 12, 2022
@isabelatkinson isabelatkinson deleted the gridfs-download branch March 6, 2023 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants