Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tarfile support #192

Merged
merged 7 commits into from
Jun 18, 2020
Merged

Add tarfile support #192

merged 7 commits into from
Jun 18, 2020

Conversation

rjgildea
Copy link
Contributor

Two alternative implementations of adding tarfile support:

  1. Decompress tarfiles immediately after downloading, before validation of the downloaded file (17484f2)
  2. Require a list of expected tar file contents to be defined in the definition.yml. After validation of the downloaded tarfile, then check the whether all expected files source["files"] exist locally, and if not, decompress the tarfile (0429b7e)

Minimal definition for 1. would be:

data:
 - url: https://zenodo.org/record/1443110/files/ccp4school2018_bl41xu_data03.tar.xz

Whereas for 2. more information is required:

data:
 - url: https://zenodo.org/record/1443110/files/ccp4school2018_bl41xu_data03.tar.xz
   files:
   - ccp4school2018_bl41xu/05/data03/data03_master.h5
   - ccp4school2018_bl41xu/05/data03/data03_data_000001.h5
   - ccp4school2018_bl41xu/05/data03/data03_data_000002.h5

Fixes #11, fixes #12

Copy link
Member

@benjaminhwilliams benjaminhwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the f-strings are the cause of the Python 2.7 & 3.5 build failures. I don't know where we stand on Python version support for dials/data, so I'll defer to @Anthchirp.

I've made a suggestion about not necessarily extracting the entire archive if we don't need it. Not absolutely necessary but I could see it being useful for cases like https://zenodo.org/record/51405, which contain lots of tarballed sweeps from each of which one may only want a few images.

@Anthchirp
Copy link
Member

2.7 we can and should plausibly drop, so I've put in #193 for that.
3.5 is supported - even though we don't use it ourselves. I would suggest keeping 3.5 support until it is officially retired mid-September, ie. leave out f-strings for now.

Copy link
Member

@Anthchirp Anthchirp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try it and see what happens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support compressed data sources Add SPring8 CCP4 workshop data from zenodo
3 participants