You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Submitting Author: Name (@vnmabus)
Package Name: rdata
One-Line Description of Package: Read R datasets from Python.
Repository Link (if existing): https://github.com/vnmabus/rdata
Code of Conduct & Commitment to Maintain Package
I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package after should it be accepted.
Include a brief paragraph describing what your package does:
Community Partnerships
We partner with communities to support peer review with an additional layer of
checks that satisfy community requirements. If your package fits into an
existing community please check below:
Please indicate which category or categories.
Check out our package scope page to learn more about our
scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
This package is not specific to a particular domain, but it can be used in several of them.
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
Its main purpose is to be able to read .rda and .rds files, the files used for storing data in the R programming language, and convert them to Python objects for further processing.
Who is the target audience and what are the scientific applications of this package?
The target audience includes users that want to open in Python datasets created in R. These include scientists working in both Python and R, scientists who want to compare results among the two languages using the same data, or simply Python scientists that want to be able to use the numerous datasets available in CRAN, the R repository of packages.
Are there other Python packages that accomplish similar things? If so, how does yours differ?
The package rpy2 can be used to interact with R from Python. This includes the ability to load data in the RData format, and to convert these data to equivalent Python objects. Although this is arguably the best package to achieve interaction between both languages, it has many disadvantages if one wants to use it just to load RData datasets. In the first place, the package requires an R installation, as it relies in launching an R interpreter and communicating with it. Secondly, launching R just to load data is inefficient, both in time and memory. Finally, this package inherits the GPL license from the R language, which is not compatible with most Python packages, typically released under more permissive licenses.
The recent package pyreadr also provides functionality to read some R datasets. It relies in the C library librdata in order to perform the parsing of the RData format. This adds an additional dependency from C building tools, and requires that the package is compiled for all the desired operating systems. Moreover, this package is limited by the functionalities available in librdata, which at the moment of writing does not include the parsing of common objects such as R lists and S4 objects. The license can also be a problem, as it is part of the GPL family and does not allow commercial use.
Any other questions or issues we should be aware of:
P.S. Have feedback/comments about our review process? Leave a comment here
The text was updated successfully, but these errors were encountered:
Hi @vnmabus! Welcome to pyOpenSci and thank you for raising this detailed presubmission inquiry.
Yes, rdata is definitely in scope. Please do proceed with a full submission.
One thing to note is that our review process focuses on strong documentation. I can tell you have already done a lot of work, and this is not a requirement to start the review, but please consider additional concrete examples beyond what you have in "getting started". Since you're working with R I'm guessing you're already familiar with the idea of vignettes--you'll want some like those for Pynteny here or the examples in the PyGMT user guide.
Please also reference this issue when you do submit.
Once you confirm that you will submit, I'll close this issue.
Submitting Author: Name (@vnmabus)
Package Name: rdata
One-Line Description of Package: Read R datasets from Python.
Repository Link (if existing): https://github.com/vnmabus/rdata
Code of Conduct & Commitment to Maintain Package
Description
Community Partnerships
We partner with communities to support peer review with an additional layer of
checks that satisfy community requirements. If your package fits into an
existing community please check below:
Scope
Scope
Please indicate which category or categories.
Check out our package scope page to learn more about our
scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
Domain Specific & Community Partnerships
This package is not specific to a particular domain, but it can be used in several of them.
Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:
Its main purpose is to be able to read
.rda
and.rds
files, the files used for storing data in the R programming language, and convert them to Python objects for further processing.Who is the target audience and what are the scientific applications of this package?
The target audience includes users that want to open in Python datasets created in R. These include scientists working in both Python and R, scientists who want to compare results among the two languages using the same data, or simply Python scientists that want to be able to use the numerous datasets available in CRAN, the R repository of packages.
Are there other Python packages that accomplish similar things? If so, how does yours differ?
The package rpy2 can be used to interact with R from Python. This includes the ability to load data in the RData format, and to convert these data to equivalent Python objects. Although this is arguably the best package to achieve interaction between both languages, it has many disadvantages if one wants to use it just to load RData datasets. In the first place, the package requires an R installation, as it relies in launching an R interpreter and communicating with it. Secondly, launching R just to load data is inefficient, both in time and memory. Finally, this package inherits the GPL license from the R language, which is not compatible with most Python packages, typically released under more permissive licenses.
The recent package pyreadr also provides functionality to read some R datasets. It relies in the C library librdata in order to perform the parsing of the RData format. This adds an additional dependency from C building tools, and requires that the package is compiled for all the desired operating systems. Moreover, this package is limited by the functionalities available in librdata, which at the moment of writing does not include the parsing of common objects such as R lists and S4 objects. The license can also be a problem, as it is part of the GPL family and does not allow commercial use.
Any other questions or issues we should be aware of:
P.S. Have feedback/comments about our review process? Leave a comment here
The text was updated successfully, but these errors were encountered: