-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add artifacts interface #1342
Add artifacts interface #1342
Conversation
…ct provides better reusability of the processed curve data.
…ined in the single method _create_figures. This allows subclass to flexibly modify the figure generation without overwriting the entire _run_analysis.
…nd fit summary in CurveAnalysis with artifact container. Composite curve analysis is also simplified.
…have service API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @coruscating this looks good. Do you plan to add a tutorial for artifact? You also need to update unittest util for ExperimentData equality check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @wshanks do you have a chance to check this PR? Newly introduced public APIs look reasonable to me. If we find any problem in their behavior I think we can fix without breaking API change. In that sense we can also merge this now for 0.6 release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. I have a couple of concerns to get your thoughts on. I also commented on several non-blocking things that could be turned into new issues.
def artifacts( | ||
self, | ||
artifact_key: int | str = None, | ||
) -> ArtifactData | list[ArtifactData]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I don't like APIs that change output type based on the value. They are convenient in some situations like a REPL but make the user do extra work to be correct in general. Maybe artifacts
could always return a list, and if artifacts()[0]
seems too awkward there could be a separate artifact()
method the only returns the first result (and maybe warns or errors if there were more than one result for query)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair this pattern appears everywhere in experiments. I agree with your point though. I would cleanup the entire interface in next release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. It's avoiding adding opportunity for deprecated behavior vs keeping consistency across the API.
docs/howtos/artifacts.rst
Outdated
scatter_table.dataframe | ||
|
||
The artifacts in a large composite experiment with ``flatten_results=True`` can be distinguished from | ||
each other using the :attr:`~.ArtifactData.experiment` and :attr:`~.ArtifactData.device_components` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good follow up would be to add more query parameters to the artifacts
method, so you can do exp_data.artifacts("curve_data", device_components=[Qubit(1)])
instead of filtering the list manually after getting exp_data.artifacts("curve_data")
. Maybe there could be a shortcut like qubits=[1]
so you don't need to import the Qubit
class to do this query.
for i in exp_data._figures.keys(): | ||
self.assertEqual(exp_data._figures[i], copied._figures[i]) | ||
for i in exp_data._artifacts.keys(): | ||
self.assertEqual(exp_data._artifacts[i], copied._artifacts[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the public methods .artifacts()
, .figure_names()
, and .figure(name)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's better and I found a bug where figure_names
were being duplicated in the copied object.
if "artifact_files" in expdata.metadata: | ||
for filename in expdata.metadata["artifact_files"]: | ||
if service.experiment_has_file(experiment_id, filename): | ||
artifact_file = service.file_download(experiment_id, filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a lot of artifacts now so it shouldn't matter much, but we might want bulk upload/download functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's follow-up work for the cloud service.
@@ -2364,6 +2378,10 @@ def copy(self, copy_results: bool = True) -> "ExperimentData": | |||
new_instance._figures = ThreadSafeOrderedDict() | |||
new_instance.add_figures(self._figures.values()) | |||
|
|||
with self._artifacts.lock: | |||
new_instance._artifacts = ThreadSafeOrderedDict() | |||
new_instance.add_artifacts(self._artifacts.values()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't do enough threading for it to matter but ExperimentData
should probably have a top level lock. Ideally, all the data should be copied within a single lock instead of one per subcomponent.
Co-authored-by: Will Shanks <wshaos@posteo.net>
also deprecate accessing analysis results via numerical indices
@@ -1599,6 +1545,21 @@ def analysis_results( | |||
) | |||
self._retrieve_analysis_results(refresh=refresh) | |||
|
|||
if index == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to check that index 0 holds fit parameters and otherwise fall back to the other branch here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Added a check that the name starts with @
since importing PARAMS_ENTRY_PREFIX
would be circular and we already use starting with @
as a filtering criteria when sending data to the plotter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I will try to make issues from my other comments.
### Summary Thanks to #1342 we can cleanup internals of `CompositeCurveAnalysis`. Not API break and no feature upgrade with this PR. ### Details and comments Previously the curve data and fit summary data are internally created in `CurveAnalysis` but immediately discarded. The implementation in `CurveAnalysis._run_analysis` is manually copied to `CompositeCurveAnalysis._run_analysis` to access these artifact data to create composite artifact data from them. This makes code fragile since developers needed to manually update both base classes. With this PR, implementation of component analysis is encapsulated.
### Summary Thanks to #1342 we can cleanup internals of `CompositeCurveAnalysis`. Not API break and no feature upgrade with this PR. ### Details and comments Previously the curve data and fit summary data are internally created in `CurveAnalysis` but immediately discarded. The implementation in `CurveAnalysis._run_analysis` is manually copied to `CompositeCurveAnalysis._run_analysis` to access these artifact data to create composite artifact data from them. This makes code fragile since developers needed to manually update both base classes. With this PR, implementation of component analysis is encapsulated. (cherry picked from commit cb37d42)
Summary
This PR adds the artifacts interface following the design in https://github.com/Qiskit/rfcs/blob/master/0007-experiment-dataframe.md.
Details and comments
ArtifactData
dataclass for representing artifacts.ExperimentData.artifacts()
,.add_artifacts()
, anddelete_artifact()
for working with artifacts, which is stored in a thread safe list. Currently theScatterTable
andCurveFitResult
objects are stored as artifacts, and experiment serialization data will be added in the future..zip
formats for uploading to the cloud service. Inside each zipped file is a list of JSON artifact files with the filename equal to their unique artifact ID. For composite experiments withflatten_results=True
, allScatterTable
artifacts are stored incurve_data.zip
in individual jsons and so forth.dataframe.css
is for styling these tables).figure_names
were being duplicated in a copiedExperimentData
object.Example experiment with artifacts (link):
