Add artifacts interface #1342

coruscating · 2023-12-12T14:56:05Z

Summary

This PR adds the artifacts interface following the design in https://github.com/Qiskit/rfcs/blob/master/0007-experiment-dataframe.md.

Details and comments

Added the ArtifactData dataclass for representing artifacts.
Added ExperimentData.artifacts(), .add_artifacts(), and delete_artifact() for working with artifacts, which is stored in a thread safe list. Currently the ScatterTable and CurveFitResult objects are stored as artifacts, and experiment serialization data will be added in the future.
Artifacts are grouped by type and stored in a compressed format so that there aren't a huge number of individual files for composite experiments. As such, this PR depends on Add .zip format for artifact upload qiskit-ibm-experiment#93 to allow .zip formats for uploading to the cloud service. Inside each zipped file is a list of JSON artifact files with the filename equal to their unique artifact ID. For composite experiments with flatten_results=True, all ScatterTable artifacts are stored in curve_data.zip in individual jsons and so forth.
Added a how-to for artifacts and updated documentation to demonstrate dataframe objects like AnalysisResults and the ScatterTable (dataframe.css is for styling these tables).
Deprecated accessing analysis results via numerical indices to anticipate removing the curve fit result from analysis results altogether in the next release.
Fixed bug where figure_names were being duplicated in a copied ExperimentData object.

Example experiment with artifacts (link):

…ct provides better reusability of the processed curve data.

…ined in the single method _create_figures. This allows subclass to flexibly modify the figure generation without overwriting the entire _run_analysis.

…nd fit summary in CurveAnalysis with artifact container. Composite curve analysis is also simplified.

…have service API.

nkanazawa1989

Thanks @coruscating this looks good. Do you plan to add a tutorial for artifact? You also need to update unittest util for ExperimentData equality check.

qiskit_experiments/curve_analysis/curve_analysis.py

qiskit_experiments/framework/artifact_data.py

qiskit_experiments/framework/experiment_data.py

releasenotes/notes/experiment-artifacts-c481f4e07226ce9e.yaml

nkanazawa1989

LGTM. @wshanks do you have a chance to check this PR? Newly introduced public APIs look reasonable to me. If we find any problem in their behavior I think we can fix without breaking API change. In that sense we can also merge this now for 0.6 release.

wshanks

This looks good. I have a couple of concerns to get your thoughts on. I also commented on several non-blocking things that could be turned into new issues.

docs/howtos/artifacts.rst

wshanks · 2024-02-07T16:23:09Z

qiskit_experiments/framework/experiment_data.py

+    def artifacts(
+        self,
+        artifact_key: int | str = None,
+    ) -> ArtifactData | list[ArtifactData]:


Personally, I don't like APIs that change output type based on the value. They are convenient in some situations like a REPL but make the user do extra work to be correct in general. Maybe artifacts could always return a list, and if artifacts()[0] seems too awkward there could be a separate artifact() method the only returns the first result (and maybe warns or errors if there were more than one result for query)?

To be fair this pattern appears everywhere in experiments. I agree with your point though. I would cleanup the entire interface in next release.

Fair. It's avoiding adding opportunity for deprecated behavior vs keeping consistency across the API.

docs/howtos/artifacts.rst

wshanks · 2024-02-07T16:33:28Z

docs/howtos/artifacts.rst

+    scatter_table.dataframe
+
+The artifacts in a large composite experiment with ``flatten_results=True`` can be distinguished from
+each other using the :attr:`~.ArtifactData.experiment` and :attr:`~.ArtifactData.device_components`


A good follow up would be to add more query parameters to the artifacts method, so you can do exp_data.artifacts("curve_data", device_components=[Qubit(1)]) instead of filtering the list manually after getting exp_data.artifacts("curve_data"). Maybe there could be a shortcut like qubits=[1] so you don't need to import the Qubit class to do this query.

docs/howtos/artifacts.rst

wshanks · 2024-02-07T18:40:09Z

test/database_service/test_db_experiment_data.py

+        for i in exp_data._figures.keys():
+            self.assertEqual(exp_data._figures[i], copied._figures[i])
+        for i in exp_data._artifacts.keys():
+            self.assertEqual(exp_data._artifacts[i], copied._artifacts[i])


Why not use the public methods .artifacts(), .figure_names(), and .figure(name)?

Right, that's better and I found a bug where figure_names were being duplicated in the copied object.

qiskit_experiments/framework/experiment_data.py

wshanks · 2024-02-07T18:57:58Z

qiskit_experiments/framework/experiment_data.py

+            if "artifact_files" in expdata.metadata:
+                for filename in expdata.metadata["artifact_files"]:
+                    if service.experiment_has_file(experiment_id, filename):
+                        artifact_file = service.file_download(experiment_id, filename)


We don't have a lot of artifacts now so it shouldn't matter much, but we might want bulk upload/download functions?

Yeah, that's follow-up work for the cloud service.

wshanks · 2024-02-07T19:00:34Z

qiskit_experiments/framework/experiment_data.py

@@ -2364,6 +2378,10 @@ def copy(self, copy_results: bool = True) -> "ExperimentData":
            new_instance._figures = ThreadSafeOrderedDict()
            new_instance.add_figures(self._figures.values())

+        with self._artifacts.lock:
+            new_instance._artifacts = ThreadSafeOrderedDict()
+            new_instance.add_artifacts(self._artifacts.values())


We don't do enough threading for it to matter but ExperimentData should probably have a top level lock. Ideally, all the data should be copied within a single lock instead of one per subcomponent.

qiskit_experiments/framework/experiment_data.py

Co-authored-by: Will Shanks <wshaos@posteo.net>

also deprecate accessing analysis results via numerical indices

wshanks · 2024-02-08T01:04:07Z

qiskit_experiments/framework/experiment_data.py

@@ -1599,6 +1545,21 @@ def analysis_results(
            )
        self._retrieve_analysis_results(refresh=refresh)

+        if index == 0:


I think you want to check that index 0 holds fit parameters and otherwise fall back to the other branch here.

Good catch. Added a check that the name starts with @ since importing PARAMS_ENTRY_PREFIX would be circular and we already use starting with @ as a filtering criteria when sending data to the plotter.

wshanks

Looks good to me! I will try to make issues from my other comments.

### Summary Thanks to #1342 we can cleanup internals of `CompositeCurveAnalysis`. Not API break and no feature upgrade with this PR. ### Details and comments Previously the curve data and fit summary data are internally created in `CurveAnalysis` but immediately discarded. The implementation in `CurveAnalysis._run_analysis` is manually copied to `CompositeCurveAnalysis._run_analysis` to access these artifact data to create composite artifact data from them. This makes code fragile since developers needed to manually update both base classes. With this PR, implementation of component analysis is encapsulated.

### Summary Thanks to #1342 we can cleanup internals of `CompositeCurveAnalysis`. Not API break and no feature upgrade with this PR. ### Details and comments Previously the curve data and fit summary data are internally created in `CurveAnalysis` but immediately discarded. The implementation in `CurveAnalysis._run_analysis` is manually copied to `CompositeCurveAnalysis._run_analysis` to access these artifact data to create composite artifact data from them. This makes code fragile since developers needed to manually update both base classes. With this PR, implementation of component analysis is encapsulated. (cherry picked from commit cb37d42)

nkanazawa1989 and others added 13 commits August 31, 2023 17:43

Add new data model as a replacement of CurveData dataclass. This obje…

fa675b4

…ct provides better reusability of the processed curve data.

Refactor the internals of curve fit. Figure creation code is now comb…

fc68e24

…ined in the single method _create_figures. This allows subclass to flexibly modify the figure generation without overwriting the entire _run_analysis.

Add pandas to requirements

4633b2f

Update unittest for scatter table.

731b763

Update type hint

07e85fc

Add release note

3f55ad9

Introduce artifact data and replace analysis results for curve data a…

458b98c

…nd fit summary in CurveAnalysis with artifact container. Composite curve analysis is also simplified.

Introduce base class of experiment data DataCollection which doesn't …

0a37892

…have service API.

Move FigureData to dedicated file

c942460

WIP: decouple service handling from ExperimentData.

c22b3db

Update docstrings and merge some main changes

c7964f7

result metadata parsing

e4da036

add test and some fixes

f60ad6c

coruscating added this to the Release 0.6 milestone Dec 12, 2023

coruscating added 6 commits January 8, 2024 00:03

merge main

9b9c193

revert experimentdata refactoring

ed5cde2

fix bugs

ef376d3

merge main

65ad21b

fix curve analysis behavior and update tests

f1a5671

fix tests and revert service handler

4e46084

coruscating marked this pull request as ready for review January 9, 2024 14:58

coruscating added 5 commits January 11, 2024 17:04

add release note and minor fixes

de3bc90

update serialization and add_artifact behavior, add test

2dacc98

fix flatten_results behavior

1293fa0

fix test

7f016c6

merge main

a755171

nkanazawa1989 reviewed Jan 18, 2024

View reviewed changes

coruscating and others added 3 commits January 25, 2024 11:51

move ArtifactData to containers

e71e962

Update internals of AnalysisResultTable

78a967a

Update internals of ScatterTable

7a83179

add intro

f74564b

wshanks mentioned this pull request Feb 6, 2024

0.6 release notes and deprecation policy #1385

Merged

coruscating added 2 commits February 6, 2024 22:33

update howto

eb89f37

Merge remote-tracking branch 'upstream/main' into dataframe-pr3

f9a987e

wshanks reviewed Feb 7, 2024

View reviewed changes

releasenotes/notes/experiment-artifacts-c481f4e07226ce9e.yaml Outdated Show resolved Hide resolved

coruscating added 2 commits February 7, 2024 00:31

fix artifact copy and add test

6b56b55

fix release note

83d33d4

nkanazawa1989 approved these changes Feb 7, 2024

View reviewed changes

nkanazawa1989 mentioned this pull request Feb 7, 2024

Epic - Implementation of RFC 0007: Dataframe for Qiskit Experiments Qiskit/RFCs#62

Closed

5 tasks

coruscating added the Changelog: New Feature Include in the "Added" section of the changelog label Feb 7, 2024

wshanks reviewed Feb 7, 2024

View reviewed changes

coruscating and others added 6 commits February 7, 2024 15:05

review comments

93b1298

Apply suggestions from code review

a448bb8

Co-authored-by: Will Shanks <wshaos@posteo.net>

review comments

fe7485f

minor fixes

62184d0

update plot

2b3733e

put curve fit back in analysis results and deprecate access

74a65ba

also deprecate accessing analysis results via numerical indices

wshanks reviewed Feb 8, 2024

View reviewed changes

update deprecation message

f91bfe0

wshanks approved these changes Feb 8, 2024

View reviewed changes

coruscating added this pull request to the merge queue Feb 8, 2024

Merged via the queue into qiskit-community:main with commit a7d260a Feb 8, 2024
11 checks passed

coruscating deleted the dataframe-pr3 branch February 8, 2024 04:41

nkanazawa1989 mentioned this pull request Feb 8, 2024

Cleanup composite analysis #1397

Merged

coruscating mentioned this pull request Feb 8, 2024

Clearly mark analysis result names in documentation #1400

Open

mergify bot mentioned this pull request Apr 22, 2024

Cleanup composite analysis (backport #1397) #1447

Closed

coruscating mentioned this pull request May 1, 2024

Remove deprecated code for 0.7 #1452

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add artifacts interface #1342

Add artifacts interface #1342

coruscating commented Dec 12, 2023 •

edited

Loading

nkanazawa1989 left a comment

nkanazawa1989 left a comment

wshanks left a comment

wshanks Feb 7, 2024

nkanazawa1989 Feb 7, 2024

wshanks Feb 7, 2024

wshanks Feb 7, 2024

wshanks Feb 7, 2024

coruscating Feb 7, 2024

wshanks Feb 7, 2024

coruscating Feb 7, 2024

wshanks Feb 7, 2024

wshanks Feb 8, 2024

coruscating Feb 8, 2024

wshanks left a comment

Add artifacts interface #1342

Add artifacts interface #1342

Conversation

coruscating commented Dec 12, 2023 • edited Loading

Summary

Details and comments

nkanazawa1989 left a comment

Choose a reason for hiding this comment

nkanazawa1989 left a comment

Choose a reason for hiding this comment

wshanks left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wshanks left a comment

Choose a reason for hiding this comment

coruscating commented Dec 12, 2023 •

edited

Loading