Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance #64

Merged
merged 25 commits into from
Nov 30, 2023
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
updated description and removed dask
Signed-off-by: lc3267 <melvin.strobl@kit.edu>
stroblme committed Nov 14, 2023
commit 8eaf5226827a379387e601d8730ce03502bfcd11
56 changes: 33 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
@@ -44,7 +44,7 @@ mkdocs build

***

## :rocket: Usage
## Usage :rocket:

Without any configuration needed, you can execute
```
@@ -76,13 +76,13 @@ poetry kedro run --pipeline "visualize"
```
after running the "prepare" pipeline.

This project can take advantage of multiprocessing using [Dask](dask.org/) to evaluate numerous combinations of *qubits*, *depths* and *shots*.
This project can take advantage of multiprocessing to evaluate numerous combinations of *qubits*, *depths* and *shots*.
To enable this, you can run
```
poetry run kedro run --pipeline "measure" --env dask --runner quafel.runner.DaskRunner
poetry run kedro run --pipeline "measure" --env dask --runner quafel.runner.MyParallelRunner
```
which will calculate the duration and result for each configuration.
See [Dask Setup](#runner-dask-setup) for detail on this.

For details on the output, see the [Data Structure Section](#floppy_disk-data-structure).


@@ -92,51 +92,61 @@ For details on the output, see the [Data Structure Section](#floppy_disk-data-st
Checkout the pre-defined VSCode tasks if you want to develop on the project.

***
### Tuning the test circuits

## Configuration :wrench:

### Tweaking the Partitions

Circuits are being generated in the ```data_generation``` namespace of the project.
To adjust the number of qubits, depth of the circuit and other parameters, checkout [conf/base/parameters/data_generation.yml](/conf/base/parameters/data_generation.yml).
To adjust the number of qubits, depth of the circuit, enabled frameworks and more, checkout [conf/base/parameters/data_generation.yml](/conf/base/parameters/data_generation.yml).

### Selecting a Framework and Execution behaviour
### Tweaking the Execution behaviour

Everything related to executing the circuits and time measurments is contained in the ```data_science``` namespace.
Head to [conf/base/parameters/data_science.yml](/conf/base/parameters/data_science.yml) to specify a framework and set e.g. the number of evaluations.

### :eyeglasses: Pipeline
### Tweaking the Visualization

By now, there is no specific Kedro-style configuration.
The generated plots can be adjusted using the `design` class located in [src/quafel/pipelines/visualization/nodes.py](src/quafel/pipelines/visualization/nodes.py).
Propagating these settings to a `.yml` file is on the agenda!

### Pipeline :eyeglasses:

You can actually see what's going on by running
```
poetry run kedro-viz
```
which will open a browser with [kedro-viz](https://github.com/kedro-org/kedro-viz) showing the pipeline.

![kedro-viz view of the pipeline](docs/kedro_view.png)

### :floppy_disk: Data Structure
## Data Structure :floppy_disk:

- [data/01_raw](data/01_raw):
- [Evaluation Matrix](data/01_raw/dataset.json) containing all valid values for ```frameworks```, ```qubits```, ```depths```, and ```shots``` as specified in the [data_generation.yml](conf/base/parameters/data_generation.yml) file.
- **Versioned** [Evaluation Matrix](data/01_raw/dataset.json) containing all valid values for ```frameworks```, ```qubits```, ```depths```, and ```shots``` as specified in the [data_generation.yml](conf/base/parameters/data_generation.yml) file.
- [data/02_intermediate](data/02_intermediate):
- Evaluation Partitions split into single ```.csv``` files
- Evaluation Partitions split into single ```.csv``` files.
- The number of partitions depend on the configuration.
- [data/03_qasm_circuits](data/03_qasm_circuits/):
- as the name suggests, all generated qasm circuits for the job with the corresponding id
- A QASM circuit for each partition.
- [data/04_execution_results](data/04_execution_results/):
- simulator results of the job with the corresponding id.
- result formats are unified as a dictionary with the keys containing the binary bit representation of the measured qubit and the normalized counts as values.
- results are zero padded, so it is ensured that also state combinations with $0$ probability are represented.
- Simulator results of the job with the corresponding id.
- Result formats are unified as a dictionary with the keys containing the binary bit representation of the measured qubit and the normalized counts as values.
- Results are zero padded, so it is ensured that also state combinations with $0$ probability are represented.
- [data/05_execution_durations](data/05_execution_durations/):
- duration for the simulation of the job with the corresponding id.
- duration is only measured for the execution of the simulator
- combining and post-processing results (to obtain the dictionary representation) is not involved
- Duration for the simulation of the job with the corresponding id.
- Duration is only measured for the execution of the simulator
- Combining and post-processing results (to obtain the dictionary representation) is not involved
- [data/06_evaluations_combined](data/06_evaluations_combined/):
- **Versioned** dataset containing the combined information of both, the input parameters (```framework```, ```qubits```, ```depth```, ```shots```), the measured duration and the simulator results
- [data/07_reportings](data/07_reporting):
- **Versioned** dataset with the ```.json``` formatted ploty heatmaps
- The data in this folder is named by the framework and the fixed parameter. E.g. when the number of ```qubits``` is plotted against the ```shots``` and the ```qiskit_fw``` is being used to simulate a circuit of ```depth``` $3$, the filename would be ```qiskit_fw_depth_3```.
- [data/08_print](data/07_reporting):
- Print-ready output of the visualization pipeline in `pdf` and `png` format.

Note that all datasets that are not marked as "**versioned**" will be overwritten on the next run!

## :construction: Adding new frameworks
## Adding new frameworks :construction:

New frameworks can easily be added by editing the [frameworks.py](src/quafel/pipelines/data_science/frameworks.py) file.
Frameworks are defined by classes following the ```NAME_fw``` naming template where ```NAME``` should be replaced by the framework to be implemented.
@@ -162,7 +172,7 @@ This dictionary is required to contain all combinations of bitstrings that resul
```python
bitstrings = [format(i, f"0{self.n_qubits}b") for i in range (2**self.n_qubits)]
```

<!--
## :runner: Dask Setup
When running
@@ -173,4 +183,4 @@ without any additional configuration, Kedro creates Dask scheduler and also four
This behavior can be controlled in [conf/dask/parameters.yml](conf/dask/parameters.yml).
Setting the address parameter will cause Kedro trying to connect to an existing scheduler at the specified address.
You create scheduler and $N$ workers by running `.vscode/spawn_n_workers.sh -N` from the root folder of the project.
Alternatively set `n_workers` in [conf/dask/parameters.yml](conf/dask/parameters.yml) and comment out `address` to specify the number of workers that kedro should spawn.
Alternatively set `n_workers` in [conf/dask/parameters.yml](conf/dask/parameters.yml) and comment out `address` to specify the number of workers that kedro should spawn. -->