Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First try at scale transforms #134

Merged

Conversation

melissalinkert
Copy link
Member

See ome/ngff#57, follow-up to #121.

The main idea is to preserve physical sizes in the Zarr metadata itself, so that downstream applications don't have to read those values from the OME-XML. There is an open question of what to do for subresolutions; OME-XML won't store any physical sizes for those, so we could omit or store a calculated value.

/cc @sbesson @dgault (since I can't add external reviewers)

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some feedback based on the initial conversion of a multi-image CZI files. In general, completely agreed on limiting this implementation to mapping OME.PixelSize{X,Y,Z} into scale as this is the minimal transformation allowing compliance with the 0.4 specification.

Looking at the content of the .zattrs for one of the multiscales images:

{
  "multiscales" : [
    {
      "metadata" : {
        "method" : "loci.common.image.SimpleImageScaler",
        "version" : "Bio-Formats 6.8.0"
      },
      "axes" : [
        {
          "name" : "t",
          "type" : "time"
        },
        {
          "name" : "c",
          "type" : "channel"
        },
        {
          "name" : "z",
          "type" : "space"
        },
        {
          "unit" : "µm",
          "name" : "y",
          "type" : "space"
        },
        {
          "unit" : "µm",
          "name" : "x",
          "type" : "space"
        }
      ],
      "datasets" : [
        {
          "path" : "0",
          "transformations" : [
            {
              "scale" : [
                0.21998680079195246,
                0.21998680079195246
              ],
              "type" : "scale",
              "axisIndices" : [
                3,
                4
              ]
            }
          ]
        },
        {
          "path" : "1",
          "transformations" : [
            {
              "scale" : [
                0.21998680079195246,
                0.21998680079195246
              ],
              "type" : "scale",
              "axisIndices" : [
                3,
                4
              ]
            }
          ]
        },
        {
          "path" : "2",
          "transformations" : [
            {
              "scale" : [
                0.21998680079195246,
                0.21998680079195246
              ],
              "type" : "scale",
              "axisIndices" : [
                3,
                4
              ]
            }
          ]
        },
...
  • axisIndices has been reverted in the latest push for the 0.4 coordinateTransformations specification in favor of a simplified scale, translate vectore - see https://github.com/ome/ngff/pull/85/files. Unless the goal
  • scale should include all dimensions and use 1.0 when there is not metadata mapping to the physical space
  • unlike in the case of Bio-Formats where we read native sub-resolutions, my understanding is that the Converter always generates the pyramidal levels with a known downsampling factor. In that case, I think the scale vector should be computed for each dataset using the original pixel size and the downsampling factor.

@sbesson sbesson mentioned this pull request Mar 3, 2022
@melissalinkert
Copy link
Member Author

I think d6fcc38 addresses everything in #134 (review).

@sbesson
Copy link
Member

sbesson commented Mar 4, 2022

With the last commit, running a conversion on a real pyramidal input file leads to the expected layout

{
  "multiscales" : [
    {
      "metadata" : {
        "method" : "loci.common.image.SimpleImageScaler",
        "version" : "Bio-Formats 6.8.0"
      },
      "axes" : [
        {
          "name" : "t",
          "type" : "time"
        },
        {
          "name" : "c",
          "type" : "channel"
        },
        {
          "name" : "z",
          "type" : "space"
        },
        {
          "unit" : "µm",
          "name" : "y",
          "type" : "space"
        },
        {
          "unit" : "µm",
          "name" : "x",
          "type" : "space"
        }
      ],
      "datasets" : [
        {
          "path" : "0",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                0.21998680079195246,
                0.21998680079195246
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "1",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                0.4399736015839049,
                0.4399736015839049
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "2",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                0.8799472031678098,
                0.8799472031678098
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "3",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                1.7598944063356197,
                1.7598944063356197
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "4",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                3.5197888126712393,
                3.5197888126712393
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "5",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                7.039577625342479,
                7.039577625342479
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "6",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                14.079155250684957,
                14.079155250684957
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "7",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                28.158310501369915,
                28.158310501369915
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "8",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                56.31662100273983,
                56.31662100273983
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "9",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                112.63324200547966,
                112.63324200547966
              ],
              "type" : "scale"
            }
          ]
        }
      ],
      "version" : "0.2"
    }
  ]
}

The unit test expansion should also cover the testing of the scale values.
Also testing the conversion of a minimal test.fake without any physical size produces a compliant output

{
  "multiscales" : [
    {
      "metadata" : {
        "method" : "loci.common.image.SimpleImageScaler",
        "version" : "Bio-Formats 6.8.0"
      },
      "axes" : [
        {
          "name" : "t",
          "type" : "time"
        },
        {
          "name" : "c",
          "type" : "channel"
        },
        {
          "name" : "z",
          "type" : "space"
        },
        {
          "name" : "y",
          "type" : "space"
        },
        {
          "name" : "x",
          "type" : "space"
        }
      ],
      "datasets" : [
        {
          "path" : "0",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                1.0,
                1.0
              ],
              "type" : "scale"
            }
          ]
        },
        {
          "path" : "1",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                1.0,
                1.0
              ],
              "type" : "scale"
            }
          ]
        }
      ],
      "version" : "0.2"
    }
  ]
}

This example raised a question related to the following sentence in the 0.4 specification "If scaling information is not available or applicable for one of the axes set it to 1. ". Given the absence of physical metadata, I would consider that the scale vectors used above are in agreement with this statement.
However, one could certainly conceive that the resolution of all resolution levels beyond the first one should include the downscaling factor i.e. the expectation should be:

        {
          "path" : "1",
          "coordinateTransformations" : [
            {
              "scale" : [
                1.0,
                1.0,
                1.0,
                2.0,
                2.0
              ],
              "type" : "scale"
            }

@constantinpape is there an authoritative answer to this concern? Depending on the outcome, happy to open a PR or an issue against the ome/ngff related to this clarification.

Once this is settled, the changes proposed here all look good to me and should allow us to bump the multiscales version as proposed in #135.

@constantinpape
Copy link

@constantinpape is there an authoritative answer to this concern? Depending on the outcome, happy to open a PR or an issue against the ome/ngff related to this clarification.

Good catch; in this case we still need to add scaling information for the axes that are actually downscaled. So the second snippet for dataset "1" would be correct. (Otherwise the spaces wouldn't align). It would be good if you can open an issue / PR about this in ngff.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://ngff.openmicroscopy.org/0.4/ has been updated to clarify the fact that the scale vector should at minimum contain information about the downsampling of each resolution i.e. in the case of data without physical sizes downsampled by a factor of 2 along X and Y

[1.0, 1.0, 1.0, 1.0, 1.0] for the first resolution level
[1.0, 1.0, 1.0, 2.0, 2.0] for the second resolution level
[1.0, 1.0, 1.0, 4.0, 4.0] for the third resolution level
...

double resolutionScale = Math.pow(PYRAMID_SCALE, r);
for (int i=axisOrder.length()-1; i>=0; i--) {
Quantity axisScale = getScale(meta, series, axisOrder, i);
if (axisScale != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be updated to store the following values:

  • axisScale.value().doubleValue() * resolutionScale if downsampling happens along the dimension and there is some scaling metadata associated with the dimension
  • resolutionScale if downsampling happens along the dimension and there is no scaling metadata associated with the dimension
  • 1.0 otherwise

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 21633ca covers this, and will use axisScale.value().doubleValue() if scaling metadata is associated with the dimension but downsampling does not happen. If that case should actually use 1.0, I can update.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the last commit with several real and fake datasets allowing to test combinations of output. The transformation metadata is stored as expected including in the case where no physical metadata exists.

The expanded and new unit test should cover the different scenarios (dimension with scaling information and downsampling, dimension with no scaling information and downsampling, dimension with scaling information and no downsampling, dimension with no scaling information and no dowsampling).

Good to merge from my side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants