-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New image buffer #570
base: main
Are you sure you want to change the base?
New image buffer #570
Conversation
…oreAndDevices into data_buffer_new
…nters to java longs
…ure metadata is accurate for v2 buffer images
…c; Also centralized Metadata generation into the core from SWIG, core callback, device base
Sounds good, thanks! Also happy to discuss over zoom if that's easier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your hard work on this @henrypinkard, it's very promising!
I took a look over and compiled it for testing pymmcore-plus. I had to make a few changes to the C++ code in order to get pymmcore to compile, and I've tried to mark all of those places inline below.
A few pymmcore-plus tests failed with the changes in metadata that was included by default, specifically the timing metadata, which probably can't be removed to a non-default level without a deprecation.
As long as core.enableV2Buffer
is not True, everything else works fine. If I do enable the v2 buffer, I have one test failing, and it relates to using hardware sequencing with multi-cam. I haven't quite nailed it down yet, so I don't have much useful info at the moment... but for some reason, the buffer seems to return more images than expected (i.e. getRemainingImageCount
keeps being non-zero beyond what I would have expected it to, and beyond what it does without core.enableV2Buffer
enabled). Will let you know when I learn more
thanks again!
MMCore/MMCore.h
Outdated
/** \name v2 buffer control. */ | ||
///@{ | ||
void enableV2Buffer(bool enable) throw (CMMError); | ||
bool usesV2Buffer() const { return bufferManager_->IsUsingV2Buffer(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels quite similar to feature flags... Could we use enableFeature
/isFeatureEnabled
instead of adding two new public methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine by me. Do you agree @marktsuchida ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And it would be good to take into account the guidelines in CoreFeatures.cpp: for example, make sure that the newly added methods that require the V2 buffer throw an error if invoked without enabling the feature (if they don't already).
I'd maybe also suggest using a name like NewSequenceBuffer
or SequenceBuffer2025
rather than "V2", since we probably don't want to appear to have a versioning scheme separate from MMCore's. It could also be a name indicating its properties, such as FastSequenceBuffer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a way to do this with the feature feature, because enabling or disabling the v2 buffer requires additional actions in the core:
int ret = bufferManager_->EnableV2Buffer(enable);
// Default include when using circular buffer, exclude new buffer
imageMDIncludeLegacyCalibration_ = !enable;
imageMDIncludeSystemStateCache_ = !enable;
imageMDIncludeCameraTags_ = !enable;
and the CoreFeatures seems to just be boolean flags without a pointer to the Core itself. @marktsuchida can you clarify what the preferred behavior is here?
* The caller should have locked the camera device, or be calling from a thread | ||
* in the camera (e.g. CoreCallback::InsertImage) | ||
*/ | ||
void CMMCore::addCameraMetadata(std::shared_ptr<CameraInstance> pCam, Metadata& md, unsigned width, unsigned height, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmCoreAndDevices/MMCore/MMCore.cpp:2706:15: error: exception specification in declaration does not match previous declaration
2706 | void CMMCore::addCameraMetadata(std::shared_ptr<CameraInstance> pCam, Metadata& md, unsigned width, unsigned height,
| ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about this. still seeing it with latest code?
Thanks for the review @tlambert03! Are you on Mac/Linux? (I've only tested on Windows) Will address some of those specific bugs later.
I agree that the metadata needs to be discussed and figured out before merging Metadata
I favor limiting default metadata to minimize performance impact. Breaking dependencies will become apparent and can be fixed, whereas including everything causes performance hits without clear attribution. We also may need a more nuanced approach, like enabling certain metadata in MMCoreJ or pymmcore. I think we first need to pool knowledge about the various pieces of Metadata, their original purpose, and the present reasons for their use. I have only limited knowledge of this. @nicost @marktsuchida @jondaniels @go2scope please chime in: Specific metadata(See the new consolidated function) Width, Height, PixelType, Camera Bit Depth ROI/Binning Timestamps/image number Camera-specific Tags Pixel-Size/affine transform
System state cache |
Okay sounds good, let me know. If its only breaking multi-cam with v2 buffer I'd say its less critical, because the v2 buffer makes the workaround provided by the multi-camera adapter unnecessary. But couldn't hurt to fix if you're able to nail it down Not urgent, but if you have a test suite for core functionality, you might consider in the future moving it to https://github.com/micro-manager/mmpycorex, so that it could also cover functionality via MMCoreJ |
For the metadata, I think it is fine to default to a minimal set (or even empty) when using V2, given that you are providing control over including/excluding each group (using flags, as proposed above). MMStudio and other legacy code (Clojure engine) will need to start by enabling all of the flags, but the flags provide a nice way to migrate. Some comments on the individual groups:
This is the actual sample bit depth (e.g., 12 or 14 bit), not the sample format (8 or 16 bit).
It looks like this is Potentially important if allowing As for the actual camera-specific tags (which cameras send via
Excluding by default seems reasonable. Although I do think we need a solution for pixel sizes when using multiple cameras; that can be a separate discussion, including whether the mapping should be done by MMCore or by the application.
I agree. "legacy metadata" is a good name for these. Or maybe "additional legacy metadata" because some of the others feel a bit legacy, too. Both the Clojure engine and MMStudio's use of AcqEngJ (in combination with the Datastores receiving the images from them) could potentially be relying on these fields being present under certain conditions, so it's good that they can be switched on until confirmed to be unnecessary. |
Are we talking here about per-image data (I believe so, but good to get this clear)?
For area sensor like cameras, this is critical information. Data consumers need to have access to this. This may even change in a stream of data (several camera APIs have the option send separate ROIs, all from the same exposure).
Ideally, each camera would add their own time stamp (and sequential image number). Knowing the exact time the exposure happens is critical for many applications. The problem has been that many camera device adapter do not provide this information, hence the work-around to provide the time data arrived in the core (which also allows untangling camera time from computer time). I am not sure where this should be, but there has to be a way to record this data.
There needs to be a way to get this information from the image data. This may be stored just in summary metadata.
In the ideal world, it will always be possible to reconstruct the complete system state at the moment the image was taken. I believe that was the reason to include the complete system state cache. The cost was relatively high, limiting data rates to on or a few kHZ, which was very detrimental for high speed acquisition of many small images. In reality, changes in the system state during fast acquisitions are often not even recorded in the system state cache. It seems most useful to me to store the system state in the summary metadata, and then add known deviations from the system state to the image metadata. For hardware synchronized acquisitions, it may be problematic that only the acquisition engine has complete knowledge of the (desired) state of the system. In that sense, adding the recipe for the desired changes to the summary metadata could suffice. |
I noticed that a few new functions ( I would agree with a cautious approach here because it would be bad if we have cameras that only work with V1 or V2. It would also be bad if every camera that supports |
Answers to questions:
Makes sense.
Right, I just meant you can assume 12, 14 etc map to 16. But maybe the performance hit is small enough that it can just be included.
Hmmm I guess I am unfamiliar with this mechanism. But another one that's not a significant performance hit
Agree
This might have been an accidental omission -- but I think perhaps the best strategy is to add them to ensure they work with the design, but keep them commented out for now. Thoughts? Agree that we don't want cameras to have to be aware of which buffer they write into. It is already the case that
But I'm not sure how this could work without potential memory safety issues. Seems like it would be a shame to not allow cameras to write directly into the buffer just because v1 doesn't support it. Are there other ways to handle this?
Correct. Summary metadata is generated by the acquisition engine, not the core.
Okay so default include I suppose
Already is for AcqEngJ, so that argues to default exclude this one
In the existing implementation, this isn't exactly what is happening, since the system state cache is added at the time of image readout, not image acquisition
I'm not sure storing the changes would solve the performance problem, because (if I'm not mistaken) you'd still have to iterate through the full system state cache on each image to look for changes. One way to do this might be to have a thread in the core listen to callbacks from devices, and update what should be added to image metadata when it gets notified a change. But this might require a good amount of work and should probably be a seperate PR.
Adding the recipe for changes essentially already happens in AcqEngJ by converting acquisition events to image metadata. I think given all these considerations, I would propose:
|
Potential metadata plan. Are we in agreement on this? Always includeWidth, Height, PixelType, Camera Default includeBit Depth Default include v1, default exclude v2. (+ add to summary metadata)Camera-specific Tags Default excludeLegacy metadata: FrameIndex, SliceIndex, etc |
Sounds good. As for System State Cache, there currently is a mechanism to switch that off if desired (MMCoreJ.i::includeSystemStateCache_). I am all for cleaning that up, but we will need a way to continue switching that off also when using V1 when needed (hoping to start V2 soon on the Java side, but expecting that will take some time). |
I moved that method to the core and out of the SWIG layer, but will keep it indefinitely for backwards compatibility |
Added system state cache to summary metadata: micro-manager/AcqEngJ#127 |
Okay I think I've addressed everything except for the two remaining unresolved comments above. I think what to do about acquiring write slots can be addressed in future PR, but it would be good to figure out what the eventual strategy will be
This was indeed an accidental omission. I've added them to the interface, but this is commented out for now
I considered creating an intermediate compatibility buffer that would temporarily store data when a camera acquires a write slot with I think the path forward is to enable write slot acquisition in a future PR after additional testing, with the requirement that camera device adapters using the acquire/release slot feature must use the new buffer. Since the performance testing shows |
I'm finally getting around to looking at the details of the image retrieval API for the V2 buffer. Please correct me if I'm misunderstanding anything below. Using the new mechanism, the app (let's look at Java for now) calls On the other hand, if the app obtains a It's also generally hard to see what all the problems are that could arise from the lifetime management (or lack thereof) of buffer slots -- a problem in itself. I think the only safe way to deal with this is to explicitly share ownership of the buffer slots (including the memory backing them) between MMCore and the app. This could be done by managing the slots with After having written the above, I realized that you don't have separate buffers for each slot (like the V1 buffer) but rather one big, contiguous buffer. You can still have (It is not clear to me what the advantage of the contiguous buffer is. You end up using As for the Java API for things that need to be explicitly "released" by user code, it should conform to the (Ideally we also automatically release the I'm afraid I cannot recommend merging this until these buffer lifetime issues are addressed. If possible, it might be productive to split this PR into two: one that cleans up the metadata handling without introducing the V2 buffer, and one that purely introduces the V2 buffer. That would speed up reviewing the changes to the metadata handling (which I still need to take another look at -- I'm mostly happy with it but it's easier to make 100% sure there are no unknown changes in behavior than to later troubleshoot the existing (sometimes hacky) application code that might depend on exact behavior). |
Thanks for taking a look. Your understanding is mostly correct -- but in a couple places I think you've misunderstood, and in fact the behavior you're advocating for is already implemented.
Correct
The clearing/deletion of the v2 differ is handled differently than v1 circular buffer for exactly this reason. The v1 buffer clears/is reallocated every time before starting a sequence acquisition. The v2 buffer does not. For v2, we've now split into two separate operations: clearing and resetting. Trying to clear when there is application code that holds outstanding slots will throw an error. Reset is the more dangerous operation that has the problems you mention, which is why its not simply slotted in eveywhere that the old circular buffer used to be cleared. For example, for the case of changing the buffer size, we have: void BufferManager::ReallocateBuffer(unsigned int memorySizeMB) {
if (useNewDataBuffer_.load()) {
int numOutstanding = newDataBuffer_->NumOutstandingSlots();
if (numOutstanding > 0) {
throw CMMError("Cannot reallocate NewDataBuffer: " + std::to_string(numOutstanding) + " outstanding active slot(s) detected.");
}
delete newDataBuffer_;
newDataBuffer_ = new DataBuffer(memorySizeMB);
} else {
delete circBuffer_;
circBuffer_ = new CircularBuffer(memorySizeMB);
}
}
You'd have to call
This is essentially what already happens (though not with
Empirical testing indicated that the currect mechanism of memory mapping a large buffer had the best performance. True abou the fragmentation, though I don't think this is so likely to happen in practice (I can provide more detail if needed). In any case, this is an internal implementation detail that can always be changed in a future PR without breaking backwards compatibility.
In my opinion, 32 bit support should be retired
I went back and forth on how forgiving the design should be about forgetting to call I would say its better to not upgrade to Java 9 first in case that brings other unforseen issues.
I understand the motivation for this, but unfortunately, I think this would be very challenging. The metadata generation was tangled up in many other functions. I tried to do this very carefully to avoid unexpected changes in higher level application code. It will at least be straightforward to implement fixes once identified since it is all centralized now. We could consider default including legacy metadata (even though it would give a performance hit) on the v1 buffer so that unexpected things don't break Update: I split out changes to the circularbuffer behavior into #588 |
@marktsuchida I've made changes based on our discussion yesterday:
Also some further explanation on this:
Note that the contiguous buffer is memory mapped, so its not the same a regular contiguous allocation (which i tried first and was very slow). The combination of a contiguous memory mapping + slot management system (free regions, etc) is efficient because you never have to create new heap objects. The circular buffer is much slower to initialize (see graph above), especially for small image size, because it pre-allocates many frameBuffers to hold its images. I'm not sure how the new buffer could maintain flexibility to different image sizes yet not suffer this pre-allocation penalty without the current strategy of allocating a big block |
The V2 buffer provides thread-safe, generic data storage with improved performance and cleaner abstractions.
Before merging
Design
Two core components:
DataBuffer: Thread-safe generic storage replacing CircularBuffer
BufferManager: Unified interface managing both legacy and new implementations
Key features of the new buffer system:
It can be enabled with:
Performance
As a drop in replacement for the circular buffer (i.e. copying the data same number of times, but allowing for arbitrary size and data types), the new buffer gives equal or better performance:
In sequence acquisitions:

In continuous sequence acquisitions (live mode):
It's significantly faster to allocate
Additionally, it has two key features that will enable much higher performance code:
Testing
I've written and validated the new buffer and circular buffer against many new tests here. (FYI these live in mmpycorex so they can easily test both MMCoreJ and pymmcore)
It also passes all the pycromanager acquisition tests, which test the various functionalities of the acquisition engine
Metadata
In conjunction with these changes, it made sense to standardize the metadata added to to images. This was previously split amongst several places, making it hard to keep track of and maintain, including the SWIG wrapper, the core, the corec allback, and the device code. Some of it was generated at the time of image acquisition, and some of it was generated at the time of image retrieval.
It has now all been consolidated into
void CMMCore::addCameraMetadata
, and the same metadata is added to all images whether snapped or passed through a buffer (with the small exception of some multi-camera device adapter-specific tags).Testing reveals there's a substantial performance cost to adding so much metadata to all images:
Previously, much of this cost was incurred when reading images back out of the buffer. With the new changes, it is incurred at the time of insertion. However, I think it makes much more sense that this metadata is added at insertion time, because that's when its most likely to be in sync with the actual state of hardware.
Since this consolidation takes place outside the BufferManager, it also affects the circular buffer and will change behavior even if the v2 buffer is disabled. We need to figure out what should be enabled here. It's unclear (to me) what higher level code depends on what tags, but including the union of all of them by default will substantially hurt performance. I also have just a temporary function in the core API for controlling which metadata to add, which should perhaps be replaced with something more permanent.
Multi-camera
While it is possible to use the v2 buffer with multi-camera devices, since its flexibility is a more general solution (e.g. supports different image sizes, types, etc) to than the multi-camera device adapter, in my opinion that should be deprecated and application code that relies on it updated to the v2 buffer.
One addition here is the
getLastTaggedImageFromDevicePointer("cameraLabel")
, which enables you to get the last image from a specific camera, rather than having to search backwards through the most recent images and read their metadata.A step towards a single route for all data
The pointer-based API gives a good opportunity to start moving towards a single route for all data, rather than a separate route for snap and sequences. I don't think its possible to fully do this without changing how cameras handle data for snaps, but in the mean time the
GetImagePointer
function now copies the snap buffer in camera adapters into the v2 buffer, returning a pointer to it. This should be faster than copying into the application memory space because it can be multithreaded, and still allows the pointer-based handling of the data from the application layer.Pointer based image handling
You get these through methods like
getLastImagePointer()
, which return aTaggedImagePointer
object. This object is a wrapper around theTaggedImage
object, but it will not load the pixels until you callgetPixels()
, or if you never want to use them you can callrelease()
, or just use the metadata without pixels like: