Skip to content

Commit a19d92f

Browse files
committed
Use OnceCell to cache Python operations
We were previously using `RefCell` to handle the interior mutability for the Python-operation caches. This works fine, but has two problems: - `RefCell<Py<PyAny>>` is two pointers wide - we had to do heavier dynamic runtime checks for the borrow status on every access, including simple gets. This commit moves the caching to instead use `OnceCell`. The internal tracking for a `OnceCell<T>` is just whether an internal `Option<T>` is in the `Some` variant, and since `Option<Py<PyAny>>` will use the null-pointer optimisation for space, this means that `OnceCell<Py<PyAny>>` is only a single pointer wide. Since it's not permissible to take out a mutable reference to the interior, there is also no dynamic runtime checking---just the null-pointer check that the cell is initialised---so access is faster as well. We can still clear the cache if we hold a mutable reference to the `OnceCell`, and since every method that can invalidate the cache also necessarily changes Rust-space, they certainly require mutable references to the cell, making it safe. There is a risk here, though, in `OnceCell::get_or_init`. This function is not re-entrant; one cannot attempt the initialisation while another thread is initialising it. Typically this is not an issue, since the type isn't `Sync`, however PyO3 allows types that are only `Send` and _not_ `Sync` to be put in `pyclass`es, which Python can then share between threads. This is usually safe due to the GIL mediating access to Python-space objects, so there are no data races. However, if the initialiser passed to `get_or_init` yields control at any point to the Python interpreter (such as by calling a method on a Python object), this gives it a chance to block the thread and allow another Python thread to run. The other thread could then attempt to enter the same cache initialiser, which is a violation of its contract. PyO3's `GILOnceCell` can protect against this failure mode, but this is inconvenient to use for us, because it requires the GIL to be held to access the value at all, which is problematic during `Clone` operations. Instead, we make ourselves data-race safe by manually checking for initialisation, calcuting the cache value ourselves, and then calling `set` or `get_or_init` passing the already created object. While the initialiser can still yield to the Python interpreter, the actual setting of the cell is now atomic in this sense, and thus safe. The downside is that more than one thread might do the same initialisation work, but given how comparitively rare the use of Python threads is, it's better to prioritise memory use and single-Python-threaded access times than one-off cache population in multiple Python threads.
1 parent 8e45a5a commit a19d92f

File tree

4 files changed

+73
-64
lines changed

4 files changed

+73
-64
lines changed

crates/circuit/src/circuit_data.rs

+10-10
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
// that they have been altered from the originals.
1212

1313
#[cfg(feature = "cache_pygates")]
14-
use std::cell::RefCell;
14+
use std::cell::OnceCell;
1515

1616
use crate::bit_data::BitData;
1717
use crate::circuit_instruction::{CircuitInstruction, OperationFromPython};
@@ -162,7 +162,7 @@ impl CircuitData {
162162
params,
163163
extra_attrs: None,
164164
#[cfg(feature = "cache_pygates")]
165-
py_op: RefCell::new(None),
165+
py_op: OnceCell::new(),
166166
});
167167
res.track_instruction_parameters(py, res.data.len() - 1)?;
168168
}
@@ -218,7 +218,7 @@ impl CircuitData {
218218
params,
219219
extra_attrs: None,
220220
#[cfg(feature = "cache_pygates")]
221-
py_op: RefCell::new(None),
221+
py_op: OnceCell::new(),
222222
});
223223
res.track_instruction_parameters(py, res.data.len() - 1)?;
224224
}
@@ -280,7 +280,7 @@ impl CircuitData {
280280
params,
281281
extra_attrs: None,
282282
#[cfg(feature = "cache_pygates")]
283-
py_op: RefCell::new(None),
283+
py_op: OnceCell::new(),
284284
});
285285
Ok(())
286286
}
@@ -542,7 +542,7 @@ impl CircuitData {
542542
params: inst.params.clone(),
543543
extra_attrs: inst.extra_attrs.clone(),
544544
#[cfg(feature = "cache_pygates")]
545-
py_op: RefCell::new(None),
545+
py_op: OnceCell::new(),
546546
});
547547
}
548548
} else if copy_instructions {
@@ -554,7 +554,7 @@ impl CircuitData {
554554
params: inst.params.clone(),
555555
extra_attrs: inst.extra_attrs.clone(),
556556
#[cfg(feature = "cache_pygates")]
557-
py_op: RefCell::new(None),
557+
py_op: OnceCell::new(),
558558
});
559559
}
560560
} else {
@@ -650,7 +650,7 @@ impl CircuitData {
650650
inst.extra_attrs = result.extra_attrs;
651651
#[cfg(feature = "cache_pygates")]
652652
{
653-
*inst.py_op.borrow_mut() = Some(py_op.unbind());
653+
inst.py_op = py_op.unbind().into();
654654
}
655655
}
656656
Ok(())
@@ -1142,7 +1142,7 @@ impl CircuitData {
11421142
params: (!inst.params.is_empty()).then(|| Box::new(inst.params.clone())),
11431143
extra_attrs: inst.extra_attrs.clone(),
11441144
#[cfg(feature = "cache_pygates")]
1145-
py_op: RefCell::new(inst.py_op.borrow().as_ref().map(|obj| obj.clone_ref(py))),
1145+
py_op: inst.py_op.clone(),
11461146
})
11471147
}
11481148

@@ -1233,7 +1233,7 @@ impl CircuitData {
12331233
{
12341234
// Standard gates can all rebuild their definitions, so if the
12351235
// cached py_op exists, just clear out any existing cache.
1236-
if let Some(borrowed) = previous.py_op.borrow().as_ref() {
1236+
if let Some(borrowed) = previous.py_op.get() {
12371237
borrowed.bind(py).setattr("_definition", py.None())?
12381238
}
12391239
}
@@ -1302,7 +1302,7 @@ impl CircuitData {
13021302
previous.extra_attrs = new_op.extra_attrs;
13031303
#[cfg(feature = "cache_pygates")]
13041304
{
1305-
*previous.py_op.borrow_mut() = Some(op.into_py(py));
1305+
previous.py_op = op.into_py(py).into();
13061306
}
13071307
for uuid in uuids.iter() {
13081308
self.param_table.add_use(*uuid, usage)?

crates/circuit/src/circuit_instruction.rs

+24-25
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
// that they have been altered from the originals.
1212

1313
#[cfg(feature = "cache_pygates")]
14-
use std::cell::RefCell;
14+
use std::cell::OnceCell;
1515

1616
use numpy::IntoPyArray;
1717
use pyo3::basic::CompareOp;
@@ -110,7 +110,7 @@ pub struct CircuitInstruction {
110110
pub params: SmallVec<[Param; 3]>,
111111
pub extra_attrs: Option<Box<ExtraInstructionAttributes>>,
112112
#[cfg(feature = "cache_pygates")]
113-
pub py_op: RefCell<Option<PyObject>>,
113+
pub py_op: OnceCell<Py<PyAny>>,
114114
}
115115

116116
impl CircuitInstruction {
@@ -122,17 +122,18 @@ impl CircuitInstruction {
122122
/// Get the Python-space operation, ensuring that it is mutable from Python space (singleton
123123
/// gates might not necessarily satisfy this otherwise).
124124
///
125-
/// This returns the cached instruction if valid, and replaces the cached instruction if not.
126-
pub fn get_operation_mut(&self, py: Python) -> PyResult<Py<PyAny>> {
127-
let mut out = self.get_operation(py)?.into_bound(py);
128-
if !out.getattr(intern!(py, "mutable"))?.extract::<bool>()? {
129-
out = out.call_method0(intern!(py, "to_mutable"))?;
130-
}
131-
#[cfg(feature = "cache_pygates")]
132-
{
133-
*self.py_op.borrow_mut() = Some(out.to_object(py));
125+
/// This returns the cached instruction if valid, but does not replace the cache if it created a
126+
/// new mutable object; the expectation is that any mutations to the Python object need
127+
/// assigning back to the `CircuitInstruction` completely to ensure data coherence between Rust
128+
/// and Python spaces. We can't protect entirely against that, but we can make it a bit harder
129+
/// for standard-gate getters to accidentally do the wrong thing.
130+
pub fn get_operation_mut<'py>(&self, py: Python<'py>) -> PyResult<Bound<'py, PyAny>> {
131+
let out = self.get_operation(py)?.into_bound(py);
132+
if out.getattr(intern!(py, "mutable"))?.is_truthy()? {
133+
Ok(out)
134+
} else {
135+
out.call_method0(intern!(py, "to_mutable"))
134136
}
135-
Ok(out.unbind())
136137
}
137138
}
138139

@@ -155,7 +156,7 @@ impl CircuitInstruction {
155156
params: op_parts.params,
156157
extra_attrs: op_parts.extra_attrs,
157158
#[cfg(feature = "cache_pygates")]
158-
py_op: RefCell::new(Some(operation.into_py(py))),
159+
py_op: operation.into_py(py).into(),
159160
})
160161
}
161162

@@ -182,7 +183,7 @@ impl CircuitInstruction {
182183
})
183184
}),
184185
#[cfg(feature = "cache_pygates")]
185-
py_op: RefCell::new(None),
186+
py_op: OnceCell::new(),
186187
})
187188
}
188189

@@ -197,9 +198,14 @@ impl CircuitInstruction {
197198
/// The logical operation that this instruction represents an execution of.
198199
#[getter]
199200
pub fn get_operation(&self, py: Python) -> PyResult<PyObject> {
201+
// This doesn't use `get_or_init` because a) the initialiser is fallible and
202+
// `get_or_try_init` isn't stable, and b) the initialiser can yield to the Python
203+
// interpreter, which might suspend the thread and allow another to inadvertantly attempt to
204+
// re-enter the cache setter, which isn't safe.
205+
200206
#[cfg(feature = "cache_pygates")]
201207
{
202-
if let Ok(Some(cached_op)) = self.py_op.try_borrow().as_deref() {
208+
if let Some(cached_op) = self.py_op.get() {
203209
return Ok(cached_op.clone_ref(py));
204210
}
205211
}
@@ -215,9 +221,7 @@ impl CircuitInstruction {
215221

216222
#[cfg(feature = "cache_pygates")]
217223
{
218-
if let Ok(mut cell) = self.py_op.try_borrow_mut() {
219-
cell.get_or_insert_with(|| out.clone_ref(py));
220-
}
224+
self.py_op.get_or_init(|| out.clone_ref(py));
221225
}
222226

223227
Ok(out)
@@ -337,7 +341,7 @@ impl CircuitInstruction {
337341
params: params.unwrap_or(op_parts.params),
338342
extra_attrs: op_parts.extra_attrs,
339343
#[cfg(feature = "cache_pygates")]
340-
py_op: RefCell::new(Some(operation.into_py(py))),
344+
py_op: operation.into_py(py).into(),
341345
})
342346
} else {
343347
Ok(Self {
@@ -347,12 +351,7 @@ impl CircuitInstruction {
347351
params: params.unwrap_or_else(|| self.params.clone()),
348352
extra_attrs: self.extra_attrs.clone(),
349353
#[cfg(feature = "cache_pygates")]
350-
py_op: RefCell::new(
351-
self.py_op
352-
.try_borrow()
353-
.ok()
354-
.and_then(|opt| opt.as_ref().map(|op| op.clone_ref(py))),
355-
),
354+
py_op: self.py_op.clone(),
356355
})
357356
}
358357
}

crates/circuit/src/dag_node.rs

+5-5
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
// that they have been altered from the originals.
1212

1313
#[cfg(feature = "cache_pygates")]
14-
use std::cell::RefCell;
14+
use std::cell::OnceCell;
1515

1616
use crate::circuit_instruction::{CircuitInstruction, OperationFromPython};
1717
use crate::imports::QUANTUM_CIRCUIT;
@@ -170,7 +170,7 @@ impl DAGOpNode {
170170
instruction.operation = instruction.operation.py_deepcopy(py, None)?;
171171
#[cfg(feature = "cache_pygates")]
172172
{
173-
*instruction.py_op.borrow_mut() = None;
173+
instruction.py_op = OnceCell::new();
174174
}
175175
}
176176
let base = PyClassInitializer::from(DAGNode { _node_id: -1 });
@@ -223,7 +223,7 @@ impl DAGOpNode {
223223
params: self.instruction.params.clone(),
224224
extra_attrs: self.instruction.extra_attrs.clone(),
225225
#[cfg(feature = "cache_pygates")]
226-
py_op: RefCell::new(None),
226+
py_op: OnceCell::new(),
227227
})
228228
}
229229

@@ -240,7 +240,7 @@ impl DAGOpNode {
240240
self.instruction.extra_attrs = res.extra_attrs;
241241
#[cfg(feature = "cache_pygates")]
242242
{
243-
*self.instruction.py_op.borrow_mut() = Some(op.into_py(op.py()));
243+
self.instruction.py_op = op.clone().unbind().into();
244244
}
245245
Ok(())
246246
}
@@ -399,7 +399,7 @@ impl DAGOpNode {
399399
/// Sets the Instruction name corresponding to the op for this node
400400
#[setter]
401401
fn set_name(&mut self, py: Python, new_name: PyObject) -> PyResult<()> {
402-
let op = self.instruction.get_operation_mut(py)?.into_bound(py);
402+
let op = self.instruction.get_operation_mut(py)?;
403403
op.setattr(intern!(py, "name"), new_name)?;
404404
self.instruction.operation = op.extract::<OperationFromPython>()?.operation;
405405
Ok(())

crates/circuit/src/packed_instruction.rs

+34-24
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111
// that they have been altered from the originals.
1212

1313
#[cfg(feature = "cache_pygates")]
14-
use std::cell::RefCell;
14+
use std::cell::OnceCell;
1515
use std::ptr::NonNull;
1616

1717
use pyo3::intern;
@@ -421,11 +421,17 @@ pub struct PackedInstruction {
421421
pub extra_attrs: Option<Box<ExtraInstructionAttributes>>,
422422

423423
#[cfg(feature = "cache_pygates")]
424-
/// This is hidden in a `RefCell` because, while that has additional memory-usage implications
425-
/// while we're still building with the feature enabled, we intend to remove the feature in the
426-
/// future, and hiding the cache within a `RefCell` lets us keep the cache transparently in our
427-
/// interfaces, without needing various functions to unnecessarily take `&mut` references.
428-
pub py_op: RefCell<Option<Py<PyAny>>>,
424+
/// This is hidden in a `OnceCell` because it's just an on-demand cache; we don't create this
425+
/// unless asked for it. A `OnceCell` of a non-null pointer type (like `Py<T>`) is the same
426+
/// size as a pointer and there are no runtime checks on access beyond the initialisation check,
427+
/// which is a simple null-pointer check.
428+
///
429+
/// WARNING: remember that `OnceCell`'s `get_or_init` method is no-reentrant, so the initialiser
430+
/// must not yield the GIL to Python space. We avoid using `GILOnceCell` here because it
431+
/// requires the GIL to even `get` (of course!), which makes implementing `Clone` hard for us.
432+
/// We can revisit once we're on PyO3 0.22+ and have been able to disable its `py-clone`
433+
/// feature.
434+
pub py_op: OnceCell<Py<PyAny>>,
429435
}
430436

431437
impl PackedInstruction {
@@ -471,33 +477,37 @@ impl PackedInstruction {
471477
/// containing circuit; updates to its parameters, label, duration, unit and condition will not
472478
/// be propagated back.
473479
pub fn unpack_py_op(&self, py: Python) -> PyResult<Py<PyAny>> {
474-
#[cfg(feature = "cache_pygates")]
475-
{
476-
if let Ok(Some(cached_op)) = self.py_op.try_borrow().as_deref() {
477-
return Ok(cached_op.clone_ref(py));
478-
}
479-
}
480-
481-
let out = match self.op.view() {
482-
OperationRef::Standard(standard) => standard
483-
.create_py_op(
480+
let unpack = || -> PyResult<Py<PyAny>> {
481+
match self.op.view() {
482+
OperationRef::Standard(standard) => standard.create_py_op(
484483
py,
485484
self.params.as_deref().map(SmallVec::as_slice),
486485
self.extra_attrs.as_deref(),
487-
)?
488-
.into_any(),
489-
OperationRef::Gate(gate) => gate.gate.clone_ref(py),
490-
OperationRef::Instruction(instruction) => instruction.instruction.clone_ref(py),
491-
OperationRef::Operation(operation) => operation.operation.clone_ref(py),
486+
),
487+
OperationRef::Gate(gate) => Ok(gate.gate.clone_ref(py)),
488+
OperationRef::Instruction(instruction) => Ok(instruction.instruction.clone_ref(py)),
489+
OperationRef::Operation(operation) => Ok(operation.operation.clone_ref(py)),
490+
}
492491
};
493492

493+
// `OnceCell::get_or_init` and the non-stabilised `get_or_try_init`, which would otherwise
494+
// be nice here are both non-reentrant. This is a problem if the init yields control to the
495+
// Python interpreter as this one does, since that can allow CPython to freeze the thread
496+
// and for another to attempt the initialisation.
494497
#[cfg(feature = "cache_pygates")]
495498
{
496-
if let Ok(mut cell) = self.py_op.try_borrow_mut() {
497-
cell.get_or_insert_with(|| out.clone_ref(py));
499+
if let Some(ob) = self.py_op.get() {
500+
return Ok(ob.clone_ref(py));
498501
}
499502
}
500-
503+
let out = unpack()?;
504+
#[cfg(feature = "cache_pygates")]
505+
{
506+
// The unpacking operation can cause a thread pause and concurrency, since it can call
507+
// interpreted Python code for a standard gate, so we need to take care that some other
508+
// Python thread might have populated the cache before we do.
509+
let _ = self.py_op.set(out.clone_ref(py));
510+
}
501511
Ok(out)
502512
}
503513
}

0 commit comments

Comments
 (0)