Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Circuit sum layer outputs NaN when categorical layer has fixed parameters in LSE-sum semiring #319

Closed
n28div opened this issue Nov 11, 2024 · 0 comments · Fixed by #314
Closed
Assignees
Labels
bug Something isn't working

Comments

@n28div
Copy link
Collaborator

n28div commented Nov 11, 2024

When using a fixed parametrization on categorical layers the output of MixingLayer is NaN for some combination of the inputs.
The issue seems to be related to the sum layer itself and particularly on the sum parameters.

By fixing the sum parameters to be unitary and picking the LSE-sum semiring, the result is wrong when both variables have zero value, but it is ok when either one of them has a non-zero.
Minimale code:

import numpy as np
import torch
import random

random.seed(0)
np.random.seed(0)
torch.manual_seed(0)
torch.cuda.manual_seed(0)

from cirkit.symbolic.layers import CategoricalLayer, MixingLayer, HadamardLayer
from cirkit.symbolic.parameters import Parameter, ConstantParameter
from cirkit.utils.scope import Scope
from cirkit.pipeline import PipelineContext
from cirkit.symbolic.circuit import Circuit

probs = lambda: Parameter.from_input(ConstantParameter(1, 1, 2, value=np.array([0.0, 1.0]).reshape(1, 1, -1)))

cl_1 = CategoricalLayer(Scope([0]), 1, 1, num_categories=2, probs=probs())
cl_2 = CategoricalLayer(Scope([1]), 1, 1, num_categories=2, probs=probs())

sum_layer = MixingLayer(1, 2)

ctx = PipelineContext(backend='torch', fold=False, optimize=False, semiring='lse-sum')

symbolic_circuit = Circuit(
    1, 
    [cl_1, cl_2, sum_layer],
    { sum_layer: [cl_1, cl_2] },
    [sum_layer]
)
circuit = ctx.compile(symbolic_circuit)
    
print(circuit(torch.tensor([0, 0]).reshape(1, 1, 2)))
# >>> tensor([[[nan]]], grad_fn=<TransposeBackward0>)

print(circuit(torch.tensor([0, 1]).reshape(1, 1, 2)))
# >>> tensor([[[nan]]], grad_fn=<TransposeBackward0>)

print(circuit(torch.tensor([1, 0]).reshape(1, 1, 2)))
# >>> tensor([[[0.4324]]], grad_fn=<TransposeBackward0>)

print(circuit(torch.tensor([1, 1]).reshape(1, 1, 2)))
# >>> tensor([[[0.2212]]], grad_fn=<TransposeBackward0>)

changing the random seed changes the results (e.g. with random seed 1 only the first evaluation is NaN) and the same happens when fixing the sum parameters by replacing

sum_layer = MixingLayer(1, 2)

with

sum_layer = MixingLayer(1, 2, weight_factory=lambda n: Parameter.from_input(ConstantParameter(*n, value=np.ones(n))))

I tracked down the error and it appears to be in cirkit.backend.torch.semiring.LSESumSemiring in the method apply_reduce: when both inputs are 0s the variable xs is (tensor([[[[-inf]], [[-inf]]]]),). In line 375 the subtraction between two -inf results in an undefined operation.

@loreloc loreloc added the bug Something isn't working label Nov 11, 2024
@loreloc loreloc added this to the cirkit 0.2.0 (murmur) milestone Nov 11, 2024
@loreloc loreloc self-assigned this Nov 17, 2024
loreloc added a commit that referenced this issue Nov 17, 2024
@loreloc loreloc linked a pull request Nov 20, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants