Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible out-of-bounds read from mo_match in spral_match_order #230

Closed
jwnimmer-tri opened this issue Mar 5, 2025 · 9 comments · Fixed by #231
Closed

Possible out-of-bounds read from mo_match in spral_match_order #230

jwnimmer-tri opened this issue Mar 5, 2025 · 9 comments · Fixed by #231
Assignees

Comments

@jwnimmer-tri
Copy link
Contributor

Thanks for your work on producing such useful software!

As part of the Drake test suite, we have code in Drake that calls Ipopt which calls SPRAL. And as part of our CI matrix, we run the tests under valgrind's memcheck tool which detects out-of-bounds reads (i.e., memory errors). After upgrading SPRAL to the latest tag 2025.03.03 (RobotLocomotion/drake#22693), we started seeing consistent reports of memory errors.

I'll post some details below, but take note that this isn't necessarily a bug in SPRAL. It's possible that either Drake or Ipopt has a bug where we are passing bad arguments down into SPRAL, causing the problem. It's also possible that Valgrind is mistaken about there being a bug and there's no problem anyway.

The error report is in match_order.f90, and if I revert commit c2641c0, the error report no longer occurs.

I thought I'd raise an issue now in case anything jumps out at the team here. I'm still going to work on trying to produce a more minimal test case to help zoom in on the problem, and understand if its real.

Here's the error report:

Invalid read of size 8
   at 0x26D510D: __spral_match_order_MOD_mo_match (match_order.f90:618)
   by 0x26D5C3B: __spral_match_order_MOD_mo_scale (match_order.f90:455)
   by 0x26D825D: __spral_match_order_MOD_match_order_metis_ptr64 (match_order.f90:194)
   by 0x26D1F1B: __spral_ssids_MOD_analyse_double (ssids.f90:352)
   by 0x26D3AD6: __spral_ssids_MOD_analyse_double_ptr32 (ssids.f90:121)
   by 0x26BD71F: spral_ssids_analyse_ptr32 (ssids.f90:397)
   by 0x2687AC4: Ipopt::SpralSolverInterface::MultiSolve(bool, int const*, int const*, int, double*, bool, int) (IpSpralSolverInterface.cpp:547)
   by 0x25CD26E: Ipopt::TSymLinearSolver::MultiSolve(Ipopt::SymMatrix const&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, bool, int) (IpTSymLinearSolver.cpp:262)
   by 0x25DC5CC: Ipopt::StdAugSystemSolver::MultiSolve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, bool, int) (IpStdAugSystemSolver.cpp:210)
   by 0x25DE31C: Ipopt::AugSystemSolver::Solve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, bool, int) (IpAugSystemSolver.hpp:104)
   by 0x2625906: Ipopt::LowRankAugSystemSolver::Solve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, bool, int) (IpLowRankAugSystemSolver.cpp:182)
   by 0x2631C48: Ipopt::LeastSquareMultipliers::CalculateMultipliers(Ipopt::Vector&, Ipopt::Vector&) (IpLeastSquareMults.cpp:80)
   by 0x266F9E4: Ipopt::DefaultIterateInitializer::least_square_mults(Ipopt::Journalist const&, Ipopt::IpoptNLP&, Ipopt::IpoptData&, Ipopt::IpoptCalculatedQuantities&, Ipopt::SmartPtr<Ipopt::EqMultiplierCalculator> const&, double) (IpDefaultIterateInitializer.cpp:701)
   by 0x266CBD1: Ipopt::DefaultIterateInitializer::SetInitialIterates() (IpDefaultIterateInitializer.cpp:340)
   by 0x2658923: Ipopt::IpoptAlgorithm::InitializeIterates() (IpIpoptAlg.cpp:648)
   by 0x2657669: Ipopt::IpoptAlgorithm::Optimize(bool) (IpIpoptAlg.cpp:332)
 Address 0x7be6bb0 is 16 bytes before a block of size 560 alloc'd
   at 0x4848899: malloc (vg_replace_malloc.c:381)
   by 0x26D40A0: __spral_match_order_MOD_mo_match (match_order.f90:526)
   by 0x26D5C3B: __spral_match_order_MOD_mo_scale (match_order.f90:455)
   by 0x26D825D: __spral_match_order_MOD_match_order_metis_ptr64 (match_order.f90:194)
   by 0x26D1F1B: __spral_ssids_MOD_analyse_double (ssids.f90:352)
   by 0x26D3AD6: __spral_ssids_MOD_analyse_double_ptr32 (ssids.f90:121)
   by 0x26BD71F: spral_ssids_analyse_ptr32 (ssids.f90:397)
   by 0x2687AC4: Ipopt::SpralSolverInterface::MultiSolve(bool, int const*, int const*, int, double*, bool, int) (IpSpralSolverInterface.cpp:547)
   by 0x25CD26E: Ipopt::TSymLinearSolver::MultiSolve(Ipopt::SymMatrix const&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, bool, int) (IpTSymLinearSolver.cpp:262)
   by 0x25DC5CC: Ipopt::StdAugSystemSolver::MultiSolve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector const>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector const> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, std::vector<Ipopt::SmartPtr<Ipopt::Vector>, std::allocator<Ipopt::SmartPtr<Ipopt::Vector> > >&, bool, int) (IpStdAugSystemSolver.cpp:210)
   by 0x25DE31C: Ipopt::AugSystemSolver::Solve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, bool, int) (IpAugSystemSolver.hpp:104)
   by 0x2625906: Ipopt::LowRankAugSystemSolver::Solve(Ipopt::SymMatrix const*, double, Ipopt::Vector const*, double, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Matrix const*, Ipopt::Vector const*, double, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector const&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, Ipopt::Vector&, bool, int) (IpLowRankAugSystemSolver.cpp:182)
   by 0x2631C48: Ipopt::LeastSquareMultipliers::CalculateMultipliers(Ipopt::Vector&, Ipopt::Vector&) (IpLeastSquareMults.cpp:80)
   by 0x266F9E4: Ipopt::DefaultIterateInitializer::least_square_mults(Ipopt::Journalist const&, Ipopt::IpoptNLP&, Ipopt::IpoptData&, Ipopt::IpoptCalculatedQuantities&, Ipopt::SmartPtr<Ipopt::EqMultiplierCalculator> const&, double) (IpDefaultIterateInitializer.cpp:701)
   by 0x266CBD1: Ipopt::DefaultIterateInitializer::SetInitialIterates() (IpDefaultIterateInitializer.cpp:340)
   by 0x2658923: Ipopt::IpoptAlgorithm::InitializeIterates() (IpIpoptAlg.cpp:648)
@jfowkes
Copy link
Contributor

jfowkes commented Mar 5, 2025

Thank you very much for the bug report! I strongly suspect this is caused by #205 which made some (what we thought were minor) changes to the mo_match subroutine. It's not immediately obvious to me why this should lead to out-of-bounds reads though, would be really helpful if you could dig a little deeper so we can see where things are going wrong.

@mjacobse
Copy link
Collaborator

mjacobse commented Mar 5, 2025

I'm afraid this is my mistake, see #205 (comment)

@jfowkes
Copy link
Contributor

jfowkes commented Mar 5, 2025

@mjacobse don't we also need to remove the following lines from mo_scale otherwise we're again setting negative entries?

spral/src/match_order.f90

Lines 458 to 484 in 19f5b3c

if (struct_rank .ne. n) then
! structurally singular case. At this point, scaling factors
! for rows in corresponding to rank deficient part are set to
! zero. The following is to set them according to Duff and Pralet.
deallocate(ptr2, stat=stat)
deallocate(row2, stat=stat)
deallocate(val2, stat=stat)
allocate(cscale(n),stat=stat)
if (stat .ne. 0) then
flag = ERROR_ALLOCATION
return
end if
cscale(1:n) = scale(1:n)
do i = 1, n
if (cscale(i) .ne. -huge(scale)) cycle
do j = ptr(i), ptr(i+1)-1
k = row(j)
if (cscale(k) .eq. -huge(scale)) cycle
scale(i) = max(scale(i), val(j)+scale(k))
end do
if (scale(i) .eq. -huge(scale)) then
scale(i) = 0.0
else
scale(i) = -scale(i)
end if
end do
end if

@jwnimmer-tri
Copy link
Contributor Author

Thanks for looking into this. If you end up with a candidate PR with a fix, it's pretty quick for me to test it and report if I still see the problem. I can also still work on making a smaller reproducer, but it will take me a couple hours and I probably can't do it until tomorrow.

@jfowkes
Copy link
Contributor

jfowkes commented Mar 5, 2025

Thanks @jwnimmer-tri, we'll let you know once we have a candidate PR with a fix for testing.

@mjacobse
Copy link
Collaborator

mjacobse commented Mar 5, 2025

@mjacobse don't we also need to remove the following lines from mo_scale otherwise we're again setting negative entries?

[...]

But that is not about ordering and indices but instead just the scaling factors right? I'm not sure if the API of the scaling routines is supposed to promise anything about those scaling factors in case of singular matrices. Certainly worth to double check if the documentation says anything about this or if any use-cases assume a certain behaviour, but in any case would be mostly unrelated to this issue and #205 to my mind.

@jfowkes
Copy link
Contributor

jfowkes commented Mar 5, 2025

But that is not about ordering and indices but instead just the scaling factors right?

Oh yes, you're right I was getting confused between the scaling and the ordering routines.

I will create a PR with your proposed fix shortly.

@jfowkes jfowkes linked a pull request Mar 5, 2025 that will close this issue
@jfowkes
Copy link
Contributor

jfowkes commented Mar 6, 2025

@jwnimmer-tri fixed in v2025.03.06 release.

@jwnimmer-tri
Copy link
Contributor Author

jwnimmer-tri commented Mar 6, 2025

Perfect, thank you all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants