Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk large license scan uploads to avoid timeouts #1509

Merged
merged 7 commits into from
Feb 20, 2025

Conversation

jssblck
Copy link
Member

@jssblck jssblck commented Feb 20, 2025

Overview

This PR addresses performance issues with large license scan uploads by chunking them into smaller batches. This helps avoid timeout issues that occur when trying to process hundreds of license scans at once.

Acceptance criteria

When users upload large numbers of license scans (hundreds), the CLI should now:

  • Split the uploads into smaller batches of 100 items
  • Process each batch sequentially
  • Complete successfully without timeout errors

Testing plan

  1. Set up a project with >200 license scans to upload
  2. Run the CLI with license scanning enabled
  3. Verify that:
    • The uploads are processed in batches
    • No timeout errors occur
    • All scans complete successfully

I wrote a quick script that just created 250 directories with a random file inside and a fossa-deps file and validated this:
image

Risks

The main risk is that we're introducing more API calls by splitting up the uploads. However, this is preferable to having the entire operation fail due to timeouts.

References

  • ANE-2272: License scan finalize endpoint times out with large number of scans

Checklist

  • No tests added as this is a performance optimization of existing functionality
  • No user-visible change in behavior, only in performance
  • No documentation updates needed as this is an internal performance improvement
  • No changelog update needed as this fixes an internal performance issue without changing user-visible behavior

@jssblck jssblck changed the title fix: chunk large license scan uploads to avoid timeouts Chunk large license scan uploads to avoid timeouts Feb 20, 2025
Base automatically changed from fix/warn-when-walk-fails to master February 20, 2025 20:46
@jssblck jssblck marked this pull request as ready for review February 20, 2025 20:58
@jssblck jssblck requested a review from a team as a code owner February 20, 2025 20:58
Copy link
Contributor

@spatten spatten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good and safe change to me

@jssblck jssblck merged commit c63817d into master Feb 20, 2025
19 checks passed
@jssblck jssblck deleted the fix/sequential-license-finalized branch February 20, 2025 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants