Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add crawl errors to database incrementally during crawl #1558

Closed
tw4l opened this issue Feb 28, 2024 · 0 comments · Fixed by #1561
Closed

Add crawl errors to database incrementally during crawl #1558

tw4l opened this issue Feb 28, 2024 · 0 comments · Fixed by #1561
Assignees
Milestone

Comments

@tw4l
Copy link
Member

tw4l commented Feb 28, 2024

Currently, unlike pages and crawl files, errors are only being added to the database at the conclusion of a crawl. The associated API endpoint is pulling them straight from Redis during crawling and then from the db after the crawl. This is perhaps more complex a solution than it needs to be, and also leaves the potential for the crawler pod to be shut down before all errors are read into the database.

We should modify the handling of errors to be similar to crawl pages/files and add them incrementally to the database during crawling by popping them from the Redis queue as they show up.

@tw4l tw4l self-assigned this Feb 28, 2024
@tw4l tw4l moved this from Triage to Implementing in Webrecorder Projects Feb 28, 2024
@tw4l tw4l added this to the v1.10.0 milestone Feb 28, 2024
@ikreymer ikreymer moved this from Implementing to Todo in Webrecorder Projects Feb 28, 2024
@tw4l tw4l moved this from Todo to In Review in Webrecorder Projects Feb 28, 2024
ikreymer pushed a commit that referenced this issue Feb 29, 2024
Fixes #1558 

- Adds crawl errors to database incrementally during crawl rather than
after crawl completes
- Simplifies crawl /errors API endpoint to always return errors from
database
@github-project-automation github-project-automation bot moved this from In Review to Done! in Webrecorder Projects Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant