-
Notifications
You must be signed in to change notification settings - Fork 795
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run post-breach phase in separate thread #758
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #758 +/- ##
========================================
Coverage 60.31% 60.31%
========================================
Files 161 161
Lines 4899 4899
========================================
Hits 2955 2955
Misses 1944 1944 Continue to review full report at Codecov.
|
While this is a good start, we'd like all PBAs to run simultaneously, not only have the PBAs run async. So we want to open a thread for each PBA (maybe up to a limit of 4 threads running at the same time) |
6c51ac0
to
d9cc851
Compare
Cool. Can you do some time measurements and see how much time this saves (and if you increase the thread max limit)? |
Running with no threading amongst the PBAs gives 246374421 nanoseconds, and with threading, 5 threads gives the best performance which is 241448650 nanoseconds. |
Hmm... Unfortunately seems like this makes almost no difference :( If you're done with testing I think we can merge. We do learn a very important lesson from this: We should profile the Monkey execution and see how long each part takes, so we can understand where we might be able to parallelize and make it quicker. Please create an issue for that. |
d9cc851
to
444c2cb
Compare
Yes, I'll create an issue. I'm also done testing so I'm going to go ahead and merge this in a while. |
pool = Pool(5) | ||
pool.map(self.run_pba, self.pba_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried this with multithreading? Performance was worse? Processes take longer to spawn, but are truly "parallel". If I/O is long, maybe we don't really need parallel execution, maybe we just need to init all I/O asap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm importing Pool
from multiprocessing.dummy.
multiprocessing.dummy replicates the API of multiprocessing but is no more than a wrapper around the threading module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I see!
I agree with Shay. Be wary of such issues, because we should parallelize code only to increase performance and if we need to increase performance, let's increase it where it matters. Parallelizing just because we can is not the best way to go, as it introduces a lot of potential problems with a questionable return. |
LOG.debug("Executing PBA: '{}'".format(pba.name)) | ||
pba.run() | ||
except Exception as e: | ||
LOG.error("PBA {} failed. Error info: {}".format(pba.name, e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging doesn't get messed up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No? Just that all PBA logs are not logged together now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm concerned with is the possibility of log showing:
Executing PBA X
Output of PBA Y
Errors of PBA Y
This might be a bit misleading, but I think we'll manage.
Fixes #696