-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for dynamically reloading sampling strategies #2040
Conversation
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Going through the discussion on #1688 - The obvious con is repetition of code. |
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Codecov Report
@@ Coverage Diff @@
## master #2040 +/- ##
==========================================
- Coverage 97.39% 96.27% -1.13%
==========================================
Files 207 214 +7
Lines 10307 10564 +257
==========================================
+ Hits 10039 10171 +132
- Misses 224 332 +108
- Partials 44 61 +17
Continue to review full report at Codecov.
|
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Deployment spec used in test -
Output -
|
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
This reverts commit c7eb095. Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
@yurishkuro @jpkrohling - Can we merge this? |
// This is a workaround for k8s configmaps. Since k8s loads configmaps as | ||
// symlinked files within the containers, changes to the configmap register | ||
// as `fsnotify.Remove` events. | ||
if err := watcher.Add(strategiesFile); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a race condition here? If the file was removed, even temporarily, and you call watcher.Add()
on non-existing file, will it not error out immediately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this line is specifically added for the k8s configmap scenario. It is added to handle successive reloads without restarting the collector (without this line, the collector reloads the strategies only the first time the file was changed). I've verified that it works.
However, if the config file was manually removed, yes it will error out immediately, but that works for us.
Co-Authored-By: Yuri Shkuro <yurishkuro@users.noreply.github.com> Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
9c15662
to
30a56eb
Compare
} | ||
continue | ||
} | ||
if event.Op&fsnotify.Write == fsnotify.Write { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about Create event? I think it would've addressed my question above.
for { | ||
select { | ||
case event := <-watcher.Events: | ||
if event.Op&fsnotify.Remove == fsnotify.Remove { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these treated as bitmasks? The Op constants are defined as ordinary numbers (https://godoc.org/github.com/fsnotify/fsnotify#Op).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
select { | ||
case event := <-watcher.Events: | ||
if event.Op&fsnotify.Remove == fsnotify.Remove { | ||
if event.Name == strategiesFile { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that we're watching just a single file, when would the name not equal?
logger.Info("watching", zap.String("file", options.StrategiesFile)) | ||
} | ||
|
||
dir := filepath.Dir(options.StrategiesFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to watch the dir?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
func (h *strategyStore) runWatcherLoop(watcher *fsnotify.Watcher, strategiesFile string) { | ||
for { | ||
select { | ||
case event := <-watcher.Events: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not simplify this by handling all event? E.g.
case event := <-watcher.Events:
if event.Op == fsnotify.Remove {
// resubscribe to handle k8s use case
}
h.loadAndParseStrategies(strategiesFile)
continue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic is actually very close to this
- Remove events could occur in a non k8s environment. In which case we need to handle only write events, hence the following block-
if event.Op&fsnotify.Write == fsnotify.Write {
h.loadAndParseStrategies(strategiesFile)
}
- If they occur in a k8s environment, we need to reload the file strategies file hence the first part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: the second case isn't specific to Kubernetes, but it's how Kubernetes mounts those files in the local file system (with two symlinks at different levels). The same would happen on non-Kubernetes environments as well, if people use symlinks on directories pointing to the "current" version of a config map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my point is that you want special handing for Remove by re-subscribing, and all events should result in parsing the file. Right now the code ignores several events altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the code ignores several events altogether.
?
OK, I think I added the directory watcher to get past Codecov. Could you check if the logic at this commit works? 31ea29e PR coverage is 54%, but that's because we're not writing k8s specific test cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, as long as Travis passes.
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Gah, Codecov is at 40%, but the only lines left out are fsnotify errors. |
@jpkrohling @yurishkuro - soft ping, can we move forward with this? The test cases for this will have to be written in the operator repo |
Please address review comments. Only 40% of new code is covered, please bring it above the 95% target. |
@yurishkuro - I've already added one test case. Test cases to cover the remaining lines of code will have to be added in the operator repo. |
Tests in the operator repo do not affect code coverage. Why can’t we unit test? |
Yes, I'm aware. Lines checking for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing errors is no different from testing normal code, with the usual technique - using a stub. The fsnotify package can be abstracted with a small interface that could be stubbed in tests.
func (h *strategyStore) loadAndParseStrategies(strategiesFile string) error { | ||
s, err := loadStrategies(strategiesFile) | ||
if err != nil { | ||
h.logger.Warn("using the last saved configuration for sampling strategies.", zap.Error(err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
h.logger.Warn("using the last saved configuration for sampling strategies.", zap.Error(err)) | |
h.logger.Warn("Unable to reload the sampling strategies file. Using the last saved configuration.", zap.Error(err), zap.String("file", strategiesFile)) |
logger.Info("watching", zap.String("file", options.StrategiesFile)) | ||
} | ||
|
||
dir := filepath.Dir(options.StrategiesFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
func (h *strategyStore) runWatcherLoop(watcher *fsnotify.Watcher, strategiesFile string) { | ||
for { | ||
select { | ||
case event := <-watcher.Events: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the code ignores several events altogether.
?
for { | ||
select { | ||
case event := <-watcher.Events: | ||
if event.Op&fsnotify.Remove == fsnotify.Remove { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
instead of monitoring file change, why dont we expose an API which can accept json payload of sampling strategy? |
Checking file periodically is another way to reload sampling strategies, which is simpler and more reliable. Can I create a PR to do this? @yurishkuro |
#2188 implements a simpler, timer-based reload, instead of watching for file changes. I suggest we consider closing this PR - @annanay25 |
Which problem is this PR solving?
Short description of the changes
RWMutex
for concurrent access.