-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different F1 score for same signal for different time_segments_aggregate interval #511
Comments
Hi @Sid-030591, thank you for using Orion! Please refer to the documentation of time_segments_aggregate to view how the aggregation is made. With As for the performance, can you provide a snippet of what the input & output looks like? how many intervals did the pipeline detect? |
Hello @sarahmish , thank you for your response. |
Thanks for the description @Sid-030591! your reported results make sense, when the threshold is fixed, we always capture the same extreme values (4 standard deviations away from the mean) and therefore obtain the same result. However, when the threshold is dynamic ( I hope that this answers your question! |
Thank you @sarahmish for the answer.
Point 3 and 4 are not actual follow ups on the initial topic. But rather than opening a new issue, I thought of writing in this one. Hope this is ok for you. I have been doing some work on AD area and thus have all these questions, Hope this is also ok. |
@sarahmish I think i found the answer to the 3rd and 4th question. So, 0.2 is the validation split that you use for 3rd question. Also, we can train the pipeline on multi columns , anomalies will be found on univariate (single target column to be provided). Please confirm if this is the correct understanding. Also, I would appreciate if you could answer the 1st and 2nd question. |
Hi @Sid-030591, I was referring to the randomness of the model, and thereafter the error values.
I would first recommend referring to issue #375 to see some detail on Let me know if you have other questions! |
Hello @sarahmish , thank you for your response. I understand the point wrt randomness in the used model and also in the post processing step. I would like to know one more point here - so let's say you are doing some performance benchmarking/comparison. Say, AER model with reg ratio as 0.5 (default) and 0.3, 0.4. Now, we will get different F1 values. How should we conclude regarding what part of this difference is coming from the inherent randomness and what part is from the actual change ( which is in the reg ratio value) - specially when the values are not too different (let's say). One way could be to run many simulations and possibly take an average to come up with a better estimate. What's you understanding on this? |
Depending on the variable you are changing you can attribute this change. For example, In practice, however, running the same model will yield close results but not identical. To reduce the variability, I recommend running the model for n iterations to see some consistency in the results. |
Description
I am using AER pipeline for detecting anomalies on a synthetic dataset that I have created. Dataset follows MA 1 characteristics with 7 anomalies added at random instants. Timestamp is sampled at 1 hour (3600). Now, when I run this with time_segments_aggregate with 3600 interval, only 1 out of 7 anomaly is detected and it takes around 15 minutes. On the contrary, when I run the same dataset with time_segments_aggregate with 21600 interval, all 7 anomalies are detected in around 3 minutes time. Could you please explain how interval value is actually having an impact on F1 score? I can understand its impact on the time taken.
What I Did
The text was updated successfully, but these errors were encountered: