-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Demo Data no longer accessible #415
Comments
Same with me |
Thank you for noting this issue. We're working on getting things back up and running again. Thank you for being patient. |
Can you please let us know if their is any alternative if its not fixed. |
@ushasai a fix is proposed in PR #418 Please use the following url to download the data: https://sintel-orion.s3.amazonaws.com/ |
@sarahmish is there a way to access the full list of datasets available (like an index)? In the past we could access it via https://d3-ai-orion.s3.amazonaws.com/index.html, but as of today the following message is shown:
On top of this, all of the datasets (either complete, train or test splits) come unlabelled. Does |
I just added an index to the s3 bucket. Yes, |
Thank you @sarahmish. One thing that is not clear to me: when do you use the train/test datasets versus the complete one? In the benchmarking results there's a field named "split" which I assume is related to this. Do the results shown refer to models fitted on the train dataset and later detecting anomalies on the test dataset? If so, how do you adjust this experimental setup for the cases where there is no train/test split like in the Yahoo datasets? Sorry if this goes outside of the scope of the original question. |
@nunobv not at all! In terms of "split", sometimes data are divided into training/testing in advance (for example, signals in MSL and SMAP have a prior split). In order to have comparative results to other models, we use the same training/testing split. Yahoo datasets are not split and are applied to the entire signal. I hope this answers your question! |
Thanks @sarahmish. I've only taken a look at the first 7/8 signal splits in the SMAP dataset, but as far as I can tell the As the the same training strategy is being made for every single pipeline, I guess the effect is transversal, and the current benchmarking can still be used, even if only on a "relative" fashion.. What's your take on this? (in the meantime, I'll check if the training splits do or do not have any anomalies for every other signal) |
@nunobv yes, in the SMAP and MSL datasets the anomalies are only present in the test split. I agree that there is a level of supervision since we have prior knowledge that the training split does not contain any anomalies and only "normal" observations. I want to emphasize a couple of points about the benchmark:
When investigating the benchmark results, you'll notice that pipelines have high f1 score in Yahoo and NAB datasets too, indicating that even without split, the pipelines are able to find anomalies. Let me know if you have any further question |
100% clear. Thank you for your (usual) diligence! |
Description
Trying to import a signal from any of the example notebooks fails.
What I Did
The text was updated successfully, but these errors were encountered: