-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproduction of the tadGAN results over NASA telemetry dataset, as given in the corresponding research paper #521
Comments
after completing the training for 35 epochs:--
|
Hi @Chiradipb02 – thanks for opening an issue and using Orion! To run the benchmark on NASA dataset, you can use our benchmarking script which will automatically load the necessary hyperparmeter settings, i.e. from orion.benchmark import benchmark, BENCHMARK_DATA
datasets = {
"MSL": BENCHMARK_DATA["MSL"],
"SMAP": BENCHMARK_DATA["SMAP"]
}
pipelines = {"tadgan": "tadgan"}
scores = benchmark(pipelines=pipelines, datasets=datasets) You will need some compute for this to complete in a decent time. You can also find the latest results of the benchmark (which we run every release) available in the details Google Sheets document and the summarized results can also be browsed in the following summary Google Sheets document. |
Thank you @sarahmish for your response.
but some error occurs, for each of the datasets.
for which the output comes:--
Is it occuring as some values in the datasets are equal to -1 and 1 or due to absence of--
in tadgan_msl.json and tadgan_smap.json ? |
the issue is solvable when you downgrade sklearn to please make sure to install the compatible version of sklearn |
Thank you @sarahmish for your help, and sorry for the late reply. The code is running properly and giving the appropriate results, but taking a really long time. One more question:-- On a meter reading time-series dataset:--
For 10 epochs on a particular portion of the dataframe of shape (18286, 2) in orion format For 25 epochs on the same dataframe For 5 epochs on the whole dataset of shape (75224, 2)
for the other 2, the losses did not cross -70 But the anomalous part is mainly the lower flat part. I have tried the fixed threshold parameter (True and False) with similar number of epochs, but no improvement observed. What parameter values can be used here in general? |
The loss is unbounded for the critic, therefore, it makes sense to see variance between one time series and another. If you'd like, you can set To reduce/extend the range of the detected anomalies, here is a hyperparameter called hyperparameters = {
'orion.primitives.timeseries_anomalies.find_anomalies#1': {
'anomaly_padding': 0 # set to 50 by default
}
} for more information, visit the primitive page for find_anomalies. If all your anomalies look like the flat part of the signal, I think there are simpler algorithms that you try and that are faster than tadgan. |
Thank you @sarahmish for your reply. I am using the AER model, for anomaly detection. So far it's performance is really faster than tadgan and giving better results. But while trying the approach given in tulog for tadgan, to see how the primitives are working, defining the aer model requires the parameters--- and some hyperparameters. What are the layer architectures that I can pass as parameter to build the model. To get the intermediate outputs, is there any method like making visualization =True in detect() method as it was there for tadgan in tulog? |
Can you please tell what are the functionalities of the hyperparameters |
Description
I am trying to reproduce the results of the TadGAN model proposed in the paper 'TadGAN: Time Series Anomaly Detection Using
Generative Adversarial Networks' and perform benchmarking. For the efficient result reproduction for smap and msl spacecraft datas, what hyperparameter values should i use? Or if the trained model weights are available, how can I use them and where to find?
What I Did
I am currently using the hyperparameters given in the tadgan_smap.json file and the tadgan pipeline. But training for even 35 epochs is quite time taking and expensive on colab.
using tadgan pipeline
2 of the losses are diverging
Using tadgan.json pipeline
the runtime gets disconnected in between
Other Approach
There is a txt file in the nasa dataset zip link given in the paper. That txt file contained some model parameters as well. Also in the models folder there were .h5 files for each dataset file.
I tried to load on of them to tadgan model, after preprocessig the data as given in the Tulog.ipynb
there was some dimensional error as required shape appeared to be (none,none,25). So I had to reshape the data
the reconstruction from trained model
What can I do?
notebook link: https://colab.research.google.com/drive/1zahCbCImRuL2_Hc-ms1WSZl7oUyP32Q3?usp=sharing
The text was updated successfully, but these errors were encountered: