Skip to content

Commit

Permalink
Migrate to ml-stars (#418)
Browse files Browse the repository at this point in the history
* update readme

* change mlprimitives to mlstars

* change s3 bucket

* fix mlstars import

* add tqdm to requirements

* fix imports

* change tutorials to mlstars

* update ml-stars

* fix order

* update mlstars
  • Loading branch information
sarahmish authored May 19, 2023
1 parent a55f17b commit 601b1d4
Show file tree
Hide file tree
Showing 141 changed files with 293 additions and 287 deletions.
65 changes: 39 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,20 @@

A machine learning library for unsupervised time series anomaly detection.

| Important Links | |
| ----------------------------------- | -------------------------------------------------------------------- |
| :computer: **[Website]** | Check out the Sintel Website for more information about the project. |
| :book: **[Documentation]** | Quickstarts, User and Development Guides, and API Reference. |
| :star: **[Tutorials]** | Checkout our notebooks |
| :octocat: **[Repository]** | The link to the Github Repository of this library. |
| :scroll: **[License]** | The repository is published under the MIT License. |
| :keyboard: **[Development Status]** | This software is in its Pre-Alpha stage. |
| Important Links | |
| --------------------------------------------- | -------------------------------------------------------------------- |
| :computer: **[Website]** | Check out the Sintel Website for more information about the project. |
| :book: **[Documentation]** | Quickstarts, User and Development Guides, and API Reference. |
| :star: **[Tutorials]** | Checkout our notebooks |
| :octocat: **[Repository]** | The link to the Github Repository of this library. |
| :scroll: **[License]** | The repository is published under the MIT License. |
| [![][Slack Logo] **Community**][Community] | Join our Slack Workspace for announcements and discussions. |

[Website]: https://sintel.dev/
[Documentation]: https://sintel-dev.github.io/Orion
[Tutorials]: https://github.com/sintel-dev/Orion/tree/master/tutorials
[Repository]: https://github.com/sintel-dev/Orion
[License]: https://github.com/sintel-dev/Orion/blob/master/LICENSE
[Development Status]: https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha
[Community]: https://join.slack.com/t/sintel-space/shared_invite/zt-q147oimb-4HcphcxPfDAM0O9_4PaUtw
[Slack Logo]: https://github.com/sintel-dev/Orion/blob/master/docs/images/slack.png

Expand Down Expand Up @@ -87,20 +85,20 @@ which should show a signal with `timestamp` and `value`.
4 1222905600 -0.370746
```

In this example we use `lstm_dynamic_threshold` pipeline and set some hyperparameters (in this case training epochs as 5).
In this example we use `aer` pipeline and set some hyperparameters (in this case training epochs as 5).

```python3
from orion import Orion

hyperparameters = {
'keras.Sequential.LSTMTimeSeriesRegressor#1': {
'orion.primitives.aer.AER#1': {
'epochs': 5,
'verbose': True
}
}

orion = Orion(
pipeline='lstm_dynamic_threshold',
pipeline='aer',
hyperparameters=hyperparameters
)

Expand Down Expand Up @@ -136,8 +134,8 @@ We run the benchmark on **11** datasets with their known grounth truth. We recor
| LSTM Autoencoder | 6 |
| Dense Autoencoder | 6 |
| VAE | 7 |
| GANF | 6 |
| Azure | 0 |
| [GANF](https://arxiv.org/pdf/2202.07857.pdf) | 6 |
| [Azure](https://azure.microsoft.com/en-us/products/cognitive-services/anomaly-detector/) | 0 |


You can find the scores of each pipeline on every signal recorded in the [details Google Sheets document](https://docs.google.com/spreadsheets/d/1HaYDjY-BEXEObbi65fwG0om5d8kbRarhpK4mvOZVmqU/edit?usp=sharing). The summarized results can also be browsed in the following [summary Google Sheets document](https://docs.google.com/spreadsheets/d/1ZPUwYH8LhDovVeuJhKYGXYny7472HXVCzhX6D6PObmg/edit?usp=sharing).
Expand All @@ -151,24 +149,22 @@ Additional resources that might be of interest:

# Citation

If you use **Orion** which is part of the **Sintel** ecosystem for your research, please consider citing the following paper:
If you use **AER** for your research, please consider citing the following paper:

Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni. [AER: Auto-Encoder with Regression for Time Series Anomaly Detection](https://arxiv.org/pdf/2212.13558.pdf).

Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni. [Sintel: A Machine Learning Framework to Extract Insights from Signals](https://dl.acm.org/doi/pdf/10.1145/3514221.3517910).
```
@inproceedings{alnegheimish2022sintel,
title={Sintel: A Machine Learning Framework to Extract Insights from Signals},
author={Alnegheimish, Sarah and Liu, Dongyu and Sala, Carles and Berti-Equille, Laure and Veeramachaneni, Kalyan},
booktitle={Proceedings of the 2022 International Conference on Management of Data},
pages = {1855–1865},
numpages = {11},
publisher={Association for Computing Machinery},
doi = {10.1145/3514221.3517910},
series = {SIGMOD '22},
@inproceedings{wong2022aer,
title={AER: Auto-Encoder with Regression for Time Series Anomaly Detection},
author={Wong, Lawrence and Liu, Dongyu and Berti-Equille, Laure and Alnegheimish, Sarah and Veeramachaneni, Kalyan},
booktitle={2022 IEEE International Conference on Big Data (IEEE BigData)},
pages={1152-1161},
doi={10.1109/BigData55660.2022.10020857},
organization={IEEE},
year={2022}
}
```


If you use **TadGAN** for your research, please consider citing the following paper:

Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. [TadGAN - Time Series Anomaly Detection Using Generative Adversarial Networks](https://arxiv.org/pdf/2009.07769v3.pdf).
Expand All @@ -184,3 +180,20 @@ Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan
year={2020}
}
```

If you use **Orion** which is part of the **Sintel** ecosystem for your research, please consider citing the following paper:

Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni. [Sintel: A Machine Learning Framework to Extract Insights from Signals](https://dl.acm.org/doi/pdf/10.1145/3514221.3517910).
```
@inproceedings{alnegheimish2022sintel,
title={Sintel: A Machine Learning Framework to Extract Insights from Signals},
author={Alnegheimish, Sarah and Liu, Dongyu and Sala, Carles and Berti-Equille, Laure and Veeramachaneni, Kalyan},
booktitle={Proceedings of the 2022 International Conference on Management of Data},
pages={1855–1865},
numpages={11},
publisher={Association for Computing Machinery},
doi={10.1145/3514221.3517910},
series={SIGMOD '22},
year={2022}
}
```
4 changes: 2 additions & 2 deletions docs/user_guides/primitives_pipelines/pipelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ The main component in the Orion project are the **Orion Pipelines**, which consi

As ``MLPipeline`` instances, **Orion Pipelines**:

* consist of a list of one or more `MLPrimitives <https://mlbazaar.github.io/MLPrimitives/>`__
* consist of a list of one or more `mlstars <https://sintel-dev.github.io/ml-stars/>`__
* can be *fitted* on some data and later on used to *predict* anomalies on more data
* can be *scored* by comparing their predictions with some known anomalies
* have *hyperparameters* that can be *tuned* to improve their anomaly detection performance
Expand Down Expand Up @@ -153,7 +153,7 @@ Since pipelines are composed of :ref:`primitives`, you can discover the interpre
"value": np.random.randint(0, 10, 500)})
hyperparameters = {
"mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
"interval": 300
},
'keras.Sequential.LSTMTimeSeriesRegressor#1': {
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/primitives_pipelines/primitives.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Primitives
==========

Primitives are data processing units. They are defined by the code that performs the actual processing and an annotated ``json`` file. To read more about primitives and their composition, visit `MLPrimitives <https://mlbazaar.github.io/MLPrimitives/>`__.
Primitives are data processing units. They are defined by the code that performs the actual processing and an annotated ``json`` file. To read more about primitives and their composition, visit `mlstars <https://sintel-dev.github.io/ml-stars/>`__.

Preprocessing
-------------
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/primitives_pipelines/primitives/AER.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.ones((64, 100, 1))
y = X[:,:, [0]] # signal to reconstruct from X (channel 0)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 100).reshape(1, -1, 1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 100).reshape(1, -1, 1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ LSTM

**description**: this is a prediction model with double stacked LSTM layers used as a time series regressor. you can read more about it in the `related paper <https://arxiv.org/pdf/1802.04431.pdf>`__.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/keras.Sequential.LSTMTimeSeriesRegressor.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/keras.Sequential.LSTMTimeSeriesRegressor.json>`__.

====================== =================== ===========================================================================================================================================
argument type description
Expand Down Expand Up @@ -48,7 +48,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 100).reshape(1, -1, 1)
y = np.array([[1]])
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ MinMaxScaler

**description**: this primitive transforms features by scaling each feature to a given range.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/sklearn.preprocessing.MinMaxScaler.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/sklearn.preprocessing.MinMaxScaler.json>`__.

==================== =================== =============================================================================================================
argument type description
Expand All @@ -33,7 +33,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array(range(5)).reshape(-1, 1)
primitive = load_primitive('sklearn.preprocessing.MinMaxScaler',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ SimpleImputer

**description**: this primitive is an imputation transformer for filling missing values.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/sklearn.impute.SimpleImputer.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/sklearn.impute.SimpleImputer.json>`__.

==================== ========================================================= ==========================================
argument type description
Expand Down Expand Up @@ -35,7 +35,7 @@ argument type
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 4 + [np.nan]).reshape(-1, 1)
primitive = load_primitive('sklearn.impute.SimpleImputer',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 100).reshape(1, -1, 1)
y = X[:,:, [0]] # signal to reconstruct from X (channel 0)
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guides/primitives_pipelines/primitives/VAE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 100).reshape(1, -1, 1)
Expand Down
4 changes: 2 additions & 2 deletions docs/user_guides/primitives_pipelines/primitives/arima.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ ARIMA

**description**: this is an Autoregressive Integrated Moving Average (ARIMA) prediction model.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/statsmodels.tsa.arima_model.Arima.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/statsmodels.tsa.arima_model.Arima.json>`__.

==================== =================== ==================================================================
argument type description
Expand Down Expand Up @@ -35,7 +35,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array(range(100)).reshape(-1, 1)
primitive = load_primitive('statsmodels.tsa.arima_model.Arima',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ argument type
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
X = np.array([1] * 4 + [np.nan]).reshape(-1, 1)
primitive = load_primitive('orion.primitives.timeseries_preprocessing.fillna',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('orion.primitives.timeseries_anomalies.find_anomalies',
arguments={"anomaly_padding": 1})
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
intervals to mask
~~~~~~~~~~~~~~~~~

**path**: ``mlprimitives.custom.timeseries_preprocessing.intervals_to_mask``
**path**: ``mlstars.custom.timeseries_preprocessing.intervals_to_mask``

**description**: this primitive creates boolean mask from given intervals.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/mlprimitives.custom.timeseries_preprocessing.intervals_to_mask.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/mlstars.custom.timeseries_preprocessing.intervals_to_mask.json>`__.

==================== =============================== =================================================================================================================================
argument type description
Expand All @@ -28,9 +28,9 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('mlprimitives.custom.timeseries_preprocessing.intervals_to_mask')
primitive = load_primitive('mlstars.custom.timeseries_preprocessing.intervals_to_mask')
index = np.array(range(10))
intervals = [(1, 3), (7, 7)]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('orion.primitives.timeseries_errors.reconstruction_errors')
y = np.array([[1]] * 100)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('orion.primitives.timeseries_errors.regression_errors')
y = np.array([[1]] * 100)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
rolling window sequence
~~~~~~~~~~~~~~~~~~~~~~~

**path**: ``mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences``
**path**: ``mlstars.custom.timeseries_preprocessing.rolling_window_sequences``

**description**: this primitive generates many sub-sequences of the original sequence. it uses a rolling window approach to create the sub-sequences out of time series data.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/mlstars.custom.timeseries_preprocessing.rolling_window_sequences.json>`__.

==================== ============================================================== ==================================================================
argument type description
Expand Down Expand Up @@ -41,9 +41,9 @@ see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/pri
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('mlprimitives.custom.timeseries_preprocessing.rolling_window_sequences',
primitive = load_primitive('mlstars.custom.timeseries_preprocessing.rolling_window_sequences',
arguments={"window_size": 10, "target_size": 1, "step_size": 1, "target_column": 0})
X = np.array([1] * 50).reshape(-1, 1)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ argument type description
:okwarning:
import numpy as np
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('orion.primitives.tadgan.score_anomalies',
arguments={"error_smooth_window": 10, "critic_smooth_window": 10,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,11 @@
time segments aggregate
~~~~~~~~~~~~~~~~~~~~~~~

**path**: ``mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate``
**path**: ``mlstars.custom.timeseries_preprocessing.time_segments_aggregate``

**description**: this primitive creates an equi-spaced time series by aggregating values over fixed specified interval.

see `json <https://github.com/MLBazaar/MLPrimitives/blob/master/mlprimitives/primitives/mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate.json>`__.
see `json <https://github.com/MLBazaar/mlstars/blob/master/mlstars/primitives/mlstars.custom.timeseries_preprocessing.time_segments_aggregate.json>`__.

==================== =========================================== =============================================================================================================================
argument type description
Expand All @@ -28,9 +28,9 @@ argument type description
.. ipython:: python
:okwarning:
from mlprimitives import load_primitive
from mlstars import load_primitive
primitive = load_primitive('mlprimitives.custom.timeseries_preprocessing.time_segments_aggregate',
primitive = load_primitive('mlstars.custom.timeseries_preprocessing.time_segments_aggregate',
arguments={"time_column": "timestamp", "interval":10, "method":'mean'})
df = pd.DataFrame({
Expand Down
2 changes: 1 addition & 1 deletion orion/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

LOGGER = logging.getLogger(__name__)

BUCKET = 'd3-ai-orion'
BUCKET = 'sintel-orion'
S3_URL = 'https://{}.s3.amazonaws.com/{}'

BENCHMARK_PATH = os.path.join(os.path.join(
Expand Down
Loading

0 comments on commit 601b1d4

Please sign in to comment.