Skip to content

Commit 57d1267

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents 76e7b73 + 49d69e6 commit 57d1267

File tree

63 files changed

+10358
-7585
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+10358
-7585
lines changed

.github/workflows/tests.yml

+23-6
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,28 @@ on:
77
branches: [ master ]
88

99
jobs:
10+
docs:
11+
runs-on: ${{ matrix.os }}
12+
strategy:
13+
matrix:
14+
python-version: [3.8]
15+
os: [ubuntu-latest]
16+
steps:
17+
- uses: actions/checkout@v1
18+
- name: Set up Python ${{ matrix.python-version }}
19+
uses: actions/setup-python@v1
20+
with:
21+
python-version: ${{ matrix.python-version }}
22+
- name: Install package
23+
run: python -m pip install .[dev]
24+
- name: make docs
25+
run: make docs
26+
1027
lint:
1128
runs-on: ${{ matrix.os }}
1229
strategy:
1330
matrix:
14-
python-version: [3.7, 3.8]
31+
python-version: [3.6, 3.7, 3.8]
1532
os: [ubuntu-latest]
1633
steps:
1734
- uses: actions/checkout@v1
@@ -30,7 +47,7 @@ jobs:
3047
runs-on: ${{ matrix.os }}
3148
strategy:
3249
matrix:
33-
python-version: [3.7, 3.8]
50+
python-version: [3.6, 3.7, 3.8]
3451
os: [ubuntu-latest]
3552
steps:
3653
- uses: actions/checkout@v1
@@ -52,8 +69,8 @@ jobs:
5269
runs-on: ${{ matrix.os }}
5370
strategy:
5471
matrix:
55-
python-version: [3.7, 3.8]
56-
os: [ubuntu-latest, macos-latest]
72+
python-version: [3.6, 3.7, 3.8]
73+
os: [ubuntu-latest, macos-10.15]
5774
steps:
5875
- uses: actions/checkout@v1
5976
- name: Set up Python ${{ matrix.python-version }}
@@ -71,7 +88,7 @@ jobs:
7188
runs-on: ${{ matrix.os }}
7289
strategy:
7390
matrix:
74-
python-version: [3.7, 3.8]
91+
python-version: [3.6, 3.7, 3.8]
7592
os: [ubuntu-latest]
7693
steps:
7794
- uses: actions/checkout@v1
@@ -90,7 +107,7 @@ jobs:
90107
runs-on: ${{ matrix.os }}
91108
strategy:
92109
matrix:
93-
python-version: [3.7, 3.8]
110+
python-version: [3.6, 3.7, 3.8]
94111
os: [ubuntu-latest]
95112
steps:
96113
- uses: actions/checkout@v1

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -112,3 +112,6 @@ notebooks-private/
112112
scripts/
113113
dask-worker-space/
114114
tutorials/*.pkl
115+
116+
*.pkl
117+
*.DS_Store

HISTORY.md

+12-1
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,20 @@
11
# History
22

3-
## 0.1.0 - 2021-01-01
3+
4+
## 0.2.0 - 2022-04-12
5+
6+
This release features a reorganization and renaming of ``Draco`` pipelines. In addtion,
7+
we update some of the dependencies for general housekeeping.
8+
9+
* Update Draco dependencies - [Issue #66](https://github.com/signals-dev/Draco/issues/66) by @sarahmish
10+
* Reorganize pipelines - [Issue #63](https://github.com/signals-dev/Draco/issues/63) by @sarahmish
11+
12+
13+
## 0.1.0 - 2022-01-01
414

515
* First release on ``draco-ml`` PyPI
616

17+
718
## Previous GreenGuard development
819

920
### 0.3.0 - 2021-01-22

Makefile

+7-1
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,7 @@ check-release: check-candidate check-clean check-master check-history ## Check i
256256
@echo "A new release can be made"
257257

258258
.PHONY: release
259-
release: check-release bumpversion-release docker-push publish bumpversion-patch
259+
release: check-release bumpversion-release publish bumpversion-patch
260260

261261
.PHONY: release-test
262262
release-test: check-release bumpversion-release-test publish-test bumpversion-revert
@@ -267,6 +267,12 @@ release-candidate: check-master publish bumpversion-candidate
267267
.PHONY: release-candidate-test
268268
release-candidate-test: check-clean check-master publish-test
269269

270+
.PHONY: release-minor
271+
release-minor: check-release bumpversion-minor release
272+
273+
.PHONY: release-major
274+
release-major: check-release bumpversion-major release
275+
270276

271277
# DOCKER TARGETS
272278

README.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -220,18 +220,18 @@ The returned `pipeline` variable will be `list` containing the names of all the
220220
available in the Draco system:
221221

222222
```
223-
['classes.unstack_double_lstm_timeseries_classifier',
224-
'classes.unstack_lstm_timeseries_classifier',
225-
'classes.unstack_normalize_dfs_xgb_classifier',
226-
'classes.unstack_dfs_xgb_classifier',
227-
'classes.normalize_dfs_xgb_classifier']
223+
['dfs_xgb',
224+
'dfs_xgb_with_unstack',
225+
'dfs_xgb_with_normalization',
226+
'dfs_xgb_with_unstack_normalization',
227+
'dfs_xgb_prob_with_unstack_normalization']
228228
```
229229

230230
For the rest of this tutorial, we will select and use the pipeline
231-
`classes.normalize_dfs_xgb_classifier` as our template.
231+
`dfs_xgb_with_unstack_normalization` as our template.
232232

233233
```python3
234-
pipeline_name = 'classes.normalize_dfs_xgb_classifier'
234+
pipeline_name = 'dfs_xgb_with_unstack_normalization'
235235
```
236236

237237
## 3. Fitting the Pipeline

docker/Dockerfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM python:3.6
1+
FROM python:3.7
22

33
ARG UID=1000
44
EXPOSE 8888

draco/__init__.py

+5-3
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,18 @@
44

55
__author__ = """MIT Data To AI Lab"""
66
__email__ = 'dailabmit@gmail.com'
7-
__version__ = '0.1.0'
7+
__version__ = '0.2.0.dev0'
88

99
import os
1010

1111
from draco.pipeline import DracoPipeline, get_pipelines
1212

1313
_BASE_PATH = os.path.abspath(os.path.dirname(__file__))
14-
MLBLOCKS_PIPELINES = os.path.join(_BASE_PATH, 'pipelines')
1514
MLBLOCKS_PRIMITIVES = os.path.join(_BASE_PATH, 'primitives')
16-
15+
MLBLOCKS_PIPELINES = tuple(
16+
dirname
17+
for dirname, _, _ in os.walk(os.path.join(_BASE_PATH, 'pipelines'))
18+
)
1719

1820
__all__ = (
1921
'DracoPipeline',

draco/demo.py

+29-6
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,17 @@
1010
S3_URL = 'https://d3-ai-greenguard.s3.amazonaws.com/'
1111
DEMO_PATH = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'demo')
1212

13+
_FILES = {
14+
'DEFAULT': [
15+
('target_times', 'cutoff_time'),
16+
('readings', 'timestamp')
17+
],
18+
'RUL': [
19+
('rul_train_target_times', 'cutoff_time'),
20+
('rul_test_target_times', 'cutoff_time'),
21+
('rul_readings', 'timestamp')
22+
]
23+
}
1324

1425
def _load_or_download(filename, dates):
1526
filename += '.csv.gz'
@@ -27,23 +38,35 @@ def _load_or_download(filename, dates):
2738
return data
2839

2940

30-
def load_demo(load_readings=True):
41+
def load_demo(name='default', load_readings=True):
3142
"""Load the demo included in the Draco project.
3243
3344
The first time that this function is executed, the data will be downloaded
3445
and cached inside the `draco/demo` folder.
3546
Subsequent calls will load the cached data instead of downloading it again.
47+
48+
Args:
49+
rul (str):
50+
Name of the dataset to load. If "RUL", load NASA's CMAPSS dataset
51+
https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#turbofan.
52+
If "default" then load default demo.
53+
load_readings (bool):
54+
Whether to load the ``readings`` table or not.
3655
3756
Returns:
3857
tuple[pandas.DataFrame]:
3958
target_times and readings tables
4059
"""
41-
target_times = _load_or_download('target_times', 'cutoff_time')
42-
if load_readings:
43-
readings = _load_or_download('readings', 'timestamp')
44-
return target_times, readings
60+
files = _FILES[name.upper()]
4561

46-
return target_times
62+
if not load_readings:
63+
files = files[:-1]
64+
65+
output = list()
66+
for filename, dates in files:
67+
output.append(_load_or_download(filename, dates))
68+
69+
return tuple(output)
4770

4871

4972
def generate_raw_readings(output_path='demo'):

draco/pipeline.py

+28-16
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
from copy import deepcopy
1010
from hashlib import md5
1111

12-
import cloudpickle
1312
import keras
1413
import numpy as np
1514
from btb import BTBSession
@@ -54,7 +53,7 @@ def __setstate__(self, state):
5453
Sequential.__setstate__ = __setstate__
5554

5655

57-
def get_pipelines(pattern='', path=False, pipeline_type='classes'):
56+
def get_pipelines(pattern='', path=False, pipeline_type=None):
5857
"""Get the list of available pipelines.
5958
6059
Optionally filter the names using a patter or obtain
@@ -66,25 +65,33 @@ def get_pipelines(pattern='', path=False, pipeline_type='classes'):
6665
path (bool):
6766
Whether to return a dictionary containing the pipeline
6867
paths instead of only a list with the names.
69-
pipeline_type (str):
70-
The pipeline category to filter by (`classes`, `probability` and `unstacked`).
71-
Defaults to `classes`.
68+
pipeline_type (str or list[str]):
69+
The pipeline category to filter. Defaults to `None`.
7270
7371
Return:
7472
list or dict:
7573
List of available and matching pipeline names.
7674
If `path=True`, return a dict containing the pipeline
7775
names as keys and their absolute paths as values.
7876
"""
77+
if isinstance(pipeline_type, str):
78+
pipeline_type = [pipeline_type]
79+
elif pipeline_type is None:
80+
pipeline_type = os.listdir(PIPELINES_DIR)
81+
7982
pipelines = dict()
80-
pipelines_dir = os.path.join(PIPELINES_DIR, pipeline_type)
81-
82-
for filename in os.listdir(pipelines_dir):
83-
if filename.endswith('.json') and pattern in filename:
84-
name = os.path.basename(filename)[:-len('.json')]
85-
name = f'{pipeline_type}.{name}'
86-
pipeline_path = os.path.join(pipelines_dir, filename)
87-
pipelines[name] = pipeline_path
83+
pipelines_dir = [
84+
os.path.join(PIPELINES_DIR, ptype)
85+
for ptype in pipeline_type
86+
if ptype != 'preprocessing'
87+
]
88+
89+
for pdir in pipelines_dir:
90+
for filename in os.listdir(pdir):
91+
if filename.endswith('.json') and pattern in filename:
92+
name = os.path.basename(filename)[:-len('.json')]
93+
pipeline_path = os.path.join(pdir, filename)
94+
pipelines[name] = pipeline_path
8895

8996
if not path:
9097
pipelines = list(pipelines)
@@ -604,14 +611,14 @@ def predict(self, target_times=None, readings=None, turbines=None,
604611
return predictions
605612

606613
def save(self, path):
607-
"""Serialize and save this pipeline using cloudpickle.
614+
"""Serialize and save this pipeline using pickle.
608615
609616
Args:
610617
path (str):
611618
Path to the file where the pipeline will be saved.
612619
"""
613620
with open(path, 'wb') as pickle_file:
614-
cloudpickle.dump(self, pickle_file)
621+
pickle.dump(self, pickle_file)
615622

616623
@classmethod
617624
def load(cls, path):
@@ -626,4 +633,9 @@ def load(cls, path):
626633
Loaded DracoPipeline instance.
627634
"""
628635
with open(path, 'rb') as pickle_file:
629-
return cloudpickle.load(pickle_file)
636+
pipeline = pickle.load(pickle_file)
637+
638+
if not isinstance(pipeline, cls):
639+
raise ValueError('Serialized object is not a DracoPipeline')
640+
641+
return pipeline

draco/pipelines/classes/normalize_dfs_xgb_classifier.json

-65
This file was deleted.

0 commit comments

Comments
 (0)