-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DCRBaselineProtection Metric #728
Add DCRBaselineProtection Metric #728
Conversation
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dcr_feature_branch #728 +/- ##
======================================================
+ Coverage 95.39% 95.47% +0.07%
======================================================
Files 114 115 +1
Lines 4491 4570 +79
======================================================
+ Hits 4284 4363 +79
Misses 207 207
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
…ete_dcr_baseline_protection
@@ -148,3 +148,16 @@ def allow_nan_array(attributes): | |||
ret.append(entry) | |||
|
|||
return ret | |||
|
|||
|
|||
def validate_num_samples_num_iteration(num_rows_subsample, num_iterations): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Putting this in a separate function as we reuse this for DCROverfittingProtection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
I just let 1 or 2 suggestions
resolves #720
CU-86b3w869f
Average of 84 secs to run a synthetic sample of size 1000 to run against the demo dataset
fake_hotels_guest
(400 rows used for training and 100 rows used for validation) without subsampling.