params update, readme update

nkoussa · nkoussa · commit 0fc7d476ef6c · 2025-02-20T15:18:46.000-05:00
diff --git a/workflows/hpo/deephyper/README.md b/workflows/hpo/deephyper/README.md
@@ -1,59 +1,80 @@
-# Run HPO using DeepHyper on Lambda with conda
+# Hyperparameter Optimization using DeepHyper
+
+## Overview 
+
+The scripts contained here run Hyperparameter Optimization (HPO) using DeepHyper.
+
+## Requirements
+
+* [IMPROVE general environment](https://jdacs4c-improve.github.io/docs/content/INSTALLATION.html)
+* [DeepHyper](https://deephyper.readthedocs.io/en/stable/)
+* MPI (these instructions use [openmpi](https://www.open-mpi.org/))
+* [mpi4py](https://mpi4py.readthedocs.io/en/stable/) 
+* An IMPROVE-compliant model and its environment
+
+## Installation and Setup
+
+Create conda environment for DeepHyper:
+
+```
+module load openmpi
+conda create -n dh python=3.9 -y
+conda activate dh
+conda install gxx_linux-64 gcc_linux-64
+pip install "deephyper[default]"
+pip install mpi4py
+pip install improvelib
+```
+
+Install the model of choice, IMPROVE, and benchmark datasets:
 
-## 1. Install conda environment for the curated model 
-Install model, IMPROVE, and datasets:
 ```
 cd <WORKING_DIR>
 git clone https://github.com/JDACS4C-IMPROVE/<MODEL>
 cd <MODEL>
 source setup_improve.sh
 ```
 
-Install model environment (get the name of the yml file from model repo readme):
-The workflow will need to know the ./<MODEL_ENV_NAME>/.
+Create a Conda environment path for the model in the model directory:
+The workflow will need to know the `<MODEL_ENV_NAME>`.
+
 ```
 conda env create -f <MODEL_ENV>.yml -p ./<MODEL_ENV_NAME>/
 ```
 
-## 2. Perform preprocessing
-Run the preprocess script. 
-The workflow will need to know the <PATH/TO/PREPROCESSED/DATA>.
+
+Run the preprocess script:
+The workflow will need to know the `<PATH/TO/PREPROCESSED/DATA>`.
 
 ```
-cd PathDSP
+cd <MODEL>
 conda activate ./<MODEL_ENV_NAME>/
 python <MODEL_NAME>_preprocess_improve.py --input_dir ./csa_data/raw_data --output_dir <PATH/TO/PREPROCESSED/DATA>
 conda deactivate
 ```
 
-## 3. Install conda environment for DeepHyper
-```
-module load openmpi
-conda create -n dh python=3.9 -y
-conda activate dh
-conda install gxx_linux-64 gcc_linux-64
-pip install "deephyper[default]"
-pip install mpi4py
-```
+## Parameter Configuration
+
+**Workflow Parameters**
+
+This workflow uses IMPROVE parameter handling. You should create a config file following the template of `hpo_deephyper_params.ini` with the parameters appropriate for your experiment. Parameters may also be specified on the command line.
+
+
+* `model_scripts_dir` should be set to the path to the model directory containing the model scripts (from step 1).
+* `input_dir` should be set to the location of the preprocessed data (above). We highly recommend that the name of this directory includes the source and split (e.g. ./ml_data/CCLE-CCLE/split_0). You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`.
+* `model_name` should be set to your model name (this should have the same capitalization pattern as your model scripts, e.g. deepttc for deepttc_preprocess_improve.py, etc).
+* `model_environment` should be set to the location of the model environment (from step 1). You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`.
+* `output_dir` should be set to path you would like the output to be saved to. We highly recommend that the name of this directory includes the source and split (e.g. ./deephyper/CCLE/split_0)
+
+* `max_evals` should be set to the maximum number of evaluations to check for before launching additional training runs.
+* `hyperparameter_file` can be set to an alternate .json file containing hyperparameters. You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`. See below (step 5) for how to change hyperparameters.
+* `val_metric` can be set to any IMPROVE metric you would like to optimize. 'mse' and 'rmse' are minimized, all other metrics are maximized. Note that this does not change what val loss is used by the model, only what HPO tries to optimize. Default is 'mse'.
+* `num_gpus_per_node` should be set to the number of GPUs per node on your system. Default is 2.
+* `epochs`: Number of epochs to train for. If None is specified, model default parameters will be used (default: None).
+* Parameters beginning with `CBO_` can be used to change the optimization protocol. The names of these parameters can be found by running `python hpo_deephyper_subprocess.py --help` or looking in `hpo_deephyper_params_def.py`. Documentation of the DeepHyper CBO can be found [here](https://deephyper.readthedocs.io/en/stable/_autosummary/deephyper.hpo.CBO.html#deephyper.hpo.CBO).
+
+**Hyperparameters**
 
-## 4. Modify configuration file
-`hpo_deephyper_params.ini` is an example configuration file for this workflow.
-You will need to change the following parameters for your model:
-`model_scripts_dir` should be set to the path to the model directory containing the model scripts (from step 1).
-`input_dir` should be set to the location of the preprocessed data (above). We highly recommend that the name of this directory includes the source and split (e.g. ./ml_data/CCLE-CCLE/split_0). You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`.
-`model_name` should be set to your model name (this should have the same capitalization pattern as your model scripts, e.g. deepttc for deepttc_preprocess_improve.py, etc).
-`model_environment` should be set to the location of the model environment (from step 1). You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`.
-`output_dir` should be set to path you would like the output to be saved to. We highly recommend that the name of this directory includes the source and split (e.g. ./deephyper/CCLE/split_0)
-`epochs` should be set to the maximum number of epochs to train for.
-`max_evals` should be set to the maximum number of evaluations to check for before launching additional training runs.
-`interactive_session` should be set to True to run on Lambda, Polaris, and Biowulf. Other implementations have not yet been tested.
-`hyperparameter_file` can be set to an alternate .json file containing hyperparameters. You can provide a complete or relative path, or the name of the directory if it is in `model_scripts_dir`. See below (step 5) for how to change hyperparameters.
-`val_metric` can be set to any IMPROVE metric you would like to optimize. 'mse' and 'rmse' are minimized, all other metrics are maximized. Note that this does not change what val loss is used by the model, only what HPO tries to optimize. Default is 'mse'.
-`num_gpus_per_node` should be set to the number of GPUs per node on your system. Default is 2.
-Parameters beginning with `CBO_` can be used to change the optimization protocol. The names of these parameters can be found by running `python hpo_deephyper_subprocess.py --help` or looking in `hpo_deephyper_params_def.py`. Documentation of the DeepHyper CBO can be found here: https://deephyper.readthedocs.io/en/stable/_autosummary/deephyper.hpo.CBO.html#deephyper.hpo.CBO
-
-
-## 5. Modify hyperparameters file
 `hpo_deephyper_hyperparameters.json` contains dictionaries for the hyperparameters.
 The default settings are as follows:
 
@@ -73,27 +94,18 @@ You can add more hyperparameters to test by adding additional dictionaries to th
 ```
 Note that boolean values must be lowercase in JSON files.
 
+## Usage
 
-## 6. Perform HPO
-Navigate to the DeepHyper directory
-```
-cd <WORKING_DIR>/IMPROVE/workflows/deephyper_hpo
-```
-If necesssary (i.e not proceeding directly from above steps), activate environment:
+Activate the DeepHyper environment:
 ```
 module load openmpi 
 conda activate dh
 export PYTHONPATH=../../../IMPROVE
 ```
 
-Run HPO:
-```
-mpirun -np 10 python hpo_deephyper_subprocess.py
-```
-
-To run HPO with a different config file:
+Run HPO with DeepHyper:
 ```
-mpirun -np 10 python hpo_deephyper_subprocess.py --config <ALTERNATE_CONFIG_FILE>
+mpirun -np 10 python hpo_deephyper_subprocess.py --config <your_config.ini>
 ```
 
 
@@ -131,5 +143,14 @@ export CUDA_VISIBLE_DEVICES=0,1,2,3
 mpirun -n ${NTOTRANKS} --ppn ${NRANKS_PER_NODE} --depth=${NDEPTH} --cpu-bind depth --env OMP_NUM_THREADS=${NTHREADS} python hpo_deephyper_subprocess.py
 ```
 
+To submit a job on Biowulf:
+
+```
+```
+
+
+## Output
+
+The output will be in the specified `output_dir` with the following structure
 
 
diff --git a/workflows/hpo/deephyper/hpo_deephyper_params.ini b/workflows/hpo/deephyper/hpo_deephyper_params.ini
@@ -3,7 +3,5 @@ model_scripts_dir = <PATH/TO/MODEL/DIR>
 input_dir = <PATH/TO/PREPROCESSED/DATA>
 model_name = <MODEL_NAME>
 model_environment = <MODEL_ENV_NAME>
-epochs = 3
-output_dir = ./test_PathDSP
+output_dir = ./test
 max_evals = 5
-interactive_session = True
diff --git a/workflows/hpo/deephyper/hpo_deephyper_params_def.py b/workflows/hpo/deephyper/hpo_deephyper_params_def.py
@@ -18,18 +18,8 @@
     },
     {"name": "epochs",
      "type": int,
-     "default": 10,
-     "help": "Number of epochs"
-    },
-    {"name": "use_singularity",
-     "type": bool,
-     "default": True,
-     "help": "Do you want to use singularity image for running the model?"
-    },
-    {"name": "singularity_image",
-     "type": str,
-     "default": '',
-     "help": "Singularity image file of the model"
+     "default": None,
+     "help": "Number of epochs. If None, model default will be used."
     },
     {"name": "val_metric",
      "type": str,
@@ -41,11 +31,6 @@
      "default": 20,
      "help": "Number of evaluations"
     },
-    {"name": "interactive_session",
-     "type": bool,
-     "default": True,
-     "help": "Are you using an interactive session?"
-    },
     {"name": "hyperparameter_file",
      "type": str,
      "default": './hpo_deephyper_hyperparameters.json',
diff --git a/workflows/hpo/deephyper/hpo_deephyper_subprocess.py b/workflows/hpo/deephyper/hpo_deephyper_subprocess.py
@@ -41,6 +41,8 @@ def run(job, optuna_trial=None):
              str(params['epochs']),
              str(os.environ["CUDA_VISIBLE_DEVICES"])
         ]
+    if params['epochs'] is not None:
+        train_run = train_run + ['epochs'] + [params['epochs']]
     for hp in params['hyperparams']:
         train_run = train_run + [str(hp)]
         train_run = train_run + [str(job.parameters[hp])]
@@ -123,12 +125,8 @@ def run(job, optuna_trial=None):
     comm = MPI.COMM_WORLD
     rank = comm.Get_rank()
     size = comm.Get_size()
-    if params['interactive_session']:
-        os.environ["CUDA_VISIBLE_DEVICES"] = str(rank % params['num_gpus_per_node'])
-        cuda_name = "cuda:" + str(rank % params['num_gpus_per_node'])
-    else:
-        # CUDA_VISIBLE_DEVICES is now set via set_affinity_gpu_polaris.sh
-        local_rank = os.environ["PMI_LOCAL_RANK"]
+    os.environ["CUDA_VISIBLE_DEVICES"] = str(rank % params['num_gpus_per_node'])
+    cuda_name = "cuda:" + str(rank % params['num_gpus_per_node'])
 
     # Run DeepHyper
     with Evaluator.create(
diff --git a/workflows/hpo/deephyper/hpo_deephyper_subprocess_train.sh b/workflows/hpo/deephyper/hpo_deephyper_subprocess_train.sh
@@ -20,14 +20,13 @@ echo "Activated conda env $CONDA_ENV"
 SCRIPT=$2
 input_dir=$3
 output_dir=$4
-epochs=$5
-CUDA_VISIBLE_DEVICES=$6
+CUDA_VISIBLE_DEVICES=$5
 
-command="python $SCRIPT --input_dir $input_dir --output_dir $output_dir --epochs $epochs "
+command="python $SCRIPT --input_dir $input_dir --output_dir $output_dir "
 
 
 # append hyperparameter arguments to python call
-for i in $(seq 7 $#)
+for i in $(seq 6 $#)
 do
     if [ $(($i % 2)) == 0 ]; then
         command="${command} ${!i}"