Skip to content

Commit

Permalink
Merge pull request PaddlePaddle#44 from huangjun12/refine-bmn
Browse files Browse the repository at this point in the history
refine details of bmn model
  • Loading branch information
heavengate authored Apr 17, 2020
2 parents 434c5c2 + c4e9065 commit c54980b
Show file tree
Hide file tree
Showing 8 changed files with 155 additions and 107 deletions.
32 changes: 13 additions & 19 deletions examples/bmn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ BMN Overview
├── train.py # 训练代码,训练网络
├── eval.py # 评估代码,评估网络性能
├── predict.py # 预测代码,针对任意输入预测结果
├── bmn_model.py # 网络结构与损失函数定义
├── bmn_metric.py # 精度评估方法定义
├── reader.py # 数据reader,构造Dataset和Dataloader
├── bmn_utils.py # 模型细节相关代码
Expand All @@ -41,7 +40,7 @@ BMN Overview

## 数据准备

BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理好的视频特征,请下载[bmn\_feat](https://paddlemodels.bj.bcebos.com/video_detection/bmn_feat.tar.gz)数据后解压,同时相应的修改bmn.yaml中的特征路径feat\_path。对应的标签文件请下载[label](https://paddlemodels.bj.bcebos.com/video_detection/activitynet_1.3_annotations.json)并修改bmn.yaml中的标签文件路径anno\_file。
BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理好的视频特征和对应的标签文件,请下载特征数据[bmn\_feat](https://paddlemodels.bj.bcebos.com/video_detection/bmn_feat.tar.gz)和标签数据[label](https://paddlemodels.bj.bcebos.com/video_detection/activitynet_1.3_annotations.json),并相应地修改配置文件bmn.yaml中的特征文件路径feat\_path和标签文件路径anno\_file。


## 模型训练
Expand All @@ -52,22 +51,17 @@ BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理

bash run.sh

若使用单卡训练,启动方式如下:
若使用单卡训练,请将配置文件bmn.yaml中的batch\_size调整为16,启动方式如下:

export CUDA_VISIBLE_DEVICES=0
python train.py

- 代码运行需要先安装pandas

- 从头开始训练,使用上述启动命令行或者脚本程序即可启动训练,不需要用到预训练模型
默认使用静态图训练,若使用动态图训练只需要在运行脚本添加`-d`参数即可,如:

- 单卡训练时,请将配置文件中的batch_size调整为16
python train.py -d

**训练策略:**
- 代码运行需要先安装pandas

* 采用Adam优化器,初始learning\_rate=0.001
* 权重衰减系数为1e-4
* 学习率在迭代次数达到4200的时候做一次衰减,衰减系数为0.1
- 从头开始训练,使用上述启动命令行或者脚本程序即可启动训练,不需要用到预训练模型


## 模型评估
Expand All @@ -76,9 +70,9 @@ BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理

python eval.py --weights=$PATH_TO_WEIGHTS

- 进行评估时,可修改命令行中的`weights`参数指定需要评估的权重,如果不设置,将使用默认参数文件checkpoint/final.pdparams。
- 进行评估时,可修改命令行中的`weights`参数指定需要评估的权重,若未指定,脚本会下载已发布的模型[model](https://paddlemodels.bj.bcebos.com/hapi/bmn.pdparams)进行评估

- 上述程序会将运行结果保存在output/EVAL/BMN\_results文件夹下,测试结果保存在evaluate\_results/bmn\_results\_validation.json文件中
- 上述程序会将运行结果保存在`--output_path`参数指定的文件夹下,默认为output/EVAL/BMN\_results;测试结果保存在`--result_path`参数指定的文件夹下,默认为evaluate\_results。

- 注:评估时可能会出现loss为nan的情况。这是由于评估时用的是单个样本,可能存在没有iou>0.6的样本,所以为nan,对最终的评估结果没有影响。

Expand All @@ -87,9 +81,9 @@ BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理

- ActivityNet数据集的具体使用说明可以参考其[官方网站](http://activity-net.org)

- 下载指标评估代码,请从[ActivityNet Gitub repository](https://github.com/activitynet/ActivityNet.git)下载,将Evaluation文件夹拷贝至models/dygraph/bmn目录下。(注:由于第三方评估代码不支持python3,此处建议使用python2进行评估;若使用python3,print函数需要添加括号,请对Evaluation目录下的.py文件做相应修改。)
- 下载指标评估代码,请从[ActivityNet Gitub repository](https://github.com/activitynet/ActivityNet.git)下载,将Evaluation文件夹拷贝至hapi/examples/bmn目录下。(注:由于第三方评估代码不支持python3,此处建议使用python2进行评估;若使用python3,print函数需要添加括号,请对Evaluation目录下的.py文件做相应修改。)

- 请下载[activity\_net\_1\_3\_new.json](https://paddlemodels.bj.bcebos.com/video_detection/activity_net_1_3_new.json)文件,并将其放置在models/dygraph/bmn/Evaluation/data目录下,相较于原始的activity\_net.v1-3.min.json文件,我们过滤了其中一些失效的视频条目。
- 请下载[activity\_net\_1\_3\_new.json](https://paddlemodels.bj.bcebos.com/video_detection/activity_net_1_3_new.json)文件,并将其放置在hapi/examples/bmn/Evaluation/data目录下,相较于原始的activity\_net.v1-3.min.json文件,我们过滤了其中一些失效的视频条目。

- 计算精度指标

Expand All @@ -100,7 +94,7 @@ BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理

| AR@1 | AR@5 | AR@10 | AR@100 | AUC |
| :---: | :---: | :---: | :---: | :---: |
| 33.46 | 49.25 | 56.25 | 75.40 | 67.16% |
| 33.10 | 49.18 | 56.54 | 75.12 | 67.16% |


## 模型推断
Expand All @@ -110,9 +104,9 @@ BMN的训练数据采用ActivityNet1.3提供的数据集,我们提供了处理
python predict.py --weights=$PATH_TO_WEIGHTS \
--filelist=$FILELIST

- 使用python命令行启动程序时,`--filelist`参数指定待推断的文件列表,如果不设置,默认为./infer.list。`--weights`参数为训练好的权重参数,如果不设置,将使用默认参数文件checkpoint/final.pdparams。
- 使用python命令行启动程序时,`--filelist`参数指定待推断的文件列表,如果不设置,默认为./infer.list。`--weights`参数为训练好的权重参数,若未指定,脚本会下载已发布的模型[model](https://paddlemodels.bj.bcebos.com/hapi/bmn.pdparams)进行预测

- 上述程序会将运行结果保存在output/INFER/BMN\_results文件夹下,测试结果保存在predict\_results/bmn\_results\_test.json文件中
- 上述程序会将运行结果保存在`--output_path`参数指定的文件夹下,默认为output/INFER/BMN\_results;测试结果保存在`--result_path`参数指定的文件夹下,默认为predict\_results。


## 参考论文
Expand Down
11 changes: 1 addition & 10 deletions examples/bmn/bmn.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,10 @@ MODEL:
TRAIN:
subset: "train"
epoch: 9
batch_size: 4
batch_size: 4
num_workers: 4
use_shuffle: True
device: "gpu"
num_gpus: 4
learning_rate: 0.001
learning_rate_decay: 0.1
lr_decay_iter: 4200
Expand All @@ -29,21 +28,13 @@ TEST:
subset: "validation"
batch_size: 1
num_workers: 1
use_buffer: False
snms_alpha: 0.001
snms_t1: 0.5
snms_t2: 0.9
output_path: "output/EVAL/BMN_results"
result_path: "evaluate_results"

INFER:
subset: "test"
batch_size: 1
num_workers: 1
use_buffer: False
snms_alpha: 0.4
snms_t1: 0.5
snms_t2: 0.9
filelist: './infer.list'
output_path: "output/INFER/BMN_results"
result_path: "predict_results"
Expand Down
13 changes: 13 additions & 0 deletions examples/bmn/bmn_metric.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,21 @@ def __init__(self, cfg, mode):
#get video_dict and video_list
if self.mode == 'test':
self.get_test_dataset_dict()
if not os.path.isdir(self.cfg.TEST.output_path):
os.makedirs(self.cfg.TEST.output_path)
if not os.path.isdir(self.cfg.TEST.result_path):
os.makedirs(self.cfg.TEST.result_path)
elif self.mode == 'infer':
self.get_infer_dataset_dict()
if not os.path.isdir(self.cfg.INFER.output_path):
os.makedirs(self.cfg.INFER.output_path)
if not os.path.isdir(self.cfg.INFER.result_path):
os.makedirs(self.cfg.INFER.result_path)

def add_metric_op(self, preds, label):
pred_bm, pred_start, pred_en = preds
video_index = label[-1]
return [pred_bm, pred_start, pred_en, video_index] #return list

def update(self, pred_bm, pred_start, pred_end, fid):
# generate proposals
Expand Down
35 changes: 27 additions & 8 deletions examples/bmn/eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ def parse_args():
parser.add_argument(
"-d",
"--dynamic",
default=True,
action='store_true',
help="enable dygraph mode, only support dynamic mode at present time")
parser.add_argument(
Expand All @@ -56,6 +55,17 @@ def parse_args():
default=None,
help='weight path, None to automatically download weights provided by Paddle.'
)
parser.add_argument(
'--output_path',
type=str,
default="output/EVAL/BMN_results",
help='output dir path, default to use output/EVAL/BMN_results')
parser.add_argument(
'--result_path',
type=str,
default="evaluate_results/",
help='output dir path after post processing, default to use ./evaluate_results/'
)
parser.add_argument(
'--log_interval',
type=int,
Expand All @@ -67,17 +77,21 @@ def parse_args():

# Performance Evaluation
def test_bmn(args):
# only support dynamic mode at present time
device = set_device(args.device)
fluid.enable_dygraph(device) if args.dynamic else None

#config setting
config = parse_config(args.config_file)
eval_cfg = merge_configs(config, 'test', vars(args))
if not os.path.isdir(config.TEST.output_path):
os.makedirs(config.TEST.output_path)
if not os.path.isdir(config.TEST.result_path):
os.makedirs(config.TEST.result_path)

feat_dim = config.MODEL.feat_dim
tscale = config.MODEL.tscale
dscale = config.MODEL.dscale
prop_boundary_ratio = config.MODEL.prop_boundary_ratio
num_sample = config.MODEL.num_sample
num_sample_perbin = config.MODEL.num_sample_perbin

#input and video index
inputs = [
Input(
[None, config.MODEL.feat_dim, config.MODEL.tscale],
Expand All @@ -97,9 +111,14 @@ def test_bmn(args):
eval_dataset = BmnDataset(eval_cfg, 'test')

#model
model = bmn(config, pretrained=args.weights is None)
model = bmn(tscale,
dscale,
prop_boundary_ratio,
num_sample,
num_sample_perbin,
pretrained=args.weights is None)
model.prepare(
loss_function=BmnLoss(config),
loss_function=BmnLoss(tscale, dscale),
metrics=BmnMetric(
config, mode='test'),
inputs=inputs,
Expand Down
37 changes: 28 additions & 9 deletions examples/bmn/predict.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ def parse_args():
parser.add_argument(
"-d",
"--dynamic",
default=True,
action='store_true',
help="enable dygraph mode, only support dynamic mode at present time")
parser.add_argument(
Expand All @@ -54,10 +53,21 @@ def parse_args():
help='weight path, None to automatically download weights provided by Paddle.'
)
parser.add_argument(
'--save_dir',
'--filelist',
type=str,
default="infer.list",
help='infer file list, default to use ./infer.list')
parser.add_argument(
'--output_path',
type=str,
default="output/INFER/BMN_results",
help='output dir path, default to use output/INFER/BMN_results')
parser.add_argument(
'--result_path',
type=str,
default="predict_results/",
help='output dir path, default to use ./predict_results/')
help='output dir path after post processing, default to use ./predict_results/'
)
parser.add_argument(
'--log_interval',
type=int,
Expand All @@ -69,18 +79,21 @@ def parse_args():

# Prediction
def infer_bmn(args):
# only support dynamic mode at present time
device = set_device(args.device)
fluid.enable_dygraph(device) if args.dynamic else None

#config setting
config = parse_config(args.config_file)
infer_cfg = merge_configs(config, 'infer', vars(args))

if not os.path.isdir(config.INFER.output_path):
os.makedirs(config.INFER.output_path)
if not os.path.isdir(config.INFER.result_path):
os.makedirs(config.INFER.result_path)
feat_dim = config.MODEL.feat_dim
tscale = config.MODEL.tscale
dscale = config.MODEL.dscale
prop_boundary_ratio = config.MODEL.prop_boundary_ratio
num_sample = config.MODEL.num_sample
num_sample_perbin = config.MODEL.num_sample_perbin

#input and video index
inputs = [
Input(
[None, config.MODEL.feat_dim, config.MODEL.tscale],
Expand All @@ -92,7 +105,13 @@ def infer_bmn(args):
#data
infer_dataset = BmnDataset(infer_cfg, 'infer')

model = bmn(config, pretrained=args.weights is None)
#model
model = bmn(tscale,
dscale,
prop_boundary_ratio,
num_sample,
num_sample_perbin,
pretrained=args.weights is None)
model.prepare(
metrics=BmnMetric(
config, mode='infer'),
Expand Down
1 change: 0 additions & 1 deletion examples/bmn/run.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,2 @@
export CUDA_VISIBLE_DEVICES=0,1,2,3

python -m paddle.distributed.launch train.py
49 changes: 25 additions & 24 deletions examples/bmn/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,7 @@
def parse_args():
parser = argparse.ArgumentParser("Paddle high level api of BMN.")
parser.add_argument(
"-d",
"--dynamic",
default=True,
action='store_true',
help="enable dygraph mode")
"-d", "--dynamic", action='store_true', help="enable dygraph mode")
parser.add_argument(
'--config_file',
type=str,
Expand All @@ -48,7 +44,7 @@ def parse_args():
'--batch_size',
type=int,
default=None,
help='training batch size. None to use config file setting.')
help='training batch size. None for read from config file.')
parser.add_argument(
'--learning_rate',
type=float,
Expand All @@ -68,8 +64,8 @@ def parse_args():
parser.add_argument(
'--epoch',
type=int,
default=9,
help='epoch number, 0 for read from config file')
default=None,
help='epoch number, None for read from config file')
parser.add_argument(
'--valid_interval',
type=int,
Expand Down Expand Up @@ -113,46 +109,51 @@ def train_bmn(args):
if not os.path.isdir(args.save_dir):
os.makedirs(args.save_dir)

#config setting
config = parse_config(args.config_file)
train_cfg = merge_configs(config, 'train', vars(args))
val_cfg = merge_configs(config, 'valid', vars(args))

inputs = [
Input(
[None, config.MODEL.feat_dim, config.MODEL.tscale],
'float32',
name='feat_input')
]
gt_iou_map = Input(
[None, config.MODEL.dscale, config.MODEL.tscale],
'float32',
name='gt_iou_map')
gt_start = Input([None, config.MODEL.tscale], 'float32', name='gt_start')
gt_end = Input([None, config.MODEL.tscale], 'float32', name='gt_end')
feat_dim = config.MODEL.feat_dim
tscale = config.MODEL.tscale
dscale = config.MODEL.dscale
prop_boundary_ratio = config.MODEL.prop_boundary_ratio
num_sample = config.MODEL.num_sample
num_sample_perbin = config.MODEL.num_sample_perbin

# input and label list
inputs = [Input([None, feat_dim, tscale], 'float32', name='feat_input')]
gt_iou_map = Input([None, dscale, tscale], 'float32', name='gt_iou_map')
gt_start = Input([None, tscale], 'float32', name='gt_start')
gt_end = Input([None, tscale], 'float32', name='gt_end')
labels = [gt_iou_map, gt_start, gt_end]

# data
train_dataset = BmnDataset(train_cfg, 'train')
val_dataset = BmnDataset(val_cfg, 'valid')

# model
model = bmn(config, pretrained=False)
model = bmn(tscale,
dscale,
prop_boundary_ratio,
num_sample,
num_sample_perbin,
pretrained=False)
optim = optimizer(config, parameter_list=model.parameters())
model.prepare(
optimizer=optim,
loss_function=BmnLoss(config),
loss_function=BmnLoss(tscale, dscale),
inputs=inputs,
labels=labels,
device=device)

# if resume weights is given, load resume weights directly
if args.resume is not None:
model.load(args.resume)

model.fit(train_data=train_dataset,
eval_data=val_dataset,
batch_size=train_cfg.TRAIN.batch_size,
epochs=args.epoch,
epochs=train_cfg.TRAIN.epoch,
eval_freq=args.valid_interval,
log_freq=args.log_interval,
save_dir=args.save_dir,
Expand Down
Loading

0 comments on commit c54980b

Please sign in to comment.