-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add caffe2paddle.py #58
Merged
guoshengCS
merged 8 commits into
PaddlePaddle:develop
from
guoshengCS:add_caffe2paddle_tool
Jun 12, 2017
Merged
Changes from 5 commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
7167dd1
add caffe2paddle.py
guoshengCS 4f95023
delete useless code
guoshengCS 54c03f6
add param proto
guoshengCS ae9c48a
add caffe_predict to test
guoshengCS f8bc721
add README.md
guoshengCS d958e7a
update README and remove image.py
guoshengCS ffacc4b
Merge branch 'develop' of https://github.com/PaddlePaddle/models into…
guoshengCS 868369d
update defalut name corresponding to up-to-date paddle
guoshengCS File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
## 使用说明 | ||
|
||
`caffe2paddle.py`提供了将Caffe训练的模型转换为PaddlePaddle可使用的模型的接口`ModelConverter`,其封装了图像领域常用的Convolution、BatchNorm等layer的转换函数,可完成VGG、ResNet等常用模型的转换。模型转换的基本过程是:基于Caffe的Python API加载模型并依次获取每一个layer的信息,将其中的参数根据layer类型与PaddlePaddle适配后序列化保存(对于Pooling等无需训练的layer不做处理),输出可以直接为PaddlePaddle的Python API加载使用的模型文件。 | ||
|
||
`ModelConverter`的定义及说明如下: | ||
|
||
```python | ||
class ModelConverter(object): | ||
#设置Caffe网络配置文件、模型文件路径和要保存为的Paddle模型的文件名,并使用Caffe API加载模型 | ||
def __init__(self, caffe_model_file, caffe_pretrained_file, paddle_tar_name) | ||
|
||
#输出保存Paddle模型 | ||
def to_tar(self, f) | ||
|
||
#将参数值序列化输出为二进制 | ||
@staticmethod | ||
def serialize(data, f) | ||
|
||
#依次对各个layer进行转换,转换时参照name_map进行layer和参数命名 | ||
def convert(self, name_map={}) | ||
|
||
#对Caffe模型的Convolution层的参数进行转换,将使用name值对Paddle模型中对应layer的参数命名 | ||
@wrap_name_default("img_conv_layer") | ||
def convert_Convolution_layer(self, params, name=None) | ||
|
||
#对Caffe模型的InnerProduct层的参数进行转换,将使用name值对Paddle模型中对应layer的参数命名 | ||
@wrap_name_default("fc_layer") | ||
def convert_InnerProduct_layer(self, params, name=None) | ||
|
||
#对Caffe模型的BatchNorm层的参数进行转换,将使用name值对Paddle模型中对应layer的参数命名 | ||
@wrap_name_default("batch_norm_layer") | ||
def convert_BatchNorm_layer(self, params, name=None) | ||
|
||
#对Caffe模型的Scale层的参数进行转换,将使用name值对Paddle模型中对应layer的参数命名 | ||
def convert_Scale_layer(self, params, name=None) | ||
|
||
#输入图片路径和均值文件路径,使用加载的Caffe模型进行预测 | ||
def caffe_predict(self, img, mean_file) | ||
|
||
``` | ||
|
||
`ModelConverter`的使用方法如下: | ||
|
||
```python | ||
#指定Caffe网络配置文件、模型文件路径和要保存为的Paddle模型的文件名,并从指定文件加载模型 | ||
converter = ModelConverter("./ResNet-50-deploy.prototxt", | ||
"./ResNet-50-model.caffemodel", | ||
"Paddle_ResNet50.tar.gz") | ||
#进行模型转换 | ||
converter.convert(name_map={}) | ||
#进行预测并输出预测概率以便对比验证模型转换结果 | ||
converter.caffe_predict(img='./caffe/examples/images/cat.jpg') | ||
``` | ||
|
||
为验证并使用转换得到的模型,需基于PaddlePaddle API编写对应的网络结构配置文件,具体可参照PaddlePaddle使用文档,我们这里附上ResNet的配置以供使用。需要注意,上文给出的模型转换在调用`ModelConverter.convert`时传入了空的`name_map`,这将在遍历每一个layer进行参数保存时使用PaddlePaddle默认的layer和参数命名规则:以`wrap_name_default`中的值和调用计数构造layer name,并以此为前缀构造参数名(比如第一个InnerProduct层的bias参数将被命名为`___fc_layer_0__.wbias`);为此,在编写PaddlePaddle网络配置时要保证和Caffe端模型使用同样的拓扑顺序,尤其是对于ResNet这种有分支的网络结构,要保证两分支在PaddlePaddle和Caffe中先后顺序一致,这样才能够使得模型参数正确加载。如果不希望使用默认的layer name,可以使用一种更为精细的方法:建立Caffe和PaddlePaddle网络配置间layer name对应关系的`dict`并在调用`ModelConverter.convert`时作为`name_map`传入,这样在命名保存layer中的参数时将使用相应的layer name,另外这里只针对Caffe网络配置中Convolution、InnerProduct和BatchNorm类别的layer建立`name_map`即可(一方面,对于Pooling等无需训练的layer不需要保存,故这里没有提供转换接口;另一方面,对于Caffe中的Scale类别的layer,由于Caffe和PaddlePaddle在实现上的一些差别,PaddlePaddle中的batch_norm层同时包含BatchNorm和Scale层的复合,故这里对Scale进行了特殊处理)。 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 分多个段落吧。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,243 @@ | ||
import os | ||
import functools | ||
import inspect | ||
import struct | ||
import gzip | ||
import tarfile | ||
import cStringIO | ||
import numpy as np | ||
import caffe | ||
from paddle.proto.ParameterConfig_pb2 import ParameterConfig | ||
from image import load_and_transform | ||
|
||
|
||
def __default_not_set_callback__(kwargs, name): | ||
return name not in kwargs or kwargs[name] is None | ||
|
||
|
||
def wrap_param_default(param_names=None, | ||
default_factory=None, | ||
not_set_callback=__default_not_set_callback__): | ||
assert param_names is not None | ||
assert isinstance(param_names, list) or isinstance(param_names, tuple) | ||
for each_param_name in param_names: | ||
assert isinstance(each_param_name, basestring) | ||
|
||
def __impl__(func): | ||
@functools.wraps(func) | ||
def __wrapper__(*args, **kwargs): | ||
if len(args) != 0: | ||
argspec = inspect.getargspec(func) | ||
num_positional = len(argspec.args) | ||
if argspec.defaults: | ||
num_positional -= len(argspec.defaults) | ||
assert argspec.varargs or len( | ||
args | ||
) <= num_positional, "Must use keyword arguments for non-positional args" | ||
for name in param_names: | ||
if not_set_callback(kwargs, name): # Not set | ||
kwargs[name] = default_factory(func) | ||
return func(*args, **kwargs) | ||
|
||
if hasattr(func, "argspec"): | ||
__wrapper__.argspec = func.argspec | ||
else: | ||
__wrapper__.argspec = inspect.getargspec(func) | ||
return __wrapper__ | ||
|
||
return __impl__ | ||
|
||
|
||
class DefaultNameFactory(object): | ||
def __init__(self, name_prefix): | ||
self.__counter__ = 0 | ||
self.__name_prefix__ = name_prefix | ||
|
||
def __call__(self, func): | ||
if self.__name_prefix__ is None: | ||
self.__name_prefix__ = func.__name__ | ||
tmp = "__%s_%d__" % (self.__name_prefix__, self.__counter__) | ||
self.__check_name__(tmp) | ||
self.__counter__ += 1 | ||
return tmp | ||
|
||
def __check_name__(self, nm): | ||
pass | ||
|
||
def reset(self): | ||
self.__counter__ = 0 | ||
|
||
|
||
def wrap_name_default(name_prefix=None, name_param="name"): | ||
""" | ||
Decorator to set "name" arguments default to "{name_prefix}_{invoke_count}". | ||
|
||
.. code:: python | ||
|
||
@wrap_name_default("some_name") | ||
def func(name=None): | ||
print name # name will never be None. If name is not set, | ||
# name will be "some_name_%d" | ||
|
||
:param name_prefix: name prefix. wrapped function"s __name__ if None. | ||
:type name_prefix: basestring | ||
:return: a decorator to set default name | ||
:rtype: callable | ||
""" | ||
factory = DefaultNameFactory(name_prefix) | ||
return wrap_param_default([name_param], factory) | ||
|
||
|
||
class ModelConverter(object): | ||
def __init__(self, caffe_model_file, caffe_pretrained_file, | ||
paddle_output_path, paddle_tar_name): | ||
self.net = caffe.Net(caffe_model_file, caffe_pretrained_file, | ||
caffe.TEST) | ||
self.output_path = paddle_output_path | ||
self.tar_name = paddle_tar_name | ||
self.params = dict() | ||
self.pre_layer_name = "" | ||
self.pre_layer_type = "" | ||
|
||
def convert(self, name_map={}): | ||
layer_dict = self.net.layer_dict | ||
for layer_name in layer_dict.keys(): | ||
layer = layer_dict[layer_name] | ||
layer_params = layer.blobs | ||
layer_type = layer.type | ||
if len(layer_params) > 0: | ||
self.pre_layer_name = getattr( | ||
self, "convert_" + layer_type + "_layer")( | ||
layer_params, | ||
name=None | ||
if name_map == None else name_map.get(layer_name)) | ||
self.pre_layer_type = layer_type | ||
with gzip.open(self.tar_name, 'w') as f: | ||
self.to_tar(f) | ||
return | ||
|
||
def to_tar(self, f): | ||
tar = tarfile.TarFile(fileobj=f, mode='w') | ||
for param_name in self.params.keys(): | ||
param_conf, param_data = self.params[param_name] | ||
|
||
confStr = param_conf.SerializeToString() | ||
tarinfo = tarfile.TarInfo(name="%s.protobuf" % param_name) | ||
tarinfo.size = len(confStr) | ||
buf = cStringIO.StringIO(confStr) | ||
buf.seek(0) | ||
tar.addfile(tarinfo, fileobj=buf) | ||
|
||
buf = cStringIO.StringIO() | ||
self.serialize(param_data, buf) | ||
tarinfo = tarfile.TarInfo(name=param_name) | ||
buf.seek(0) | ||
tarinfo.size = len(buf.getvalue()) | ||
tar.addfile(tarinfo, buf) | ||
|
||
@staticmethod | ||
def serialize(data, f): | ||
f.write(struct.pack("IIQ", 0, 4, data.size)) | ||
f.write(data.tobytes()) | ||
|
||
@wrap_name_default("img_conv_layer") | ||
def convert_Convolution_layer(self, params, name=None): | ||
for i in range(len(params)): | ||
data = np.array(params[i].data) | ||
if len(params) == 2: | ||
suffix = "0" if i == 0 else "bias" | ||
file_name = "_%s.w%s" % (name, suffix) | ||
else: | ||
file_name = "_%s.w%s" % (name, str(i)) | ||
param_conf = ParameterConfig() | ||
param_conf.name = file_name | ||
param_conf.size = reduce(lambda a, b: a * b, data.shape) | ||
self.params[file_name] = (param_conf, data.flatten()) | ||
|
||
return name | ||
|
||
@wrap_name_default("fc_layer") | ||
def convert_InnerProduct_layer(self, params, name=None): | ||
for i in range(len(params)): | ||
data = np.array(params[i].data) | ||
if len(params) == 2: | ||
suffix = "0" if i == 0 else "bias" | ||
file_name = "_%s.w%s" % (name, suffix) | ||
else: | ||
file_name = "_%s.w%s" % (name, str(i)) | ||
data = np.transpose(data) | ||
param_conf = ParameterConfig() | ||
param_conf.name = file_name | ||
dims = list(data.shape) | ||
if len(dims) < 2: | ||
dims.insert(0, 1) | ||
param_conf.size = reduce(lambda a, b: a * b, dims) | ||
param_conf.dims.extend(dims) | ||
self.params[file_name] = (param_conf, data.flatten()) | ||
return name | ||
|
||
@wrap_name_default("batch_norm_layer") | ||
def convert_BatchNorm_layer(self, params, name=None): | ||
scale = 1 / np.array(params[-1].data)[0] if np.array( | ||
params[-1].data)[0] != 0 else 0 | ||
for i in range(2): | ||
data = np.array(params[i].data) * scale | ||
file_name = "_%s.w%s" % (name, str(i + 1)) | ||
param_conf = ParameterConfig() | ||
param_conf.name = file_name | ||
dims = list(data.shape) | ||
assert len(dims) == 1 | ||
dims.insert(0, 1) | ||
param_conf.size = reduce(lambda a, b: a * b, dims) | ||
param_conf.dims.extend(dims) | ||
self.params[file_name] = (param_conf, data.flatten()) | ||
return name | ||
|
||
def convert_Scale_layer(self, params, name=None): | ||
assert self.pre_layer_type == "BatchNorm" | ||
name = self.pre_layer_name | ||
for i in range(len(params)): | ||
data = np.array(params[i].data) | ||
suffix = "0" if i == 0 else "bias" | ||
file_name = "_%s.w%s" % (name, suffix) | ||
param_conf = ParameterConfig() | ||
param_conf.name = file_name | ||
dims = list(data.shape) | ||
assert len(dims) == 1 | ||
dims.insert(0, 1) | ||
param_conf.size = reduce(lambda a, b: a * b, dims) | ||
if i == 1: | ||
param_conf.dims.extend(dims) | ||
self.params[file_name] = (param_conf, data.flatten()) | ||
return name | ||
|
||
def caffe_predict(self, | ||
img, | ||
mean_file='./caffe/imagenet/ilsvrc_2012_mean.npy'): | ||
net = self.net | ||
|
||
net.blobs['data'].data[...] = load_img(img, mean_file) | ||
out = net.forward() | ||
|
||
output_prob = net.blobs['prob'].data[0].flatten() | ||
print np.sort(output_prob)[::-1] | ||
print np.argsort(output_prob)[::-1] | ||
print 'predicted class is:', output_prob.argmax() | ||
|
||
|
||
def load_image(file, mean_file): | ||
im = load_and_transform(file, 256, 224, is_train=False) | ||
im = im[(2, 1, 0), :, :] | ||
mu = np.load(mean_file) | ||
mu = mu.mean(1).mean(1) | ||
im = im - mu[:, None, None] | ||
im = im / 255.0 | ||
return im | ||
|
||
|
||
if __name__ == "__main__": | ||
converter = ModelConverter("./resnet50/ResNet-50-deploy.prototxt", | ||
"./resnet50/ResNet-50-model.caffemodel", | ||
"paddle_resnet50.tar.gz") | ||
converter.convert(name_map=dict()) | ||
converter.caffe_predict("./images/cat.jpg") |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
文档觉得需要简洁明了告诉用户怎么用,比如配置好参数之后,运行:
然后再把code相关解释,注意事项列出来。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done