This code was used for offline experiments of JDRec (http://arxiv.org/abs/2102.xxxxx). You can reproduce the offline experiments in the paper, or just obtain the JDRec dataset for further research.
3 quick steps to reproduce the offline experiments in the paper:
- download the data:
sh get_data.sh
- train the models:
python main.py train
- evaluate the models:
python main.py eval
Based on python 3.8 and tensorflow 2.2.0, we got the following results, which can be a baseline for further research on reinforcement learning for Recommender System:
Evaluator | click AUC |
---|---|
Pointwise Evaluator | 0.7201 |
Listwise Evaluator | 0.7248 |
Generator | Average CTR |
---|---|
Naive RL Generator | 0.0390 |
CTR RL Generator | 0.0650 |
If you only need the JDRec dataset for some further research, you only need to run the script get_data.sh
, or download the data directly through the links in the script. Please cite the paper if you use the data in any way.
In addition to the information provided in the paper, more details about the JDRec dataset need to be introduced:
In our offline experiments, the data format is one-item-per-line csv. Each line consist of 53 columns, following the following order:
Click, RerankIndex, Improv, RequestTime, SkuCategory1, SkuCategory2, SkuCategory3, SkuShopId, SkuVendorId, SkuBestProduct, SkuBrandId, PCtr, PCvr, PGmv, TotalValue, PCtrCtrInt, PCtrCvrInt, PCtrGmvInt, PageNum, PCtrInt, PCvrInt, PGmvInt, ValueInt, CidOneExpNum, CidOneClkNum, CidOneNoClkNum, CidOneExpGap, CidOneClkGap, CidOneClkTimestamp, CidTwoExpNum, CidTwoClkNum, CidTwoNoClkNum, CidTwoExpGap, CidTwoClkGap, CidTwoClkTimestamp, CidThreeExpNum, CidThreeClkNum, CidThreeNoClkNum, CidThreeExpGap, CidThreeClkGap, CidThreeClkTimestamp, BrandExpNum, BrandClkNum, BrandNoClkNum, BrandExpGap, BrandClkGap, BrandClkTimestamp, ProductExpNum, ProductClkNum, ProductNoClkNum, ProductExpGap, ProductClkGap, ProductClkTimestamp.
Each sample includes 44 items, so lines 1 through 44 belong to the first sample, lines 45 through 88 belong to the second one, etc. In each sample, the first 4 lines are finally selected by the online rerank module, while the following 40 lines are all candidate items(include the 4 selected items). All samples are orderd by RequestTime.
If you are still confused about our csv data format, you can also choose the one-sample-per-line json format. JDRec dataset in the two format include exactly the same data, except that the json format data include column infomation and sample gramularity structure. To download json format dataset, you only need to replace all ‘csv’s in the links with 'json's. For example: http://storage.jd.com/jdrec-json/train_0.json
,etc.