Skip to content

SeekerYb/JDRec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JDRec

This code was used for offline experiments of JDRec (http://arxiv.org/abs/2102.xxxxx). You can reproduce the offline experiments in the paper, or just obtain the JDRec dataset for further research.

offline experiments

3 quick steps to reproduce the offline experiments in the paper:

  1. download the data: sh get_data.sh
  2. train the models: python main.py train
  3. evaluate the models: python main.py eval

Based on python 3.8 and tensorflow 2.2.0, we got the following results, which can be a baseline for further research on reinforcement learning for Recommender System:

Evaluator click AUC
Pointwise Evaluator 0.7201
Listwise Evaluator 0.7248
Generator Average CTR
Naive RL Generator 0.0390
CTR RL Generator 0.0650

JDRec dataset

If you only need the JDRec dataset for some further research, you only need to run the script get_data.sh, or download the data directly through the links in the script. Please cite the paper if you use the data in any way.

In addition to the information provided in the paper, more details about the JDRec dataset need to be introduced:

In our offline experiments, the data format is one-item-per-line csv. Each line consist of 53 columns, following the following order:

Click, RerankIndex, Improv, RequestTime, SkuCategory1, SkuCategory2, SkuCategory3, SkuShopId, SkuVendorId, SkuBestProduct, SkuBrandId, PCtr, PCvr, PGmv, TotalValue, PCtrCtrInt, PCtrCvrInt, PCtrGmvInt, PageNum, PCtrInt, PCvrInt, PGmvInt, ValueInt, CidOneExpNum, CidOneClkNum, CidOneNoClkNum, CidOneExpGap, CidOneClkGap, CidOneClkTimestamp, CidTwoExpNum, CidTwoClkNum, CidTwoNoClkNum, CidTwoExpGap, CidTwoClkGap, CidTwoClkTimestamp, CidThreeExpNum, CidThreeClkNum, CidThreeNoClkNum, CidThreeExpGap, CidThreeClkGap, CidThreeClkTimestamp, BrandExpNum, BrandClkNum, BrandNoClkNum, BrandExpGap, BrandClkGap, BrandClkTimestamp, ProductExpNum, ProductClkNum, ProductNoClkNum, ProductExpGap, ProductClkGap, ProductClkTimestamp.

Each sample includes 44 items, so lines 1 through 44 belong to the first sample, lines 45 through 88 belong to the second one, etc. In each sample, the first 4 lines are finally selected by the online rerank module, while the following 40 lines are all candidate items(include the 4 selected items). All samples are orderd by RequestTime.

If you are still confused about our csv data format, you can also choose the one-sample-per-line json format. JDRec dataset in the two format include exactly the same data, except that the json format data include column infomation and sample gramularity structure. To download json format dataset, you only need to replace all ‘csv’s in the links with 'json's. For example: http://storage.jd.com/jdrec-json/train_0.json,etc.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published