About Tenrec

Tenrec is a large-scale benchmark dataset for recommeder systems. It was collected from two different feeds recommendation apps of Tencent with four scenarios. Tenrec has five characteristics: (1) it is large-scale, containing around 5 million users and 140 million interactions; (2) it has not only positive user feedback, but also true negative feedback (vs. one-class recommendation); (3) it contains overlapped users and items across four different scenarios; (4) it contains various types of user positive feedback, in forms of clicks, likes, shares, and follows, etc; (5) it contains additional features beyond the user IDs and item IDs.

If you use this dataset for your research, please cite this paper: @article{yuan2022tenrec, title={Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems}, author={Yuan, Guanghu and Yuan, Fajie and Li, Yudong and Kong, Beibei and Li, Shujie and Chen, Lei and Yang, Min and Yu, Chenyun and Hu, Bo and Li, Zang and others}, journal={arXiv preprint arXiv:2210.10629}, year={2022} }

*Email Fajie & Guanghu if you want to launch a new leaderboard for an important RS task using Tenrec.

Guanghu Yuan, Fajie Yuan, Yudong Li, Beibei Kong, Shujie Li, Lei Chen, Min Yang, Chenyun Yu, Bo Hu, Zang Li, Yu Xu, Xiaohu Qie. Tenrec: A Large-Scale Multipurpose Benchmark Dataset for Recommender Systems NeurIPS 2022.


Download

The Tenrec dataset can be used for research purposes under Tenrec Dataset link. Before you download the dataset, please read these terms. And Code link.

Note

If you use Tenrec (with our training, validation and testing set) and have new SOTA results, we are happy to update them on the leaderboard. In this case, you should provide (1) your algorithm code; (2) all your hyper-parameters; (3) a readme file tells other researchers how to run your code. We will append them on the leaderboard website, and make sure your models are evaluated with a fair comparison or common practice. We are also happy to create new leaderboard if you use Tenrec to perform new tasks, just email us.

Leaderboard

CTR-1M(Separate Embedding)

Rank Model AUC

1

Aug. 07, 2017
NFM 0.7957

2

Jun. 03, 2021
DCN-v2 0.7932

3

Jul. 19, 2018
xDeepFM 0.7931

4

Mar. 13, 2017
DeepFM 0.7930

5

Aug. 15, 2017
AFM 0.7928

6

Aug. 14, 2017
DCN 0.7927

7

Sept. 15, 2016
Wide & Deep 0.7919

CTR-1M(Shared Embedding)

Rank Model AUC

1

Aug. 07, 2017
NFM 0.7924

2

Jun. 03, 2021
DCN-v2 0.7922

3

Jul. 19, 2018
xDeepFM 0.7922

4

AUG. 14, 2017
AFM 0.7921

5

MAR. 13, 2017
DeepFM 0.7920

6

DIEN 0.7918

7

AUG. 14, 2017
DCN 0.7911

8

DIN 0.7910

9

SEPT. 15, 2016
Wide & Deep 0.7910

CTR-5M

Rank Model AUC

1

Jul. 19, 2018
xDeepFM 0.8235

2

Mar. 13, 2017
DeepFM 0.8235

3

Sept. 15, 2016
Wide & Deep 0.8234

4

Aug. 07, 2017
NFM 0.8231

5

Aug. 15, 2017
AFM 0.8226

Session based Recommendation 1M

Rank Model NDCG@20

1

Jan. 30, 2019
NextItNet 0.0199

2

Dec. 30, 2018
SASRec 0.0194

3

Aug. 27, 2017
GRU4Rec 0.0192

4

Nov. 03, 2019
BERT4Rec 0.0185

Session based Recommendation 5M

Rank Model NDCG@20

1

Jan. 30, 2019
NextItNet 0.0214

2

Dec. 30, 2018
SASRec 0.0201

3

Nov. 03, 2019
BERT4Rec 0.0191

Multi-task Learning

Rank Model click-AUC like-AUC

1

Jun. 27, 2018
ESMM 0.7940 0.9110

2

Sept. 22, 2020
PLE 0.7822 0.9103

3

Jul. 19, 2018
MMOE 0.7900 0.9020

Transfer Learning

Rank Model NDCG@20

1

Jan. 30, 2019
NextItNet 0.0489

2

Dec. 30, 2018
SASRec 0.0479

User Profile Prediction

Rank Model Age-ACC Gender-ACC

1

Nov. 03, 2019
BERT4Rec 0.69903 0.90082

2

Jul. 25, 2020
PeterRec 0.69712 0.90036

3

--
DNN 0.67875 0.88531

Cold-Start

Rank Model NDCG@20

1

Nov. 03, 2019
BERT4Rec 0.0239

2

Jul. 25, 2020
PeterRec 0.0221

Cold-Start 0.3

Rank Model NDCG@20

1

Nov. 03, 2019
BERT4Rec 0.0137

2

Jul. 25, 2020
PeterRec 0.0133

Cold-Start 0.7

Rank Model NDCG@20

1

Nov. 03, 2019
BERT4Rec 0.0134

2

Jul. 25, 2020
PeterRec 0.0132

Cold-Start 1

Rank Model NDCG@20

1

Nov. 03, 2019
BERT4Rec 0.0166

2

Jul. 25, 2020
PeterRec 0.0165

Lifelong Learning

Rank Model Task1-NDCG@20 Task2-NDCG@20 Task3-NDCG@20 Task4-NDCG@20

1

Jul. 11, 2021
Conure-NextItNet 0.0177 0.0095 0.0167 0.1074

2

Jul. 11, 2021
Conure-SASRec 0.0172 0.0086 0.0166 0.0959

Model Compression

Rank Model Compress Para. NDCG@20

1

Jul. 25, 2020
Cp-NextItNet 33.1% 0.0195

2

Jul. 25, 2020
Cp-SASRec 30.1% 0.0191

Model Training Speedup

Rank Model Time Speedup NDCG@20

1

Jul. 11, 2021
Stack-NextItNet 63.6% 0.0202

2

Jul. 11, 2021
Stack-SASRec 30% 0.0196

Model Inference Speedup

Rank Model Time Speedup NDCG@20

1

May 18, 2021
Skip-NextItNet 23.4% 0.0472

2

May 18, 2021
Skip-SASRec 32.5% 0.0431

Top-N Recommendation with the random negative sampler

Rank Model NDCG@20

1

Jul. 2, 2020/span>
LightGCN 0.0542

2

Aug. 07, 2009/span>
NGCF 0.0455

3

Jul. 18, 2019
MF 0.0437

4

Apr. 03, 2017
NCF 0.0403

Top-N Recommendation with the popularity negative sampler.

Rank Model NDCG@20

1

Jul. 2, 2020/span>
LightGCN 0.0617

2

Aug. 07, 2009/span>
NGCF 0.0476

3

Jul. 18, 2019
MF 0.0467

4

Apr. 03, 2017
NCF 0.0405