LGCM

This is the official repository for the Findings of EACL 2024 paper "Local and Global Contexts for Conversation".

Environment

python==3.6.8
torch==1.4.0
transformers==3.0.2

Usage

Data preparation

Tokenizer

We use GPT2 vocabulary in our experiments. To prepare vocabulary files, please:

download gpt2-vocab.json from here, rename it to vocab.json, and move it to the folder ./gpt2_vocab/
download gpt2-merges.txt from here, rename it to merges.txt, and move it to the folder ./gpt2_vocab/

Datasets

We have trained LGCM on three public available dalogue datasets:

After downloading raw data, please run scripts in ./prepare_data/ to preprocess data.

Training

PersonaChat: bash scripts/train_personachat.sh
DailyDialog: bash scripts/train_dailydialog.sh
MultiWOZ: bash scripts/train_multiwoz.sh

Evaluation

PersonaChat: bash scripts/evaluate_personachat.sh
DailyDialog: bash scripts/evaluate_dailydialog.sh
MultiWOZ: bash scripts/evaluate_multiwoz.sh

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
datasets		datasets
evaluators		evaluators
models		models
modules		modules
prepare_data		prepare_data
scripts		scripts
trainers		trainers
vocabs		vocabs
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LGCM

Environment

Usage

Data preparation

Tokenizer

Datasets

Training

Evaluation

About

Releases

Packages

Languages

PKUAI-LINGroup/LGCM

Folders and files

Latest commit

History

Repository files navigation

LGCM

Environment

Usage

Data preparation

Tokenizer

Datasets

Training

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages