Skip to content

Latest commit

 

History

History
18 lines (8 loc) · 570 Bytes

README.md

File metadata and controls

18 lines (8 loc) · 570 Bytes

This is the three datasets we constructed for Sentence Transfer tasks.

Yelp-dm is extracted from a sentiment domain corpus Yelp, examples in Yelp dataset are from business review website $Yelp$.

Wiki-dm is extracted from an open domain corpus WikiText-103\cite{merity2016pointer}, which consists of Wikipedia articles.

Book-dm is extracted from another open domain corpus BookCorpus~\cite{zhu2015aligning}, which consists of text from unpublished novels.

Dataset |Train| Dev|Test|

YELP-dm | 10K | 1K | 1K |

Wiki-dm | 80K | 5K | 3K |

Book-dm | 400K|30K |30K |