Seq2Seq (TensorFlow)¶

Seq2seq is a basic model for single turn dialog. Here, we implement seq2seq with attention mechanism. You can refer to the following papers for details:

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In International Conference on Learning Representation.

Require Packages¶

cotk
TensorFlow == 1.13.1
TensorBoardX >= 1.4

Quick Start¶

Downloading dataset and save it to ./data. (Dataset will be released soon.)
Execute python run.py to train the model.
- The default dataset is OpenSubtitles. You can use --dataset to specify other dataloader class.
- It don’t use pretrained word vector by default setting. You can use --wvclass to specify wordvector class. For example: --wvclass gloves
- If you don’t have GPUs, you can add --cpu for switching to CPU, but it may cost very long time.
You can view training process by tensorboard, the log is at ./tensorboard.
- For example, tensorboard --logdir=./tensorboard. (You have to install tensorboard first.)
After training, execute python run.py --mode test --restore best for test.
- You can use --restore filename to specify checkpoints files, which are in ./model.
- --restore last means last checkpoint, --restore best means best checkpoints on dev.
Find results at ./output.

Arguments¶

    usage: run.py [-h] [--name NAME] [--restore RESTORE] [--mode MODE]
                  [--dataset DATASET] [--datapath DATAPATH] [--epoch EPOCH]
                  [--wvclass WVCLASS] [--wvpath WVPATH] [--out_dir OUT_DIR]
                  [--log_dir LOG_DIR] [--model_dir MODEL_DIR]
                  [--cache_dir CACHE_DIR] [--cpu] [--debug] [--cache]

    optional arguments:
      -h, --help            show this help message and exit

    useful arguments:
      --name NAME           The name of your model, used for variable scope and
                            tensorboard, etc.
                            Default: runXXXXXX_XXXXXX (initialized by current time)
      --restore RESTORE     Checkpoints name to load. "last" for last checkpoints,
                            "best" for best checkpoints on dev. Attention: "last"
                            and "best" wiil cause unexpected behaviour when run 2
                            models in the same dir at the same time. Default: None
                            (do not load anything)
      --mode MODE           "train" or "test". Default: train
      --dataset DATASET     Dataloader class. Default: OpenSubtitles
      --datapath DATAPATH   Directory for data set. Default: ./data
      --epoch EPOCH         Epoch for trainning. Default: 100
      --wvclass WVCLASS     Wordvector class, none for not using pretrained
                            wordvec. Default: None
      --wvpath WVPATH       Directory for pretrained wordvector. Default:
                            ./wordvec

    advanced arguments:
      --out_dir OUT_DIR     Output directory for test output. Default: ./output
      --log_dir LOG_DIR     Log directory for tensorboard. Default: ./tensorboard
      --model_dir MODEL_DIR
                            Checkpoints directory for model. Default: ./model
      --cache_dir CACHE_DIR
                            Checkpoints directory for cache. Default: ./cache
      --cpu                 Use cpu.
      --debug               Enter debug mode (using ptvsd).
      --cache               Use cache for speeding up load data and wordvec. (It
                               may cause problems when you switch dataset.)
none

### TensorBoard Example

Execute ``tensorboard --logdir=./tensorboard``, you will see the plot in tensorboard pages:

![tensorboard_plot_example](.seq2seq-tensorflow/images/tensorflow-plot-example.png)

Following plot are shown in this model:

* train/loss
* train/perplexity
* dev/loss
* dev/perplexity
* test/loss
* test/perplexity

And text output:

![tensorboard_text_example](.seq2seq-tensorflow/images/tensorflow-text-example.png)

Following text are shown in this model:

* args

### Case Study of Model Results

Execute ``python run.py --mode test --restore best``

The following output will be in `./output/[name]_[dev|test].txt`:

​```none
bleu:  0.186838
perplexity:    40.417562
post:  if it were anyone but <unk> s son .
resp:  <unk> is a great fighter .
gen:   i dont know what to do .
post:  in the fortress , you will face more than the <unk> .
resp:  you will face the beast , who is their leader .
gen:   the ss s going to be crushed .
post:  in a cave on the highest peak .
resp:  without the <unk> , you will never be able to reach <unk> .
gen:   when the boys s out , then we started .

Performance¶

	Perplexity	BLEU
OpenSubtitles	40.42	0.187

Author¶

KE Pei