GitHub - opensourceware/Seq-Domain-Adaptation

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
README		README
adv_train.py		adv_train.py
config.py		config.py
loader.py		loader.py
lstm_mapper.py		lstm_mapper.py
pretrain.py		pretrain.py
run_trained_model.py		run_trained_model.py
script.sh		script.sh
utils.py		utils.py

Repository files navigation

A domain adaptation system using Adversarial learning. This project performs domain adaptation over sequential data and for a sequential task. The tasks under consideration are part-of-speech tagging for news and medical domains.

The tagged news data has been obtained from Wall Street Journal PTB corpus and medical tagged data is obtained from MedPOST.

The trained networks are source LSTM for news data, target LSTM for medical data and a common classifier which can tag sequences from both source and target LSTM distributions.

Domain Adaptation is performed in 3 parts:
1. Run python pretrain.py to train POS Tagger on WSJ dataset.
   Pretrained weights can be found in source_blstm_crf and source_model_only_embeddings folders.
2. Run python adv_train.py to perform adversarial training between TargetLSTM and discriminator,
3. Run python run_trained_model.py to evaluate trained TargetLSTM on MedPost dataset.

2 & 3 alternate: To analyse F1 score on medpost dataset during adversarial training, run ./script.sh simulatenously with adv_train.py.