tatk.nlu.svm.camrest package¶
Submodules¶
tatk.nlu.svm.camrest.evaluate module¶
Evaluate SVMNLU models on Camrest test dataset
- Metric:
dataset level Precision/Recall/F1
- Usage:
PYTHONPATH=../../../.. python evaluate.py [usr|sys|all]
-
tatk.nlu.svm.camrest.evaluate.
da2triples
(dialog_act)¶
tatk.nlu.svm.camrest.nlu module¶
SVMNLU build a classifier for each semantic tuple (intent-slot-value) based on n-gram features. It’s first proposed by Mairesse et al. (2009). We adapt the implementation from pydial.
For more information, please refer to tatk/nlu/svm/camrest/README.md
Trained models can be download on:
https://tatk-data.s3-ap-northeast-1.amazonaws.com/svm_camrest_all.zip
https://tatk-data.s3-ap-northeast-1.amazonaws.com/svm_camrest_sys.zip
https://tatk-data.s3-ap-northeast-1.amazonaws.com/svm_camrest_usr.zip
Reference:
Mairesse, F., Gasic, M., Jurcicek, F., Keizer, S., Thomson, B., Yu, K., & Young, S. (2009, April). Spoken language understanding from unaligned data using discriminative classification models. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 4749-4752). IEEE.
-
class
tatk.nlu.svm.camrest.nlu.
SVMNLU
(mode)¶ Bases:
tatk.nlu.nlu.NLU
-
__init__
(mode)¶ SVM NLU initialization.
- Args:
- mode (str):
can be either ‘usr’, ‘sys’ or ‘all’, representing which side of data the model was trained on.
- Example:
nlu = SVMNLU(mode=’usr’)
-
predict
(utterance, context=[])¶ Predict the dialog act of a natural language utterance.
- Args:
- utterance (str):
A natural language utterance.
- Returns:
- output (dict):
The dialog act of utterance.
-
tatk.nlu.svm.camrest.preprocess module¶
Preprocess camrest data for SVMNLU.
- Usage:
python preprocess [mode=all|usr|sys] mode: which side data will be use
- Require:
../../../../data/camrest/[train|val|test].json.zip
data file../../../../data/camrest/db
database dir
- Output:
configs/ontology_camrest_[mode].json
ontology filedata/[mode]_data/
processed data dir
-
tatk.nlu.svm.camrest.preprocess.
read_zipped_json
(filepath, filename)¶