tatk.policy.mdrg.multiwoz package

Submodules

tatk.policy.mdrg.multiwoz.auto_download module

tatk.policy.mdrg.multiwoz.auto_download.auto_download()

tatk.policy.mdrg.multiwoz.create_delex_data module

tatk.policy.mdrg.multiwoz.create_delex_data.addBookingPointer(task, turn, pointer_vector)

Add information about availability of the booking option.

tatk.policy.mdrg.multiwoz.create_delex_data.addDBPointer(turn)

Create database pointer for all related domains.

tatk.policy.mdrg.multiwoz.create_delex_data.analyze_dialogue(dialogue, maxlen)

Cleaning procedure for all kinds of errors in text and annotation.

tatk.policy.mdrg.multiwoz.create_delex_data.buildDictionaries(word_freqs_usr, word_freqs_sys)

Build dictionaries for both user and system sides. You can specify the size of the dictionary through DICT_SIZE variable.

tatk.policy.mdrg.multiwoz.create_delex_data.createDelexData()

Main function of the script - loads delexical dictionary, goes through each dialogue and does: 1) data normalization 2) delexicalization 3) addition of database pointer 4) saves the delexicalized data

tatk.policy.mdrg.multiwoz.create_delex_data.createDict(word_freqs)
tatk.policy.mdrg.multiwoz.create_delex_data.delexicaliseReferenceNumber(sent, turn)

Based on the belief state, we can find reference number that during data gathering was created randomly.

tatk.policy.mdrg.multiwoz.create_delex_data.divideData(data)

Given test and validation sets, divide the data for three different sets

tatk.policy.mdrg.multiwoz.create_delex_data.fixDelex(filename, data, data2, idx, idx_acts)

Given system dialogue acts fix automatic delexicalization.

tatk.policy.mdrg.multiwoz.create_delex_data.get_dial(dialogue)

Extract a dialogue from the file

tatk.policy.mdrg.multiwoz.create_delex_data.get_summary_bstate(bstate)

Based on the mturk annotations we form multi-domain belief state

tatk.policy.mdrg.multiwoz.create_delex_data.is_ascii(s)
tatk.policy.mdrg.multiwoz.create_delex_data.loadData()
tatk.policy.mdrg.multiwoz.create_delex_data.main()

tatk.policy.mdrg.multiwoz.default_policy module

class tatk.policy.mdrg.multiwoz.default_policy.DefaultPolicy(hidden_size_pol, hidden_size, db_size, bs_size)

Bases: torch.nn.modules.module.Module

__init__(hidden_size_pol, hidden_size, db_size, bs_size)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(encodings, db_tensor, bs_tensor, act_tensor=None)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

tatk.policy.mdrg.multiwoz.evaluator module

tatk.policy.mdrg.multiwoz.evaluator.evaluateGeneratedDialogue(dialog, goal, realDialogue, real_requestables)

Evaluates the dialogue created by the model. First we load the user goal of the dialogue, then for each turn generated by the system we look for key-words. For the Inform rate we look whether the entity was proposed. For the Success rate we look for requestables slots

tatk.policy.mdrg.multiwoz.evaluator.evaluateModel(dialogues, val_dials, mode='valid')

Gathers statistics for the whole sets.

tatk.policy.mdrg.multiwoz.evaluator.evaluateRealDialogue(dialog, filename)

Evaluation of the real dialogue. First we loads the user goal and then go through the dialogue history. Similar to evaluateGeneratedDialogue above.

tatk.policy.mdrg.multiwoz.evaluator.parseGoal(goal, d, domain)

Parses user goal into dictionary format.

tatk.policy.mdrg.multiwoz.mdrg_model module

class tatk.policy.mdrg.multiwoz.mdrg_model.Attn(method, hidden_size)

Bases: torch.nn.modules.module.Module

__init__(method, hidden_size)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden, encoder_outputs)
Parameters
  • hidden – previous hidden state of the decoder, in shape (layers*directions,B,H)

  • encoder_outputs – encoder outputs from Encoder, in shape (T,B,H)

:return

attention energies in shape (B,T)

score(hidden, encoder_outputs)
class tatk.policy.mdrg.multiwoz.mdrg_model.BeamSearchNode(h, prevNode, wordid, logp, leng)

Bases: object

__init__(h, prevNode, wordid, logp, leng)

Initialize self. See help(type(self)) for accurate signature.

eval(repeatPenalty, tokenReward, scoreTable, alpha=1.0)
class tatk.policy.mdrg.multiwoz.mdrg_model.DecoderRNN(embedding_size, hidden_size, output_size, cell_type, dropout=0.1)

Bases: torch.nn.modules.module.Module

__init__(embedding_size, hidden_size, output_size, cell_type, dropout=0.1)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, hidden, not_used)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class tatk.policy.mdrg.multiwoz.mdrg_model.EncoderRNN(input_size, embedding_size, hidden_size, cell_type, depth, dropout)

Bases: torch.nn.modules.module.Module

__init__(input_size, embedding_size, hidden_size, cell_type, depth, dropout)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_seqs, input_lens, hidden=None)

forward procedure. No need for inputs to be sorted :param input_seqs: Variable of [T,B] :param hidden: :param input_lens: numpy array of len for each input sequence :return:

class tatk.policy.mdrg.multiwoz.mdrg_model.Model(args, input_lang_index2word, output_lang_index2word, input_lang_word2index, output_lang_word2index)

Bases: torch.nn.modules.module.Module

__init__(args, input_lang_index2word, output_lang_index2word, input_lang_word2index, output_lang_word2index)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

build_model()
clipGradients()
cuda_(var)
decode(target_tensor, decoder_hidden, encoder_outputs)
forward(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor)

Given the user sentence, user belief state and database pointer, encode the sentence, decide what policy vector construct and feed it as the first hiddent state to the decoder.

getCount()
greedy_decode(decoder_hidden, encoder_outputs, target_tensor)
input_index2word(index)
input_word2index(index)
loadModel(iter=0)
output_index2word(index)
output_word2index(index)
predict(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor)
printGrad()
saveModel(iter)
setOptimizers()
train(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor, dial_name=None)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

class tatk.policy.mdrg.multiwoz.mdrg_model.SeqAttnDecoderRNN(embedding_size, hidden_size, output_size, cell_type, dropout_p=0.1, max_length=30)

Bases: torch.nn.modules.module.Module

__init__(embedding_size, hidden_size, output_size, cell_type, dropout_p=0.1, max_length=30)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, hidden, encoder_outputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

tatk.policy.mdrg.multiwoz.mdrg_model.init_gru(gru, gain=1)
tatk.policy.mdrg.multiwoz.mdrg_model.init_lstm(cell, gain=1)
tatk.policy.mdrg.multiwoz.mdrg_model.whatCellType(input_size, hidden_size, cell_type, dropout_rate)

tatk.policy.mdrg.multiwoz.model module

class tatk.policy.mdrg.multiwoz.model.Attn(method, hidden_size)

Bases: torch.nn.modules.module.Module

__init__(method, hidden_size)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(hidden, encoder_outputs)
Parameters
  • hidden – previous hidden state of the decoder, in shape (layers*directions,B,H)

  • encoder_outputs – encoder outputs from Encoder, in shape (T,B,H)

:return

attention energies in shape (B,T)

score(hidden, encoder_outputs)
class tatk.policy.mdrg.multiwoz.model.BeamSearchNode(h, prevNode, wordid, logp, leng)

Bases: object

__init__(h, prevNode, wordid, logp, leng)

Initialize self. See help(type(self)) for accurate signature.

eval(repeatPenalty, tokenReward, scoreTable, alpha=1.0)
class tatk.policy.mdrg.multiwoz.model.DecoderRNN(embedding_size, hidden_size, output_size, cell_type, dropout=0.1)

Bases: torch.nn.modules.module.Module

__init__(embedding_size, hidden_size, output_size, cell_type, dropout=0.1)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, hidden, not_used)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class tatk.policy.mdrg.multiwoz.model.EncoderRNN(input_size, embedding_size, hidden_size, cell_type, depth, dropout)

Bases: torch.nn.modules.module.Module

__init__(input_size, embedding_size, hidden_size, cell_type, depth, dropout)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_seqs, input_lens, hidden=None)

forward procedure. No need for inputs to be sorted :param input_seqs: Variable of [T,B] :param hidden: :param input_lens: numpy array of len for each input sequence :return:

class tatk.policy.mdrg.multiwoz.model.Model(args, input_lang_index2word, output_lang_index2word, input_lang_word2index, output_lang_word2index)

Bases: torch.nn.modules.module.Module

__init__(args, input_lang_index2word, output_lang_index2word, input_lang_word2index, output_lang_word2index)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

build_model()
clipGradients()
cuda_(var)
decode(target_tensor, decoder_hidden, encoder_outputs)
forward(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor)

Given the user sentence, user belief state and database pointer, encode the sentence, decide what policy vector construct and feed it as the first hiddent state to the decoder.

getCount()
greedy_decode(decoder_hidden, encoder_outputs, target_tensor)
input_index2word(index)
input_word2index(index)
loadModel(iter=0)
output_index2word(index)
output_word2index(index)
predict(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor)
printGrad()
saveModel(iter)
setOptimizers()
train(input_tensor, input_lengths, target_tensor, target_lengths, db_tensor, bs_tensor, dial_name=None)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

class tatk.policy.mdrg.multiwoz.model.SeqAttnDecoderRNN(embedding_size, hidden_size, output_size, cell_type, dropout_p=0.1, max_length=30)

Bases: torch.nn.modules.module.Module

__init__(embedding_size, hidden_size, output_size, cell_type, dropout_p=0.1, max_length=30)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input, hidden, encoder_outputs)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

tatk.policy.mdrg.multiwoz.model.init_gru(gru, gain=1)
tatk.policy.mdrg.multiwoz.model.init_lstm(cell, gain=1)
tatk.policy.mdrg.multiwoz.model.whatCellType(input_size, hidden_size, cell_type, dropout_rate)

tatk.policy.mdrg.multiwoz.policy module

class tatk.policy.mdrg.multiwoz.policy.MDRGWordPolicy(num=1)

Bases: tatk.policy.policy.Policy

__init__(num=1)

Initialize self. See help(type(self)) for accurate signature.

init_session()

Init the class variables for a new session.

predict(state)

Predict the next agent action given dialog state. update state[‘system_action’] with predict system action

Args:
state (tuple or dict):

when the DST and Policy module are separated, the type of state is tuple. else when they are aggregated together, the type of state is dict (dialog act).

Returns:
action (list of list):

The next dialog action.

tatk.policy.mdrg.multiwoz.policy.addBookingPointer(task, turn, pointer_vector)

Add information about availability of the booking option.

tatk.policy.mdrg.multiwoz.policy.addDBPointer(state)

Create database pointer for all related domains. domains = [‘restaurant’, ‘hotel’, ‘attraction’, ‘train’] pointer_vector = np.zeros(6 * len(domains)) for domain in domains:

num_entities = dbPointer.queryResult(domain, turn) pointer_vector = dbPointer.oneHotVector(num_entities, domain, pointer_vector)

return pointer_vector

tatk.policy.mdrg.multiwoz.policy.createDelexData(dialogue)

Main function of the script - loads delexical dictionary, goes through each dialogue and does: 1) data normalization 2) delexicalization 3) addition of database pointer 4) saves the delexicalized data

tatk.policy.mdrg.multiwoz.policy.decode(data, model, device)
tatk.policy.mdrg.multiwoz.policy.decodeWrapper(args)
tatk.policy.mdrg.multiwoz.policy.get_active_domain(prev_active_domain, prev_state, state)
tatk.policy.mdrg.multiwoz.policy.loadModel(num, args)
tatk.policy.mdrg.multiwoz.policy.loadModelAndData(num, args)
tatk.policy.mdrg.multiwoz.policy.load_config(args)
tatk.policy.mdrg.multiwoz.policy.main()
tatk.policy.mdrg.multiwoz.policy.populate_template(template, top_results, num_results, state)

tatk.policy.mdrg.multiwoz.test module

tatk.policy.mdrg.multiwoz.train module