EduNLP.ModelZoo¶

base_model¶

class EduNLP.ModelZoo.base_model.BaseModel[source]¶

base_model_prefix = ''¶

forward(*input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

save_pretrained(output_dir)[source]¶

classmethod from_pretrained(pretrained_model_path, *args, **kwargs)[source]¶

save_config(config_dir)[source]¶

classmethod from_config(config_path, *args, **kwargs)[source]¶

training: bool¶

rnn¶

class EduNLP.ModelZoo.rnn.ElmoLM(vocab_size: int, embedding_dim: int, hidden_size: int, num_layers: int = 2, dropout_rate: float = 0.5, use_pack_pad=False, **kwargs)[source]¶

base_model_prefix = 'elmo'¶

forward(seq_idx=None, seq_len=None) → ModelOutput[source]¶

Parameters

seq_idx (Tensor, of shape (batch_size, sequence_length)) – a list of indices
seq_len (Tensor, of shape (batch_size)) – length

Returns

pred_forward: of shape (batch_size, sequence_length) pred_backward: of shape (batch_size, sequence_length) forward_output: of shape (batch_size, sequence_length, hidden_size) backward_output: of shape (batch_size, sequence_length, hidden_size)

Return type

ElmoLMOutput

classmethod from_config(config_path, **kwargs)[source]¶

training: bool¶

class EduNLP.ModelZoo.rnn.ElmoLMForKnowledgePrediction(vocab_size: int, embedding_dim: int, hidden_size: int, num_classes_list: List[int], num_total_classes: int, dropout_rate: float = 0.5, batch_first=True, head_dropout: Optional[float] = 0.5, flat_cls_weight: Optional[float] = 0.5, attention_unit_size: Optional[int] = 256, fc_hidden_size: Optional[int] = 512, beta: Optional[float] = 0.5, **kwargs)[source]¶

base_model_prefix = 'elmo'¶

training: bool¶

forward(seq_idx=None, seq_len=None, labels=None) → ModelOutput[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config_path, **kwargs)[source]¶

class EduNLP.ModelZoo.rnn.ElmoLMForPreTraining(vocab_size: int, embedding_dim: int, hidden_size: int, dropout_rate: float = 0.5, batch_first=True, use_pack_pad=False, **kwargs)[source]¶

base_model_prefix = 'elmo'¶

forward(seq_idx=None, seq_len=None) → ModelOutput[source]¶

Parameters

seq_idx (Tensor, of shape (batch_size, sequence_length)) – a list of indices
seq_len (Tensor, of shape (batch_size)) – length
pred_mask (Tensor, of shape(batch_size, sequence_length)) –
idx_mask (Tensor, of shape (batch_size, sequence_length)) –

Returns

loss pred_forward: of shape (batch_size, sequence_length) pred_backward: of shape (batch_size, sequence_length) forward_output: of shape (batch_size, sequence_length, hidden_size) backward_output: of shape (batch_size, sequence_length, hidden_size)

Return type

ElmoLMForPreTrainingOutput

classmethod from_config(config_path, **kwargs)[source]¶

training: bool¶

class EduNLP.ModelZoo.rnn.ElmoLMForPropertyPrediction(vocab_size: int, embedding_dim: int, hidden_size: int, dropout_rate: float = 0.5, batch_first=True, head_dropout=0.5, **kwargs)[source]¶

base_model_prefix = 'elmo'¶

forward(seq_idx=None, seq_len=None, labels=None) → ModelOutput[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config_path, **kwargs)[source]¶

training: bool¶

class EduNLP.ModelZoo.rnn.LM(rnn_type: str, vocab_size: int, embedding_dim: int, hidden_size: int, num_layers=1, bidirectional=False, embedding=None, model_params=None, use_pack_pad=True, **kwargs)[source]¶

Parameters

rnn_type：str – Legal types including RNN, LSTM, GRU, BiLSTM
vocab_size (int) –
embedding_dim (int) –
hidden_size (int) –
num_layers –
bidirectional –
embedding –
model_params –
kwargs –

Examples

>>> import torch
>>> seq_idx = torch.LongTensor([[1, 2, 3], [1, 2, 0], [3, 0, 0]])
>>> seq_len = torch.LongTensor([3, 2, 1])
>>> lm = LM("RNN", 4, 3, 2)
>>> output, hn = lm(seq_idx, seq_len)
>>> output.shape
torch.Size([3, 3, 2])
>>> hn.shape
torch.Size([1, 3, 2])
>>> lm = LM("RNN", 4, 3, 2, num_layers=2)
>>> output, hn = lm(seq_idx, seq_len)
>>> output.shape
torch.Size([3, 3, 2])
>>> hn.shape
torch.Size([2, 3, 2])

forward(seq_idx, seq_len)[source]¶

Parameters

seq_idx (Tensor) – a list of indices
seq_len (Tensor) – length

Returns

a PackedSequence object

Return type

sequence

training: bool¶

disenqnet¶

class EduNLP.ModelZoo.disenqnet.DisenQNet(vocab_size: int, hidden_size: int, dropout_rate: float, wv=None, **kwargs)[source]¶

base_model_prefix = 'disenq'¶

DisenQNet question representation model

Parameters

vocab_size (int) – size of vocabulary
hidden_size (int) – size of word and question embedding
dropout_rate (float) – dropout rate
wv (torch.Tensor) – Tensor of (vocab_size, hidden_size) or None, initial word embedding, default = None

forward(seq_idx=None, seq_len=None, get_vk=True, get_vi=True) → ModelOutput[source]¶

Parameters

seq_idx (Tensor of (batch_size, seq_len)) – word index
seq_len (Tensor of (batch_size)) – valid sequence length of each batch
get_vk (bool) – whether to return vk
get_vi (bool) – whether to return vi

Returns

embed: Tensor of (batch_size, seq_len, hidden_size), word embedding
k_hidden: Tensor of (batch_size, hidden_size) or None, concept representation of question
i_hidden: Tensor of (batch_size, hidden_size) or None, individual representation of question

Return type

DisenQNetOutput

classmethod from_config(config_path, **kwargs)[source]¶

training: bool¶

class EduNLP.ModelZoo.disenqnet.DisenQNetForPreTraining(vocab_size, concept_size, hidden_size, dropout_rate, pos_weight, w_cp, w_mi, w_dis, warmup, n_adversarial, wv=None, **kwargs)[source]¶

base_model_prefix = 'disenq'¶

training: bool¶

forward(seq_idx=None, seq_len=None, concept=None) → ModelOutput[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config_path, **kwargs)[source]¶

quesnet¶

class EduNLP.ModelZoo.quesnet.QuesNet(_stoi=None, meta='know_name', pretrained_embs: Optional[ndarray] = None, pretrained_image: Optional[Module] = None, pretrained_meta: Optional[Module] = None, lambda_input=None, feat_size=256, emb_size=256, rnn_type='LSTM', layers=4, **kwargs)[source]¶

base_model_prefix = 'quesnet'¶

init_h(batch_size)[source]¶

load_emb(emb)[source]¶

load_img(img_layer: Module)[source]¶

load_meta(meta_layer: Module)[source]¶

make_batch(data, device, pretrain=False)[source]¶: Returns embeddings

forward(inputs: SeqBatch)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config_path, **kwargs)[source]¶

training: bool¶

class EduNLP.ModelZoo.quesnet.QuesNetForPreTraining(_stoi=None, pretrained_embs: Optional[ndarray] = None, pretrained_image: Optional[Module] = None, pretrained_meta: Optional[Module] = None, meta='know_name', emb_size=256, feat_size=512, rnn_type='LSTM', lambda_input=None, lambda_loss=None, layers=4, **kwargs)[source]¶

base_model_prefix = 'quesnet'¶: Sequence-to-sequence feature extractor based on RNN. Supports different input forms and different RNN types (LSTM/GRU),

training: bool¶

forward(batch)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config_path, **kwargs)[source]¶

class EduNLP.ModelZoo.quesnet.AE[source]¶

factor = 1¶

enc(item, *args, **kwargs)[source]¶

dec(item, *args, **kwargs)[source]¶

loss(item, emb=None)[source]¶

forward(item)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class EduNLP.ModelZoo.quesnet.ImageAE(emb_size)[source]¶

encoder(item, detach_tensor=False)[source]¶

decoder(emb, detach_tensor=False)[source]¶

training: bool¶

class EduNLP.ModelZoo.quesnet.MetaAE(meta_size, emb_size)[source]¶

training: bool¶

utils¶

class EduNLP.ModelZoo.utils.PadSequence(length, pad_val=0, clip=True)[source]¶

Pad the sequence.

Pad the sequence to the given length by inserting pad_val. If clip is set, sequence that has length larger than length will be clipped.

Parameters

length (int) – The maximum length to pad/clip the sequence
pad_val (number) – The pad value. Default 0
clip (bool) –

Returns

list of number

Return type

ret

EduNLP.ModelZoo.utils.pad_sequence(sequence: list, max_length=None, pad_val=0, clip=True)[source]¶

Parameters

sequence –
max_length –
pad_val –
clip –

Returns

Modified list – padding the sequence in the same size.

Return type

list

Examples

>>> seq = [[4, 3, 3], [2], [3, 3, 2]]
>>> pad_sequence(seq)
[[4, 3, 3], [2, 0, 0], [3, 3, 2]]
>>> pad_sequence(seq, pad_val=1)
[[4, 3, 3], [2, 1, 1], [3, 3, 2]]
>>> pad_sequence(seq, max_length=2)
[[4, 3], [2, 0], [3, 3]]
>>> pad_sequence(seq, max_length=2, clip=False)
[[4, 3, 3], [2, 0], [3, 3, 2]]

EduNLP.ModelZoo.utils.set_device(_net, ctx, *args, **kwargs)[source]¶: code from longling v1.3.26

class EduNLP.ModelZoo.utils.Masker(mask: (<class 'int'>, <class 'str'>, Ellipsis) = 0, per=0.2, seed=None)[source]¶

Parameters

mask (int, str) –
per –
seed –

Examples

>>> masker = Masker(per=0.5, seed=10)
>>> items = [[1, 1, 3, 4, 6], [2], [5, 9, 1, 4]]
>>> masked_seq, mask_label = masker(items)
>>> masked_seq
[[1, 1, 0, 0, 6], [2], [0, 9, 0, 4]]
>>> mask_label
[[0, 0, 1, 1, 0], [0], [1, 0, 1, 0]]
>>> items = [[1, 2, 3], [1, 1, 0], [2, 0, 0]]
>>> masked_seq, mask_label = masker(items, [3, 2, 1])
>>> masked_seq
[[1, 0, 3], [0, 1, 0], [2, 0, 0]]
>>> mask_label
[[0, 1, 0], [1, 0, 0], [0, 0, 0]]
>>> masker = Masker(mask="[MASK]", per=0.5, seed=10)
>>> items = [["a", "b", "c"], ["d", "[PAD]", "[PAD]"], ["hello", "world", "[PAD]"]]
>>> masked_seq, mask_label = masker(items, length=[3, 1, 2])
>>> masked_seq
[['a', '[MASK]', 'c'], ['d', '[PAD]', '[PAD]'], ['hello', '[MASK]', '[PAD]']]
>>> mask_label
[[0, 1, 0], [0, 0, 0], [0, 1, 0]]

Returns: list of masked_seq and list of masked_list
Return type: list

EduNLP.ModelZoo.utils.load_items(data_path)[source]¶

class EduNLP.ModelZoo.utils.MLP(in_dim, n_classes, hidden_dim, dropout, n_layers=2, act=<function leaky_relu>)[source]¶

forward(input)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

class EduNLP.ModelZoo.utils.TextCNN(embed_dim, hidden_dim)[source]¶

forward(embed)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

EduNLP.ModelZoo.utils.gather_nd(params, indices)[source]¶

_summary_

Parameters

params (_type_) – _description_
indices (_type_) – _description_

Returns

_type_ – _description_
Examples
———
>>> gather_nd(
… params=torch.tensor([[1, 2, 3],
… [4, 5, 6]]),
… indices=torch.tensor([[1],
… [0]]))
tensor([[4, 5, 6], – [1, 2, 3]])

EduNLP.ModelZoo.utils.sequence_mask(lengths, max_len=None)[source]¶

Same as tf.sequence_mask, Returns a mask tensor representing the first N positions of each cell.

Parameters

lengths (_type_) – integer tensor, all its values <= maxlen.
max_len (_type_, optional) – scalar integer tensor, size of last dimension of returned tensor. Default is the maximum value in lengths.

Returns

_type_ – A mask tensor of shape lengths.shape + (maxlen,)
Examples
———
>>> sequence_mask(torch.tensor([1, 3, 2]), 5)
tensor([[ True, False, False, False, False], – [ True, True, True, False, False], [ True, True, False, False, False]])
>>> sequence_mask(torch.tensor([[1, 3],[2,0]]))
tensor([[[ True, False, False], – [ True, True, True]],
<BLANKLINE> –

[[ True, True, False],
[False, False, False]]])