EduNLP.ModelZoo¶
base_model¶
- class EduNLP.ModelZoo.base_model.BaseModel[source]¶
- base_model_prefix = ''¶
- forward(*input)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
rnn¶
- class EduNLP.ModelZoo.rnn.ElmoLM(vocab_size: int, embedding_dim: int, hidden_size: int, num_layers: int = 2, dropout_rate: float = 0.5, use_pack_pad=False, **kwargs)[source]¶
- base_model_prefix = 'elmo'¶
- forward(seq_idx=None, seq_len=None) ModelOutput[source]¶
- Parameters:
seq_idx (Tensor, of shape (batch_size, sequence_length)) – a list of indices
seq_len (Tensor, of shape (batch_size)) – length
- Returns:
pred_forward: of shape (batch_size, sequence_length) pred_backward: of shape (batch_size, sequence_length) forward_output: of shape (batch_size, sequence_length, hidden_size) backward_output: of shape (batch_size, sequence_length, hidden_size)
- Return type:
ElmoLMOutput
- training: bool¶
- class EduNLP.ModelZoo.rnn.ElmoLMForKnowledgePrediction(vocab_size: int, embedding_dim: int, hidden_size: int, num_classes_list: List[int], num_total_classes: int, dropout_rate: float = 0.5, batch_first=True, head_dropout: Optional[float] = 0.5, flat_cls_weight: Optional[float] = 0.5, attention_unit_size: Optional[int] = 256, fc_hidden_size: Optional[int] = 512, beta: Optional[float] = 0.5, **kwargs)[source]¶
- base_model_prefix = 'elmo'¶
- training: bool¶
- forward(seq_idx=None, seq_len=None, labels=None) ModelOutput[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class EduNLP.ModelZoo.rnn.ElmoLMForPreTraining(vocab_size: int, embedding_dim: int, hidden_size: int, dropout_rate: float = 0.5, batch_first=True, use_pack_pad=False, **kwargs)[source]¶
- base_model_prefix = 'elmo'¶
- forward(seq_idx=None, seq_len=None) ModelOutput[source]¶
- Parameters:
seq_idx (Tensor, of shape (batch_size, sequence_length)) – a list of indices
seq_len (Tensor, of shape (batch_size)) – length
pred_mask (Tensor, of shape(batch_size, sequence_length)) –
idx_mask (Tensor, of shape (batch_size, sequence_length)) –
- Returns:
loss pred_forward: of shape (batch_size, sequence_length) pred_backward: of shape (batch_size, sequence_length) forward_output: of shape (batch_size, sequence_length, hidden_size) backward_output: of shape (batch_size, sequence_length, hidden_size)
- Return type:
ElmoLMForPreTrainingOutput
- training: bool¶
- class EduNLP.ModelZoo.rnn.ElmoLMForPropertyPrediction(vocab_size: int, embedding_dim: int, hidden_size: int, dropout_rate: float = 0.5, batch_first=True, head_dropout=0.5, **kwargs)[source]¶
- base_model_prefix = 'elmo'¶
- forward(seq_idx=None, seq_len=None, labels=None) ModelOutput[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.rnn.HAM(num_classes_list: List[int], num_total_classes: int, sequence_model_hidden_size: int, attention_unit_size: Optional[int] = 256, fc_hidden_size: Optional[int] = 512, beta: Optional[float] = 0.5, dropout_rate=None)[source]¶
- forward(sequential_embeddings)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.rnn.LM(rnn_type: str, vocab_size: int, embedding_dim: int, hidden_size: int, num_layers=1, bidirectional=False, embedding=None, model_params=None, use_pack_pad=True, **kwargs)[source]¶
- Parameters:
rnn_type:str – Legal types including RNN, LSTM, GRU, BiLSTM
vocab_size (int) –
embedding_dim (int) –
hidden_size (int) –
num_layers –
bidirectional –
embedding –
model_params –
kwargs –
Examples
>>> import torch >>> seq_idx = torch.LongTensor([[1, 2, 3], [1, 2, 0], [3, 0, 0]]) >>> seq_len = torch.LongTensor([3, 2, 1]) >>> lm = LM("RNN", 4, 3, 2) >>> output, hn = lm(seq_idx, seq_len) >>> output.shape torch.Size([3, 3, 2]) >>> hn.shape torch.Size([1, 3, 2]) >>> lm = LM("RNN", 4, 3, 2, num_layers=2) >>> output, hn = lm(seq_idx, seq_len) >>> output.shape torch.Size([3, 3, 2]) >>> hn.shape torch.Size([2, 3, 2])
- forward(seq_idx, seq_len)[source]¶
- Parameters:
seq_idx (Tensor) – a list of indices
seq_len (Tensor) – length
- Returns:
a PackedSequence object
- Return type:
sequence
- training: bool¶
disenqnet¶
- class EduNLP.ModelZoo.disenqnet.DisenQNet(vocab_size: int, hidden_size: int, dropout_rate: float, wv=None, **kwargs)[source]¶
- base_model_prefix = 'disenq'¶
DisenQNet question representation model
- Parameters:
vocab_size (int) – size of vocabulary
hidden_size (int) – size of word and question embedding
dropout_rate (float) – dropout rate
wv (torch.Tensor) – Tensor of (vocab_size, hidden_size) or None, initial word embedding, default = None
- forward(seq_idx=None, seq_len=None, get_vk=True, get_vi=True) ModelOutput[source]¶
- Parameters:
seq_idx (Tensor of (batch_size, seq_len)) – word index
seq_len (Tensor of (batch_size)) – valid sequence length of each batch
get_vk (bool) – whether to return vk
get_vi (bool) – whether to return vi
- Returns:
embed: Tensor of (batch_size, seq_len, hidden_size), word embedding
k_hidden: Tensor of (batch_size, hidden_size) or None, concept representation of question
i_hidden: Tensor of (batch_size, hidden_size) or None, individual representation of question
- Return type:
DisenQNetOutput
- training: bool¶
- class EduNLP.ModelZoo.disenqnet.DisenQNetForPreTraining(vocab_size, concept_size, hidden_size, dropout_rate, pos_weight, w_cp, w_mi, w_dis, warmup, n_adversarial, wv=None, **kwargs)[source]¶
- base_model_prefix = 'disenq'¶
- forward(seq_idx=None, seq_len=None, concept=None) ModelOutput[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.disenqnet.DisenQNetForPropertyPrediction(vocab_size: int, hidden_size: int, dropout_rate: float, wv=None, head_dropout=0.5, **kwargs)[source]¶
- base_model_prefix = 'disenq'¶
- forward(seq_idx=None, seq_len=None, labels=None, vector_type='i') ModelOutput[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.disenqnet.DisenQNetForKnowledgePrediction(vocab_size: int, hidden_size: int, dropout_rate: float, num_classes_list: List[int], num_total_classes: int, wv=None, head_dropout: Optional[float] = 0.5, flat_cls_weight: Optional[float] = 0.5, attention_unit_size: Optional[int] = 256, fc_hidden_size: Optional[int] = 512, beta: Optional[float] = 0.5, **kwargs)[source]¶
- base_model_prefix = 'disenq'¶
- training: bool¶
- forward(seq_idx=None, seq_len=None, labels=None, vector_type='i') ModelOutput[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
quesnet¶
- class EduNLP.ModelZoo.quesnet.QuesNet(_stoi=None, meta='know_name', pretrained_embs: Optional[ndarray] = None, pretrained_image: Optional[Module] = None, pretrained_meta: Optional[Module] = None, lambda_input=None, feat_size=256, emb_size=256, rnn_type='LSTM', layers=4, **kwargs)[source]¶
- base_model_prefix = 'quesnet'¶
- forward(inputs: SeqBatch)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.quesnet.QuesNetForPreTraining(_stoi=None, pretrained_embs: Optional[ndarray] = None, pretrained_image: Optional[Module] = None, pretrained_meta: Optional[Module] = None, meta='know_name', emb_size=256, feat_size=512, rnn_type='LSTM', lambda_input=None, lambda_loss=None, layers=4, **kwargs)[source]¶
- base_model_prefix = 'quesnet'¶
Sequence-to-sequence feature extractor based on RNN. Supports different input forms and different RNN types (LSTM/GRU),
- training: bool¶
- forward(batch)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class EduNLP.ModelZoo.quesnet.AE[source]¶
- factor = 1¶
- forward(item)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
utils¶
- class EduNLP.ModelZoo.utils.PadSequence(length, pad_val=0, clip=True)[source]¶
Pad the sequence.
Pad the sequence to the given length by inserting pad_val. If clip is set, sequence that has length larger than length will be clipped.
- Parameters:
length (int) – The maximum length to pad/clip the sequence
pad_val (number) – The pad value. Default 0
clip (bool) –
- Returns:
list of number
- Return type:
ret
- EduNLP.ModelZoo.utils.pad_sequence(sequence: list, max_length=None, pad_val=0, clip=True)[source]¶
- Parameters:
sequence –
max_length –
pad_val –
clip –
- Returns:
Modified list – padding the sequence in the same size.
- Return type:
list
Examples
>>> seq = [[4, 3, 3], [2], [3, 3, 2]] >>> pad_sequence(seq) [[4, 3, 3], [2, 0, 0], [3, 3, 2]] >>> pad_sequence(seq, pad_val=1) [[4, 3, 3], [2, 1, 1], [3, 3, 2]] >>> pad_sequence(seq, max_length=2) [[4, 3], [2, 0], [3, 3]] >>> pad_sequence(seq, max_length=2, clip=False) [[4, 3, 3], [2, 0], [3, 3, 2]]
- class EduNLP.ModelZoo.utils.Masker(mask: (<class 'int'>, <class 'str'>, Ellipsis) = 0, per=0.2, seed=None)[source]¶
- Parameters:
mask (int, str) –
per –
seed –
Examples
>>> masker = Masker(per=0.5, seed=10) >>> items = [[1, 1, 3, 4, 6], [2], [5, 9, 1, 4]] >>> masked_seq, mask_label = masker(items) >>> masked_seq [[1, 1, 0, 0, 6], [2], [0, 9, 0, 4]] >>> mask_label [[0, 0, 1, 1, 0], [0], [1, 0, 1, 0]] >>> items = [[1, 2, 3], [1, 1, 0], [2, 0, 0]] >>> masked_seq, mask_label = masker(items, [3, 2, 1]) >>> masked_seq [[1, 0, 3], [0, 1, 0], [2, 0, 0]] >>> mask_label [[0, 1, 0], [1, 0, 0], [0, 0, 0]] >>> masker = Masker(mask="[MASK]", per=0.5, seed=10) >>> items = [["a", "b", "c"], ["d", "[PAD]", "[PAD]"], ["hello", "world", "[PAD]"]] >>> masked_seq, mask_label = masker(items, length=[3, 1, 2]) >>> masked_seq [['a', '[MASK]', 'c'], ['d', '[PAD]', '[PAD]'], ['hello', '[MASK]', '[PAD]']] >>> mask_label [[0, 1, 0], [0, 0, 0], [0, 1, 0]]
- Returns:
list of masked_seq and list of masked_list
- Return type:
list
- class EduNLP.ModelZoo.utils.MLP(in_dim, n_classes, hidden_dim, dropout, n_layers=2, act=<function leaky_relu>)[source]¶
- forward(input)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.utils.TextCNN(embed_dim, hidden_dim)[source]¶
- forward(embed)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class EduNLP.ModelZoo.utils.KnowledgePredictionOutput[source]¶
- loss: FloatTensor = None¶
- logits: FloatTensor = None¶
- class EduNLP.ModelZoo.utils.ModelOutput[source]¶
Base class for all model outputs as dataclass. Has a __getitem__ that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the None attributes. Otherwise behaves like a regular python dictionary.
<Tip warning={true}>
You can’t unpack a ModelOutput directly. Use the [~utils.ModelOutput.to_tuple] method to convert it to a tuple before.
</Tip>
- setdefault(*args, **kwargs)[source]¶
Insert key with a value of default if key is not in the dictionary.
Return the value for key if key is in the dictionary, else default.
- pop(k[, d]) v, remove specified key and return the corresponding[source]¶
value. If key is not found, d is returned if given, otherwise KeyError is raised.
- class EduNLP.ModelZoo.utils.PropertyPredictionOutput[source]¶
- loss: FloatTensor = None¶
- logits: FloatTensor = None¶
- EduNLP.ModelZoo.utils.gather_nd(params, indices)[source]¶
_summary_
- Parameters:
params (_type_) – _description_
indices (_type_) – _description_
- Returns:
_type_ – _description_
Examples
———
>>> gather_nd(
… params=torch.tensor([[1, 2, 3],
… [4, 5, 6]]),
… indices=torch.tensor([[1],
… [0]]))
tensor([[4, 5, 6], – [1, 2, 3]])
- EduNLP.ModelZoo.utils.sequence_mask(lengths, max_len=None)[source]¶
Same as tf.sequence_mask, Returns a mask tensor representing the first N positions of each cell.
- Parameters:
lengths (_type_) – integer tensor, all its values <= maxlen.
max_len (_type_, optional) – scalar integer tensor, size of last dimension of returned tensor. Default is the maximum value in lengths.
- Returns:
_type_ – A mask tensor of shape lengths.shape + (maxlen,)
Examples
———
>>> sequence_mask(torch.tensor([1, 3, 2]), 5)
tensor([[ True, False, False, False, False], – [ True, True, True, False, False], [ True, True, False, False, False]])
>>> sequence_mask(torch.tensor([[1, 3],[2,0]]))
tensor([[[ True, False, False], – [ True, True, True]],
<BLANKLINE> –
- [[ True, True, False],
[False, False, False]]])