flambe.nlp.transformers
¶
Submodules¶
Package Contents¶
-
class
flambe.nlp.transformers.
BERTTextField
(vocab_file: str, sos_token: str = '[CLS]', eos_token: str = '[SEP]', do_lower_case: bool = False, max_len_truncate: int = 100, **kwargs)[source]¶ Bases:
flambe.field.TextField
,pytorch_transformers.BertTokenizer
Perform WordPiece tokenization.
Inspired by: https://github.com/huggingface/pytorch-pretrained-BERT/ blob/master/pytorch_pretrained_bert/tokenization.py.
Note that this object requires a pretrained vocabulary.
-
classmethod
from_alias
(cls, path: str = 'bert-base-cased', cache_dir: Optional[str] = None, do_lower_case: bool = False, max_len_truncate: int = 100, **kwargs)¶ Initialize from a pretrained tokenizer.
Parameters: path (str) – Path to a pretrained model, or one of the following string aliases currently available: . bert-base-uncased . bert-large-uncased . bert-base-cased . bert-large-cased . bert-base-multilingual-uncased . bert-base-multilingual-cased . bert-base-chinese
-
process
(self, example: str)¶ Process an example, and create a Tensor.
Parameters: example (str) – The example to process, as a single string Returns: The processed example, tokenized and numericalized Return type: torch.Tensor
-
classmethod
-
class
flambe.nlp.transformers.
BERTEmbeddings
(input_size_or_config: Union[int, pt.BertConfig], embedding_size: int = 768, embedding_dropout: float = 0.1, embedding_freeze: bool = False, pad_index: int = 0, max_position_embeddings: int = 512, type_vocab_size: int = 2, **kwargs)[source]¶ Bases:
flambe.nn.Module
,pytorch_transformers.modeling_bert.BertPreTrainedModel
Integrate the pytorch_pretrained_bert BERT word embedding model.
This module can be used as any normal encoder, or it can be loaded with the official pretrained BERT models. Simply used the from_pretrained class method when initializing the model.
Currently available: . bert-base-uncased . bert-large-uncased . bert-base-cased . bert-large-cased . bert-base-multilingual-uncased . bert-base-multilingual-cased . bert-base-chinese
-
classmethod
from_alias
(cls, path: str = 'bert-base-cased', cache_dir: Optional[str] = None, **kwargs)¶ Initialize from a pretrained model.
Parameters: path (str) – Path to a pretrained model, or one of the following string aliases currently available: . bert-base-uncased . bert-large-uncased . bert-base-cased . bert-large-cased . bert-base-multilingual-uncased . bert-base-multilingual-cased . bert-base-chinese
-
forward
(self, data: Tensor)¶ Performs a forward pass through the network.
Parameters: data (torch.Tensor) – The input data, as a float tensor, batch first Returns: - torch.Tensor – The encoded output, as a float tensor, batch_first
- torch.Tensor, optional – The padding mask if a pad index was given
-
classmethod
-
class
flambe.nlp.transformers.
BERTEncoder
(input_size_or_config: Union[int, pt.modeling_bert.BertConfig], hidden_size: int = 768, num_hidden_layers: int = 12, num_attention_heads: int = 12, intermediate_size: int = 3072, hidden_act: str = 'gelu', hidden_dropout_prob: float = 0.1, attention_probs_dropout_prob: float = 0.1, max_position_embeddings: int = 512, type_vocab_size: int = 2, initializer_range: float = 0.02, pool_last: bool = False, **kwargs)[source]¶ Bases:
flambe.nn.Module
,pytorch_transformers.modeling_bert.BertPreTrainedModel
Integrate the pytorch_pretrained_bert BERT encoder model.
This module can be used as any normal encoder, or it can be loaded with the official pretrained BERT models. Simply used the from_pretrained class method when initializing the model.
Currently available: . bert-base-uncased . bert-large-uncased . bert-base-cased . bert-large-cased . bert-base-multilingual-uncased . bert-base-multilingual-cased . bert-base-chinese
-
classmethod
from_alias
(cls, path: str = 'bert-base-cased', cache_dir: Optional[str] = None, pool_last: bool = False, **kwargs)¶ Initialize from a pretrained model.
Parameters: path (str) – Path to a pretrained model, or one of the following string aliases currently available: . bert-base-uncased . bert-large-uncased . bert-base-cased . bert-large-cased . bert-base-multilingual-uncased . bert-base-multilingual-cased . bert-base-chinese
-
forward
(self, data: Tensor, mask: Optional[Tensor] = None)¶ Performs a forward pass through the network.
Parameters: data (torch.Tensor) – The input data, as a long tensor Returns: The encoded output, as a float tensor or the pooled output Return type: torch.Tensor
-
classmethod
-
class
flambe.nlp.transformers.
OpenAIGPTTextField
(vocab_file: str, merges_file: str, max_len: int = 100, lower: bool = False)[source]¶ Bases:
flambe.field.TextField
,pytorch_transformers.OpenAIGPTTokenizer
Perform WordPiece tokenization.
Inspired by: https://github.com/huggingface/pytorch-pretrained-BERT/ blob/master/pytorch_pretrained_bert/tokenization_openai.py.
Note that this object requires a pretrained vocabulary.
-
classmethod
from_alias
(cls, path: str = 'openai-gpt', cache_dir: Optional[str] = None)¶ Initialize from a pretrained tokenizer.
-
process
(self, example: str)¶ Process an example, and create a Tensor.
Parameters: example (str) – The example to process, as a single string Returns: The processed example, tokenized and numericalized Return type: torch.Tensor
-
classmethod
-
class
flambe.nlp.transformers.
OpenAIGPTEmbeddings
(input_size_or_config: Union[int, pt.OpenAIGPTConfig] = 40478, embedding_size: int = 768, embedding_dropout: float = 0.1, embedding_freeze: bool = False, pad_index: int = 0, n_special: int = 0, n_positions: int = 512, initializer_range=0.02)[source]¶ Bases:
flambe.nn.Module
,pytorch_transformers.modeling_openai.OpenAIGPTPreTrainedModel
Integrate the pytorch_pretrained_bert OpenAI embedding model.
This module can be used as any normal encoder, or it can be loaded with the official pretrained OpenAI models. Simply used the from_pretrained class method when initializing the model.
-
classmethod
from_alias
(cls, path: str = 'openai-gpt', cache_dir: Optional[str] = None)¶ Initialize from a pretrained model.
Parameters: path (str) – Path to a pretrained model, or one of the following string aliases currently available: . openai-gpt
-
set_num_special_tokens
(self, num_special_tokens)¶ Update input embeddings with new embedding matrice if needed
-
forward
(self, data: Tensor)¶ Performs a forward pass through the network.
Parameters: data (torch.Tensor) – The input data, as a float tensor, batch first Returns: - torch.Tensor – The encoded output, as a float tensor, batch_first
- torch.Tensor, optional – The padding mask if a pad index was given
-
classmethod
-
class
flambe.nlp.transformers.
OpenAIGPTEncoder
(input_size_or_config: Union[int, pt.OpenAIGPTConfig] = 768, n_ctx: int = 512, n_layer: int = 12, n_head: int = 12, afn: Union[str, nn.Module] = 'gelu', resid_pdrop: float = 0.1, embd_pdrop: float = 0.1, attn_pdrop: float = 0.1, layer_norm_epsilon: float = 1e-05, initializer_range=0.02)[source]¶ Bases:
flambe.nn.Module
,pytorch_transformers.modeling_openai.OpenAIGPTPreTrainedModel
Integrate the pytorch_pretrained_bert OpenAIGPT encoder model.
This module can be used as any normal encoder, or it can be loaded with the official pretrained BERT models. Simply used the from_pretrained class method when initializing the model.
Currently available: . openai-gpt
-
classmethod
from_alias
(cls, path: str = 'openai-gpt', cache_dir: Optional[str] = None)¶ Initialize from a pretrained model.
Parameters: path (str) – Path to a pretrained model, or one of the following string aliases currently available: . openai-gpt
-
forward
(self, data: Tensor, mask: Optional[Tensor] = None)¶ Performs a forward pass through the network.
Parameters: data (torch.Tensor) – The input data, as a long tensor Returns: The encoded output, as a float tensor or the pooled output Return type: torch.Tensor
-
classmethod
-
class
flambe.nlp.transformers.
AdamW
[source]¶ Bases:
flambe.Component
,pytorch_transformers.optimization.Optimizer
-
class
flambe.nlp.transformers.
ConstantLRSchedule
[source]¶ Bases:
flambe.Component
,pytorch_transformers.ConstantLRSchedule
-
class
flambe.nlp.transformers.
WarmupConstantSchedule
[source]¶ Bases:
flambe.Component
,pytorch_transformers.WarmupConstantSchedule
-
class
flambe.nlp.transformers.
WarmupLinearSchedule
[source]¶ Bases:
flambe.Component
,pytorch_transformers.WarmupLinearSchedule