dlpy.layers.MultiHeadAttention¶
-
class
dlpy.layers.
MultiHeadAttention
(n, n_attn_heads, name=None, act='AUTO', init=None, std=None, mean=None, truncation_factor=None, dropout=None, attn_dropout=None, include_bias=True, src_layers=None, **kwargs)¶ Multi-head attention layer from “Attention is All You Need” (Vaswani et al., NIPS 2017)
- Parameters
- nint
Specifies the number of neurons.
- n_attn_headsint
Specifies the number of attention heads.
- namestring, optional
Specifies the name of the layer.
- actstring, optional
Specifies the activation function. Valid Values: AUTO, IDENTITY, LOGISTIC, SIGMOID, EXP, TANH, RECTIFIER, RELU, GELU Default: AUTO
- initstring, optional
Specifies the initialization scheme for the layer. Valid Values: XAVIER, UNIFORM, NORMAL, CAUCHY, XAVIER1, XAVIER2, MSRA, MSRA1, MSRA2 Default: XAVIER
- stdfloat, optional
Specifies the standard deviation value when the
init
parameter is set to NORMAL.- meanfloat, optional
Specifies the mean value when the
init
parameter is set to NORMAL.- truncation_factorfloat, optional
Specifies the truncation threshold (truncationFactor x std), when the
init
parameter is set to NORMAL- dropoutfloat, optional
Specifies the dropout rate. Default: 0
- attn_dropoutfloat, optional
Specifies the attention dropout rate. Default: 0
- include_biasbool, optional
Includes bias neurons. Default: True
- src_layersiter-of-Layers, optional
Specifies the layers directed to this layer.
- Returns
-
__init__
(n, n_attn_heads, name=None, act='AUTO', init=None, std=None, mean=None, truncation_factor=None, dropout=None, attn_dropout=None, include_bias=True, src_layers=None, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(n, n_attn_heads[, name, act, init, …])Initialize self.
count_instances
()format_name
([block_num, local_count])Format the name of the layer
get_number_of_instances
()to_model_params
()Convert the model configuration to CAS action parameters
Attributes
can_be_last_layer
kernel_size
layer_id
num_bias
num_features
num_weights
number_of_instances
output_size
rnn_summary
Return a DataFrame containing the layer information for rnn models
summary
Return a DataFrame containing the layer information
type
type_desc
type_label