dlpy.applications.Faster_RCNN¶
-
dlpy.applications.
Faster_RCNN
(conn, model_table='Faster_RCNN', n_channels=3, width=1000, height=496, scale=1, norm_stds=None, offsets=(102.9801, 115.9465, 122.7717), random_mutation=None, n_classes=20, anchor_num_to_sample=256, anchor_ratio=[0.5, 1, 2], anchor_scale=[8, 16, 32], base_anchor_size=16, coord_type='coco', max_label_per_image=200, proposed_roi_num_train=2000, proposed_roi_num_score=300, roi_train_sample_num=128, roi_pooling_height=7, roi_pooling_width=7, nms_iou_threshold=0.3, detection_threshold=0.5, max_object_num=50, number_of_neurons_in_fc=4096, backbone='vgg16', random_flip=None, random_crop=None)¶ Generates a deep learning model with the faster RCNN architecture.
Parameters: - conn : CAS
Specifies the connection of the CAS connection.
- model_table : string, optional
Specifies the name of CAS table to store the model.
- n_channels : int, optional
Specifies the number of the channels (i.e., depth) of the input layer.
Default: 3- width : int, optional
Specifies the width of the input layer.
Default: 1000- height : int, optional
Specifies the height of the input layer.
Default: 496- scale : double, optional
Specifies a scaling factor to be applied to each pixel intensity values.
Default: 1- norm_stds : double or iter-of-doubles, optional
Specifies a standard deviation for each channel in the input data. The final input data is normalized with specified means and standard deviations.
- offsets : double or iter-of-doubles, optional
Specifies an offset for each channel in the input data. The final input data is set after applying scaling and subtracting the specified offsets.
- random_mutation : string, optional
Specifies how to apply data augmentations/mutations to the data in the input layer.
Valid Values: ‘none’, ‘random’- n_classes : int, optional
Specifies the number of classes. If None is assigned, the model will automatically detect the number of classes based on the training set.
Default: 20- anchor_num_to_sample : int, optional
Specifies the number of anchors to sample for training the region proposal network
Default: 256- anchor_ratio : iter-of-float
Specifies the anchor height and width ratios (h/w) used.
- anchor_scale : iter-of-float
Specifies the anchor scales used based on base_anchor_size
- base_anchor_size : int, optional
Specifies the basic anchor size in width and height (in pixels) in the original input image dimension
Default: 16- coord_type : int, optional
Specifies the coordinates format type in the input label and detection result.
Valid Values: RECT, COCO, YOLO
Default: COCO- proposed_roi_num_score: int, optional
Specifies the number of ROI (Region of Interest) to propose in the scoring phase
Default: 300- proposed_roi_num_train: int, optional
Specifies the number of ROI (Region of Interest) to propose used for RPN training, and also the pool to sample from for FastRCNN Training in the training phase
Default: 2000- roi_train_sample_num: int, optional
Specifies the number of ROIs(Regions of Interests) to sample after NMS(Non-maximum Suppression) is performed in the training phase.
Default: 128- roi_pooling_height : int, optional
Specifies the output height of the region pooling layer.
Default: 7- roi_pooling_width : int, optional
Specifies the output width of the region pooling layer.
Default: 7- max_label_per_image : int, optional
Specifies the maximum number of labels per image in the training.
Default: 200- nms_iou_threshold: float, optional
Specifies the IOU threshold of maximum suppression in object detection
Default: 0.3- detection_threshold : float, optional
Specifies the threshold for object detection.
Default: 0.5- max_object_num: int, optional
Specifies the maximum number of object to detect
Default: 50- number_of_neurons_in_fc: int, or list of int, optional
Specifies the number of neurons in the last two fully connected layers. If one int is set, then both of the layers will have the same values. If a list is set, then the layers get different number of neurons.
Default: 4096- backbone: string, optional
Specifies the architecture to be used as the feature extractor.
Valid Values: vgg16
Default: vgg16, resnet50, resnet18, resnet34, mobilenetv1, mobilenetv2- random_flip : string, optional
Specifies how to flip the data in the input layer when image data is used. Approximately half of the input data is subject to flipping.
Valid Values: ‘h’, ‘hv’, ‘v’, ‘none’- random_crop : string, optional
Specifies how to crop the data in the input layer when image data is used. Images are cropped to the values that are specified in the width and height parameters. Only the images with one or both dimensions that are larger than those sizes are cropped.
Valid Values: ‘none’, ‘unique’, ‘randomresized’, ‘resizethencrop’
Returns: References