Computer Vision News - August 2022

9 Detect Graphic Intensity and Power in Videos img_size_flat = img_size * img_size * num_channels # Number of classes for classification (Violence-No Violence) num_classes = 2 # Number of files to train _num_files_train = 1 # Number of frames per video _images_per_file = 20 # Number of frames per training set _num_images_train = _num_files_train * _images_per_file # Video extension video_exts = ".avi" Plot a video frame to see if data is correct # First get the names and labels of the whole videos names, labels = label_video_names(in_dir) Then we are going to load 20 frames of one video, for example names[12] 'fi191_xvid.avi' The video has violence, look at the name of the video, starts with 'fi' frames = get_frames(in_dir, names[12]) Convert back the frames to uint8 pixel format to plot the frame visible_frame = (frames*255).astype('uint8') plt.imshow(visible_frame[3]) <matplotlib.image.AxesImage at 0x7f37ef72fef0> plt.imshow(visible_frame[15]) Pre-Trained Model: VGG16 The following creates an instance of the pre-trained VGG16 model using the Keras API. This automatically downloads the required files if you don't have them already. TheVGG16model contains a convolutional part anda fully-connected (or dense) partwhich is used for classification. If include_top=True then the whole VGG16 model is downloaded which is about 528 MB. If include_top=False then only the convolutional part of the VGG16 model is downloaded which is just 57 MB. image_model = VGG16(include_top=True, weights='imagenet') image_model.summary() We can observe the shape of the tensors expected as input by the pre-trained VGG16 model. In this case, it is images of shape 224 x 224 x 3. Note that we have defined the frame size as 224x224x3. The video frame will be the input of the VGG16 net. input_shape = image_model.layers[0].output_shape[1:3] input_shape (224, 224)