VGGNet

The VGGNets are a series of convolutional neural networks (CNNs) developed by the Visual Geometry Group (VGG) at the University of Oxford.

The VGG family includes various configurations with different depths, denoted by the letter "VGG" followed by the number of weight layers.

[2] An ensemble model of VGGNets achieved state-of-the-art results in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2014.

[1][3] It was used as a baseline comparison in the ResNet paper for image classification,[4] as the network in the Fast Region-based CNN for object detection, and as a base network in neural style transfer.

[5] The series was historically important as an early influential model designed by composing generic modules, whereas AlexNet (2012) was designed "from scratch".

It was also instrumental in changing the standard convolutional kernels in CNN from large (up to 11-by-11 in AlexNet) to just 3-by-3, a decision that was only revised in ConvNext (2022).

[8] The key architectural principle of VGG models is the consistent use of small

This contrasts with earlier CNN architectures that employed larger filters, such as

The original publication showed that deep and narrow CNN significantly outperform their shallow and wide counterparts.

[7] The VGG series of models are deep neural networks composed of generic modules: The VGG family includes various configurations with different depths, denoted by the letter "VGG" followed by the number of weight layers.

The most common ones are VGG-16 (13 convolutional layers + 3 fully connected layers) and VGG-19 (16 + 3), denoted as configurations D and E in the original paper.