{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# VGG16预训练模型\n", "\n", "*Author: Pytorch Team*\n", "\n", "**Award winning ConvNets from 2014 Imagenet ILSVRC challenge**\n", "\n", "\n", "\n", "\n", "https://pytorch.org/hub/pytorch_vision_vgg/\n", "\n", "https://pytorch.org/docs/stable/torchvision/models.html" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run it in colab\n", "\n", "https://colab.research.google.com/drive/1epVRmNLeoAenypwM1ffGeHv9pk1xtEek\n", "\n", "This notebook is optionally accelerated with a GPU runtime.\n", "\n", "If you would like to use this acceleration, please \n", "- select the menu option \"Runtime\" -> \"Change runtime type\", \n", "- select \"Hardware Accelerator\" -> \"GPU\" and click \"SAVE\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Pretrained Models" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "ExecuteTime": { "start_time": "2021-07-23T12:32:02.893Z" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Downloading: \"https://github.com/pytorch/vision/archive/v0.6.0.zip\" to /Users/datalab/.cache/torch/hub/v0.6.0.zip\n", "Downloading: \"https://download.pytorch.org/models/vgg16-397923af.pth\" to /Users/datalab/.cache/torch/hub/checkpoints/vgg16-397923af.pth\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "f7d7289bf8db4e6fb3ce17bf9b0901f4", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(FloatProgress(value=0.0, max=553433881.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import torch\n", "model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg16', pretrained=True)\n", "# or any of these variants\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg11_bn', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg13', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg13_bn', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg16', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg16_bn', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg19', pretrained=True)\n", "# model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg19_bn', pretrained=True)\n", "model.eval()" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-12-04T08:55:55.405309Z", "start_time": "2020-12-04T08:55:55.386578Z" } }, "source": [ "- Downloading: \"https://github.com/pytorch/vision/archive/v0.6.0.zip\" to /root/.cache/torch/hub/v0.6.0.zip\n", "- Downloading: \"https://download.pytorch.org/models/vgg16-397923af.pth\" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth\n", "\n", "```\n", "100% 528M/528M [00:02<00:00, 223MB/s]\n", "```\n", "\n", "```\n", "VGG(\n", " (features): Sequential(\n", " (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (1): ReLU(inplace=True)\n", " (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (3): ReLU(inplace=True)\n", " (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", " (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (6): ReLU(inplace=True)\n", " (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (8): ReLU(inplace=True)\n", " (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", " (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (11): ReLU(inplace=True)\n", " (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (13): ReLU(inplace=True)\n", " (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (15): ReLU(inplace=True)\n", " (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", " (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (18): ReLU(inplace=True)\n", " (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (20): ReLU(inplace=True)\n", " (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (22): ReLU(inplace=True)\n", " (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", " (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (25): ReLU(inplace=True)\n", " (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (27): ReLU(inplace=True)\n", " (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))\n", " (29): ReLU(inplace=True)\n", " (30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)\n", " )\n", " (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))\n", " (classifier): Sequential(\n", " (0): Linear(in_features=25088, out_features=4096, bias=True)\n", " (1): ReLU(inplace=True)\n", " (2): Dropout(p=0.5, inplace=False)\n", " (3): Linear(in_features=4096, out_features=4096, bias=True)\n", " (4): ReLU(inplace=True)\n", " (5): Dropout(p=0.5, inplace=False)\n", " (6): Linear(in_features=4096, out_features=1000, bias=True)\n", " )\n", ")\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All pre-trained models expect input images normalized in the same way,\n", "i.e. mini-batches of 3-channel RGB images of shape `(3 x H x W)`, where `H` and `W` are expected to be at least `224`.\n", "\n", "The images have to be loaded in to a range of `[0, 1]` and then normalized using `mean = [0.485, 0.456, 0.406]`\n", "and `std = [0.229, 0.224, 0.225]`.\n", "\n", "Here's a sample execution." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Download an example image from the pytorch website\n", "import urllib\n", "url, filename = (\"https://github.com/pytorch/hub/raw/master/images/dog.jpg\", \"dog.jpg\")\n", "try: urllib.URLopener().retrieve(url, filename)\n", "except: urllib.request.urlretrieve(url, filename)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "https://github.com/pytorch/hub/raw/master/images/dog.jpg" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from PIL import Image\n", "input_image = Image.open(filename)\n", "input_image" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# sample execution (requires torchvision)\n", "from torchvision import transforms\n", "\n", "preprocess = transforms.Compose([\n", " transforms.Resize(256),\n", " transforms.CenterCrop(224),\n", " transforms.ToTensor(),\n", " transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),\n", "])\n", "input_tensor = preprocess(input_image)\n", "input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model\n", "\n", "# move the input and model to GPU for speed if available\n", "if torch.cuda.is_available():\n", " input_batch = input_batch.to('cuda')\n", " model.to('cuda')\n", "\n", "with torch.no_grad():\n", " output = model(input_batch)\n", "# Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes\n", "print(output[0])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The output has unnormalized scores. To get probabilities, you can run a softmax on it.\n", "\n", "input_prob = torch.nn.functional.softmax(output[0], dim=0)\n", "torch.argmax(input_prob)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```\n", "tensor(258, device='cuda:0')\n", "```" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2021-07-23T12:37:11.364843Z", "start_time": "2021-07-23T12:37:09.577544Z" } }, "outputs": [], "source": [ "# ImageNet挑战使用了一个“修剪”的1000个非重叠类的列表\n", "import pandas as pd\n", "\n", "url = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'\n", "imagenet_df = pd.read_json(url).T" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2021-07-23T12:37:18.548085Z", "start_time": "2021-07-23T12:37:18.530787Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", " | 0 | \n", "1 | \n", "
---|---|---|
0 | \n", "n01440764 | \n", "tench | \n", "
1 | \n", "n01443537 | \n", "goldfish | \n", "
2 | \n", "n01484850 | \n", "great_white_shark | \n", "
3 | \n", "n01491361 | \n", "tiger_shark | \n", "
4 | \n", "n01494475 | \n", "hammerhead | \n", "
... | \n", "... | \n", "... | \n", "
995 | \n", "n13044778 | \n", "earthstar | \n", "
996 | \n", "n13052670 | \n", "hen-of-the-woods | \n", "
997 | \n", "n13054560 | \n", "bolete | \n", "
998 | \n", "n13133613 | \n", "ear | \n", "
999 | \n", "n15075141 | \n", "toilet_tissue | \n", "
1000 rows × 2 columns
\n", "