Automated recognition of Spotted-wing drosophila

Copyright (c) 2020 BioDatarium (Authors: Dalila Rendon, Ewald Enzinger)

For this project, we are fine-tuning a pretrained faster_rcnn model to accurately identify males and females of the spotted-wing drosophila (Drosophila suzukii) and individuals of Drosophila melanogaster in mixed images.

Spotted-wing drosophila is an invasive fly pest of berries, and monitoring for its presence in crop fields is an important element for management. Collecting spotted-wing drosophila is often done with drowing cup traps. One of the downsides is that these traps often collect a lot of bycatch, including other drosophila species. Trap samples are often inspected under dissecting microscopes to detect and count the presence of spotted-wing drosophila.

For this project, we aim to build an automated species recognition model that can identify and count males and females of spotted-wing drosophila in microscope images of mixed samples. The training data consists of 353 images of mixed flies, with a total of 1719 male, 2390 female spotted-wing drosophila and 3381 Drosophila melanogaster labeled individuals.

This script is an overview of one of the many approaches to solve this problem, but optimizing models with different types of images is a constant work in progress. Some of the code used here was adapted from https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html

Image copyright: Dalila Rendon, 2020

Capture1.JPG

In [ ]:
!pip install pycocotools
Requirement already satisfied: pycocotools in /usr/local/lib/python3.6/dist-packages (2.0.2)
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (3.2.2)
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (50.3.0)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.6/dist-packages (from pycocotools) (0.29.21)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.2.0)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (1.18.5)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.4.7)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (2.8.1)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib>=2.1.0->pycocotools) (0.10.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.6/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools) (1.15.0)
In [ ]:
#######################################################################
# Importing python packages
#######################################################################

# os, used for listing files from directories, checking if files exist, etc.
import os
# Numpy for linear algebra and n-dimension arrays
import numpy as np
# Pytorch, our deep learning framework
import torch
# PIL (pillow) for image loading/resizing
from PIL import Image
# Path is used similar to "os" for listing files from directories, but with
# a pattern, e.g. "*.xml" to only list XML files
from pathlib import Path
# XML library for parsing the annotation XML files
import xml.etree.ElementTree as ET
# Pre-existing object detection model (Faster-RCNN)
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
In [ ]:
def get_transform(train):
    transforms = []
    transforms.append(ToTensor())
    if train:
        transforms.append(RandomHorizontalFlip(0.5))
    return Compose(transforms)
In [ ]:
#######################################################################
# Define a "CustomDataset" class that defines __init__, __getitem__,
# and __len__ methods (functions). These are later used by Pytorch
# during training when data batches are being requested.
#######################################################################
class CustomDataset(object):
    # In the __init__ method, we receive as input the data folder.
    # Optionally, we can also define "transforms", which is a list
    # of functions that "transform" the data, e.g. for augmentation.
    # For example, one could use a RandomHorizontalFlip function from
    # torchvision to randomly flip images horizontally to create augmented
    # training data.
    def __init__(self, data_folder, transforms = None):
        self.data_folder = data_folder
        self.transforms = transforms
        # create a list of all image files, sorting them to
        # ensure that they are aligned with the XML files
        # NOTE: This requires that there is exactly one XML file for each
        # image file, with the same name!!
        self.image_file_names = list(sorted(Path(data_folder).glob("*.JPG")))
        self.xml_file_names = list(sorted(Path(data_folder).glob("*.xml")))
        print(f"{len(self.image_file_names)} JPG files")
        print(f"{len(self.xml_file_names)} xml files")

    def __getitem__(self, idx):
        # Create full image file path from data_folder and the current file name
        # referenced by idx
        image_path = Path(self.data_folder).joinpath(self.image_file_names[idx])
        # Load actual image data
        image = Image.open(image_path).convert("RGB")

        # Create full XML file path from data_folder and the current file name
        # referenced by idx
        label_path = Path(self.data_folder).joinpath(self.xml_file_names[idx])
        # Get root element of XML file (named "annotation" in the file)
        xml_tree = ET.parse(label_path).getroot()
        # Find all the "<object>..</object>" tags in the XML file
        objects = xml_tree.findall("object")

        boxes = []
        labels = []
        for entry in objects:
            # Get "<bndbox>..</bndbox>" tag inside the "<object>"
            bounding_box = entry.find("bndbox")
            # Get coordinates for bounding boxes
            xmin = int(bounding_box.find("xmin").text)
            xmax = int(bounding_box.find("xmax").text)
            ymin = int(bounding_box.find("ymin").text)
            ymax = int(bounding_box.find("ymax").text)
            # Convert labels to dummy variable:
            # 1 for female
            # 2 for male
            # 3 for drosophila
            label_text = entry.find("name").text
            if label_text == "f":
                label = 1
            elif label_text == "m":
                label = 2
            elif label_text == "d":
                label = 3
            else:
                print(f"Label text for image {image_path}: {label_text}")
            # Add bounding boxes
            boxes.append([xmin, ymin, xmax, ymax])
            labels.append(label)

        # convert everything into a torch.Tensor
        boxes = torch.as_tensor(boxes, dtype=torch.float32)
        labels = torch.as_tensor(labels, dtype=torch.int64)
        image_id = torch.tensor([idx])
        area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])

        # Build the target dictionary
        target = {}
        target["boxes"] = boxes
        target["labels"] = labels
        target["image_id"] = image_id
        target["area"] = area

        # Apply transforms, if there are any
        if self.transforms is not None:
            image, target = self.transforms(image, target)

        # Return image and target data
        return image, target

    # The __len__ method returns how many images/XML file pairs
    # there are in the dataset
    def __len__(self):
        return len(self.image_file_names)
In [ ]:
# replace the classifier with a new one, that has
# num_classes which is user-defined

# 1 for female (Drosophila suzukii)
# 2 for male (Drosophila suzukii)
# 3 for drosophila (Drosophila melanogaster)
num_classes = 4  # 4 classes (female, male) + drosophila + background

# load an instance segmentation model pre-trained on COCO
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

# get number of input features for the classifier
in_features = model.roi_heads.box_predictor.cls_score.in_features
# replace the pre-trained head with a new one
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
In [ ]:
# Do training and evaluation:

# train on the GPU or on the CPU, if a GPU is not available
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

# use our dataset and defined transformations
dataset = CustomDataset('/content/data/SWD', get_transform(train=False))

# define training and validation data loaders
data_loader = torch.utils.data.DataLoader(
    dataset, batch_size=2, shuffle=True, num_workers=4,
    collate_fn=collate_fn)
353 JPG files
353 xml files
353 JPG files
353 xml files
In [ ]:
images, targets = next(iter(data_loader))
images = list(image for image in images)
targets = [{k: v for k, v in t.items()} for t in targets]
output = model(images,targets)   # Returns losses and detections
# For inference
model.eval()
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
predictions = model(x)           # Returns predictions
Downloading: "https://download.pytorch.org/models/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth" to /root/.cache/torch/hub/checkpoints/fasterrcnn_resnet50_fpn_coco-258fb6c6.pth

/usr/local/lib/python3.6/dist-packages/torchvision/ops/boxes.py:101: UserWarning: This overload of nonzero is deprecated:
	nonzero()
Consider using one of the following signatures instead:
	nonzero(*, bool as_tuple) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
  keep = keep.nonzero().squeeze(1)
In [ ]:
# move model to the right device
model.to(device)

# construct an optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005,
                            momentum=0.9, weight_decay=0.0005)
# and a learning rate scheduler
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,
                                               step_size=3,
                                               gamma=0.1)

# let's train it for 10 epochs
num_epochs = 10

for epoch in range(num_epochs):
    # train for one epoch, printing every 10 iterations
    train_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)
    # update the learning rate
    lr_scheduler.step()

print("That's it!")
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [0]  [  0/164]  eta: 0:09:47  lr: 0.000036  loss: 3.2520 (3.2520)  loss_classifier: 1.5379 (1.5379)  loss_box_reg: 0.9433 (0.9433)  loss_objectness: 0.6340 (0.6340)  loss_rpn_box_reg: 0.1369 (0.1369)  time: 3.5816  data: 1.5889  max mem: 2224
Epoch: [0]  [ 10/164]  eta: 0:02:06  lr: 0.000342  loss: 2.6526 (2.5904)  loss_classifier: 1.3684 (1.3017)  loss_box_reg: 0.9291 (0.9173)  loss_objectness: 0.1582 (0.2586)  loss_rpn_box_reg: 0.1208 (0.1126)  time: 0.8205  data: 0.1627  max mem: 2492
Epoch: [0]  [ 20/164]  eta: 0:01:39  lr: 0.000649  loss: 1.9042 (2.2086)  loss_classifier: 0.8742 (1.0600)  loss_box_reg: 0.9010 (0.8998)  loss_objectness: 0.0497 (0.1555)  loss_rpn_box_reg: 0.0809 (0.0933)  time: 0.5445  data: 0.0191  max mem: 2492
Epoch: [0]  [ 30/164]  eta: 0:01:26  lr: 0.000955  loss: 1.6057 (1.9991)  loss_classifier: 0.6632 (0.9215)  loss_box_reg: 0.8579 (0.8821)  loss_objectness: 0.0289 (0.1122)  loss_rpn_box_reg: 0.0619 (0.0833)  time: 0.5442  data: 0.0178  max mem: 2493
Epoch: [0]  [ 40/164]  eta: 0:01:16  lr: 0.001261  loss: 1.5293 (1.8648)  loss_classifier: 0.6140 (0.8350)  loss_box_reg: 0.8241 (0.8638)  loss_objectness: 0.0162 (0.0880)  loss_rpn_box_reg: 0.0599 (0.0780)  time: 0.5453  data: 0.0175  max mem: 2493
Epoch: [0]  [ 50/164]  eta: 0:01:08  lr: 0.001568  loss: 1.3462 (1.7480)  loss_classifier: 0.5134 (0.7686)  loss_box_reg: 0.7680 (0.8328)  loss_objectness: 0.0083 (0.0746)  loss_rpn_box_reg: 0.0526 (0.0721)  time: 0.5469  data: 0.0176  max mem: 2493
Epoch: [0]  [ 60/164]  eta: 0:01:01  lr: 0.001874  loss: 1.1742 (1.6520)  loss_classifier: 0.5008 (0.7235)  loss_box_reg: 0.6371 (0.7945)  loss_objectness: 0.0085 (0.0650)  loss_rpn_box_reg: 0.0475 (0.0690)  time: 0.5490  data: 0.0176  max mem: 2493
Epoch: [0]  [ 70/164]  eta: 0:00:55  lr: 0.002181  loss: 1.0983 (1.5605)  loss_classifier: 0.4785 (0.6872)  loss_box_reg: 0.5301 (0.7492)  loss_objectness: 0.0130 (0.0581)  loss_rpn_box_reg: 0.0475 (0.0660)  time: 0.5519  data: 0.0178  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [0]  [ 80/164]  eta: 0:00:49  lr: 0.002487  loss: 1.0258 (1.4938)  loss_classifier: 0.4672 (0.6592)  loss_box_reg: 0.4981 (0.7178)  loss_objectness: 0.0146 (0.0527)  loss_rpn_box_reg: 0.0439 (0.0641)  time: 0.5531  data: 0.0178  max mem: 2493
Epoch: [0]  [ 90/164]  eta: 0:00:43  lr: 0.002794  loss: 0.9719 (1.4292)  loss_classifier: 0.4374 (0.6344)  loss_box_reg: 0.4496 (0.6859)  loss_objectness: 0.0103 (0.0481)  loss_rpn_box_reg: 0.0411 (0.0607)  time: 0.5523  data: 0.0178  max mem: 2493
Epoch: [0]  [100/164]  eta: 0:00:37  lr: 0.003100  loss: 0.8946 (1.3742)  loss_classifier: 0.4374 (0.6153)  loss_box_reg: 0.4166 (0.6561)  loss_objectness: 0.0067 (0.0447)  loss_rpn_box_reg: 0.0345 (0.0581)  time: 0.5514  data: 0.0176  max mem: 2493
Epoch: [0]  [110/164]  eta: 0:00:31  lr: 0.003407  loss: 0.8429 (1.3246)  loss_classifier: 0.4241 (0.5980)  loss_box_reg: 0.3720 (0.6296)  loss_objectness: 0.0063 (0.0413)  loss_rpn_box_reg: 0.0320 (0.0557)  time: 0.5532  data: 0.0174  max mem: 2493
Epoch: [0]  [120/164]  eta: 0:00:25  lr: 0.003713  loss: 0.8444 (1.2842)  loss_classifier: 0.4217 (0.5826)  loss_box_reg: 0.3775 (0.6089)  loss_objectness: 0.0032 (0.0391)  loss_rpn_box_reg: 0.0306 (0.0536)  time: 0.5564  data: 0.0179  max mem: 2493
Epoch: [0]  [130/164]  eta: 0:00:19  lr: 0.004019  loss: 0.8444 (1.2497)  loss_classifier: 0.4045 (0.5693)  loss_box_reg: 0.3788 (0.5907)  loss_objectness: 0.0070 (0.0370)  loss_rpn_box_reg: 0.0362 (0.0527)  time: 0.5573  data: 0.0177  max mem: 2493
Epoch: [0]  [140/164]  eta: 0:00:13  lr: 0.004326  loss: 0.8157 (1.2194)  loss_classifier: 0.3972 (0.5571)  loss_box_reg: 0.3533 (0.5751)  loss_objectness: 0.0081 (0.0354)  loss_rpn_box_reg: 0.0420 (0.0517)  time: 0.5577  data: 0.0169  max mem: 2493
Epoch: [0]  [150/164]  eta: 0:00:08  lr: 0.004632  loss: 0.7894 (1.1912)  loss_classifier: 0.3816 (0.5461)  loss_box_reg: 0.3748 (0.5613)  loss_objectness: 0.0050 (0.0335)  loss_rpn_box_reg: 0.0358 (0.0503)  time: 0.5573  data: 0.0170  max mem: 2493
Epoch: [0]  [160/164]  eta: 0:00:02  lr: 0.004939  loss: 0.7516 (1.1622)  loss_classifier: 0.3505 (0.5331)  loss_box_reg: 0.3608 (0.5478)  loss_objectness: 0.0045 (0.0319)  loss_rpn_box_reg: 0.0341 (0.0493)  time: 0.5552  data: 0.0175  max mem: 2493
Epoch: [0]  [163/164]  eta: 0:00:00  lr: 0.005000  loss: 0.7539 (1.1542)  loss_classifier: 0.3505 (0.5297)  loss_box_reg: 0.3659 (0.5439)  loss_objectness: 0.0046 (0.0315)  loss_rpn_box_reg: 0.0351 (0.0491)  time: 0.5545  data: 0.0176  max mem: 2493
Epoch: [0] Total time: 0:01:33 (0.5712 s / it)
Epoch: [1]  [  0/164]  eta: 0:06:38  lr: 0.005000  loss: 0.7856 (0.7856)  loss_classifier: 0.3671 (0.3671)  loss_box_reg: 0.3856 (0.3856)  loss_objectness: 0.0031 (0.0031)  loss_rpn_box_reg: 0.0298 (0.0298)  time: 2.4297  data: 1.6495  max mem: 2493
Epoch: [1]  [ 10/164]  eta: 0:01:58  lr: 0.005000  loss: 0.6834 (0.6966)  loss_classifier: 0.3175 (0.3169)  loss_box_reg: 0.3395 (0.3402)  loss_objectness: 0.0055 (0.0069)  loss_rpn_box_reg: 0.0333 (0.0325)  time: 0.7664  data: 0.1747  max mem: 2493
Epoch: [1]  [ 20/164]  eta: 0:01:36  lr: 0.005000  loss: 0.6834 (0.7064)  loss_classifier: 0.3126 (0.3245)  loss_box_reg: 0.3332 (0.3380)  loss_objectness: 0.0070 (0.0089)  loss_rpn_box_reg: 0.0333 (0.0351)  time: 0.5813  data: 0.0220  max mem: 2493
Epoch: [1]  [ 30/164]  eta: 0:01:25  lr: 0.005000  loss: 0.7218 (0.7166)  loss_classifier: 0.3156 (0.3243)  loss_box_reg: 0.3634 (0.3502)  loss_objectness: 0.0078 (0.0086)  loss_rpn_box_reg: 0.0303 (0.0335)  time: 0.5628  data: 0.0170  max mem: 2493
Epoch: [1]  [ 40/164]  eta: 0:01:16  lr: 0.005000  loss: 0.7075 (0.6964)  loss_classifier: 0.3062 (0.3136)  loss_box_reg: 0.3549 (0.3440)  loss_objectness: 0.0042 (0.0075)  loss_rpn_box_reg: 0.0224 (0.0312)  time: 0.5631  data: 0.0173  max mem: 2493
Epoch: [1]  [ 50/164]  eta: 0:01:09  lr: 0.005000  loss: 0.6048 (0.6826)  loss_classifier: 0.2668 (0.3066)  loss_box_reg: 0.3211 (0.3389)  loss_objectness: 0.0025 (0.0070)  loss_rpn_box_reg: 0.0237 (0.0301)  time: 0.5644  data: 0.0173  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [1]  [ 60/164]  eta: 0:01:02  lr: 0.005000  loss: 0.6400 (0.6758)  loss_classifier: 0.2844 (0.3031)  loss_box_reg: 0.3103 (0.3356)  loss_objectness: 0.0030 (0.0068)  loss_rpn_box_reg: 0.0271 (0.0303)  time: 0.5669  data: 0.0174  max mem: 2493
Epoch: [1]  [ 70/164]  eta: 0:00:56  lr: 0.005000  loss: 0.6381 (0.6670)  loss_classifier: 0.2590 (0.2946)  loss_box_reg: 0.3123 (0.3354)  loss_objectness: 0.0039 (0.0068)  loss_rpn_box_reg: 0.0304 (0.0302)  time: 0.5689  data: 0.0172  max mem: 2493
Epoch: [1]  [ 80/164]  eta: 0:00:49  lr: 0.005000  loss: 0.5944 (0.6563)  loss_classifier: 0.2366 (0.2891)  loss_box_reg: 0.3136 (0.3310)  loss_objectness: 0.0039 (0.0067)  loss_rpn_box_reg: 0.0281 (0.0294)  time: 0.5683  data: 0.0178  max mem: 2493
Epoch: [1]  [ 90/164]  eta: 0:00:43  lr: 0.005000  loss: 0.5970 (0.6546)  loss_classifier: 0.2588 (0.2863)  loss_box_reg: 0.3136 (0.3314)  loss_objectness: 0.0071 (0.0072)  loss_rpn_box_reg: 0.0279 (0.0297)  time: 0.5683  data: 0.0178  max mem: 2493
Epoch: [1]  [100/164]  eta: 0:00:37  lr: 0.005000  loss: 0.6326 (0.6513)  loss_classifier: 0.2674 (0.2854)  loss_box_reg: 0.3245 (0.3296)  loss_objectness: 0.0049 (0.0069)  loss_rpn_box_reg: 0.0279 (0.0295)  time: 0.5696  data: 0.0173  max mem: 2493
Epoch: [1]  [110/164]  eta: 0:00:31  lr: 0.005000  loss: 0.5900 (0.6429)  loss_classifier: 0.2674 (0.2817)  loss_box_reg: 0.3044 (0.3257)  loss_objectness: 0.0027 (0.0067)  loss_rpn_box_reg: 0.0238 (0.0288)  time: 0.5699  data: 0.0180  max mem: 2493
Epoch: [1]  [120/164]  eta: 0:00:25  lr: 0.005000  loss: 0.5752 (0.6405)  loss_classifier: 0.2480 (0.2803)  loss_box_reg: 0.3045 (0.3247)  loss_objectness: 0.0032 (0.0066)  loss_rpn_box_reg: 0.0272 (0.0290)  time: 0.5704  data: 0.0181  max mem: 2493
Epoch: [1]  [130/164]  eta: 0:00:19  lr: 0.005000  loss: 0.5910 (0.6387)  loss_classifier: 0.2573 (0.2796)  loss_box_reg: 0.3057 (0.3237)  loss_objectness: 0.0032 (0.0064)  loss_rpn_box_reg: 0.0282 (0.0290)  time: 0.5701  data: 0.0177  max mem: 2493
Epoch: [1]  [140/164]  eta: 0:00:13  lr: 0.005000  loss: 0.6283 (0.6369)  loss_classifier: 0.2573 (0.2780)  loss_box_reg: 0.3058 (0.3233)  loss_objectness: 0.0034 (0.0064)  loss_rpn_box_reg: 0.0313 (0.0292)  time: 0.5693  data: 0.0175  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [1]  [150/164]  eta: 0:00:08  lr: 0.005000  loss: 0.6355 (0.6344)  loss_classifier: 0.2521 (0.2776)  loss_box_reg: 0.3058 (0.3215)  loss_objectness: 0.0036 (0.0063)  loss_rpn_box_reg: 0.0290 (0.0290)  time: 0.5719  data: 0.0172  max mem: 2493
Epoch: [1]  [160/164]  eta: 0:00:02  lr: 0.005000  loss: 0.5667 (0.6317)  loss_classifier: 0.2288 (0.2755)  loss_box_reg: 0.3117 (0.3210)  loss_objectness: 0.0038 (0.0062)  loss_rpn_box_reg: 0.0274 (0.0290)  time: 0.5719  data: 0.0171  max mem: 2493
Epoch: [1]  [163/164]  eta: 0:00:00  lr: 0.005000  loss: 0.5814 (0.6306)  loss_classifier: 0.2288 (0.2747)  loss_box_reg: 0.3130 (0.3207)  loss_objectness: 0.0038 (0.0062)  loss_rpn_box_reg: 0.0273 (0.0289)  time: 0.5701  data: 0.0172  max mem: 2493
Epoch: [1] Total time: 0:01:35 (0.5826 s / it)
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [2]  [  0/164]  eta: 0:06:45  lr: 0.005000  loss: 0.5739 (0.5739)  loss_classifier: 0.2494 (0.2494)  loss_box_reg: 0.2924 (0.2924)  loss_objectness: 0.0022 (0.0022)  loss_rpn_box_reg: 0.0298 (0.0298)  time: 2.4738  data: 1.6905  max mem: 2493
Epoch: [2]  [ 10/164]  eta: 0:02:00  lr: 0.005000  loss: 0.5077 (0.5195)  loss_classifier: 0.1930 (0.1921)  loss_box_reg: 0.2935 (0.2973)  loss_objectness: 0.0027 (0.0035)  loss_rpn_box_reg: 0.0234 (0.0266)  time: 0.7855  data: 0.1754  max mem: 2493
Epoch: [2]  [ 20/164]  eta: 0:01:38  lr: 0.005000  loss: 0.5077 (0.5217)  loss_classifier: 0.1942 (0.1977)  loss_box_reg: 0.2935 (0.2938)  loss_objectness: 0.0027 (0.0037)  loss_rpn_box_reg: 0.0233 (0.0266)  time: 0.5963  data: 0.0208  max mem: 2493
Epoch: [2]  [ 30/164]  eta: 0:01:27  lr: 0.005000  loss: 0.4972 (0.5135)  loss_classifier: 0.1893 (0.1939)  loss_box_reg: 0.2817 (0.2891)  loss_objectness: 0.0022 (0.0036)  loss_rpn_box_reg: 0.0261 (0.0269)  time: 0.5760  data: 0.0173  max mem: 2493
Epoch: [2]  [ 40/164]  eta: 0:01:18  lr: 0.005000  loss: 0.5030 (0.5133)  loss_classifier: 0.1826 (0.1923)  loss_box_reg: 0.2858 (0.2906)  loss_objectness: 0.0029 (0.0037)  loss_rpn_box_reg: 0.0268 (0.0266)  time: 0.5752  data: 0.0172  max mem: 2493
Epoch: [2]  [ 50/164]  eta: 0:01:10  lr: 0.005000  loss: 0.5187 (0.5117)  loss_classifier: 0.1907 (0.1927)  loss_box_reg: 0.2910 (0.2898)  loss_objectness: 0.0021 (0.0037)  loss_rpn_box_reg: 0.0218 (0.0255)  time: 0.5766  data: 0.0176  max mem: 2493
Epoch: [2]  [ 60/164]  eta: 0:01:03  lr: 0.005000  loss: 0.5133 (0.5164)  loss_classifier: 0.1907 (0.1938)  loss_box_reg: 0.2959 (0.2935)  loss_objectness: 0.0018 (0.0039)  loss_rpn_box_reg: 0.0215 (0.0252)  time: 0.5788  data: 0.0178  max mem: 2493
Epoch: [2]  [ 70/164]  eta: 0:00:57  lr: 0.005000  loss: 0.5092 (0.5126)  loss_classifier: 0.1866 (0.1920)  loss_box_reg: 0.2892 (0.2911)  loss_objectness: 0.0023 (0.0040)  loss_rpn_box_reg: 0.0241 (0.0255)  time: 0.5771  data: 0.0176  max mem: 2493
Epoch: [2]  [ 80/164]  eta: 0:00:50  lr: 0.005000  loss: 0.4691 (0.5064)  loss_classifier: 0.1792 (0.1901)  loss_box_reg: 0.2663 (0.2874)  loss_objectness: 0.0025 (0.0040)  loss_rpn_box_reg: 0.0241 (0.0249)  time: 0.5766  data: 0.0178  max mem: 2493
Epoch: [2]  [ 90/164]  eta: 0:00:44  lr: 0.005000  loss: 0.4772 (0.5053)  loss_classifier: 0.1860 (0.1901)  loss_box_reg: 0.2598 (0.2858)  loss_objectness: 0.0027 (0.0041)  loss_rpn_box_reg: 0.0221 (0.0252)  time: 0.5791  data: 0.0189  max mem: 2493
Epoch: [2]  [100/164]  eta: 0:00:38  lr: 0.005000  loss: 0.5009 (0.5082)  loss_classifier: 0.1943 (0.1917)  loss_box_reg: 0.2834 (0.2861)  loss_objectness: 0.0036 (0.0042)  loss_rpn_box_reg: 0.0305 (0.0262)  time: 0.5806  data: 0.0188  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [2]  [110/164]  eta: 0:00:32  lr: 0.005000  loss: 0.5169 (0.5102)  loss_classifier: 0.1943 (0.1937)  loss_box_reg: 0.2854 (0.2861)  loss_objectness: 0.0034 (0.0042)  loss_rpn_box_reg: 0.0293 (0.0262)  time: 0.5803  data: 0.0181  max mem: 2493
Epoch: [2]  [120/164]  eta: 0:00:26  lr: 0.005000  loss: 0.5094 (0.5105)  loss_classifier: 0.1867 (0.1934)  loss_box_reg: 0.2846 (0.2865)  loss_objectness: 0.0042 (0.0043)  loss_rpn_box_reg: 0.0264 (0.0264)  time: 0.5809  data: 0.0178  max mem: 2493
Epoch: [2]  [130/164]  eta: 0:00:20  lr: 0.005000  loss: 0.5161 (0.5118)  loss_classifier: 0.2063 (0.1946)  loss_box_reg: 0.2879 (0.2868)  loss_objectness: 0.0035 (0.0043)  loss_rpn_box_reg: 0.0244 (0.0261)  time: 0.5824  data: 0.0178  max mem: 2493
Epoch: [2]  [140/164]  eta: 0:00:14  lr: 0.005000  loss: 0.5059 (0.5100)  loss_classifier: 0.1958 (0.1938)  loss_box_reg: 0.2755 (0.2859)  loss_objectness: 0.0035 (0.0043)  loss_rpn_box_reg: 0.0239 (0.0261)  time: 0.5848  data: 0.0186  max mem: 2493
Epoch: [2]  [150/164]  eta: 0:00:08  lr: 0.005000  loss: 0.4775 (0.5069)  loss_classifier: 0.1646 (0.1920)  loss_box_reg: 0.2738 (0.2848)  loss_objectness: 0.0037 (0.0043)  loss_rpn_box_reg: 0.0238 (0.0258)  time: 0.5855  data: 0.0183  max mem: 2493
Epoch: [2]  [160/164]  eta: 0:00:02  lr: 0.005000  loss: 0.4740 (0.5050)  loss_classifier: 0.1638 (0.1907)  loss_box_reg: 0.2849 (0.2846)  loss_objectness: 0.0021 (0.0042)  loss_rpn_box_reg: 0.0199 (0.0255)  time: 0.5800  data: 0.0177  max mem: 2493
Epoch: [2]  [163/164]  eta: 0:00:00  lr: 0.005000  loss: 0.4740 (0.5047)  loss_classifier: 0.1691 (0.1902)  loss_box_reg: 0.2869 (0.2847)  loss_objectness: 0.0020 (0.0042)  loss_rpn_box_reg: 0.0226 (0.0255)  time: 0.5787  data: 0.0177  max mem: 2493
Epoch: [2] Total time: 0:01:37 (0.5944 s / it)
Epoch: [3]  [  0/164]  eta: 0:06:42  lr: 0.000500  loss: 0.4314 (0.4314)  loss_classifier: 0.1722 (0.1722)  loss_box_reg: 0.2368 (0.2368)  loss_objectness: 0.0028 (0.0028)  loss_rpn_box_reg: 0.0195 (0.0195)  time: 2.4521  data: 1.6377  max mem: 2493
Epoch: [3]  [ 10/164]  eta: 0:02:02  lr: 0.000500  loss: 0.4307 (0.4236)  loss_classifier: 0.1617 (0.1611)  loss_box_reg: 0.2368 (0.2416)  loss_objectness: 0.0013 (0.0029)  loss_rpn_box_reg: 0.0195 (0.0180)  time: 0.7932  data: 0.1778  max mem: 2493
Epoch: [3]  [ 20/164]  eta: 0:01:39  lr: 0.000500  loss: 0.4136 (0.4105)  loss_classifier: 0.1427 (0.1537)  loss_box_reg: 0.2344 (0.2355)  loss_objectness: 0.0012 (0.0027)  loss_rpn_box_reg: 0.0189 (0.0187)  time: 0.6050  data: 0.0247  max mem: 2493
Epoch: [3]  [ 30/164]  eta: 0:01:28  lr: 0.000500  loss: 0.3907 (0.4047)  loss_classifier: 0.1339 (0.1508)  loss_box_reg: 0.2223 (0.2318)  loss_objectness: 0.0010 (0.0030)  loss_rpn_box_reg: 0.0188 (0.0191)  time: 0.5836  data: 0.0175  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [3]  [ 40/164]  eta: 0:01:19  lr: 0.000500  loss: 0.3600 (0.4018)  loss_classifier: 0.1312 (0.1499)  loss_box_reg: 0.2172 (0.2304)  loss_objectness: 0.0014 (0.0029)  loss_rpn_box_reg: 0.0178 (0.0186)  time: 0.5830  data: 0.0173  max mem: 2493
Epoch: [3]  [ 50/164]  eta: 0:01:11  lr: 0.000500  loss: 0.3869 (0.4047)  loss_classifier: 0.1455 (0.1506)  loss_box_reg: 0.2204 (0.2324)  loss_objectness: 0.0015 (0.0029)  loss_rpn_box_reg: 0.0168 (0.0189)  time: 0.5815  data: 0.0173  max mem: 2493
Epoch: [3]  [ 60/164]  eta: 0:01:04  lr: 0.000500  loss: 0.3788 (0.3976)  loss_classifier: 0.1427 (0.1466)  loss_box_reg: 0.2198 (0.2294)  loss_objectness: 0.0018 (0.0032)  loss_rpn_box_reg: 0.0168 (0.0184)  time: 0.5810  data: 0.0174  max mem: 2493
Epoch: [3]  [ 70/164]  eta: 0:00:57  lr: 0.000500  loss: 0.3732 (0.3963)  loss_classifier: 0.1284 (0.1458)  loss_box_reg: 0.2173 (0.2290)  loss_objectness: 0.0009 (0.0033)  loss_rpn_box_reg: 0.0161 (0.0182)  time: 0.5801  data: 0.0173  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [3]  [ 80/164]  eta: 0:00:51  lr: 0.000500  loss: 0.3815 (0.3949)  loss_classifier: 0.1379 (0.1452)  loss_box_reg: 0.2184 (0.2284)  loss_objectness: 0.0013 (0.0033)  loss_rpn_box_reg: 0.0162 (0.0180)  time: 0.5805  data: 0.0173  max mem: 2493
Epoch: [3]  [ 90/164]  eta: 0:00:44  lr: 0.000500  loss: 0.3953 (0.3950)  loss_classifier: 0.1356 (0.1438)  loss_box_reg: 0.2316 (0.2296)  loss_objectness: 0.0017 (0.0034)  loss_rpn_box_reg: 0.0190 (0.0183)  time: 0.5825  data: 0.0176  max mem: 2493
Epoch: [3]  [100/164]  eta: 0:00:38  lr: 0.000500  loss: 0.3894 (0.3915)  loss_classifier: 0.1320 (0.1423)  loss_box_reg: 0.2316 (0.2277)  loss_objectness: 0.0022 (0.0034)  loss_rpn_box_reg: 0.0192 (0.0181)  time: 0.5845  data: 0.0175  max mem: 2493
Epoch: [3]  [110/164]  eta: 0:00:32  lr: 0.000500  loss: 0.3537 (0.3871)  loss_classifier: 0.1265 (0.1406)  loss_box_reg: 0.2042 (0.2254)  loss_objectness: 0.0016 (0.0033)  loss_rpn_box_reg: 0.0169 (0.0178)  time: 0.5871  data: 0.0177  max mem: 2493
Epoch: [3]  [120/164]  eta: 0:00:26  lr: 0.000500  loss: 0.3574 (0.3865)  loss_classifier: 0.1250 (0.1405)  loss_box_reg: 0.2111 (0.2250)  loss_objectness: 0.0016 (0.0034)  loss_rpn_box_reg: 0.0139 (0.0176)  time: 0.5894  data: 0.0179  max mem: 2493
Epoch: [3]  [130/164]  eta: 0:00:20  lr: 0.000500  loss: 0.3749 (0.3853)  loss_classifier: 0.1320 (0.1403)  loss_box_reg: 0.2172 (0.2242)  loss_objectness: 0.0020 (0.0033)  loss_rpn_box_reg: 0.0163 (0.0176)  time: 0.5913  data: 0.0178  max mem: 2493
Epoch: [3]  [140/164]  eta: 0:00:14  lr: 0.000500  loss: 0.3749 (0.3854)  loss_classifier: 0.1263 (0.1401)  loss_box_reg: 0.2195 (0.2245)  loss_objectness: 0.0015 (0.0032)  loss_rpn_box_reg: 0.0178 (0.0176)  time: 0.5918  data: 0.0177  max mem: 2493
Epoch: [3]  [150/164]  eta: 0:00:08  lr: 0.000500  loss: 0.3789 (0.3851)  loss_classifier: 0.1273 (0.1398)  loss_box_reg: 0.2226 (0.2246)  loss_objectness: 0.0015 (0.0031)  loss_rpn_box_reg: 0.0177 (0.0176)  time: 0.5889  data: 0.0176  max mem: 2493
Epoch: [3]  [160/164]  eta: 0:00:02  lr: 0.000500  loss: 0.3945 (0.3876)  loss_classifier: 0.1525 (0.1410)  loss_box_reg: 0.2304 (0.2257)  loss_objectness: 0.0016 (0.0031)  loss_rpn_box_reg: 0.0184 (0.0178)  time: 0.5861  data: 0.0179  max mem: 2493
Epoch: [3]  [163/164]  eta: 0:00:00  lr: 0.000500  loss: 0.3914 (0.3873)  loss_classifier: 0.1434 (0.1409)  loss_box_reg: 0.2319 (0.2256)  loss_objectness: 0.0016 (0.0031)  loss_rpn_box_reg: 0.0191 (0.0178)  time: 0.5839  data: 0.0185  max mem: 2493
Epoch: [3] Total time: 0:01:38 (0.6001 s / it)
Epoch: [4]  [  0/164]  eta: 0:06:46  lr: 0.000500  loss: 0.3813 (0.3813)  loss_classifier: 0.1299 (0.1299)  loss_box_reg: 0.2264 (0.2264)  loss_objectness: 0.0015 (0.0015)  loss_rpn_box_reg: 0.0235 (0.0235)  time: 2.4809  data: 1.6828  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [4]  [ 10/164]  eta: 0:02:02  lr: 0.000500  loss: 0.3437 (0.3761)  loss_classifier: 0.1200 (0.1307)  loss_box_reg: 0.2264 (0.2236)  loss_objectness: 0.0015 (0.0021)  loss_rpn_box_reg: 0.0173 (0.0196)  time: 0.7935  data: 0.1757  max mem: 2493
Epoch: [4]  [ 20/164]  eta: 0:01:39  lr: 0.000500  loss: 0.3630 (0.3713)  loss_classifier: 0.1218 (0.1301)  loss_box_reg: 0.2221 (0.2206)  loss_objectness: 0.0014 (0.0023)  loss_rpn_box_reg: 0.0172 (0.0182)  time: 0.6047  data: 0.0215  max mem: 2493
Epoch: [4]  [ 30/164]  eta: 0:01:28  lr: 0.000500  loss: 0.3860 (0.3771)  loss_classifier: 0.1351 (0.1344)  loss_box_reg: 0.2293 (0.2218)  loss_objectness: 0.0016 (0.0029)  loss_rpn_box_reg: 0.0180 (0.0181)  time: 0.5855  data: 0.0181  max mem: 2493
Epoch: [4]  [ 40/164]  eta: 0:01:19  lr: 0.000500  loss: 0.3808 (0.3786)  loss_classifier: 0.1428 (0.1363)  loss_box_reg: 0.2291 (0.2215)  loss_objectness: 0.0020 (0.0030)  loss_rpn_box_reg: 0.0180 (0.0179)  time: 0.5866  data: 0.0183  max mem: 2493
Epoch: [4]  [ 50/164]  eta: 0:01:11  lr: 0.000500  loss: 0.3801 (0.3767)  loss_classifier: 0.1427 (0.1367)  loss_box_reg: 0.2172 (0.2198)  loss_objectness: 0.0012 (0.0027)  loss_rpn_box_reg: 0.0169 (0.0175)  time: 0.5855  data: 0.0180  max mem: 2493
Epoch: [4]  [ 60/164]  eta: 0:01:04  lr: 0.000500  loss: 0.3794 (0.3780)  loss_classifier: 0.1333 (0.1358)  loss_box_reg: 0.2211 (0.2217)  loss_objectness: 0.0013 (0.0028)  loss_rpn_box_reg: 0.0167 (0.0177)  time: 0.5855  data: 0.0178  max mem: 2493
Epoch: [4]  [ 70/164]  eta: 0:00:58  lr: 0.000500  loss: 0.3794 (0.3769)  loss_classifier: 0.1259 (0.1352)  loss_box_reg: 0.2282 (0.2216)  loss_objectness: 0.0013 (0.0026)  loss_rpn_box_reg: 0.0171 (0.0175)  time: 0.5862  data: 0.0179  max mem: 2493
Epoch: [4]  [ 80/164]  eta: 0:00:51  lr: 0.000500  loss: 0.3794 (0.3794)  loss_classifier: 0.1284 (0.1369)  loss_box_reg: 0.2149 (0.2221)  loss_objectness: 0.0010 (0.0027)  loss_rpn_box_reg: 0.0171 (0.0178)  time: 0.5874  data: 0.0178  max mem: 2493
Epoch: [4]  [ 90/164]  eta: 0:00:45  lr: 0.000500  loss: 0.3796 (0.3786)  loss_classifier: 0.1456 (0.1369)  loss_box_reg: 0.2143 (0.2213)  loss_objectness: 0.0011 (0.0028)  loss_rpn_box_reg: 0.0191 (0.0177)  time: 0.5913  data: 0.0180  max mem: 2493
Epoch: [4]  [100/164]  eta: 0:00:39  lr: 0.000500  loss: 0.3585 (0.3742)  loss_classifier: 0.1247 (0.1352)  loss_box_reg: 0.2106 (0.2190)  loss_objectness: 0.0009 (0.0026)  loss_rpn_box_reg: 0.0145 (0.0173)  time: 0.5935  data: 0.0191  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [4]  [110/164]  eta: 0:00:32  lr: 0.000500  loss: 0.3232 (0.3707)  loss_classifier: 0.1191 (0.1344)  loss_box_reg: 0.1961 (0.2167)  loss_objectness: 0.0010 (0.0027)  loss_rpn_box_reg: 0.0128 (0.0170)  time: 0.5911  data: 0.0188  max mem: 2493
Epoch: [4]  [120/164]  eta: 0:00:26  lr: 0.000500  loss: 0.3426 (0.3706)  loss_classifier: 0.1164 (0.1342)  loss_box_reg: 0.2037 (0.2167)  loss_objectness: 0.0016 (0.0027)  loss_rpn_box_reg: 0.0137 (0.0170)  time: 0.5887  data: 0.0175  max mem: 2493
Epoch: [4]  [130/164]  eta: 0:00:20  lr: 0.000500  loss: 0.3347 (0.3687)  loss_classifier: 0.1164 (0.1329)  loss_box_reg: 0.2054 (0.2162)  loss_objectness: 0.0012 (0.0026)  loss_rpn_box_reg: 0.0175 (0.0169)  time: 0.5868  data: 0.0175  max mem: 2493
Epoch: [4]  [140/164]  eta: 0:00:14  lr: 0.000500  loss: 0.3332 (0.3690)  loss_classifier: 0.1219 (0.1334)  loss_box_reg: 0.1980 (0.2162)  loss_objectness: 0.0008 (0.0026)  loss_rpn_box_reg: 0.0151 (0.0169)  time: 0.5863  data: 0.0178  max mem: 2493
Epoch: [4]  [150/164]  eta: 0:00:08  lr: 0.000500  loss: 0.3512 (0.3689)  loss_classifier: 0.1230 (0.1331)  loss_box_reg: 0.2079 (0.2165)  loss_objectness: 0.0007 (0.0025)  loss_rpn_box_reg: 0.0146 (0.0168)  time: 0.5875  data: 0.0180  max mem: 2493
Epoch: [4]  [160/164]  eta: 0:00:02  lr: 0.000500  loss: 0.3512 (0.3692)  loss_classifier: 0.1207 (0.1333)  loss_box_reg: 0.2111 (0.2167)  loss_objectness: 0.0010 (0.0025)  loss_rpn_box_reg: 0.0140 (0.0168)  time: 0.5839  data: 0.0183  max mem: 2493
Epoch: [4]  [163/164]  eta: 0:00:00  lr: 0.000500  loss: 0.3567 (0.3699)  loss_classifier: 0.1208 (0.1336)  loss_box_reg: 0.2155 (0.2170)  loss_objectness: 0.0011 (0.0025)  loss_rpn_box_reg: 0.0146 (0.0168)  time: 0.5824  data: 0.0179  max mem: 2493
Epoch: [4] Total time: 0:01:38 (0.6022 s / it)
Epoch: [5]  [  0/164]  eta: 0:06:30  lr: 0.000500  loss: 0.2833 (0.2833)  loss_classifier: 0.1028 (0.1028)  loss_box_reg: 0.1662 (0.1662)  loss_objectness: 0.0006 (0.0006)  loss_rpn_box_reg: 0.0138 (0.0138)  time: 2.3790  data: 1.5833  max mem: 2493
Epoch: [5]  [ 10/164]  eta: 0:02:00  lr: 0.000500  loss: 0.3455 (0.3616)  loss_classifier: 0.1250 (0.1276)  loss_box_reg: 0.2195 (0.2123)  loss_objectness: 0.0019 (0.0040)  loss_rpn_box_reg: 0.0169 (0.0176)  time: 0.7851  data: 0.1658  max mem: 2493
Epoch: [5]  [ 20/164]  eta: 0:01:39  lr: 0.000500  loss: 0.3455 (0.3559)  loss_classifier: 0.1218 (0.1245)  loss_box_reg: 0.2157 (0.2105)  loss_objectness: 0.0020 (0.0034)  loss_rpn_box_reg: 0.0161 (0.0175)  time: 0.6057  data: 0.0208  max mem: 2493
Epoch: [5]  [ 30/164]  eta: 0:01:27  lr: 0.000500  loss: 0.3396 (0.3516)  loss_classifier: 0.1136 (0.1221)  loss_box_reg: 0.2086 (0.2099)  loss_objectness: 0.0010 (0.0030)  loss_rpn_box_reg: 0.0143 (0.0166)  time: 0.5849  data: 0.0172  max mem: 2493
Epoch: [5]  [ 40/164]  eta: 0:01:19  lr: 0.000500  loss: 0.3385 (0.3515)  loss_classifier: 0.1149 (0.1225)  loss_box_reg: 0.2059 (0.2098)  loss_objectness: 0.0008 (0.0026)  loss_rpn_box_reg: 0.0142 (0.0165)  time: 0.5842  data: 0.0167  max mem: 2493
Epoch: [5]  [ 50/164]  eta: 0:01:11  lr: 0.000500  loss: 0.3374 (0.3531)  loss_classifier: 0.1210 (0.1232)  loss_box_reg: 0.2088 (0.2109)  loss_objectness: 0.0008 (0.0025)  loss_rpn_box_reg: 0.0153 (0.0165)  time: 0.5857  data: 0.0169  max mem: 2493
Epoch: [5]  [ 60/164]  eta: 0:01:04  lr: 0.000500  loss: 0.3618 (0.3563)  loss_classifier: 0.1281 (0.1250)  loss_box_reg: 0.2102 (0.2121)  loss_objectness: 0.0008 (0.0023)  loss_rpn_box_reg: 0.0173 (0.0168)  time: 0.5882  data: 0.0170  max mem: 2493
Epoch: [5]  [ 70/164]  eta: 0:00:58  lr: 0.000500  loss: 0.3524 (0.3557)  loss_classifier: 0.1264 (0.1254)  loss_box_reg: 0.2102 (0.2112)  loss_objectness: 0.0012 (0.0023)  loss_rpn_box_reg: 0.0167 (0.0167)  time: 0.5887  data: 0.0169  max mem: 2493
Epoch: [5]  [ 80/164]  eta: 0:00:51  lr: 0.000500  loss: 0.3684 (0.3597)  loss_classifier: 0.1307 (0.1268)  loss_box_reg: 0.2191 (0.2131)  loss_objectness: 0.0017 (0.0028)  loss_rpn_box_reg: 0.0173 (0.0171)  time: 0.5890  data: 0.0168  max mem: 2493
Epoch: [5]  [ 90/164]  eta: 0:00:45  lr: 0.000500  loss: 0.3868 (0.3605)  loss_classifier: 0.1390 (0.1270)  loss_box_reg: 0.2238 (0.2138)  loss_objectness: 0.0017 (0.0027)  loss_rpn_box_reg: 0.0183 (0.0171)  time: 0.5906  data: 0.0170  max mem: 2493
Epoch: [5]  [100/164]  eta: 0:00:38  lr: 0.000500  loss: 0.3769 (0.3619)  loss_classifier: 0.1302 (0.1285)  loss_box_reg: 0.2173 (0.2137)  loss_objectness: 0.0011 (0.0028)  loss_rpn_box_reg: 0.0166 (0.0170)  time: 0.5913  data: 0.0174  max mem: 2493
Epoch: [5]  [110/164]  eta: 0:00:32  lr: 0.000500  loss: 0.3700 (0.3626)  loss_classifier: 0.1284 (0.1291)  loss_box_reg: 0.2131 (0.2137)  loss_objectness: 0.0016 (0.0027)  loss_rpn_box_reg: 0.0173 (0.0171)  time: 0.5923  data: 0.0184  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [5]  [120/164]  eta: 0:00:26  lr: 0.000500  loss: 0.3512 (0.3602)  loss_classifier: 0.1235 (0.1283)  loss_box_reg: 0.2048 (0.2121)  loss_objectness: 0.0018 (0.0030)  loss_rpn_box_reg: 0.0145 (0.0168)  time: 0.5912  data: 0.0185  max mem: 2493
Epoch: [5]  [130/164]  eta: 0:00:20  lr: 0.000500  loss: 0.3319 (0.3601)  loss_classifier: 0.1165 (0.1284)  loss_box_reg: 0.2082 (0.2121)  loss_objectness: 0.0024 (0.0029)  loss_rpn_box_reg: 0.0145 (0.0167)  time: 0.5874  data: 0.0177  max mem: 2493
Epoch: [5]  [140/164]  eta: 0:00:14  lr: 0.000500  loss: 0.3468 (0.3621)  loss_classifier: 0.1157 (0.1285)  loss_box_reg: 0.2146 (0.2139)  loss_objectness: 0.0016 (0.0029)  loss_rpn_box_reg: 0.0173 (0.0168)  time: 0.5854  data: 0.0171  max mem: 2493
Epoch: [5]  [150/164]  eta: 0:00:08  lr: 0.000500  loss: 0.3370 (0.3585)  loss_classifier: 0.1111 (0.1269)  loss_box_reg: 0.2065 (0.2123)  loss_objectness: 0.0009 (0.0028)  loss_rpn_box_reg: 0.0134 (0.0165)  time: 0.5855  data: 0.0176  max mem: 2493
Epoch: [5]  [160/164]  eta: 0:00:02  lr: 0.000500  loss: 0.3136 (0.3577)  loss_classifier: 0.1090 (0.1268)  loss_box_reg: 0.1883 (0.2116)  loss_objectness: 0.0014 (0.0029)  loss_rpn_box_reg: 0.0116 (0.0164)  time: 0.5823  data: 0.0178  max mem: 2493
Epoch: [5]  [163/164]  eta: 0:00:00  lr: 0.000500  loss: 0.3170 (0.3582)  loss_classifier: 0.1176 (0.1274)  loss_box_reg: 0.1984 (0.2116)  loss_objectness: 0.0017 (0.0028)  loss_rpn_box_reg: 0.0116 (0.0164)  time: 0.5799  data: 0.0169  max mem: 2493
Epoch: [5] Total time: 0:01:38 (0.6015 s / it)
Epoch: [6]  [  0/164]  eta: 0:04:54  lr: 0.000050  loss: 0.2558 (0.2558)  loss_classifier: 0.0958 (0.0958)  loss_box_reg: 0.1532 (0.1532)  loss_objectness: 0.0003 (0.0003)  loss_rpn_box_reg: 0.0065 (0.0065)  time: 1.7948  data: 1.0944  max mem: 2493
Epoch: [6]  [ 10/164]  eta: 0:01:55  lr: 0.000050  loss: 0.3560 (0.3422)  loss_classifier: 0.1197 (0.1206)  loss_box_reg: 0.2119 (0.2052)  loss_objectness: 0.0004 (0.0011)  loss_rpn_box_reg: 0.0185 (0.0153)  time: 0.7480  data: 0.1252  max mem: 2493
Epoch: [6]  [ 20/164]  eta: 0:01:36  lr: 0.000050  loss: 0.3549 (0.3447)  loss_classifier: 0.1208 (0.1227)  loss_box_reg: 0.2060 (0.2056)  loss_objectness: 0.0009 (0.0014)  loss_rpn_box_reg: 0.0148 (0.0149)  time: 0.6134  data: 0.0225  max mem: 2493
Epoch: [6]  [ 30/164]  eta: 0:01:26  lr: 0.000050  loss: 0.3465 (0.3411)  loss_classifier: 0.1149 (0.1212)  loss_box_reg: 0.1973 (0.2037)  loss_objectness: 0.0010 (0.0015)  loss_rpn_box_reg: 0.0142 (0.0146)  time: 0.5847  data: 0.0169  max mem: 2493
Epoch: [6]  [ 40/164]  eta: 0:01:17  lr: 0.000050  loss: 0.3151 (0.3354)  loss_classifier: 0.1093 (0.1196)  loss_box_reg: 0.1882 (0.2000)  loss_objectness: 0.0011 (0.0015)  loss_rpn_box_reg: 0.0129 (0.0143)  time: 0.5861  data: 0.0167  max mem: 2493
Epoch: [6]  [ 50/164]  eta: 0:01:10  lr: 0.000050  loss: 0.3408 (0.3430)  loss_classifier: 0.1275 (0.1222)  loss_box_reg: 0.2064 (0.2046)  loss_objectness: 0.0011 (0.0016)  loss_rpn_box_reg: 0.0148 (0.0147)  time: 0.5878  data: 0.0168  max mem: 2493
Epoch: [6]  [ 60/164]  eta: 0:01:04  lr: 0.000050  loss: 0.3722 (0.3488)  loss_classifier: 0.1317 (0.1252)  loss_box_reg: 0.2223 (0.2067)  loss_objectness: 0.0011 (0.0016)  loss_rpn_box_reg: 0.0167 (0.0152)  time: 0.5909  data: 0.0171  max mem: 2493
Epoch: [6]  [ 70/164]  eta: 0:00:57  lr: 0.000050  loss: 0.3262 (0.3423)  loss_classifier: 0.1097 (0.1228)  loss_box_reg: 0.1952 (0.2030)  loss_objectness: 0.0009 (0.0016)  loss_rpn_box_reg: 0.0150 (0.0149)  time: 0.5901  data: 0.0171  max mem: 2493
Epoch: [6]  [ 80/164]  eta: 0:00:51  lr: 0.000050  loss: 0.3082 (0.3406)  loss_classifier: 0.1087 (0.1214)  loss_box_reg: 0.1794 (0.2026)  loss_objectness: 0.0011 (0.0016)  loss_rpn_box_reg: 0.0143 (0.0150)  time: 0.5888  data: 0.0168  max mem: 2493
Epoch: [6]  [ 90/164]  eta: 0:00:44  lr: 0.000050  loss: 0.3511 (0.3430)  loss_classifier: 0.1118 (0.1225)  loss_box_reg: 0.1946 (0.2036)  loss_objectness: 0.0013 (0.0017)  loss_rpn_box_reg: 0.0166 (0.0152)  time: 0.5901  data: 0.0171  max mem: 2493
Epoch: [6]  [100/164]  eta: 0:00:38  lr: 0.000050  loss: 0.3309 (0.3400)  loss_classifier: 0.1195 (0.1213)  loss_box_reg: 0.1934 (0.2019)  loss_objectness: 0.0008 (0.0017)  loss_rpn_box_reg: 0.0161 (0.0151)  time: 0.5886  data: 0.0176  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [6]  [110/164]  eta: 0:00:32  lr: 0.000050  loss: 0.3309 (0.3419)  loss_classifier: 0.1228 (0.1229)  loss_box_reg: 0.1917 (0.2020)  loss_objectness: 0.0007 (0.0018)  loss_rpn_box_reg: 0.0136 (0.0152)  time: 0.5891  data: 0.0177  max mem: 2493
Epoch: [6]  [120/164]  eta: 0:00:26  lr: 0.000050  loss: 0.3435 (0.3422)  loss_classifier: 0.1296 (0.1229)  loss_box_reg: 0.1965 (0.2022)  loss_objectness: 0.0006 (0.0018)  loss_rpn_box_reg: 0.0151 (0.0153)  time: 0.5911  data: 0.0173  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [6]  [130/164]  eta: 0:00:20  lr: 0.000050  loss: 0.3331 (0.3428)  loss_classifier: 0.1167 (0.1238)  loss_box_reg: 0.1926 (0.2019)  loss_objectness: 0.0008 (0.0018)  loss_rpn_box_reg: 0.0151 (0.0154)  time: 0.5890  data: 0.0171  max mem: 2493
Epoch: [6]  [140/164]  eta: 0:00:14  lr: 0.000050  loss: 0.3333 (0.3413)  loss_classifier: 0.1086 (0.1220)  loss_box_reg: 0.1979 (0.2020)  loss_objectness: 0.0008 (0.0018)  loss_rpn_box_reg: 0.0156 (0.0154)  time: 0.5868  data: 0.0173  max mem: 2493
Epoch: [6]  [150/164]  eta: 0:00:08  lr: 0.000050  loss: 0.3351 (0.3415)  loss_classifier: 0.1052 (0.1222)  loss_box_reg: 0.2018 (0.2021)  loss_objectness: 0.0011 (0.0018)  loss_rpn_box_reg: 0.0163 (0.0155)  time: 0.5886  data: 0.0174  max mem: 2493
Epoch: [6]  [160/164]  eta: 0:00:02  lr: 0.000050  loss: 0.3382 (0.3413)  loss_classifier: 0.1144 (0.1214)  loss_box_reg: 0.2041 (0.2025)  loss_objectness: 0.0011 (0.0018)  loss_rpn_box_reg: 0.0163 (0.0156)  time: 0.5854  data: 0.0173  max mem: 2493
Epoch: [6]  [163/164]  eta: 0:00:00  lr: 0.000050  loss: 0.3382 (0.3417)  loss_classifier: 0.1123 (0.1213)  loss_box_reg: 0.2089 (0.2029)  loss_objectness: 0.0012 (0.0018)  loss_rpn_box_reg: 0.0163 (0.0157)  time: 0.5839  data: 0.0174  max mem: 2493
Epoch: [6] Total time: 0:01:38 (0.5998 s / it)
Epoch: [7]  [  0/164]  eta: 0:06:05  lr: 0.000050  loss: 0.3945 (0.3945)  loss_classifier: 0.1377 (0.1377)  loss_box_reg: 0.2364 (0.2364)  loss_objectness: 0.0021 (0.0021)  loss_rpn_box_reg: 0.0183 (0.0183)  time: 2.2305  data: 1.4879  max mem: 2493
Epoch: [7]  [ 10/164]  eta: 0:01:58  lr: 0.000050  loss: 0.3708 (0.3563)  loss_classifier: 0.1189 (0.1208)  loss_box_reg: 0.2238 (0.2156)  loss_objectness: 0.0015 (0.0036)  loss_rpn_box_reg: 0.0160 (0.0164)  time: 0.7707  data: 0.1570  max mem: 2493
Epoch: [7]  [ 20/164]  eta: 0:01:38  lr: 0.000050  loss: 0.3299 (0.3434)  loss_classifier: 0.1186 (0.1218)  loss_box_reg: 0.1991 (0.2039)  loss_objectness: 0.0010 (0.0023)  loss_rpn_box_reg: 0.0137 (0.0155)  time: 0.6047  data: 0.0202  max mem: 2493
Epoch: [7]  [ 30/164]  eta: 0:01:27  lr: 0.000050  loss: 0.3381 (0.3524)  loss_classifier: 0.1238 (0.1290)  loss_box_reg: 0.1965 (0.2047)  loss_objectness: 0.0008 (0.0024)  loss_rpn_box_reg: 0.0149 (0.0163)  time: 0.5865  data: 0.0167  max mem: 2493
Epoch: [7]  [ 40/164]  eta: 0:01:18  lr: 0.000050  loss: 0.3605 (0.3516)  loss_classifier: 0.1231 (0.1275)  loss_box_reg: 0.2148 (0.2057)  loss_objectness: 0.0008 (0.0024)  loss_rpn_box_reg: 0.0151 (0.0160)  time: 0.5886  data: 0.0173  max mem: 2493
Epoch: [7]  [ 50/164]  eta: 0:01:11  lr: 0.000050  loss: 0.3315 (0.3475)  loss_classifier: 0.1160 (0.1259)  loss_box_reg: 0.1958 (0.2035)  loss_objectness: 0.0008 (0.0025)  loss_rpn_box_reg: 0.0131 (0.0156)  time: 0.5885  data: 0.0173  max mem: 2493
Epoch: [7]  [ 60/164]  eta: 0:01:04  lr: 0.000050  loss: 0.3197 (0.3430)  loss_classifier: 0.1042 (0.1239)  loss_box_reg: 0.1907 (0.2013)  loss_objectness: 0.0018 (0.0025)  loss_rpn_box_reg: 0.0132 (0.0153)  time: 0.5883  data: 0.0170  max mem: 2493
Epoch: [7]  [ 70/164]  eta: 0:00:57  lr: 0.000050  loss: 0.3121 (0.3408)  loss_classifier: 0.1035 (0.1224)  loss_box_reg: 0.1882 (0.2006)  loss_objectness: 0.0018 (0.0025)  loss_rpn_box_reg: 0.0145 (0.0154)  time: 0.5881  data: 0.0170  max mem: 2493
Epoch: [7]  [ 80/164]  eta: 0:00:51  lr: 0.000050  loss: 0.3242 (0.3442)  loss_classifier: 0.1088 (0.1236)  loss_box_reg: 0.1996 (0.2024)  loss_objectness: 0.0007 (0.0026)  loss_rpn_box_reg: 0.0165 (0.0156)  time: 0.5888  data: 0.0166  max mem: 2493
Epoch: [7]  [ 90/164]  eta: 0:00:45  lr: 0.000050  loss: 0.3371 (0.3435)  loss_classifier: 0.1124 (0.1225)  loss_box_reg: 0.2011 (0.2026)  loss_objectness: 0.0010 (0.0025)  loss_rpn_box_reg: 0.0165 (0.0159)  time: 0.5892  data: 0.0166  max mem: 2493
Epoch: [7]  [100/164]  eta: 0:00:38  lr: 0.000050  loss: 0.3282 (0.3419)  loss_classifier: 0.1120 (0.1218)  loss_box_reg: 0.1973 (0.2020)  loss_objectness: 0.0015 (0.0024)  loss_rpn_box_reg: 0.0139 (0.0157)  time: 0.5880  data: 0.0167  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [7]  [110/164]  eta: 0:00:32  lr: 0.000050  loss: 0.3225 (0.3407)  loss_classifier: 0.1083 (0.1211)  loss_box_reg: 0.1899 (0.2016)  loss_objectness: 0.0009 (0.0023)  loss_rpn_box_reg: 0.0136 (0.0156)  time: 0.5861  data: 0.0167  max mem: 2493
Epoch: [7]  [120/164]  eta: 0:00:26  lr: 0.000050  loss: 0.3367 (0.3426)  loss_classifier: 0.1172 (0.1222)  loss_box_reg: 0.1983 (0.2023)  loss_objectness: 0.0009 (0.0024)  loss_rpn_box_reg: 0.0148 (0.0156)  time: 0.5851  data: 0.0167  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [7]  [130/164]  eta: 0:00:20  lr: 0.000050  loss: 0.3473 (0.3438)  loss_classifier: 0.1227 (0.1223)  loss_box_reg: 0.2070 (0.2034)  loss_objectness: 0.0015 (0.0023)  loss_rpn_box_reg: 0.0152 (0.0158)  time: 0.5853  data: 0.0167  max mem: 2493
Epoch: [7]  [140/164]  eta: 0:00:14  lr: 0.000050  loss: 0.3459 (0.3429)  loss_classifier: 0.1174 (0.1218)  loss_box_reg: 0.2011 (0.2031)  loss_objectness: 0.0010 (0.0022)  loss_rpn_box_reg: 0.0170 (0.0157)  time: 0.5861  data: 0.0169  max mem: 2493
Epoch: [7]  [150/164]  eta: 0:00:08  lr: 0.000050  loss: 0.3268 (0.3429)  loss_classifier: 0.1081 (0.1217)  loss_box_reg: 0.2002 (0.2032)  loss_objectness: 0.0010 (0.0023)  loss_rpn_box_reg: 0.0164 (0.0158)  time: 0.5873  data: 0.0168  max mem: 2493
Epoch: [7]  [160/164]  eta: 0:00:02  lr: 0.000050  loss: 0.3268 (0.3416)  loss_classifier: 0.1081 (0.1211)  loss_box_reg: 0.1918 (0.2027)  loss_objectness: 0.0011 (0.0022)  loss_rpn_box_reg: 0.0130 (0.0156)  time: 0.5823  data: 0.0167  max mem: 2493
Epoch: [7]  [163/164]  eta: 0:00:00  lr: 0.000050  loss: 0.3305 (0.3417)  loss_classifier: 0.1223 (0.1213)  loss_box_reg: 0.1939 (0.2026)  loss_objectness: 0.0011 (0.0022)  loss_rpn_box_reg: 0.0138 (0.0156)  time: 0.5803  data: 0.0167  max mem: 2493
Epoch: [7] Total time: 0:01:38 (0.6000 s / it)
Epoch: [8]  [  0/164]  eta: 0:06:01  lr: 0.000050  loss: 0.2657 (0.2657)  loss_classifier: 0.0893 (0.0893)  loss_box_reg: 0.1650 (0.1650)  loss_objectness: 0.0018 (0.0018)  loss_rpn_box_reg: 0.0095 (0.0095)  time: 2.2031  data: 1.4218  max mem: 2493
Epoch: [8]  [ 10/164]  eta: 0:01:58  lr: 0.000050  loss: 0.3532 (0.3489)  loss_classifier: 0.1289 (0.1256)  loss_box_reg: 0.1989 (0.2043)  loss_objectness: 0.0013 (0.0022)  loss_rpn_box_reg: 0.0165 (0.0167)  time: 0.7719  data: 0.1497  max mem: 2493
Epoch: [8]  [ 20/164]  eta: 0:01:38  lr: 0.000050  loss: 0.3545 (0.3586)  loss_classifier: 0.1295 (0.1311)  loss_box_reg: 0.2073 (0.2079)  loss_objectness: 0.0011 (0.0024)  loss_rpn_box_reg: 0.0165 (0.0171)  time: 0.6071  data: 0.0198  max mem: 2493
Epoch: [8]  [ 30/164]  eta: 0:01:27  lr: 0.000050  loss: 0.3595 (0.3587)  loss_classifier: 0.1365 (0.1321)  loss_box_reg: 0.2070 (0.2077)  loss_objectness: 0.0011 (0.0022)  loss_rpn_box_reg: 0.0159 (0.0168)  time: 0.5864  data: 0.0171  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [8]  [ 40/164]  eta: 0:01:18  lr: 0.000050  loss: 0.3366 (0.3501)  loss_classifier: 0.1230 (0.1289)  loss_box_reg: 0.1959 (0.2029)  loss_objectness: 0.0008 (0.0022)  loss_rpn_box_reg: 0.0151 (0.0162)  time: 0.5872  data: 0.0170  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [8]  [ 50/164]  eta: 0:01:11  lr: 0.000050  loss: 0.3143 (0.3458)  loss_classifier: 0.1131 (0.1260)  loss_box_reg: 0.1904 (0.2016)  loss_objectness: 0.0008 (0.0022)  loss_rpn_box_reg: 0.0138 (0.0159)  time: 0.5866  data: 0.0167  max mem: 2493
Epoch: [8]  [ 60/164]  eta: 0:01:04  lr: 0.000050  loss: 0.3206 (0.3421)  loss_classifier: 0.1142 (0.1242)  loss_box_reg: 0.1943 (0.1999)  loss_objectness: 0.0012 (0.0023)  loss_rpn_box_reg: 0.0150 (0.0157)  time: 0.5869  data: 0.0166  max mem: 2493
Epoch: [8]  [ 70/164]  eta: 0:00:57  lr: 0.000050  loss: 0.3319 (0.3435)  loss_classifier: 0.1142 (0.1249)  loss_box_reg: 0.1965 (0.2008)  loss_objectness: 0.0008 (0.0021)  loss_rpn_box_reg: 0.0158 (0.0156)  time: 0.5890  data: 0.0171  max mem: 2493
Epoch: [8]  [ 80/164]  eta: 0:00:51  lr: 0.000050  loss: 0.3364 (0.3439)  loss_classifier: 0.1132 (0.1245)  loss_box_reg: 0.2102 (0.2015)  loss_objectness: 0.0010 (0.0020)  loss_rpn_box_reg: 0.0173 (0.0159)  time: 0.5907  data: 0.0172  max mem: 2493
Epoch: [8]  [ 90/164]  eta: 0:00:45  lr: 0.000050  loss: 0.3559 (0.3441)  loss_classifier: 0.1172 (0.1242)  loss_box_reg: 0.2117 (0.2017)  loss_objectness: 0.0011 (0.0023)  loss_rpn_box_reg: 0.0178 (0.0159)  time: 0.5889  data: 0.0168  max mem: 2493
Epoch: [8]  [100/164]  eta: 0:00:38  lr: 0.000050  loss: 0.3476 (0.3440)  loss_classifier: 0.1129 (0.1234)  loss_box_reg: 0.2123 (0.2021)  loss_objectness: 0.0013 (0.0024)  loss_rpn_box_reg: 0.0177 (0.0160)  time: 0.5882  data: 0.0168  max mem: 2493
Epoch: [8]  [110/164]  eta: 0:00:32  lr: 0.000050  loss: 0.3156 (0.3405)  loss_classifier: 0.1040 (0.1220)  loss_box_reg: 0.1873 (0.2003)  loss_objectness: 0.0011 (0.0024)  loss_rpn_box_reg: 0.0136 (0.0158)  time: 0.5878  data: 0.0169  max mem: 2493
Epoch: [8]  [120/164]  eta: 0:00:26  lr: 0.000050  loss: 0.3245 (0.3421)  loss_classifier: 0.1113 (0.1224)  loss_box_reg: 0.1876 (0.2015)  loss_objectness: 0.0009 (0.0023)  loss_rpn_box_reg: 0.0127 (0.0159)  time: 0.5869  data: 0.0175  max mem: 2493
Epoch: [8]  [130/164]  eta: 0:00:20  lr: 0.000050  loss: 0.3510 (0.3428)  loss_classifier: 0.1221 (0.1229)  loss_box_reg: 0.2019 (0.2017)  loss_objectness: 0.0010 (0.0023)  loss_rpn_box_reg: 0.0168 (0.0160)  time: 0.5881  data: 0.0179  max mem: 2493
Epoch: [8]  [140/164]  eta: 0:00:14  lr: 0.000050  loss: 0.3350 (0.3421)  loss_classifier: 0.1089 (0.1221)  loss_box_reg: 0.1941 (0.2020)  loss_objectness: 0.0010 (0.0022)  loss_rpn_box_reg: 0.0157 (0.0158)  time: 0.5882  data: 0.0184  max mem: 2493
Epoch: [8]  [150/164]  eta: 0:00:08  lr: 0.000050  loss: 0.3321 (0.3414)  loss_classifier: 0.1011 (0.1217)  loss_box_reg: 0.2050 (0.2018)  loss_objectness: 0.0010 (0.0023)  loss_rpn_box_reg: 0.0125 (0.0157)  time: 0.5895  data: 0.0187  max mem: 2493
Epoch: [8]  [160/164]  eta: 0:00:02  lr: 0.000050  loss: 0.3185 (0.3406)  loss_classifier: 0.1119 (0.1216)  loss_box_reg: 0.1908 (0.2013)  loss_objectness: 0.0009 (0.0022)  loss_rpn_box_reg: 0.0136 (0.0155)  time: 0.5859  data: 0.0181  max mem: 2493
Epoch: [8]  [163/164]  eta: 0:00:00  lr: 0.000050  loss: 0.3185 (0.3407)  loss_classifier: 0.1142 (0.1217)  loss_box_reg: 0.1908 (0.2013)  loss_objectness: 0.0008 (0.0022)  loss_rpn_box_reg: 0.0120 (0.0155)  time: 0.5827  data: 0.0180  max mem: 2493
Epoch: [8] Total time: 0:01:38 (0.6010 s / it)
Epoch: [9]  [  0/164]  eta: 0:06:42  lr: 0.000005  loss: 0.2969 (0.2969)  loss_classifier: 0.0917 (0.0917)  loss_box_reg: 0.1846 (0.1846)  loss_objectness: 0.0051 (0.0051)  loss_rpn_box_reg: 0.0155 (0.0155)  time: 2.4547  data: 1.6290  max mem: 2493
Label text for image /content/data/SWD/DMS11252.JPG: mw
Label text for image /content/data/SWD/DMS11252.JPG: mw
Epoch: [9]  [ 10/164]  eta: 0:02:01  lr: 0.000005  loss: 0.3432 (0.3600)  loss_classifier: 0.1132 (0.1267)  loss_box_reg: 0.2095 (0.2139)  loss_objectness: 0.0011 (0.0014)  loss_rpn_box_reg: 0.0181 (0.0180)  time: 0.7909  data: 0.1710  max mem: 2493
Epoch: [9]  [ 20/164]  eta: 0:01:40  lr: 0.000005  loss: 0.3432 (0.3502)  loss_classifier: 0.1132 (0.1258)  loss_box_reg: 0.2025 (0.2063)  loss_objectness: 0.0007 (0.0014)  loss_rpn_box_reg: 0.0158 (0.0167)  time: 0.6068  data: 0.0215  max mem: 2493
Epoch: [9]  [ 30/164]  eta: 0:01:28  lr: 0.000005  loss: 0.3571 (0.3513)  loss_classifier: 0.1268 (0.1267)  loss_box_reg: 0.2015 (0.2063)  loss_objectness: 0.0014 (0.0021)  loss_rpn_box_reg: 0.0150 (0.0162)  time: 0.5883  data: 0.0174  max mem: 2493
Epoch: [9]  [ 40/164]  eta: 0:01:19  lr: 0.000005  loss: 0.3314 (0.3479)  loss_classifier: 0.1176 (0.1248)  loss_box_reg: 0.1978 (0.2046)  loss_objectness: 0.0013 (0.0020)  loss_rpn_box_reg: 0.0154 (0.0164)  time: 0.5880  data: 0.0171  max mem: 2493
Epoch: [9]  [ 50/164]  eta: 0:01:12  lr: 0.000005  loss: 0.3307 (0.3443)  loss_classifier: 0.1125 (0.1227)  loss_box_reg: 0.1969 (0.2033)  loss_objectness: 0.0008 (0.0021)  loss_rpn_box_reg: 0.0159 (0.0162)  time: 0.5878  data: 0.0170  max mem: 2493
Epoch: [9]  [ 60/164]  eta: 0:01:04  lr: 0.000005  loss: 0.3275 (0.3411)  loss_classifier: 0.1125 (0.1212)  loss_box_reg: 0.1967 (0.2021)  loss_objectness: 0.0008 (0.0020)  loss_rpn_box_reg: 0.0134 (0.0158)  time: 0.5884  data: 0.0173  max mem: 2493
Label text for image /content/data/SWD/DMS11204.JPG: \
Label text for image /content/data/SWD/DMS11204.JPG: \
Epoch: [9]  [ 70/164]  eta: 0:00:58  lr: 0.000005  loss: 0.3174 (0.3391)  loss_classifier: 0.1123 (0.1207)  loss_box_reg: 0.1962 (0.2010)  loss_objectness: 0.0009 (0.0019)  loss_rpn_box_reg: 0.0131 (0.0156)  time: 0.5895  data: 0.0176  max mem: 2493
Epoch: [9]  [ 80/164]  eta: 0:00:51  lr: 0.000005  loss: 0.3302 (0.3405)  loss_classifier: 0.1146 (0.1215)  loss_box_reg: 0.2054 (0.2016)  loss_objectness: 0.0011 (0.0020)  loss_rpn_box_reg: 0.0142 (0.0155)  time: 0.5891  data: 0.0177  max mem: 2493
Epoch: [9]  [ 90/164]  eta: 0:00:45  lr: 0.000005  loss: 0.3358 (0.3383)  loss_classifier: 0.1146 (0.1206)  loss_box_reg: 0.1921 (0.2001)  loss_objectness: 0.0014 (0.0022)  loss_rpn_box_reg: 0.0143 (0.0154)  time: 0.5887  data: 0.0175  max mem: 2493
Epoch: [9]  [100/164]  eta: 0:00:39  lr: 0.000005  loss: 0.3245 (0.3388)  loss_classifier: 0.1050 (0.1207)  loss_box_reg: 0.1920 (0.2005)  loss_objectness: 0.0012 (0.0022)  loss_rpn_box_reg: 0.0135 (0.0154)  time: 0.5887  data: 0.0176  max mem: 2493
Epoch: [9]  [110/164]  eta: 0:00:32  lr: 0.000005  loss: 0.3194 (0.3371)  loss_classifier: 0.1113 (0.1196)  loss_box_reg: 0.1984 (0.2001)  loss_objectness: 0.0008 (0.0021)  loss_rpn_box_reg: 0.0127 (0.0153)  time: 0.5882  data: 0.0176  max mem: 2493
Epoch: [9]  [120/164]  eta: 0:00:26  lr: 0.000005  loss: 0.3257 (0.3373)  loss_classifier: 0.1042 (0.1193)  loss_box_reg: 0.2054 (0.2007)  loss_objectness: 0.0008 (0.0021)  loss_rpn_box_reg: 0.0127 (0.0153)  time: 0.5893  data: 0.0175  max mem: 2493
Epoch: [9]  [130/164]  eta: 0:00:20  lr: 0.000005  loss: 0.3326 (0.3360)  loss_classifier: 0.1138 (0.1191)  loss_box_reg: 0.1906 (0.1996)  loss_objectness: 0.0008 (0.0022)  loss_rpn_box_reg: 0.0144 (0.0151)  time: 0.5895  data: 0.0177  max mem: 2493
Epoch: [9]  [140/164]  eta: 0:00:14  lr: 0.000005  loss: 0.3419 (0.3382)  loss_classifier: 0.1196 (0.1203)  loss_box_reg: 0.1993 (0.2004)  loss_objectness: 0.0010 (0.0022)  loss_rpn_box_reg: 0.0147 (0.0153)  time: 0.5902  data: 0.0173  max mem: 2493
Epoch: [9]  [150/164]  eta: 0:00:08  lr: 0.000005  loss: 0.3419 (0.3386)  loss_classifier: 0.1160 (0.1201)  loss_box_reg: 0.2012 (0.2009)  loss_objectness: 0.0015 (0.0022)  loss_rpn_box_reg: 0.0187 (0.0154)  time: 0.5913  data: 0.0178  max mem: 2493
Epoch: [9]  [160/164]  eta: 0:00:02  lr: 0.000005  loss: 0.3454 (0.3407)  loss_classifier: 0.1104 (0.1210)  loss_box_reg: 0.2125 (0.2019)  loss_objectness: 0.0016 (0.0023)  loss_rpn_box_reg: 0.0155 (0.0155)  time: 0.5858  data: 0.0178  max mem: 2493
Epoch: [9]  [163/164]  eta: 0:00:00  lr: 0.000005  loss: 0.3410 (0.3400)  loss_classifier: 0.1104 (0.1207)  loss_box_reg: 0.2096 (0.2016)  loss_objectness: 0.0016 (0.0023)  loss_rpn_box_reg: 0.0133 (0.0155)  time: 0.5822  data: 0.0177  max mem: 2493
Epoch: [9] Total time: 0:01:38 (0.6030 s / it)
That's it!

The text above also shows some mistakes in the labels (some were accidentally entered as "\" or "mw", but these are discarded.

In [ ]:
# Save the model
torch.save(model.state_dict(), "/content/models/swd.pth")

Testing the model

Here we test the trained model on an image that it has never seen, and we see how good it is at identifying and counting objects.

In [ ]:
# Run evaluation data
from torchvision.transforms.functional import to_tensor
image_path = '/content/data/SWD/DMS10001.JPG'
image = Image.open(image_path)
# Put model in "eval" mode for prediction
model.eval()
image_tensor = to_tensor(image).to(device)
# Get predictions
predictions = model([image_tensor])
# Draw bounding boxes
import cv2
img = cv2.imread(image_path, cv2.COLOR_BGR2RGB)

from collections import Counter
label_counter = Counter()

for i in range(len(predictions[0]['boxes'])):
    x1, x2, x3, x4 = map(int, predictions[0]['boxes'][i].tolist())
    target = predictions[0]['labels'][i]
    score = predictions[0]['scores'][i]
    # Threshold predictions to remove low-confidence labels
    if score < 0.5:
        continue
    if target == 1:
        label_text = "f"
    elif target == 2:
        label_text = "m"
    elif target == 3:
        label_text = "d"
    label_counter[label_text] += 1

    image = cv2.rectangle(img, (x1, x2), (x3, x4), (255, 0, 0), 1)
    image = cv2.putText(image, f"{label_text} ({score})", (x1, x2), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)

print(f'There are {label_counter["f"]} Drosophila suzukii females, {label_counter["m"]} Drosophila suzukii males, and {label_counter["d"]} Drosophila melanogaster.')
cv2.imshow('img', image)
There are 5 Drosophila suzukii females, 4 Drosophila suzukii males, and 3 Drosophila melanogaster.

The model correctly identified and counted most of the flies in this image, with exception of one blocked by the file details.