From: hu-po

MMDetection is an open-source object detection toolbox built upon PyTorch [00:01:04]. It is part of the OpenMMLab project [00:01:07], an open-source community that contributes to various AI projects [00:01:39]. MMDetection itself is an open-source project [00:08:09] under the Apache 2 license, a permissive license allowing broad use [00:01:55].

Key Features

MMDetection offers several notable features:

  • Modular Design: It decomposes the detection framework into different components, allowing users to easily construct custom object detection frameworks by combining various modules [00:07:14].
  • Toolbox Support: It directly supports popular and contemporary detection frameworks such as Faster R-CNN, Mask R-CNN, and RetinaNet [00:07:23]. Mask R-CNN and Faster R-CNN are also used for segmentation [00:07:29].
  • GPU Acceleration: All basic bounding box and mask operations run on GPUs, leading to training speeds that are faster than or comparable to other detection codebases, including Detectron2 [00:07:45].
  • Award-Winning Origin: The toolbox stems from a codebase developed by the MMDetection team, who won the COCO detection challenge in 2018 [00:08:03].
  • State-of-the-Art Performance: Newer versions like RTMDet achieve state-of-the-art performance on instant segmentation and rotated object detection tasks [00:09:51], including on aerial images and real-time segmentation for MS COCO [01:00:03].

Setting Up and Installing MMDetection Environment

The process of installing MMDetection involves several steps to manage dependencies and ensure compatibility.

Initial Installation and Dependencies

  1. Create a Virtual Environment: It is recommended to create a new Python virtual environment for MMDetection [00:03:41].
    pi m virtual mmdet
    pi m activate mmdet
  2. Install MMDetection via Pip:
    pip install mmdet
    This command installs packages like SciPy, PyCocoTools, NumPy, Matplotlib, and Pillow [00:04:53], but notably not PyTorch or CUDA [00:05:02].
  3. Install PyTorch: MMDetection’s master branch is designed to work with PyTorch 1.5+ [00:07:05]. The correct PyTorch version must be installed, specifically with the compatible CUDA version (e.g., CUDA 11.6) [00:05:52].
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu116
  4. Install MMCV: MMDetection requires MMCV. The openmim tool can be used to install mmcv-full which includes comprehensive CPU and CUDA operations [00:13:19].
    pip install openmim
    mim install mmcv-full
    This step also installs addict and opencv-python [00:14:13]. addict is a Python dictionary subclass that allows items to be accessed and set like attributes using dot notation [00:15:12].

Troubleshooting and Reinstallation

  • Module Not Found Errors: If mmdet or mmcv modules are not found, it might be due to incorrect installation or an outdated API [00:12:52].
  • Cloning the Repository: If pip installation from PyPI is problematic, cloning the MMDetection GitHub repository and installing in editable mode (development mode) is an alternative [00:23:52].
    git clone https://github.com/open-mmlab/mmdetection.git
    cd mmdetection
    pip install -v -e .
  • distutils Compatibility Issues: A common issue was setup.py failing due to distutils being replaced [00:21:51]. Downgrading setuptools to an older version can resolve this [00:35:30].
    pip install setuptools==65.5.0 # Example older version
  • Checking Python and PyTorch Versions: Verify that the installed Python version (e.g., 3.8.0) and PyTorch version (e.g., 1.13) are compatible with the MMDetection version being used [00:29:15].

Usage and Inference

After installation, MMDetection can be used for object detection inference.

Basic Inference Steps

  1. Download Config and Checkpoint Files: For inference, a configuration file (e.g., yolov3_mobilenet_v2_320_coco.py) and its corresponding checkpoint file (model weights) are required [00:20:30]. These can be downloaded using the mim command or directly from the model zoo [00:24:31].

    from mim import download
    download(config_path, dest_root='.') # Note: API might change from 'destination_dir' to 'dest_root'

    API Changes destination_dir to dest_root in the mim.download function [00:38:45]. This highlights a common challenge in maintaining open-source projects where documentation might lag behind code changes [00:39:19].

    The argument for the download destination has changed from

  2. Initialize Detector: Use the initialize_detector function from mmdet.apis to load the model [00:16:38].

  3. Perform Inference: Use inference_detector to get detection results [00:17:09].

  4. Visualize Results: The show_result_pyplot function can visualize the detection results on an image [00:47:20].

Example Code Snippet

import os
from mmdet.apis import inference_detector, init_detector, show_result_pyplot
 
# Define model configuration and checkpoint paths
# Example using YOLOv3 MobileNetV2
config_file = 'mmdetection/configs/yolov3/yolov3_mobilenet_v2_320_coco.py'
checkpoint_file = 'yolov3_mobilenet_v2_320_coco_20210719_215349-d1703272.pth'
 
# Example using Faster R-CNN ResNet101 (larger model)
# config_file = 'mmdetection/configs/faster_rcnn/faster_rcnn_r101_fpn_1x_coco.py'
# checkpoint_file = 'faster_rcnn_r101_fpn_1x_coco_20200130-f7051d33.pth'
 
# Initialize the detector
# device='cuda:0' for GPU, device='cpu' for CPU
model = init_detector(config_file, checkpoint_file, device='cuda:0') <a class="yt-timestamp" data-t="00:47:06">[00:47:06]</a>
 
# Prepare input images
image_paths = [
    '/tmp/AI_astronaut.png',
    '/tmp/AI_cowboy.png',
    '/tmp/people.png'
]
 
# Loop through images and perform inference
for img_path in image_paths:
    # Perform inference
    result = inference_detector(model, img_path) <a class="yt-timestamp" data-t="00:48:31">[00:48:31]</a>
 
    # Visualize the results and save to a file
    output_filename = f"result_{os.path.basename(img_path)}"
    model.show_result(img_path, result, out_file=output_filename) # Use model.show_result for saving to file
    print(f"Results saved to {output_filename}")

Inference Performance

Testing with different models demonstrates varying performance:

  • YOLOv3 MobileNetV2: A smaller model that might produce lower confidence scores or miss some objects, especially in crowded scenes [01:00:13]. For example, it identified a “person” and “horse” in an AI-generated image with decent confidence [00:54:53], but struggled with AI-generated textures, misclassifying objects as “dog,” “bottle,” or “teddy bear” [00:36:56].
  • Faster R-CNN ResNet101: A larger model that generally provides higher confidence scores and better detection capabilities, especially in dense environments [01:10:00]. It detected significantly more people in a crowded stadium photo compared to the smaller model [01:11:06] and also identified additional objects like cell phones [01:11:33].

AI-Generated Image Challenges

AI-generated images with unusual textures can cause detection models to produce unexpected classifications (e.g., a “teddy bear” or “toothbrush” in an astronaut image) [01:10:10]. Even applying super-resolution techniques (like “crispying” an image) may not improve detection accuracy, and can sometimes worsen it [01:14:55].

Model Zoo and Benchmarks

MMDetection provides an extensive model zoo with various pre-trained models. These can be found at the OpenMMLab website under the “Benchmark and Model Zoo” section [01:01:04]. The model zoo includes diverse architectures like YOLOv3 and Faster R-CNN with different backbones (e.g., MobileNetV2, ResNet101) [00:58:12]. Users can download specific config and checkpoint files for their desired models [01:06:18].

Project Health and Community

MMDetection appears to be a healthy open-source project with consistent maintenance and development:

  • Contribution Graph: Contributions have been steady since around 2018, showing continuous additions over time [01:16:09]. While some initial major contributors might have moved on, new maintainers have taken over [01:16:40].
  • Code Coverage: The project maintains a 60% code coverage by tests, which is considered acceptable for an open-source project [00:09:27]. Tests ensure that the software behaves properly and covers various edge cases [00:08:59].
  • Build Status: The build status, though sometimes failing (indicated by a red badge), reflects continuous integration (CI) where packages are built upon code pushes [00:08:39].
  • Releases: MMDetection has a good amount of releases (45 total), with recent releases indicating active development, such as updates two and three weeks prior [01:18:23].
  • Issue Resolution: The project actively closes issues, demonstrating responsiveness to user questions and bug reports [01:18:09].

Conclusion

MMDetection is a robust and actively maintained open-source object detection framework based on PyTorch [01:18:56]. It offers a wide range of models in its model zoo, supporting both large and small models that work with GPUs and CPUs [01:19:05]. Its modular design and comprehensive features make it a valuable toolkit for various object detection projects [01:19:11].