MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

2020/07/19
Ho Seong Lee (hoya012)
Cognex Deep Learning Lab
PR-263 | MVTec AD—A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection 1

Contents
• Introduction
• Related Work
• Datasets for Anomaly Detection in natural images
• Methods
• MVTec-AD Dataset
• Benchmark
• Discussion
• Conclusion

Introduction
What is Anomaly Detection?
• Detecting anomalous regions in images or videos or time-series data
• Today, we will focus anomaly detection in industrial images
Reference: https://hoya012.github.io/blog/anomaly-detection-overview-1/

Introduction
Definition of Anomaly Detection
• I wrote technical blog post about anomaly detection (Korean Only TT)
• I recommend reading this post if you want to know more about anomaly detection

Introduction
Anomaly Detection in PR-12
• Unsupervised Anomaly Detection 1, One-Class Anomaly Detection 1, Out-of-distribution Detection 2
• Minor topic.. 5/263 = 1.9%
PR-115
By 강민국님
PR-148
By 강민국님
PR-190
By 강민국님
PR-235
By 이도엽님

Related Work
Datasets for Anomaly Detection in natural images
• One-Class Anomaly Detection: MNIST, CIFAR-10
Normal Class Anomaly Class
Provide a large amount of
train/test data
Anomaly samples differ
significantly from normal
samples

Related Work
Datasets for Anomaly Detection in natural images
• Unsupervised Anomaly Detection (Segmentation): DAGM 2007, NanoTWICE
Only focus on the
inspection of textured
surfaces
Only 5 defect-free images
can be used for training
Defects were generated
by similar texture models

Related Work
Methods - Generative Adversarial Networks (AnoGAN) → See PR-115!
• Train GAN using normal samples, and fix Generator and Discriminator
• Search for a latent sample that reproduces a given input image and manages to fool discriminator
Reference: “Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery”

Related Work
Methods – Deep Convolutional Autoencoders and Variational Autoencoders (VAE)
• Reconstruct normal training samples through a bottleneck (latent space)
• During testing, they fail to reproduce images that differ from the data that was observed during training
Many papers provide further
evidence that probabilities obtained
from VAEs and other deep generative
models might fail to model the true
likelihood of the training data

Related Work
Methods – Features of pre-trained CNN (Feature Dictionary)
• Use feature descriptors obtained from ImageNet pre-trained CNN
• Sliding window-based approach → for large images, very slow
Reference: “Anomaly Detection in Nanofibrous Materials by CNN-Based Self Similarity”

Related Work
Methods – Traditional Methods
• Use hand-crafted feature descriptors from defect-free texture images
• Gaussian Mixture Model(GMM)-based Texture Inspection Model for texture objects
• Shape-based Matching-based Variation Model for non-texture objects

MVTec-AD Dataset
PR-263 | MVTec AD—A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec-AD Dataset Description
• 15 categories with 3629 images for training and validation and 1725 images for testing
• The defects were generated with the aim to produce realistic anomalies as they would occur in real-
world industrial inspection scenarios
12
OK OK / NG

MVTec-AD Dataset
Grid, Screw, Zipper = Gray(1ch)

MVTec-AD Dataset
MVTec-AD Dataset Description
• All images were acquired using a 2048 x 2048 pixel high-resolution industrial RGB sensor
• Afterwards, the images were cropped to a suitable output size (700 x 700 ~ 1024 x 1024)
• The images were acquired under highly controlled illumination conditions. For some object classes,
however, the illumination was altered intentionally to increase variability
• Provide pixel-precise ground truth labels for each defective image region

Benchmark
Evaluated Methods
• AnoGAN: use the publicly available implementation from DoYup Lee!! (PR-12 Presenter)
• L2 and SSIM Autoencoder: same CAE architecture as MVTec’s previous paper
• CNN Feature Dictionary: 512-dimensional avgpool layer of ResNet-18 pretrained on ImageNet
• GMM-Based Texture Inspection Model / Variation Model: use HALCON machine vision library from
MVTec
MVTec CAE
→ www.github.com/LeeDoYup/AnoGAN-tf

Benchmark
Data Augmentation
• Texture images: random crop rotated rectangular patches of fixed size
• Object images: random translation and rotation and additional random flip if possible
Texture Object

Benchmark
Evaluation
• Output score is one-channel spatial map. To obtain a final segmentation result, a threshold must be
determined! Use defect-free validation images to estimate the threshold
Ideal threshold.. But.. we don’t know
Original Reconstructed Anomaly Map
Reference: SUALAB Research Team Park’s Figure

Benchmark
Evaluation
• Output score is one-channel spatial map. To obtain a final segmentation result, a threshold must be
determined! Use defect-free validation images to estimate the threshold
• For every category, define a minimum defect area that a connected component in the thresholded
anomaly map must have to be classified as a defective region
• Successively segment the anomaly maps of the defect-free validation images with increasing thresholds
• The threshold that yielded this segmentation is used for further evaluation
Human’s prior
Maybe.. Blob?

Benchmark
Evaluation Metric
• For classification, use the accuracy of correctly classified test images
• For segmentation, use relative per-region overlap (IoU) and AUROC (which is independent of threshold)
Red line: GT / Green map: Prediction

Benchmark
Results
Classification Segmentation
Top row: OK
Bottom row: NG
Top row: IoU
Bottom row: AUROC

Benchmark
Results
• For texture images, none of methods emerges as a clear winner
• Autoencoder and CNN Feature Dictionary perform well

Benchmark
Results
• For object images, Autoencoder achieve the best results
• L2 AE achieves better per-region overlap values

Discussion
AnoGAN
• Observe a tendency of GAN training to result in mode collapse
• AnoGAN has great difficulties with object categories for which the objects appear in various
shapes or orientations in the dataset
23

Discussion
Autoencoder
• Observe stable training across all dataset categories both SSIM AE and L2 AE
• For some categories, however, fail to model small details, which results in rather blurry reconstruction
• This is especially the case for high-frequency textures, which appear in tile and zipper
24
False Positive!

Discussion
CNN Feature Dictionary
• As a method proposed for detection of anomalous regions in textured surfaces, CNN Feature
Dictionary achieves satisfactory results for all textures except grid
• Its performance degenerates when evaluated on objects categories
25
False Negative!

Discussion
Others (Texture Inspection Model and Variation Model)
• Good performance and Bad performance..
26
Texture
Inspection
Variation
Model

Conclusion
• Introduce MVTec Anomaly Detection dataset (MVTec-AD), a novel dataset for unsupervised
anomaly detection mimicking real-world industrial inspection scenarios
• Several methods were thoroughly evaluated on this dataset
• The evaluations provide a first benchmark on this dataset and show that there is still
considerable room for improvement
→ The first real-world industrial unsupervised anomaly detection dataset!
→ Some ambiguous evaluation metric(threshold).. And didn’t provide source code
27

MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

More Related Content

What's hot (20)

Similar to MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection (20)

More from LEE HOSEONG (20)

Recently uploaded (20)

MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection