top of page

Our Motivation

Motivation: About

Traditional Vision System

Traditional vision systems use well-established (hand-crafted) feature descriptors (SIFT, SURF, BRIEF, etc.) for down-stream tasks such as image classification, object detection and 3D vision.

traditional_edited.jpg
Motivation: Text

Data-Driven Vision Approach

Data-driven vision (Deep Learning) approaches entirely rely on the images and labels to find the underlying patterns in an end-to-end learning approach.


Performance: Deep Learning >> Traditional Vision Approach

Data driven.png
Motivation: Text

However, the performance of the DL approach entirely relies on the distribution of the data!
Therefore, if the data has group shifts, the model would cause serious discrimination over the minor group! In other words, the model could be overfitting towards the major groups!

black box.png
Motivation: Text

Below is an example of a group-shifted dataset, the Celebrity Faces dataset. 
The label denotes the class categories. In Celebrity Faces, it is male and female.
Attributes denote the features within each sample which are not characterized in label space. In Celebrity Faces, it could correspond to the hair color: black or blonde.
The groups, or we call  Uncertainty Set, are the combinations of Labels and Attributes. In Celebrity Faces, it corresponds to four groups of images.
Clearly, males with Blonde hair would be the minor group in this case.

group shifts.png
Motivation: Text

In fact, group shifts, (or we call Bias),  are quite common in ML 
Here are the classification results under the Celebrity Faces dataset. It is obvious that the Minor group is suffering from the dominant impact of major groups, thus receiving low accuracy!

minor suffer.png
Motivation: Text

Therefore, it is our goal to improve the worst-case group classification accuracy while preserving the major groups' accuracy. On the next page, we'll introduce some state-of-art methods and their limitations.

Motivation: Text
bottom of page