Computer Vision Seminar

On Clothing Object Detection from Images and Videos

Joy Tang, Suren KumarCEO, CSOMarkable

This is a "tech Talk" . Markable representatives will be available for discussions about jobs and hiring after the talk.
Clothing product detection from images and videos paves the way for visual fashion understanding. Clothing detection is an important step for retrieving similar clothing items, organizing fashion photos, creating artificial intelligence powered shopping assistants, and automatically labeling large catalogues. Training a deep learning based clothing detector requires pre-defined categories (dresses, pants, etc.) and a high volume of annotated image data for each category. However, fashion evolves and new categories are constantly introduced in the marketplace. For example, consider the case of jeggings, which is a combination of jeans and leggings. To retrain a network to handle the jegging category would require adding annotated data specific to the jegging class, and subsequently relearning the weights for the deep network. In this talk, we demonstrate a method that can handle novel category detection without the need to obtain new labeled data or retrain the network. Our approach learns the visual similarities between various clothing categories and predicts a hierarchical tree for the categories. The resulting framework significantly improves the generalization capabilities of the detector to the novel clothing products.

This talk will also feature explore Markable's work on exploiting similarity between multiple detection to form clothing product tracklets across a video.

Markable is currently hiring for the following positions: training data specialists and computer vision interns.


Joy Tang, CEO of Markable:

Joy Tang is a former high frequency algorithmic trader turned computer vision entrepreneur. She left a seven figure salary in high frequency trading to create Markable, the leader in fashion image recognition. Originally from China, Joy attended MIT on a full scholarship, with majors in math and economics. She also holds a Gold Medal in Math Olympics in China and as a teen was the anchor of a Chinese television program. She understood how to build algorithms from her work in high frequency trading, and upon noticing a huge gap in the market between e-commerce and digital content, she was inspired to develop the AI technology to make all actors' and actresses' wardrobes immediately shoppable.

Suren Kumar, CSO of Markable:

Suren currently serves as Chief Science Officer at Markable, where he leads the computer vision effort to extract semantic content from visual data. Previously, he was a research fellow at University of Michigan – Ann Arbor, where he tackled challenges in behaviour modelling and prediction as applied to autonomous driving. During his PhD, he worked on representation, modelling and estimation of articulated bodies from sensor data using concepts from computer vision, machine learning and robotics. While interning at Mitsubishi Electric Research Labs, he invented a patented a system to track humans using RGB and Infrared cameras. He also worked on a Toyota-funded project for which he developed methods for long-term prediction of vehicles on the highway. He has over 100 citations in Google Scholar; for more details about his publications, please go to his Google Scholar profile:

Sponsored by


Faculty Host

Jason Corso