REC-MV: REconstructing 3D Dynamic Cloth
from Monocular Videos
CVPR 2023



Reconstructing dynamic 3D garment surfaces with open boundaries from monocular videos is an important problem as it provides a practical and low-cost solution for clothes digitization. Recent neural rendering methods achieve high-quality dynamic clothed human reconstruction results from monocular video, but these methods cannot separate the garment surface from the body. Moreover, despite existing garment reconstruction methods based on feature curve representation demonstrating impressive results for garment reconstruction from a single image, they struggle to generate temporally consistent surfaces for the video input. To address the above limitations, in this paper, we formulate this task as an optimization problem of 3D garment feature curves and surface reconstruction from monocular video. We introduce a novel approach, called REC-MV, to jointly optimize the explicit feature curves and the implicit signed distance field (SDF) of the garments. Then the open garment meshes can be extracted via garment template registration in the canonical space. Experiments on multiple casually captured datasets show that our approach outperforms existing methods and can produce high-quality dynamic garment surfaces.



Given a monocular video with Ni frames depicting a moving person {It|t = 1, . . . , Ni}, REC-MV aims to reconstruct high-fidelity and space-time coherent open garment meshes. This is a challenging problem as it requires a method to simultaneously capture the shape contours, local surface details, and the motion of the garment. Observing that feature curves (e.g., necklines, hemlines) provide critical cues for determining the shape contours of garment and implicit signed distance function (SDF) can well represent a detailed closed surface, we propose to first optimize the explicit 3D feature curves and implicit garment surfaces from the video, and then apply non-rigid clothing template registration to extract the open garment meshes.


Our method can be devided into 4 parts:
(a) Starting from a surface template, we initialize the canonical curves by solving Eq. (3), and apply a handle-based deformation to initialize the canonical implicit surface.
(b) Given an i-th frame, canonical curves are deformed to the camera view space to compute the projection loss based on the surface-aware visibility estimation.
(c) Similarly, the canonical surface is deform to the camera view to compute the photometric loss by differentiable rendering. The curves and surface are jointly optimized to enable a progressive co-evolution.
(d) Last, the open garment meshes can be extracted by template registration in the canonical space.

Qualitative Results


Qualitative comparison on real datasets between BCNet, ClothWild, ReEF, and our method


Reconstruction results on a large pose sequence


Dynamic garment reconstruction results of our method



The website template was borrowed from Michaƫl Gharbi.