1
Deep learning models for handling specularity in face image intrinsic decomposition | |
Author | Sirisilp Kongsilp |
Subject(s) | Machine learning Human face recognition (Computer science) |
Note | A dissertation submitted in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Computer Science |
Publisher | Asian Institute of Technology |
Abstract | The human face plays a very important role in this world. The face is the key by which we can recognize people. When we see a person, his or her face is apparently represented in our brains in a very robust form that enables later recognition under a variety of conditions. Clearly, capturing face images, storing robust representations of those faces, and identifying them is a very important task for the human brain. If computer systems are to match human performance, they must capture human face images, process them, and store them in memory in such a way that they can be used later for re identifying people. However, it is not simple to build such a system. Sometimes, faces may not be captured properly, due to high specularity or low illumination. If so, we may not recognize them accurately later. If we could convert low-quality images to high-quality images prior to storage, we may be better prepared to recognize them accurately later in applications such as face detection or recognition. To perform this conversion through machine learning models, we need real-world training datasets that contain faces corrupted by specularity, along with ground truth diffuse faces. Unfortunately, today, there is no real-world face dataset that contains faces with specularity and corresponding ground truth diffuse images. Therefore, in this dissertation, one of the focal points is creating datasets that contain faces with specularity and corresponding ground truth diffuse, reflectance/albedo, and shading im ages. I have created three datasets. The Spec-Face dataset consists of face images with specu larity and corresponding ground truth diffuse images. Spec-Face-IID consists of face images with specularity and corresponding ground truth reflectance and shading images. Synth Spec consists of synthetic images of simple rendered objects with corresponding ground truth, specularity, diffuse, reflectance, and shading images. Another focal point of the dissertation is the design of a pipeline that first removes specularity from faces and then splits them into reflectance and shading components. For specularity removal, I develop an optimized intensity-based method and compare its results with those of state-of-the-art methods. The optimized method produces better results than existing state of-the-art methods. I also introduce two deep learning models, namely Spec-Net and Spec CGAN, to remove specularity from face images. Spec-CGAN is the first use of GANs for removing specularity from images, specifically face iiiimages. I test different variations of the GAN approach namely Simple-GAN, Spec-CGAN, Cycle-GAN-SSIM, Cycle-GAN-and-SSIM, and Cycle-GAN-Per-loss, and I find that Spec CGAN performs better than other variations and consider it for further evaluation. I conclude from the experimental results that Spec-Net and Spec-CGAN produce better re sults than do state-of-the-art methods, qualitatively and quantitatively for Spec-Face and for sample image from the LFW dataset. The new pipeline also produces better specular-free reflectance results for the Synth-Spec and Spec-Face-IID datasets. The knowledge obtained from this research may help in many computer vision applications, such as face detection, recognition, matching, image content editing, edge detection, color constancy or segmentation, and photometric stereo. I apply Spec-CGAN to face recognition problem on LFW test data using pre-trained model on VGGface2. It increase the recognition rate by 1.007% as compared to using the original LFW test data. In the future, I plan to apply the knowledge gained from the research to practical applications such as face detection, recognition on other benchmark, and segmen tation, or in other fields such as dermatology (skin cancer diagnosis, for example). Previous work in illumination robustness has exploited many images of the same object under dif ferent lighting condition. In a similar way, I plan to create datasets containing faces under many different lighting conditions. This will enable fine-tuning or re-training of current state-of-the-models. The resulting models will improve our ability to remove specularity robustly from faces and may help computer vision applications where diffuse images require for further decomposition such as into reflectance and shading components. |
Year | 2020 |
Type | Dissertation |
School | School of Engineering and Technology (SET) |
Department | Department of Information and Communications Technologies (DICT) |
Academic Program/FoS | Computer Science (CS) |
Chairperson(s) | Dailey, Matthew N.; |
Examination Committee(s) | Mongkol Ekpanyapong;Chutiporn Anutariya; |
Scholarship Donor(s) | Shaheed Benazir Bhutto University, Sheringal, Dir(U), KP, Pakistan;Asian Institute of Technology Fellowship |
Degree | Thesis (Ph.D.) - Asian Institute of Technology, 2020 |