|Description (include details on usage, files and paper references)||This is the largest real-world face clustering dataset. We used this dataset in our AAAI 2018 paper "Merge or Not? Learning to Group Faces via Imitation Learning". We collect 60 real users’ albums with permission from a Chinese social network portal. The size of an album varies from 120 to 3600 faces, with a maximum number of identities of 321. In total, the dataset contains 84,200 images with 78,000 faces of 3,132 different identities. We annotate all detections with identity/noise labels. The images are unconstrained, taken in various indoor/outdoor scenes. Faces are naturally distributed with different poses with spontaneous expression. In addition, faces can be severely occluded, blurred with motion, and differently illuminated under different scenes.