Observation: I have checked randomly about 1000 images of cc_lmdb, it turns out that about 20% of cc_lmdb is word "alamy".  Questions/Concerns: 1. Is this expected behavior for cc_lmdb? 2. Could this skew model training?
Observation:
I have checked randomly about 1000 images of cc_lmdb, it turns out that about 20% of cc_lmdb is word "alamy".
Questions/Concerns: