NVIDIA has recently launched a new web site, in order for everyone to test the power of their usage of artificial intelligence (aka A.I.) applied to the generation of portraits people, and by using the huge power of their GPU's (Graphical process units). To accomplish that, NVIDIA has taken, as inputs of their AI system, thousands of real people pictures from Instagram. Many people have said that they overpassed their rights, because no one had been asked for their permission to use their own pictures. NVIDIA replied that the general conditions of Instagram, accepted by every user of the platform, implies that they lose their rights on their uploaded pictures. By the way, that's not the subject of the day, but it was important to notify it.
In that tutorial, we are going to view, in an easy way, how to automatically get generated portraits people from NVIDIA website https://thispersondoesnotexist.com/ , but using a polite policy. In other words, we introduced a crawl delay in order to not harm NVIDIA website. Therefore, we won't be responsible for the usage that is made of these scripts, neither if you modify them, and if you don't respect the basic rules of web crawling, that among others attempt to not disturb webservers and respect the work pleasantly put at the disposal of everyone. We found that a crawl delay of 2 seconds between each fetching is a good compromise, because it will not harm the distant website, and mostly because the generation seems to take more than one second. Therefore, crawling too fast results apparently in getting duplicate pictures.
If you already own a personal linux operating system such as Ubuntu, Debian, and so on, you will certainly have all the needed tools in order to fetch the distant pictures. We used wget cli tool, but the work could also be accomplished using curl for instance.
If wget is not installed on your linux-based operating system (eg Ubuntu or Debian), install it that way :
foo@bar:~$ sudo apt-get install wget
For other linux-based operating systems, refer to the documentation to get the proper command for installing a package.
Here is the script we made to accomplish the hard work :
#!/bin/bash
#---------------------------------------------------------------------------------------------
#PARAMETERS THAT CAN BE CHANGED TO SUIT YOUR NEEDS
#---------------------------------------------------------------------------------------------
OUTPUT_DIR="$PWD/output" #DEFAULT storage path of downloaded portraits
WAIT_TIME_SECONDS=2 #Number of seconds to wait between each fetching of pictures
MAX_DOWNLOADS=20 #Work will stop after having downloaded MAX_DOWNLOADS pictures
FILENAME_PREFIX="thispersondoesnotexist"
#---------------------------------------------------------------------------------------------
#---------------------------------------------------------------------------------------------
#OTHER PARAMETERS
#---------------------------------------------------------------------------------------------
USER_AGENT="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.12) Gecko/20101026 Firefox/3.6.12"
URL="https://thispersondoesnotexist.com/image"
#---------------------------------------------------------------------------------------------
DOWNLOAD_TOOL="wget"
echo "Check if ${DOWNLOAD_TOOL} is installed"
dpkg -s $DOWNLOAD_TOOL &> /dev/null
if [ $? -eq 0 ]; then
echo "Package ${DOWNLOAD_TOOL} is installed!"
else
echo "Package ${DOWNLOAD_TOOL} is NOT installed and is required ! "
exit 1
fi
[[ ! -d $OUTPUT_DIR ]] && mkdir $OUTPUT_DIR
mkdir -p $OUTPUT_DIR/extract.$$
for (( c=1; c<=$MAX_DOWNLOADS; c++ ))
do
fullfilename="$OUTPUT_DIR/extract.$$/${FILENAME_PREFIX}.$c.jpg"
wget --user-agent="User-Agent: ${USER_AGENT}" ${URL} -O $fullfilename
sleep ${WAIT_TIME_SECONDS}
done
echo "Extraction finished."
Here is an excerpt of the log resulting of the execution of the script (click to zoom) :
On the following screenshot, we can see the output directory, containing the downloaded files.
Anyone can find the right usage for these pictures. For instance, these pictures can be a perfect basis for artificial intelligence purposes such as a facial racognition development and testing. They can also be used in the construction and deilvery of web templates designs, where there always exists a page for testimonials or to present every person of a team. But if they are going to be used in a commercial project, you should take care of the licensing model chosen by NVIDIA for the usage and redistribution of their generated pictures.