Flip images in multiple directories for data augmentation with no effort

When preparing a dataset for training a neural network,
You would like to fully exploit that dataset to increase your network's accuracy as many papers provided evidence that some of these methods do improve the network's recognition ability,
Some of those methods are flipping the images horizontally and scaling the images, all while keeping the original images unscathed.


Picture from the oppenheimer's famous words video.



Assuming you are using linux or macOS,
First you will need to install graphics magick which is super simple sudo apt-get install graphicsmagick
we will use the mogrify function from graphics magic to flip and resize images.

import subprocess

def prepare_dataset():
    DIR_PARENTS = ['dir1/',
                                    'dir2/',
                                    'dir3/']
    for DIR_PARENT in DIR_PARENTS:
              DIR_TRAIN        = 'train/'
              DIR_VALIDATE = 'validate/'
              bashCommands = [[DIR_PARENT, 'mkdir orig'],
                                             [DIR_PARENT, 'cp -r ' + DIR_TRAIN + '. ' + 'orig'],
                                             [DIR_PARENT + DIR_TRAIN ,'gm mogrify -flop *'],
                                             [DIR_PARENT + DIR_TRAIN , 'for file in *; do mv "$file" "flipped_${file%.}" ; done'],
                                             [DIR_PARENT ,'mv orig/* '+ DIR_TRAIN +'.'],

                                             [DIR_PARENT, 'cp -r ' + DIR_VALIDATE + '. ' + 'orig'],

                                             [DIR_PARENT + DIR_VALIDATE ,'gm mogrify -flop *'],
                                           
                                             [DIR_PARENT + DIR_VALIDATE , 'for file in *; do mv "$file" "flipped_${file%.}" ; done'],

                                             [DIR_PARENT, 'cp -r ' + 'orig'+ '. ' + DIR_VALIDATE ],

                                             [DIR_PARENT + DIR_VALIDATE ,'gm mogrify -resize {}x{} *'.format(random.randint(360,640), random.randint(360,640))],

                                         
                                             [DIR_PARENT + DIR_VALIDATE , 'for file in *; do mv "$file" "resized_${file%.}" ; done'],

                                             [DIR_PARENT ,'mv orig/* '+ DIR_VALIDATE +'.'],

                                             [DIR_PARENT ,'rm -R orig']]

              for command in bashCommands:
                     process = subprocess.Popen(command[1], stdout=subprocess.PIPE, cwd = command[0], shell=True)
                     output, error = process.communicate()
DIR_PARENTS is all the directories that contain the images inside,
I had two folders inside each directory, each containing images of its own, those are DIR_TRAIN and DIR_VALIDATE.
I then Construct a list of lists of all the commands relevant to a directory to execute them in one go,
Each list has the directory to execute the command and the command inside
First we make a folder named orig mkdir orig and we copy all the images inside DIR_TRAIN to orig recursively, 
Then starting the shell inside the DIR_TRAIN directory DIR_PARENT + DIR_TRAIN, we flip them all horizontally gm mogrify -flop * noting that it replaces the original images automatically,
Next we rename the flipped images by putting flipped_ before the original name of each file for file in *; do mv "$file" "flipped_${file%.}" ; done ,
Then we copy the original images inside DIR_TRAIN again so we can resize them this time gm mogrify -resize {}x{} *where we choose a random value for the size between 360 and 640 random.randint(360,640), random.randint(360,640) ,
Next we rename the resized images to distinguish them from the flipped and the original images for file in *; do mv "$file" "resized_${file%.}" ; done ,
then we empty the orig folder by moving all the original images inside back to DIR_TRAIN and remove the folder itself
And last we iterate over the created commands, we open a process subprocess.Popen for each command command[1], we execute the command  at a specific directory cwd = command[0] and we spawn a shell for it shell=True so we can be able to expand all the file globs.
And we execute.


Comments

Popular posts from this blog

Create a route optimization algorithm with zero costs using google's OR-tools and OSRM Part 3

Learn python programming through algorithms - Binpacking part 2

Create a route optimization algorithm with zero costs using google's OR-tools and OSRM Part 1