December 2019 - January 2020
Visual Studio Code | Microsoft Excel
The brief of this project was to write a script in Python which could fill spaces on a grid with tetrominoes (Tetris) pieces, as quickly and completely as possible. The grid size could be up to 1000 x 1000 squares large, with a certain area of the grid defined as a target to be covered. The grid was represented as a two dimensional array, with the squares to be covered being 1 and squares to leave blank being 0.
Key algorithm developments made throughout the project
Initial greedy algorithm
My basic greedy algorithm worked through the grid left to right, top to bottom and placed the first piece that fitted the target area as it scans. Initially, accuracy was 60-80% on average, depending of the grid size and target area density.
Shape rank optimisation
I realised that using this left to right top to bottom scan pattern meant that it was better to place some pieces before others. For example, it would be better to place a right angled piece with its corner in the top left than the bottom right, as the latter piece might now block off some squares to it's left. Using this observation, I redefined the origin of each shape to be the top left-most point. I set the algorithm to check through the shapes in a predetermined order, starting with those occupying more space in the upper left. This increased the potential coverage accuracy to about 90% with some cases still solving with about 70% accuracy.
Edge case optimisation
So far I have only placed pieces which fitted completely in the target area. However, the conditions of the problem stated that accuracy should be lost for every target square left unfilled and every square filled outside of the target. This meant it was beneficial to place a piece even if one of its four squares was outside the target, as not placing would leave three of the target squares unfilled, losing more accuracy. Therefore I wrote functions to test pieces that fitted perfectly in the target, protruded by one square, and protruded by two square. I prioritised number of target squares covered and then how highly the shape ranked in my predetermined list to determine which shape should be placed.
I then ran three tests at different sizes and accuracies to judge the strengths and weaknesses of my algorithm. I found above around 50x50, performance was consistent with lows of 84 in middle densities. Smaller grids were still an issue and inconsistent.
Another optimisation was implemented to solve the grid upside down, right to left, and all other combinations of orientations. I found that for smaller grids especially it could make a difference which way the grid was scanned. My method involved making transformed copies of the grid and solving them, then reversing the transform to obtain the original grid, albeit solved differently. While this didn't increase the maximum accuracy, it did greatly improve the consistency of results, ensuring no solve was worse than 80%. Above a certain size the orientation became negligible.
From the optimisations made, I was now solving each grid 12 times and choosing the best result (all orientations x all edge protrusion values). This was necessary for smaller grids to find the best solve and could be done in milliseconds, but for larger grids was unnecessary as found in the assessment. I implemented a check stage where the grid size and density was measured. Following further tests, I defined threshold values at which to stop checking multiple orientations, and stop checking multiple edge protrusion values.
Despite these optimisations, small grids were still unpredictable, sometimes solved perfectly other times with <70% accuracy. Therefore my final implementation was a backtracker, designed to perfectly solve small grids. The principle was to still use the ranked list of shapes to check through and only place ones with no protrusion. When the solver had finished, the grid was checked, if not 100% covered, the most recently placed piece was removed and the next piece in the list tried there. If no pieces work, then the next most recently placed piece removed, and so on, working back and forth through the grid until 100% filled. To enable the solver to progress and not loop through the same possibilities repeatedly, a cache was implemented to store failed grid layouts.
The backtracker was optimised one last time with the addition of a breadth-first-search method based on a geometric observation. If the scanner stepped past a target square in the grid without placing a shape then a mistake must already have been made, as there would be no left target space in a 100% covered grid. With this check running at each step, as soon as a target square was left blank the backtracking method was triggered, saving the time from running all the way to the end of the grid.
Greedy / Backtracking selector
After optimisation, I ran thousands of tests of different grid sizes and densities to find the worst cases my backtracking algorithm could still solve in a reasonable time. I used excel to plot maximum grid size against grid density, which showed the largest grid solvable for each density. A curve of best fit could be found along with the equation. The equation was then used in my code to determine when to use the greedy algorithm and when to use the backtracker. This method proved to be reliable, and completed my Python solution.
My final algorithm delivered perfect covers up to the grid size / density cut off point, and from there on solved from 88-96% accuracy, with time never exceeding 10 seconds.