Preface

An appreciated facet of my life are the various happenstance connections I've made with a wide range of individuals. Despite having studied computer science for so long, most friendships and acquaintances were formed outside of the domain. This has been a strong theme in many of my writings, which often revolve around experiencing moments with this ever growing assorted group of individuals. During my graduate years, extra effort was spent casting a wider net.

During a time, the casting of the social net would land upon the university's bouldering wall. One thing that impressed me about the wall were the people who attended it. If one were to walk through the recreation center, a glance to the right would show isolated individuals trying their best to ignore each other. This awkwardness would exist while taking silent turns having at equipment made for individual use. A glance to the left was stark; a climbing wall with a vibrant splay holds and groups of people accompanied by a positive murmur of discussion. Highlights of discussion include group strategy for approaching a given route and cheerful encouragement of one who was about finish a particular one.

Discussion would typically occur amongst pockets of individuals who were college aged - early to mid 20s. Fortunately, this wasn't a rule. Being a non-traditional student in their early 30s, it was easy for me to integrate. Considering social norms, I could understand how it would be a circumstance a bit daunting to step into - especially if one were to look at the opposite side of the rec center. For this reason, I was sure to put forth effort to reciprocate the ease of experiencing the social perks associated with the climbing wall.

One such individual met through these circumstances was someone who seemed to be from a generation or two beyond me; an older gentleman who was learning the art of bouldering whilst enjoying the benefits of both the body-weight exercise and the associated skills of body awareness. Conversation was spurred by discussing a how to climb particular route.

Turns out this individual was a physics professor. Continued discussion would reveal that he was looking for a software engineer to make a user interface for an image processing script he had finished making. Providence would have it that I'm a computer scientist with a good amount of experience both in software development with user interface implementation.

The script would reprocess an image to one that had the appearance of being produced using the aquatint printmaking technique. The product would be the same image with a more old-time aesthetic to it; the image would be more grainy with a warm temperature reminiscent of early print making.

The professor's script, through the web app I would end up building, would be used to produce three pieces of building art for the Physics Research Center at the University of Chicago. It's fitting that a serendipitous connection at a climbing wall would manifest on a wall at an educational institution elsewhere.

The main body of this page discusses three primary facets of the web app that I find interesting: handling of the original script, upload security, and supplying feedback to a user.

To use the web app, follow this link: Aquatint Image Processor


Web Development: Aquatint Image Processor

Interpretation of the original Python script

It seems to be typical for other academic domains to produce code within in a Jupyter notebook. The advantage here is it allows the output of running code to be interspersed within the script itself; contrary to being produced in a mutually exclusive environment, (such as stdout or some log file). This makes sense given the context of a computer scientist as someone who is moulded in such a manner as to think a like a machine - one that is able to easily parse through such output.

An advantage to being provided a script within a Jupyter notebook is that it's easier to discern the sections the developer finds important. The script received from Professor Meurice took advantage of matplotlib to display the reprocessed images. Within the notebook, each significant step was capped by a display of the reprocessed image as-in progress. For example, the first significant step of the algorithm was to apply a grey-scale to each individual pixel. A given image would be read in using the imageio library then processed as such:

im2 = imageio.imread(filename)
Nix=im2.shape[0]
Niy=im2.shape[1]
grayimage=np.zeros([Nix,Niy])
for i in range(0,Nix):
        for j in range(0,Niy):
            blueComponent = im2[i][j][0]
            greenComponent = im2[i][j][1]
            redComponent = im2[i][j][2]
            grayValue = 0.07 * blueComponent + 0.72 * greenComponent + 0.21 * redComponent
            grayimage[i][j] = grayValue
            pass
dsqin=1-grayimage/255.0
hsimage=plt.imshow(dsqin,cmap='Greys',aspect=1,interpolation='none')
plt.colorbar(hsimage)
plt.show(hsimage)
                            

An import of matplotlib.pyplot as plt preceded this block. Knowing this, take note of the usage of pyplot's show method near the end of the code snippet.

The set of images produced in the notebook, along with the various textual/comment blocks, made it easy to discover a set of variables that can be set by a user to tune the appearance of an image that has been processed by the Aquatint script. These variables are a greycut, temperature, and the amount of sweeps to be applied.

Greycut was well defined within the documentation provided in the notebook:

The output image will have only black and white pixels so it is a good exercise to convert the original one to this form. Provide a greycut (contrast) number between 0 and 1. This converts the grey pixels into black (above greycut) and white (below greycut).

The other values weren't clearly defined here. Since the product of the script is of visual nature, this provided an opportunity to produce something which can visually inform the variance of these values can produce. This would require a combinatoric production of the same image using a valid range of values. I refactored the code from the Jupyter notebook into an external python file and ran the following bash script:

sweeps=1
while [ $sweeps -le 5 ]
do
    greycut=1
    while [ $greycut -le 9 ]
    do
        greycut_float=`bc <<< "scale=2; ${greycut}/10"`
        temperature=1
        while [ $temperature -le 9 ]
        do
            `python3 "aquatintScript.py" "cycle.png" $greycut_float $temperature $sweeps`
            temperature=`expr $temperature + 2`
        done
        greycut=`expr $greycut + 2`
    done
    sweeps=`expr $sweeps + 1`
done
                            

The ranges of the loops were restricted to keep a reasonable runtime of this process. A lot of the image processing is dependent on the amount of pixels contained in a given image, so the input image size was kept low enough whilst ensuring enough pixels were available to be able to gauge the differences the other input parameters bring to the process.

One-hundred and twenty-five images in total were produced on account of running the bash script. Those who have Javascript enabled for this page can view the effect of each variable, (with respect to the values set for the others), within the following figure:

The view provided here helps establish the set of user controls needed to actually implement the web app.


The Web App

What's been discussed so far has involved Jupyter notebooks, Python, and Bash scripting. These are technologies not often associated with the core of web development. A utilitarian product needed to be produced with a hint of time constraint. Thus, I opted for using the Bootstrap framework to handle the front-end styling. The view provided by the previous figure implies that Javascript is also at play for the web app. Finally, an engine was needed to process the uploads.

My experience using Python as a web server back-end is minimal. It is likely that using Python here would lend well to the situation considering the Aquatint scripts were written in the language. At the time, I had no experience handling file uploads. The back-end most familiar to me was PHP. Thus I decided to take the opportunity to shore up that gap in my experience with the language.

Upload Security

The upload form consists of three sliders and a file upload box. It needs to be ensured that the selections for the sliders are numbers that reside in a specific range. It also needs to be ensured that the file being uploaded is indeed an image with an applicable type. It is not sufficient to rely on the constraints set forth by the front-end in general. Any individual can change the arbitrary restrictions enforced by these mechanisms through either the element inspector or by circumventing a browser environment by means of an http post with some headless program.

The assurance of valid slider input is trivial through the back-end logic. Each slider has an id associated with the control. On submission post, PHP will check whether the posted values are numeric and whether they exist in the expected range. Assurance that a given file is indeed an image is another story.

An individual who is used to operating in a computing environment that abstracts away file information may think it's sufficient to simply check the extension as it is given within the file name. Any power user, (or any user of a linux distribution), will know this is lacking.

The key methods used to ensure image upload are basename, pathinfo, exif_imagetype, and image_type_to_mime_type.

The basename function is used to truncate any attempts to submit a filename that tries to traverse the server's file system. Consider a post variable with the identifier of uploadImage:

$origin_file = $target_dir . basename($_FILES['uploadImage']['name']);
                            

The pathinfo function is used to isolate the extension of the filename string. Contrary to sentiment posed in paragraphs prior, this is still worthy of checking to provide useful feedback for those who are making sincere attempts at using the application.

$imageFileType = strtolower(pathinfo($target_dir . $origin_file,PATHINFO_EXTENSION));
                            

The exif_imagetype function provides a means within PHP to drill down to byte-level in order to validate file structure. This is validated further by image_type_to_mime_type which makes use of Apache's mime_magic module to make the same assurance.

$target_file = $target_dir . $file_name  . "." . $imageFileType;
$check = exif_imagetype($_FILES['uploadImage']['tmp_name']);
$mimeType = image_type_to_mime_type($check);
                            

These functions are used in conjunction with safe system administration procedure. Within the Linux environment, proper(ly strict) permissions are granted to the upload folder while Apache configuration restricts file-type access to the folder to only allow access to what is relevant.

Slider and file selection input has been validated. A keen observer will discover a hidden input form. A decision was made to assign a random name to the uploaded file as it is placed into the upload folder. This is an attempt to decouple any malicious attempts at file system traversal and malicious script execution vectors that the previous measures may have missed. The back-end generates this random string as the template for the submission page is built.

$file_name = '';
for($i = 0; $i <= rand(10,20); $i++){
    $new_ord = rand(87,122);
    if($new_ord >= 97){
        $file_name = $file_name . chr($new_ord);
    }else{
        $file_name = $file_name . $new_ord;
    }
}
                            

A hidden input form was opted instead of using a query string to embed this information. The reason for this was to keep the submission url clean. Another reason involves the necessity to know the filename before any submission is made! This relates to giving the user feedback of progress once they've made a submission.


$target_dir = 'uploads/';
$uploadOk = 1;

//Validate string form controls:
if(isset($_POST['hidden_file_name']) && isset($_FILES['uploadImage']['name'])){
    if(strlen($_FILES['uploadImage']['name']) <= 0 || strlen($_POST['hidden_file_name']) <= 0){
        $uploadOk = 0;
    }else{
        $preg_result = preg_match("/\A([a-z0-9]+)\z/",$_POST['hidden_file_name']);
        if($preg_result == 0){
            echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from altering hidden form.</div>';
            $uploadOk = 0;
        }
    }

    if($uploadOk == 1){
        $file_name = $_POST['hidden_file_name'];
        $origin_file = $target_dir . basename($_FILES['uploadImage']['name']);
        $imageFileType = strtolower(pathinfo($target_dir . $origin_file,PATHINFO_EXTENSION));
        $target_file = $target_dir . $file_name  . "." . $imageFileType;
        $check = exif_imagetype($_FILES['uploadImage']['tmp_name']);
        $mimeType = image_type_to_mime_type($check);
    }
}else{
    $uploadOk = 0;
}
                                
//Validate filetype:
if($uploadOk == 1 ){
    if($check !== false){
        //$uploadOk = 1;
        if($_FILES['uploadImage']['size'] > 1048576){
            echo '<div class="alert alert-danger"><strong>Warning!</strong> Sorry, your file is too large.</div>';
            $uploadOk = 0;
        }
    }else{
        echo '<div class="alert alert-danger"><strong>Warning!</strong> File is not an image.</div>';
        $uploadOk = 0;
    }

    if( ($imageFileType != 'jpg') && ($imageFileType != 'png') && ($imageFileType != 'jpeg') && ($imageFileType != 'gif') ){
        echo '<div class="alert alert-danger"><strong>Warning!</strong> Only jpg, jpeg, png, and gif files are allowed.</div>';
        $uploadOk = 0;
    }else

    if( ($mimeType != 'image/gif') && ($mimeType != 'image/jpeg') && ($mimeType != 'image/png') ){
        echo '<div class="alert alert-danger"><strong>Warning!</strong> Only jpg, jpeg, png, and gif files are allowed.</div>';
        $uploadOk = 0;
    }

}
                                
//Validate numeric form controls:
if($uploadOk == 1){

    //Greycut:
    try{
        $greycut = (float) $_POST['greycut'];
        if($greycut < 0 || $greycut > 1){
            echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
            $uploadOk = 0;
        }else{
            $greycut = (string) $greycut;
        }
    }catch (Exception $ex){
        echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
        $uploadOk = 0;
    }

    //Temperature:
    try{
        $temperature = (float) $_POST['temperature'];
        if($temperature < 0.1 || $temperature > 10){
            echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
            $uploadOk = 0;
        }else{
            $temperature = (string) $temperature;
        }
    }catch (Exception $ex){
        echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
        $uploadOk = 0;
    }

    //Total Sweeps:
    try{
        $totalsweeps = (float) $_POST['totalsweeps'];
        if($totalsweeps < 1 || $totalsweeps > 10){
            echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
            $uploadOk = 0;
        }else{
            $totalsweeps = (string) $totalsweeps;
        }
    }catch (Exception $ex){
        echo '<div class="alert alert-danger"><strong>Warning!</strong> Please refrain from changing form values with the element inspector.</div>';
        $uploadOk = 0;
    }

}
                                
Expand the headers above to see the relevant code-block for each tier of validation.

Providing feedback

Once the validation check passes, move_uploaded_files is called such that the receiving file is given the random name generated within the hidden input form. This new file's name will be used in concatenating a string to use for an exec call:

$script = 'python3 aquatintScript.py "'.$target_file.'" '.$greycut.' '.$temperature.' '.$totalsweeps;
exec($script,$output,$result);
                            

All the user has to do now is wait for the Aquatint script to complete; And to wait they shall! Dependent on resolution, it may take a good amount of time for an image to process. The algorithm itself is at least quadratic in runtime with respect to the amount of pixels an image has. This is said ignoring unknown runtime of any library method calls embedded within the iteration of these pixels. This potential for slowness is compounded by the measly 1 gigabyte of ram and 2Ghz single core CPU that my LAMP server has access to. For these reasons, it is necessary to let the user know how far along the Aquatint conversion process is.

Some would rightfully claim that using exec is a code smell. It took a bit to convince myself that the steps described in the previous section are adequate to scrub input of malicious intent. In terms of running the Aquatint script, a string is being passed that has been stripped of any POSIX compliant command. The same can be said in terms of the Python code embedded within the script itself.

Initial intuition for supplying a user feedback on progress was to leverage print statements in the Aquatint script to expose progress through stdout. These print statements would occur upon the completion of significant steps within the script, such as when grey-scaling is finished. The problem here is that the exec call is an atomic operation in the eyes of PHP. One could skirt around this by instead of leveraging a call to proc_open, but programming intuition has me believing that this is an even greater code smell than using exec in the first place!

The propensity to lean towards leveraging stdout is likely related to the opening discussion of this article; it is influenced by experiences of studying computer science which has been heavily guided by interpretation of the stdout environment. Program state can still be relayed through other mechanisms. The solution to this problem was leveraging the ability to write to file instead of standard output.

The decision to write state to some file, which can then be read by the application, aligns more with the intuit of a web developer. Implementation of a Restful API has been the cornerstone of one significant project throughout my studies. This was by means of a PHP project whilst attending community college. The span of time since then once again warranted practice for the sake of refilling a gap of knowledge.

The process is as follows: When the submission form is loaded, the back-end automatically generates a filename. This has already been discussed in the section prior. It's necessary to know the file-name before the submission is posted to the server. Write this string to the hidden form and write it to a JSON file that only the back-end may access. This file will serve as a map for an API to use.

$file_name = '';
for($i = 0; $i <= rand(10,20); $i++){
    $new_ord = rand(87,122);
    if($new_ord >= 97){
        $file_name = $file_name . chr($new_ord);
    }else{
        $file_name = $file_name . $new_ord;
    }
}
$json = file_get_contents("map.json");
$json_data = json_decode($json,true);
$json_data[$file_name] = array("status" => 0, "time" => time());
file_put_contents("map.json",json_encode($json_data));
                            

Once the submit button is pressed, PHP will only serve the resulting page once all the program statements are completed. This includes the execution of the exec statement. This necessitates the need to have a filename pre-generated. Thus, a Javascript event listener is added to the submission button as a means know when it is pressed. This event listener must know the string that represents the filename.

<div class='form-group'>
    <input type='submit' value='Upload Image' name='submit' onclick="submit_process('<?php echo $file_name; ?>');" />
    <p id='wait' style='visibility:hidden;'><b>Please wait...</b></p>
</div>
                            

Once submit is pressed, toggle a visual prompt for the user that they should wait. Trigger an interval loop which runs an AJAX query to the server's API to query for the status of an upload based on the randomly generated filename. This query occurs once every three seconds.

submit_process = function(filestring){
    document.getElementById("wait").style.visibility = "visible";
    setInterval(function(){
        query("<?php echo $file_name;?>");
    },3000);
}
                            

Once submit is pressed, the Aquatint script will be run via the PHP script's exec command. Within the Aquatint script, create a JSON file that resides in the uploads folder. This JSON file is prefixed with the name of the file to be written. It will contain entry points to indicate when a certain step of the algorithm is completed. It will also have a spot to indicate the progress of the current step being executed.

status_dict = {"origin":False,"greycut":False,"temperature":False,"sweeps":dict(),"finished":0,"total":3+totalsweeps,"progress":0}

for i in range(0,totalsweeps):
    status_dict['sweeps']["sweep"+str(i)] = False

write_to_json(filename.split('.')[-2]+'-status.json',json.dumps(status_dict))
                            

As the Aquatint program is running, it will write to the file once a significant step has been completed. Within each significant step, (usually embedded in an outer-loop), it will write a value indicating the percentage of the step completed. This will only be written for every 3% completed to save the amount of times the progress is written to this file.

# !!! This is the same loop described earlier in the reading.
#     It has been expanded to allow progress reporting.
#     Note that this is only a subsection of the Aquatint script.

im2 = imageio.imread(filename)
Nix=im2.shape[0]
Niy=im2.shape[1]
grayimage=np.zeros([Nix,Niy])

rewrite_switch = True
for i in range(0,Nix):
        for j in range(0,Niy):
            blueComponent = im2[i][j][0]
            greenComponent = im2[i][j][1]
            redComponent = im2[i][j][2]
            grayValue = 0.07 * blueComponent + 0.72 * greenComponent + 0.21 * redComponent
            grayimage[i][j] = grayValue
            pass
        status_dict['progress'] = i / Nix
        if round((i * 100) / Nix) % 3 == 0:
            if rewrite_switch == True:
                write_to_json(filename.split('.')[-2]+'-status.json',json.dumps(status_dict))
                rewrite_switch = False
        else:
            rewrite_switch = True
status_dict["progress"] = 0

dsqin=1-grayimage/255.0
hsimage=plt.imshow(dsqin,cmap='Greys',aspect=1,interpolation='none')
#cb = plt.colorbar(hsimage)
plt.savefig(filename.split('.')[-2]+'-origin.jpg',dpi=300)

status_dict["origin"] = True
status_dict['finished'] += 1
write_to_json(filename.split('.')[-2]+'-status.json',json.dumps(status_dict))
                            

As the user's browser is waiting for a response of the post, the AJAX method will be querying the API and receiving new state written by the Python script. The AJAX call will work through a set of states which represent the completion of a certain step of the Aquatint conversion process. Percentage of a given step will be reported, and once a step is completed, a progress bar will be filled in.

query = function(filestring){
    var xmlhttp = new XMLHttpRequest();
    xmlhttp.onreadystatechange = function(){
        if (this.readyState == 4 && this.status == 200){
            result = this.responseText;
            finished = JSON.parse(result)[0];
            total = JSON.parse(result)[1];
            ratio = (finished/total) * 100;
            new_width = '' + ratio + "%";
            document.getElementById('progress-bar').style.width = new_width;
            progress_text_object = document.getElementById('progress-text');
            if(finished == 0){
                progress = JSON.parse(result)[2];
                progress_text_object.innerHTML = 'Step 1/'+total+'; Resizing and applying greyscale to original image: ' + Math.ceil(progress * 100) + '% complete.';
            }else if(finished == 1){
                progress_text_object.innerHTML = 'Step 2/'+total+'; Original image resized - Applying greycut...';
            }else if(finished == 2){
                progress_text_object.innerHTML = 'Step 3/'+total+'; Greycut applied - Applying temperature...';
            }else if(finished == 3){
                progress = JSON.parse(result)[2];
                progress_text_object.innerHTML = 'Step 4/'+total+'; Greycut and Temperature applied - Applying first sweep: ' + Math.ceil(progress * 100) + '% complete.';
            }else if(finished >= 4){
                progress = JSON.parse(result)[2];
                progress_text_object.innerHTML = 'Step '+(finished+1)+'/'+total+'; Applying sweep: ' + Math.ceil(progress * 100) + '% complete.';
            }
        }
    };
    xmlhttp.open("GET","status.php?id="+filestring,true);
    xmlhttp.send();
}
                            

<div class="progress">
    <div id="progress-bar" class="progress-bar progress-bar-striped progress-bar-animated bg-info" role="progressbar" style="width: 0%" >
    </div>
</div>
<div id='progress-text' class='alert alert-light'>
</div>
                            

For every query to the API, the PHP script will then look up the relevant JSON status file and report the relevant status. The AJAX query will make use of the returned information to fill in the relevant HTML elements to give the user a sense of progress.

<?php
$not_ready = json_encode(array(0,0));

if(isset($_GET['id'])){
    $result = preg_match("/\A([a-z0-9]+)\z/",$_GET['id']);
    $valid = 0;
    if($result == 1){
        $json = file_get_contents("map.json");
        $json_data = json_decode($json,true);
        if(isset($json_data[$_GET['id']])){
            $valid = 1;
        }else{
            echo $not_ready;
        }
    }else{
        echo $not_ready;
    }
}else{
    echo $not_ready;
}

if($valid == 1){
    try{
        $json = file_get_contents("uploads/".$_GET['id']."-status.json");
        $json_data = json_decode($json,true);
        $return = array($json_data["finished"],$json_data["total"],$json_data['progress']);
        echo json_encode($return);
    }catch(Exception $ex){
        echo $not_ready;
    }
}
?>
                            

To clean-up things on the server-side, every time the submission page is accessed, the back end will take a look at the mapping json file and keep all the items that are less than 30 minutes old. A cronjob also works on the server and deletes all files within the uploads folder that meets this criteria as well.

Finished product

The rest of the web app isn't worth elaborating upon. If a reader has been able to track what's been said thus far, the remaining details are both trivial and intuitive.

A working version of the web app can be viewed here: Aquatint Image Processor. To label this as a finished project is a mischaracterization. Future work includes a cancel button that allows a user to discontinue waiting for an image to process within an arbitrary sweep stage which could then forward the user to a page that displays the most recent applied sweep.

Another potential vector of future work is to add a mechanism to allow a user to bookmark a completion page and refer to it later. This would be a trivial endeavor by implementing a query string that the API can use to look up and return the relevant images in the uploads folder. There is hesitance in implementing this. It is at odds with the scheduled cleanup of the uploads folder. This scheduled cleanup is a security necessity from both a systems perspective and a social perspective - I cannot verify, in real time, the contents of an image and thus must assume the worse. The regularly scheduled deletion of the images helps moderate this.


Concluding notes

Credit for the Aquatint script goes to Professor Yannick Meurice at the University of Iowa. Literature pertaining to the script can be read through the American Journal of Physics Volume 90, Issue 2.


Images created by the University of Chicago in which the Aquatint Image Processor was leveraged.