Secure Code Review Best Practices

Jared Ablon
8 min readJul 14, 2020

--

Originally posted here: https://blog.hackedu.com/secure-code-review-best-practices

Introduction

Static code analysis testing with automated tools can enable analyzing large codebases in minutes and identify a wide range of vulnerabilities. But static analysis tools limitations, especially with business logic vulnerabilities. Since no one tool or strategy is 100% in removing vulnerabilities from software, developers should also review their code for security flaws manually.

This article is a short introduction that outlines the most important things to look for in a secure code review.

What is a Secure Code Review?

A secure code review is a part of the code review process to identify missing best practices early in the Software Development Lifecycle (SDLC), resulting in fewer vulnerabilities in the production. Over time, software engineers have defined various security best practices that can protect an application against common web vulnerabilities such as those listed in the OWASP Top 10 or CWE/SANS Top 25.

Let’s suppose one of your colleagues sent you code to review. What should you look for?

  • Broken Authentication / Broken Access Control
  • Database communication security
  • Data encryption
  • Data protection
  • Error handling
  • File management
  • Hardcoded credentials
  • Input validation
  • Language-specific issues (e.g., type juggling in PHP)
  • Memory management
  • Output sanitization
  • Security through obscurity
  • Transport Layer Security

That’s quite a list. Of course, you won’t be able to tackle all of these without spending weeks or more. So we need to narrow down the list and focus only on the most important vulnerabilities and the ones that are best to be caught using secure code reviews. Keep in mind that this stage is not meant to necessarily find every security flaw in an application, but to offer developers insights into what classes of vulnerabilities exist. Fortunately, some of the security best practices from above are more important and effective than others. So, we can shrink that list to five items based on the severity and the number of flaws that each best practice prevents:

  • Broken Authentication / Broken Access Control
  • Database communication security
  • Data encryption
  • Input validation
  • Output sanitization

Now let’s discuss each of these.

Broken Authentication / Broken Access Control

Broken Authentication and Broken Access Control are two types of logic vulnerabilities that cannot be easily identified using automated tools as they require an understanding of the application behavior.

A Broken Authentication vulnerability occurs when the application fails to check whether the user is logged in before permitting access to a specific resource/feature. Here is a short example:

@app.route(‘/internal-dashboard/users’, methods=[‘GET’])

def showUsers():

users = db.getAllUsers()

return render_template(“dashboard/users.html”, users=users)

In this case, anyone who knows the route can view the details of all registered users without being logged in.

A Broken Access Control vulnerability occurs when the web application fails to check whether the user is authorized to access a specific resource. As a result an attacker can view, modify, or delete data belonging to other users. For instance, let’s consider the following code snippet from a Flask app:

@app.route(‘/api/delete-invoice’, methods=[‘GET’])

def deleteInvoice():

user = users.get_user()

if not user[“authenticated”]:

return redirect(“/login”)

invoice_id = request.args.get(“invoice_id”)

return ‘success’

This code only verifies that the user is logged, but not if the user is the owner of that invoice. As a result an attacker may be able to delete any invoice from the database.A secure version should look like this:

@app.route(‘/api/delete-invoice’, methods=[‘GET’])

def deleteInvoice():

user = users.get_user()

if not user[“authenticated”]:

return redirect(“/login”)

if not user.hasPermission():

return redirect(‘/forbidden’)

invoice_id = request.args.get(“invoice_id”)

return ‘success’

Database communication security

Most modern web applications use a Database Management System (DBMS) to store and easily retrieve information from a database. However, it is crucial to make sure that user supplied data is not introduced directly into queries executed by the database engine. Otherwise, the application will be vulnerable to SQL Injection and as a result, an attacker could view/modify/delete sensitive data of other users or even get unauthorized access to the entire system. Here is an example of a NodeJS vulnerable code:

connection.query(“SELECT * FROM user WHERE username = ‘“+ req.body.username +”’ and password = ‘“ + req.body.password + “‘“, function (error, results, fields) {

if (error) throw error;

}

As you can see, the user-provided data is embedded directly in the SQL query. If the user inserts admin and a’ or ‘1’=‘1, the user will be able to log into the admin account without knowing the password because the SQL statement has been altered.

The most efficient way to prevent SQL Injections is by using prepared statements:

connection.query(“SELECT * FROM user WHERE username = ? and password = ?”, [username, password], function (error, results, fields) {

if (error) throw errror;

}

If you want to learn more about how prepared statements work, check out this blog post.

Data encryption

Web applications often involve encryption to keep sensitive data confidential. However, there are several common pitfalls that you should be aware of.

Hardcoded credentials

Web developers sometimes use hardcoded credentials/secrets for quick tests and easy access when needed. However, sometimes developers forget to remove the secrets before deploying the application to production and even publish them to git platforms. This practice poses a significant security risk that can allow attackers to bypass authentication mechanisms or to increase the severity of a vulnerability they already found. In 2016, Uber had a data breach that exposed information of 57 million customers due to some hardcoded credentials publicly available in one of their Github repositories.

Here is a quick example of a vulnerable code snippet:

$link = mysqli_connect(“75.77.1.133”, “root”, “supersecretpassword19231231ASD5675”, “prod_db”);

if (!$link) {

echo “Error: Unable to connect to MySQL.” . PHP_EOL;

echo “Debugging errno: “ . mysqli_connect_errno() . PHP_EOL;

echo “Debugging error: “ . mysqli_connect_error() . PHP_EOL;

exit;

}

Ideally, sensitive data such as credentials or secrets should be stored in a separate file (e.g., encrypted creds.env) and use placeholders instead of actual data.

Using user passwords in plain-text

Users’ passwords must be hashed and salted before storing them in a database. Therefore, if you see a code snippet as part of the authentication mechanism without it, that’s a red flag.

$username = $_GET[‘username’];

$password = $_GET[‘password’];

if($users->find($username)->password == $password)

// valid credentials

A secure alternative of this code would compare the hashed password from the database with the hashed version of the user-supplied password:

$username = $_GET[‘username’];

$password = $_GET[‘password’];

if($users->find($username)->password == Hash::make($password))

// valid credentials

Usage of weak cryptographic algorithms for cryptographic purposes

A secure cryptographic hash algorithm is defined as an algorithm that generates a unique, fixed-size hash for any variable-length input. One of the conditions for a hash function to be considered secure is to be collision resistance. This means the algorithm should have an extremely low chance to produce the same hash for two different inputs.

In the last years, researchers managed to identify and demonstrate vulnerabilities in two of the most used hashing algorithms (SHA1 and MD5). Therefore, when reviewing code, make sure the application does not use SHA1, MD5.

In addition, it is important to ensure that insecure cryptographic algorithms are not used for other purposes as well (e.g. communication, secure storage of data). For example, DES is no longer considered a secure algorithm, where as AES is considered the symmetric encryption algorithm to use.

Input validation

Input validation is the most fundamental defense mechanism and can be used to prevent many web vulnerabilities, including Injections, Cross-Site Scripting, Server Side Request Forgery, Local File Inclusion, and others. This best practice has its roots in a simple principle: never trust user input. For instance, if your web application asks the user for their age, then you should only allow numbers for that variable. Asking for a phone number, an email address, or a URL?

Make sure user input matches a specific pattern of allowed characters.

Example:

$phonenum = $_GET[‘phonenum’];

//Format: 000–0000–0000

if(preg_match(“/^[0–9]{3}-[0–9]{4}-[0–9]{4}$/”, $phonenum)) {

// $phonenum is valid

} else {

// $phonenum is invalid

}

If we run the above validation example against several payloads, we will see the following results:

Attack TypeUser InputResultNone123–4567–8910Valid phone numberNone1234Invalid phone numberNoneloremipsumInvalid phone numberSQL Injection‘ OR 1=1Invalid phone numberXSS<script>alert(1)</script>Invalid phone numberLFI../../etc/passwdInvalid phone number

As you can see, a simple validation rule can prevent various vulnerabilities.

Denylists are a red flag

A common mistake when validating user input is to use a denylist instead of an allowlist.

Denylisting means that the user input is valid if it does not contain unacceptable data, while allowlisting means that the user input is valid if it contains acceptable data. In other words, denylists allow everything that is not denylisted, while allowlists denies everything that is not allowlisted.

This may sound confusing, so let’s look at a real-world example. You have a key to your house door to unlock and enter. However, you don’t have a lock that has a list of every wrong key to not let people in (that’s denylisting). Instead, you give a key to every family member and they are the only ones who can unlock and enter (that’s allowlisting).

Now let’s suppose you are reviewing an avatar upload feature:

$ext = $avatar->ext;

$extensions = [‘7z’,’asm’,’asp’,’aspx’,’bak’,’bat’,’bin’,’bz2',’c’,’cc’,’cfg’,’cfm’,’cgi’,’class’,’cnf’,’conf’,’config’,’cpp’,’cs’,’csv’,’dat’,’db’,’dll’,’do’,’doc’,’dump’,’ep’,’err’,’error’,’exe’,’gz’,’htm’,’html’,’inc’,’ini’,’java’,’jhtml’,’js’,’jsf’,’jsp’,’key’,’lib’,’log’,’lst’,’manifest’,’mdb’,’meta’,’msg’,’nsf’,’o’,’old’,’ora’,’orig’,’out’,’part’,’pdf’,’php’,’php3',’phtml’,’pl’,’pm’,’ppt’,’properties’,’py’,’rar’,’rss’,’rtf’,’save’,’sh’,’shtml’,’so’,’sql’,’stackdump’,’swf’,’tar’,’tar.bz2',’tar.gz’,’temp’,’test’,’tgz’,’tmp’,’trace’,’txt’,’vb’,’vbs’,’ws’,’xls’,’xml’,’xsl’,’zip’];

if(in_array($ext, $extensions)) {

// invalid extension

} else {

// valid extension

}

The problem with denylists is that it relies on the developer to anticipate any edge case of possible unacceptable data. In most cases, this is not possible and it’s just a matter of time until an attacker finds a bypass. So a safer version of the above code snippet would be:

$ext = $avatar->ext;

$acceptedExtensions = [‘png’, ‘jpg’, ‘jpeg’];

if( ! in_array($ext, $extensions)) {

// invalid extension

} else {

// valid extension

}

As a rule of thumb, whenever you see a denylist filter, that’s a red flag that should be further investigated. Also, make sure user input is always validated before being used.

Output sanitization

Output sanitization is the process in which unsafe characters are encoded using different encoding schemes such as HTML entities or URL encoding. This practice relies on the same principle as input validation: never trust user-supplied data. This way, edge cases that cannot be stopped by input validation can be safely output (e.g. displayed when outputting HTML, avoiding Cross-Site Scripting vulnerabilities).

Example:

<form method=”POST” action=”/login”>

<div class=”form-group”>

<label for=”username”>Username:</label>

<input type=”username” class=”form-control” name=”username” autocomplete=”off”>

</div>

<div class=”form-group”>

<label for=”password”>Password:</label>

<input type=”password” class=”form-control” name=”password”>

</div>

<?php if($error) echo ‘User ‘.$username.’ is not in our database.’; ?>

<button type=”submit” class=”btn btn-primary”>Sign In</button>

</form>

In this case, if the user introduces <script>alert(1)</script> in the username field, the Javascript code will be executed. Here is the secure version of this code:

<form method=”POST” action=”/login”>

<div class=”form-group”>

<label for=”username”>Username:</label>

<input type=”username” class=”form-control” name=”username” autocomplete=”off”>

</div>

<div class=”form-group”>

<label for=”password”>Password:</label>

<input type=”password” class=”form-control” name=”password”>

</div>

<?php if($error) echo ‘User ‘.htmlentities($username, ENT_QUOTES).’ is not in our database.’; ?>

<button type=”submit” class=”btn btn-primary”>Sign In</button>

</form>

Most modern frameworks such as Angular, React, Vue.JS, Laravel, Flask, and others escape output by default, but you still need to be careful with the context in which data is introduced.

Conclusion

Secure code reviews are an important part of a secure software development lifecycle. When doing a code review focus on areas that are better suited for manually reviewing code and having a human in the loop such as Broken Authentication / Broken Access Control, Database communication security, Data encryption, Input validation, and Output sanitization.

Specifically, ensure that:

  • User input is validated correctly
  • Web applications use allowlist validation and not denylist validation
  • User-supplied data is encoded before displaying it to the user
  • Authentication and authorization mechanisms are in place, where required
  • There are no hardcoded credentials
  • The application doesn’t use broken cryptographic algorithms for cryptography purposes
  • The application uses prepared statements to communicate with databases

Read more here: https://blog.hackedu.com/secure-code-review-best-practices

--

--

Jared Ablon

Co-founder and CEO of HackEDU. Previously a CISO. 15 years in cybersecurity.