How to prevent SQL Injection vulnerabilities: How Prepared Statements Work

Jared Ablon
4 min readFeb 11, 2020

--

Read the full post here: https://blog.hackedu.com/how-to-prevent-sql-injection-vulnerabilities-how-prepared-statements-w

Introduction

SQL Injection is a software vulnerability that occurs when user-supplied data is used as part of a SQL query. Due to improper validation of data, an attacker can submit a valid SQL statement that changes the logic of the initial query used by the application. As a result, the attacker can view/modify/delete sensitive data of other users or even get unauthorized access to the entire system.

While easy to fix, SQL Injection vulnerabilities are still prevalent. In this article, we will discuss how to prevent these vulnerabilities through good coding practices. We will focus on prepared statements, how they work, and how you can implement them.

What are Prepared Statements?

A prepared statement is a parameterized and reusable SQL query which forces the developer to write the SQL command and the user-provided data separately. The SQL command is executed safely, preventing SQL Injection vulnerabilities.

Here is an example of an unsafe approach in PHP:

$query = “SELECT * FROM users WHERE user = ‘$username’ and password = ‘$password’”;

$result = mysql_query($query);

As you can see, the user-provided data is embedded directly in the SQL query. If the user inserts admin and a’ or ‘1’=’1, she will be able to login to the admin account without knowing the password because the SQL statement has been altered.

Here is an example of a prepared statement approach in PHP:

$stmt = $mysqli->prepare(“SELECT * FROM users WHERE user = ? AND password = ?”);

$stmt->bind_param(“ss”, $username, $password);

$stmt->execute();

The user-supplied data is not directly embedded in the SQL query in this example. Instead of the user’s data there is a ? symbol. That is a placeholder and temporarily takes the place of the data. The SQL query is pre-compiled with placeholders, and the user’s data is added later. If the user inserts admin and a’ or ‘1’=’1, the initial SQL query logic won’t be changed. Instead, the database will look for a user admin whose password is literally a’ or ‘1’=’1.

How Prepared Statements work?

Before discussing how prepared statement works, let’s have a look at the SQL query processing workflow:

Fig 1: Oversimplified representation of SQL query processing

As you can see, the process involves six steps:

  1. Parsing — The SQL query is broken into individual words (also called tokens). Syntax error and misspelling checks are performed to ensure the validity of the SQL query.
  2. Semantics Check — The Database Management System (DBSM) establishes the validity of the query. Does the specified columns and table exist? Does the user have privileges to execute this query?
  3. Binding — The query is converted into a format understandable by machines: byte code. Next, the query is compiled and sent to the database server for optimization and execution.
  4. Query Optimization — The DBSM chooses the best algorithm for executing the query, considering the cost.
  5. Cache — The best algorithm is saved in the cache, so next time when the same query is executed it will skip the first four steps and jump straight to the execution.
  6. Execution — The query is executed and the results are returned to the user.

But how does a prepared statement go through this process since they are different from a normal query?

The process is similar, but with a few differences:

  1. Parsing and Semantics Check are the same.
  2. With Binding, the database engine detects the placeholders, and the query is compiled with placeholders. The user-supplied data will be added later.
  3. The Cache step is the same. The query is stored in cache for further use.
  4. Between Cache and Execution, there is an additional step: Placeholder Replacement. At this point, the placeholders are replaced with the user’s data. However, the query is already pre-compiled (Binding), so the final query will not go through compilation phase again. For this reason, the user-provided data will always be interpreted as a simple string and cannot modify the original query’s logic. Thus, the query will be immune to SQL Injection vulnerabilities for that data.

Fig 2. Oversimplified representation of SQL prepared statements processing

See examples in different programming languages and read more here: https://blog.hackedu.com/how-to-prevent-sql-injection-vulnerabilities-how-prepared-statements-work

--

--

Jared Ablon

Co-founder and CEO of HackEDU. Previously a CISO. 15 years in cybersecurity.