Other Approaches
In this particular engagement, we obtained enough access that we did not feel the need to do much more, but other steps could have been taken. We'll touch on the ones that we can think of now, though we are quite certain that this is not comprehensive.
We are also aware that not all approaches work with all databases, and we can touch on some of them here.
- Use xp_cmdshell
- Microsoft's SQL Server supports a stored procedure xp_cmdshell that permits what amounts to arbitrary command execution, and if this is permitted to the web user, complete compromise of the webserver is inevitable.
- What we had done so far was limited to the web application and the underlying database, but if we can run commands, the webserver itself cannot help but be compromised. Access to xp_cmdshell is usually limited to administrative accounts, but it's possible to grant it to lesser users.
- Map out more database structure
- Though this particular application provided such a rich post-login environment that it didn't really seem necessary to dig further, in other more limited environments this may not have been sufficient.
- Being able to systematically map out the available schema, including tables and their field structure, can't help but provide more avenues for compromise of the application.
- One could probably gather more hints about the structure from other aspects of the website (e.g., is there a "leave a comment" page? Are there "support forums"?). Clearly, this is highly dependent on the application and it relies very much on making good guesses.
Mitigations
We believe that web application developers often simply do not think about "surprise inputs", but security people do (including the bad guys), so there are three broad approaches that can be applied here.
- Sanitize the input
- It's absolutely vital to sanitize user inputs to insure that they do not contain dangerous codes, whether to the SQL server or to HTML itself. One's first idea is to strip out "bad stuff", such as quotes or semicolons or escapes, but this is a misguided attempt. Though it's easy to point out some dangerous characters, it's harder to point to all of them.
- The language of the web is full of special characters and strange markup (including alternate ways of representing the same characters), and efforts to authoritatively identify all "bad stuff" are unlikely to be successful.
- Instead, rather than "remove known bad data", it's better to "remove everything but known good data": this distinction is crucial. Since - in our example - an email address can contain only these characters:
-
abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
0123456789
@.-_+ - There is really no benefit in allowing characters that could not be valid, and rejecting them early - presumably with an error message - not only helps forestall SQL Injection, but also catches mere typos early rather than stores them into the database.
-
Sidebar on email addresses
It's important to note here that email addresses in particular are troublesome to validate programmatically, because everybody seems to have his own idea about what makes one "valid", and it's a shame to exclude a good email address because it contains a character you didn't think about.
The only real authority is RFC 2822 (which encompasses the more familiar RFC822), and it includes a fairly expansive definition of what's allowed. The truly pedantic may well wish to accept email addresses with ampersands and asterisks (among other things) as valid, but others - including this author - are satisfied with a reasonable subset that includes "most" email addresses.
Those taking a more restrictive approach ought to be fully aware of the consequences of excluding these addresses, especially considering that better techniques (prepare/execute, stored procedures) obviate the security concerns which those "odd" characters present.
- Be aware that "sanitizing the input" doesn't mean merely "remove the quotes", because even "regular" characters can be troublesome. In an example where an integer ID value is being compared against the user input (say, a numeric PIN):
-
SELECT fieldlist
FROM table
WHERE id = 23 OR 1=1; -- Boom! Always matches! - In practice, however, this approach is highly limited because there are so few fields for which it's possible to outright exclude many of the dangerous characters. For "dates" or "email addresses" or "integers" it may have merit, but for any kind of real application, one simply cannot avoid the other mitigations.
- Escape/Quotesafe the input
- Even if one might be able to sanitize a phone number or email address, one cannot take this approach with a "name" field lest one wishes to exclude the likes of Bill O'Reilly from one's application: a quote is simply a valid character for this field.
- One includes an actual single quote in an SQL string by putting two of them together, so this suggests the obvious - but wrong! - technique of preprocessing every string to replicate the single quotes:
-
SELECT fieldlist
FROM customers
WHERE name = 'Bill O''Reilly'; -- works OK - However, this naïve approach can be beaten because most databases support other string escape mechanisms. MySQL, for instance, also permits \' to escape a quote, so after input of \'; DROP TABLE users; -- is "protected" by doubling the quotes, we get:
-
SELECT fieldlist
FROM customers
WHERE name = '\''; DROP TABLE users; --'; -- Boom! - The expression '\'' is a complete string (containing just one single quote), and the usual SQL shenanigans follow. It doesn't stop with backslashes either: there is Unicode, other encodings, and parsing oddities all hiding in the weeds to trip up the application designer.
- Getting quotes right is notoriously difficult, which is why many database interface languages provide a function that does it for you. When the same internal code is used for "string quoting" and "string parsing", it's much more likely that the process will be done properly and safely.
- Some examples are the MySQL function mysql_real_escape_string() and perl DBD method $dbh->quote($value).
- These methods must be used.
- Use bound parameters (the PREPARE statement)
- Though quotesafing is a good mechanism, we're still in the area of "considering user input as SQL", and a much better approach exists: bound parameters, which are supported by essentially all database programming interfaces. In this technique, an SQL statement string is created with placeholders - a question mark for each parameter - and it's compiled ("prepared", in SQL parlance) into an internal form.
- Later, this prepared query is "executed" with a list of parameters:
- Example in perl
$sth = $dbh->prepare("SELECT email, userid FROM members WHERE email = ?;");
$sth->execute($email); - Thanks to Stefan Wagner, this demonstrates bound parameters in Java:
- Insecure version
Statement s = connection.createStatement();
ResultSet rs = s.executeQuery("SELECT email FROM member WHERE name = "
+ formField); // *boom* - Secure version
PreparedStatement ps = connection.prepareStatement(
"SELECT email FROM member WHERE name = ?");
ps.setString(1, formField);
ResultSet rs = ps.executeQuery(); - Here, $email is the data obtained from the user's form, and it is passed as positional parameter #1 (the first question mark), and at no point do the contents of this variable have anything to do with SQL statement parsing. Quotes, semicolons, backslashes, SQL comment notation - none of this has any impact, because it's "just data". There simply is nothing to subvert, so the application is be largely immune to SQL injection attacks.
- There also may be some performance benefits if this prepared query is reused multiple times (it only has to be parsed once), but this is minor compared to the enormous security benefits. This is probably the single most important step one can take to secure a web application.
- Limit database permissions and segregate users
- In the case at hand, we observed just two interactions that are made not in the context of a logged-in user: "log in" and "send me password". The web application ought to use a database connection with the most limited rights possible: query-only access to the members table, and no access to any other table.
- The effect here is that even a "successful" SQL injection attack is going to have much more limited success. Here, we'd not have been able to do the UPDATE request that ultimately granted us access, so we'd have had to resort to other avenues.
- Once the web application determined that a set of valid credentials had been passed via the login form, it would then switch that session to a database connection with more rights.
- It should go almost without saying that sa rights should never be used for any web-based application.
- Use stored procedures for database access
- When the database server supports them, use stored procedures for performing access on the application's behalf, which can eliminate SQL entirely (assuming the stored procedures themselves are written properly).
- By encapsulating the rules for a certain action - query, update, delete, etc. - into a single procedure, it can be tested and documented on a standalone basis and business rules enforced (for instance, the "add new order" procedure might reject that order if the customer were over his credit limit).
- For simple queries this might be only a minor benefit, but as the operations become more complicated (or are used in more than one place), having a single definition for the operation means it's going to be more robust and easier to maintain.
- Note: it's always possible to write a stored procedure that itself constructs a query dynamically: this provides no protection against SQL Injection - it's only proper binding with prepare/execute or direct SQL statements with bound variables that provide this protection.
- Isolate the webserver
- Even having taken all these mitigation steps, it's nevertheless still possible to miss something and leave the server open to compromise. One ought to design the network infrastructure to assume that the bad guy will have full administrator access to the machine, and then attempt to limit how that can be leveraged to compromise other things.
- For instance, putting the machine in a DMZ with extremely limited pinholes "inside" the network means that even getting complete control of the webserver doesn't automatically grant full access to everything else. This won't stop everything, of course, but it makes it a lot harder.
- Configure error reporting
- The default error reporting for some frameworks includes developer debugging information, and this cannot be shown to outside users. Imagine how much easier a time it makes for an attacker if the full query is shown, pointing to the syntax error involved.
- This information is useful to developers, but it should be restricted - if possible - to just internal users.
Note that not all databases are configured the same way, and not all even support the same dialect of SQL (the "S" stands for "Structured", not "Standard"). For instance, most versions of MySQL do not support subselects, nor do they usually allow multiple statements: these are substantially complicating factors when attempting to penetrate a network.
We'd like to emphasize that though we chose the "Forgotten password" link to attack in this particular case, it wasn't really because this particular web application feature is dangerous. It was simply one of several available features that might have been vulnerable, and it would be a mistake to focus on the "Forgotten password" aspect of the presentation.
This Tech Tip has not been intended to provide comprehensive coverage on SQL injection, or even a tutorial: it merely documents the process that evolved over several hours during a contracted engagement. We've seen other papers on SQL injection discuss the technical background, but still only provide the "money shot" that ultimately gained them access.
But that final statement required background knowledge to pull off, and the process of gathering that information has merit too. One doesn't always have access to source code for an application, and the ability to attack a custom application blindly has some value.