Minimizing the Attack Surface, Part 1

What was the first thing you learned about network security? There’s a good chance it had something to do with port scanning. After scanning a few boxes, you realized that modern operating systems have a lot of open ports by default, meaning a lot of services. Some had an obvious purpose, like telnet on tcp/23 or ftp fon tcp/21. Others left you wondering, what the heck is listening on tcp/515 or tcp/7100? And remember, you couldn’t ask Google because it didn’t exist (well, maybe it did depending on when you got into security).

Your first real lesson about locking down a host was how to reduce its attack surface. You learned how to disable services using /etc/inetd.conf. Then you learned about rc.d and how to prevent unnecessary services from being launched at startup. Next, maybe you configured the Xserver to disallow remote connections or moved on to removing setuid permissions from files. As you worked, you’d periodically re-scan the box to gauge progress, asking yourself “have I removed everything I don’t need?” The underlying motivation, of course, is that an attacker can’t hack something that isn’t there.

You learned how to extend those concepts to the network — configuring firewall rules, router ACLs, VLANs, etc. Segmenting the network. Creating a DMZ. No need to dwell on this, you get the idea.

Eventually, people realized that applications had an attack surface too. Web servers and application servers got a lot of attention, followed closely by custom web applications. “What do you mean you can execute SQL queries against my database? That’s impossible, I have a firewall!”

Some companies, the ones who could afford it anyway, started to build security into their development cycle. Doing threat modeling during the design phase made sense, because hey, it’s much cheaper to fix security holes in a whiteboard drawing than it is to rewrite your authorization module from scratch after it’s in production.

Let’s talk strictly about custom web applications now. What I’ve observed is that most development groups, even the ones who actively engage in threat modeling, do not understand their web application’s attack surface. The lead architect can whiteboard a high-level diagram of all the major components and how they interact. Individual developers can go a bit deeper, telling you which files they touch, what database permissions they need, or how various pieces of data are encrypted in storage. At the end of this exercise you have a complete picture of the processes, data flows, protocols, privilege boundaries, external entities, and so on, and you’re well on your way to understanding all of the potential attack vectors.

Or are you?

What often gets overlooked or glossed over is the impact of external libraries or packages. Nobody writes everything from scratch. A typical list of third-party libraries for a Java-based Web 2.0 application might include DWR, GWT, Axis, and Dojo, plus about 30 other libraries to do everything from logging to parsing to image manipulation. Nine out of ten times, the libraries will be installed in full, using the default configuration from page one of the README file.

Why is this relevant? Because just as those old Unix boxes exposed unnecessary services, libraries expose unnecessary code. Let’s say you installed Dojo to simplify the process of creating an HTML table with rows and columns that can be sorted on demand. Did you remember to remove all the .js files you didn’t need? Or maybe you installed Axis or DWR or anything else that has its own Servlet(s) for processing requests. Have you compared what that Servlet can do against what you need it to do?

A fictitious example may help illustrate further. Imagine you just downloaded a new library called WhizBang. You follow the installation instructions to define and map two servlets in your web.xml file, WhizServlet and BangServlet, and you configure it to integrate with your web app. After a bit of trial and error, it’s functional. Yay! This is where most developers stop.

Nobody asks, “how much of this do I actually need?” Case in point, what if your application only uses WhizServlet? BangServlet is still exposed, and you don’t even use it! Similarly, what if WhizServlet takes an “action” parameter which can be either “view”, “edit”, or “delete”, and your application only uses “view”? You’re still exposing the other actions to anybody who knows the URL syntax (pretty trivial if it’s open source). You wouldn’t expose large chunks of your own code that you weren’t using, so why should it be any different with libraries?

This post is getting kind of long so I’m going to split it up. In the next post, I’ll continue the discussion of attack surface minimization, as well as some of the tradeoffs that go along with this approach.

Veracode Security Solutions
Veracode Security Threat Guides

Andre Gironda | June 24, 2008 4:29 pm

Chris,

I love these analogies, and I think I even see where you are going with this.

External libraries/packages/components are good in most situations, especially base class libraries and already-secure components such as OWASP ESAPI. Other times, they’re not so great, but you’re right — there are a few ways to solve these problems, but certain development shops must be ready for them and aware of how to integrate them into their lifecycle.

Aspect Security (authors of OWASP ESAPI) recently did some presentations at the OWASP AppSec EU 2008 conference in Belgium. Dave Wichers talked about Agile Security, and I’ve seen Jeff Williams write about it on The Register and possibly other places. Some of the benefits of Agile for Security they claim include Test-Driven Development (TDD), Sprints (i.e. short cycles of iterative programming), User stories (use cases written in longer “story” format), and constant Refactoring (to avoid the anti-pattern of Big-Design-Up-Front).

I read this and immediately thought it was brilliant. However, I told them that the word “Agile” is a difficult word to use because many shops aren’t one (Agile) or the other (Waterfall), but very often “their own thing”. However, the concepts of Test-first development, constant Refactoring, short Sprints, and User stories are all top-notch ideas — ideal for security purposes!

However, the BDUF argument is an interesting one (another that differs depending on the team or project), but the same concepts of short Sprints can also be applied for testing and inspection, especially with the noisy, high-error rate of static analysis tools such as FxCop or FindBugs (or worse, the popular commercial security review tools!). It is very nice to be able to refactor and add functionality, but I think it’s also smart to talk and prepare for that functionality from the start. Not to mention coding standards, which can be checked with much faster tools such as the style checkers StyleCop or PMD.

Last year, I was working with a development team and we went through a lot of these types of issues. I was very hyped about Continuous Integration and TDD, as well as build statistics such as code coverage and Cyclomatic complexity. After seeing many of the results, I think that McCabe doesn’t work well for security purposes, and that code coverage is merely a tool to help improve writing tests (not a goal or standard, per-se).

The tests can certainly help a ton, however the developers seemed to be faster to respond to issues with peer review using failed build reports (or just picking on the guy who never listens code). It helped that they were using a framework that supported MVC (although there were many valid arguments that MVP models were better, especially for testing purposes), as well as Dependency injection.

You could go on about Security Patterns for a long time, Design-by-Contract — we could talk about how Model-driven development or Behavior-driven development differs from Test-driven development… but really I think you’re getting to a few specific areas of interest that I really appreciate.

Developers already have to minimize surface area to avoid compile-time dependencies. It’s already in their benefit to do this sort of thing in order to have maintainable code. Unfortunately, modern languages (especially web application languages of both the bytecode and scripting varieties) expose public interface by default (marked as virtual). Note that last decade’s language choice, C++, did not — so you might make the claim that Classic ASP does have its benefits with its drawbacks (and there are other examples of this).

In C# (for ASP.NET), this is typically done by marking your classes as `sealed’. In addition, you can clobber base class methods with `new’. There are better ways to do this, however, which I will get to in a second.

If supported, you can limit dependencies between libraries with Dependency injection (DI). DI works by trading off compile-time for runtime dependencies through xml config files (although some DI frameworks, such as Google Guice are XML-free). I’m most familiar with Spring (for JEE, which also has Guice and PicoContainer as DI options), but I’ve been looking at C# .NET ones such as Castle Project MonoRail / Windsor, as well as the new ones: Ninject and Microsoft’s very own Unity DI framework. Somebody smart is probably hitting up Wikipedia right now, but those are the only ones that I’ve familiar with or have read about.

Additionally, this helps with testing in isolation: a huge win for security testing. Stephen de Vries wrote an excellent paper on this subject here – http://research.corsaire.com/whitepapers/technical.html “Security Testing Applications through Automated Software Tests”. In it, Stephen says:

“Unit tests should only test a single class and should not rely on helper or dependent classes. Since few classes exist in such a form of isolation it is usually necessary to create a “stub” or “mock” of the helper class that only does what is expected by the calling class and no more. Using this technique has the added benefit of allowing developers to complete modules in parallel without having to wait for dependent modules to be completed. To enable this form of testing it is important that the code is pluggable, this can be achieved by using the Inversion of Control (IoC) or Service Locator design patterns. Pluggable code using these patterns is a worthy goal in itself and the ease with which they allow tests to be performed is just one of their many advantages.”

I guess if you’re not going to talk about DI/IoC, then I certainly will at some point. There’s a lot more to be said (I’ve seen frameworks that combine AOP with DI), but I think this is probably enough for now. Suffice it to say, there is a lot of synergistic behavior between some of the development patterns for removal of surface area, ease of testing, et al. Looking forward to the next post.

Chris Eng | June 24, 2008 5:07 pm

@Dre:

Your comments are always longer than my original posts. :)

I think you’ve forked off in a slightly different direction than where I was going but it is an interesting direction nonetheless. Your focus seems to be on exploring coding practices that may make it more efficient for an organization to unit test its own code (correct me if I’ve misunderstood). Beyond unit testing though, IoC frameworks introduce some unique challenges to automated data/control flow analysis and they can be confusing for a code reviewer as well. That’s a totally separate discussion though.

My intent in the post is not to suggest that people should change their entire development practice. Rather, it is to illustrate that while an organization may be going to great lengths to incorporate security into their custom code, they’re often being extremely careless in allowing third-party interfaces to unnecessarily widen the scope of potential attack vectors. It’s almost a deployment-related task (e.g. removing all the sample apps from Tomcat), but not quite.

In a typical development shop, who would own the task of analyzing and reducing the attack surface of third-party libraries? I don’t think it has a clear owner, which is probably why it’s often overlooked.

Andre Gironda | June 25, 2008 4:40 am

Your focus seems to be on exploring coding practices that may make it more efficient for an organization to unit test its own code (correct me if I’ve misunderstood)

Yes, I did want to point out that these sorts of coding practices make testing more efficient (and more capable). I consider this as a primary benefit to security, although using DI or IoC certainly will reduce the surface area of third-party code as a secondary benefit (or is this the primary?). Lead developers will often integrate DI/IoC/TDD/Plugin-pattern concepts into the mix of things that the people on their team need to learn and use – along with choice of third-party libraries and coding standards with regards to base class libraries.

In a typical development shop, who would own the task of analyzing and reducing the attack surface of third-party libraries?

Maybe it would be the same person who is also responsible for solving / working around the issues with compile-time dependencies?

Beyond unit testing though, IoC frameworks introduce some unique challenges to automated data/control flow analysis and they can be confusing for a code reviewer as well. That’s a totally separate discussion though

Actually, it’s really a part of the same discussion. If I suggest that DI or IoC frameworks be used for security purposes, then it would make sense to talk about the disadvantages as well. I wouldn’t say that the challenges are unique to DI/IoC, but I would say that the majority of automated security review tools out there make it difficult to support this.

It’s almost a deployment-related task (e.g. removing all the sample apps from Tomcat), but not quite

To put it in your Unix README terminology – tarfiles sometimes shipped with MANIFEST files that contained a list of all of the files that shipped with that tar file. Assembly manifests (when provided in third-party packages) should contain the metadata for the assembly such as the name and version of the assembly, the files that make up the assembly (including their names and hash values), the compile-time dependency of this assembly on other assemblies, the culture or language an assembly supports, and the set of permissions required for the assembly to run properly.

Evan | June 25, 2008 8:22 am

Chris: I’ve always felt that this is one of the problems with development these days. Everyone is looking for the quick fix/integration technique. Developers are forgetting three things:

1) Never trust data coming into your application/class
2) Design with re-use in mind.
3) Keep it simple

While these frameworks allow for companies/systems to be developed more rapidly they tend to open up a lot of other issues — the struts validation framework for example leads to developers forgetting about validating the data as it comes into the class itself (whose to say the web front end is the only place data will be coming from).

XML File usage has gotten way out of hand and adds to the complexity. Sure, someone coudl develop a front end or Eclipse plugin to assist, but it’s still another area where manipulation and management needs to occur. (keep it simple).

While I’m not perfect at it, I was always taught through my undergrad program — know what you are using and what it does. Without it, you are flying blind.

links for 2008-06-26 (Jarrett House North) | June 25, 2008 9:33 pm

[...] Zero in a bit » Minimizing the Attack Surface, Part 1 How does “attack surface minimization” translate into the web application world? (tags: security) [...]

Please Post Your Comments & Reviews

Your email address will not be published. Required fields are marked *

RSS feed for comments on this post