Global Internet penetration started in the late 80’s and early 90’s when an increasing number of Research Institutions from all over the world started to interconnect with each other and the first commercial Internet Service Providers (ISPs) began to emerge. At that time, the Internet was primarily used to exchange messages and news between hosts. In 1990, the number of interconnected hosts had grown to more than 300.000 hosts. In the same year, Tim Berners-Lee and Robert Cailliau from CERN started the World Wide Web (WWW) project to allow scientists to share research data, news and documentation in a simple manner.
Berners-Lee developed all the tools necessary for a working WWW including an application protocol (HTTP), a language to create web pages (HTML), a Web browser to render and display web pages and a Web server to serve web pages. As the WWW provided the infrastructure for publishing and obtaining information via the Internet, it simplified the use of the Internet. Hence, the Internet started to become tremendously popular among normal non-technical users resulting in an increasing number of connected hosts.
Over the past decade, affordable, high-speed and ‘always-on’ Internet connections have become the standard. Due to the ongoing investments in local cell infrastructure, Internet can now be accessed from everywhere at any device. People access the Internet using desktops, notebooks, Tablet PCs and cell phones from home, office, bars, restaurants, airports and other places.
Along with investments in Internet infrastructure, companies started to offer new types of services that helped to make the Web a more compelling experience. These services would not have been realized without a technological evolution of the Web. The Web has evolved from simple static pages to very sophisticated web applications, whose content is dynamically generated depending on the user’s input. Similar to static web pages, web applications can be accessed over a network such as the Internet or an Intranet using a Web browser and it generates content depending on the user’s input. The ubiquity of Web browsers, the ability of updating and maintaining web applications without distributing and installing software on potentially thousands of computers and their cross-platform compatibility, are factors that contributed to the popularity of web applications.
The technological evolution of the Web has dramatically changed the type of services that are offered on the Web. New services such as social networking are introduced and traditional services such as e-mail and online banking have been replaced by offerings based on Web technology. Notably in this context is the emerging strategy of software vendors to replace their traditional server or desktop application offerings by sophisticated web applications. An example is SAP’s Business ByDesign, an ERP solution offered by SAP as a web application.
The technological evolution has impacted the way how people nowadays use the Web. Today, people critically depend on the Web to perform transactions, to obtain information, to interact, have fun and to socialize via social networking sites such as Facebook and Myspace. Search engines such as Google and Bing allow people to search and obtain all kinds of information. The Web is also used for many different commercial purposes. These include purchasing airline tickets, do-it-yourself auctions via Ebay and electronic market places such Amazon.com.
Over the years, the World Wide Web has attracted many malicious users and attacks against web applications have become prevalent. Recent data from SANS Institute estimates that up to 60% of Internet attacks target web applications [23]. The insecure situation on the Web can be attributed to several factors.
First, the number of vulnerabilities in web applications has increased over the years. We observe that as of 2006, more than half of the reported vulnerabilities are web-related vulnerabilities. The situation has not improved in recent years. Based on an analysis of 3000 web sites in 2010, a web site contained on average 230 vulnerabilities according to a report from WhiteHat Security [105]. Although not all web vulnerabilities pose a security risk, many vulnerabilities are exploited by attackers to compromise the integrity, availability or confidentiality of a web application. Second, attackers have a wide range of tools at their disposal to find web vulnerabilities and launch attacks against web applications. The advanced functionality of Google Search allows attackers to find security holes in the configuration and programming code of websites. This is also known as Google Hacking [12, 72]. Furthermore, there is a wide range of tools and frameworks available that allow attackers to launch attacks against web applications. Most notably in this context is the Metasploit framework [85]. This modular framework leverages on the world’s largest database of quality assured exploits, including hundreds of remote exploits, auxiliary modules, and payloads. Finally, attackers do have motivations to perform attacks against web applications. These attacks can result into, among other things, data leakage, impersonating innocent users and large-scale malware infections.
An increasing number of web applications store and process sensitive data such as user’s credentials, account records and credit card numbers. Vulnerabilities in web applications may occur in the form of data breaches which allow attackers to collect this sensitive information. The attacker may use this information for identity theft or he can sell it on the underground market. Stealing large amounts of credentials and selling them on the underground market can be profitable for an attacker as shown by several studies [124, 13, 39, 104]. Security researchers estimate that stolen credit card numbers can be sold for a price ranging between $2 to $20 each [13, 39]. For bank accounts the price range per item is between $10 and $1000 while for e-mail passwords the range is $4 to $30 [39] per item
Vulnerable web applications can also be used by attackers to perform malicious actions on the victim’s behalf as part of phishing attack. In these types of attacks, attackers use social engineering techniques to acquire sensitive information such as user’s credentials or credit card details and/or let the user perform some unwanted actions thereby masquerading itself as a trustworthy entity in the communication. Certain flaws in web applications such as cross-site scripting make it easier for an attacker to perform a successful phishing attack because in such attack, a user is directed to the bank or service’s own web page where everything from the web address to the security certificates appears to be correct. The costs of phishing attacks are significant, RSA estimates that the losses of phishing attacks world wide in the first half year of 2011 amounted over more than 520 million Dollars [94].
Legitimate web applications that are vulnerable can be compromised by attackers to install malware on the victim’s host as part of a drive-bydownload [84]. The installed malware can take full control of the victim’s machine and the attacker uses the malware to make financial profit. Typically, malware is used for purposes such as acting as a botnet node, harvesting sensitive information from the victim’s machine, or performing other malicious actions that can be monetized. Web-based malware is actively traded on the underground market [124]. While no certain assessments exist on the total amount of money attackers earn with trading virtual assets such as malware on the underground market, some activities have been analyzed. A study performed by Mcafee [64] shows that compromised machines are sold as anonymous proxy servers on the underground market for a price ranging between $35 and $550 a month depending on the features of the proxy.
Attacks against web applications affect the availability, integrity and confidentiality of web applications and the data they process. Because our society heavily depends on web applications, attacks against web applications form a serious threat. The increasing number of web applications that process more and more sensitive data made the situation even worse.
To summarize, the current insecure state of the Web can be attributed to the prevalence of web vulnerabilities, the readily available tools for exploiting them and the (financial) motivations of attackers. Unfortunately, the growing popularity of the Web will make the situation even worse. It will motivate attackers more as attacks can potentially affect a larger number of innocent users resulting into more profit for the attackers. The situation needs to be improved because the consequences of attacks are dramatic in terms of financial losses and efforts required to repair the damage.
To improve the security on the Web, much effort has been spent in the past decade on making web applications more secure. Organizations such as MITRE [62], SANS Institute [23] and OWASP [79] have emphasized the importance of improving the security education and awareness among programmers, software customers, software managers and chief information officers. Also, the security research community has worked on tools and techniques to improve the security of web applications. These tools and techniques mainly focus on either reducing the number of vulnerabilities in applications or on preventing the exploitation of vulnerabilities.
Although a considerable amount of effort has been spent by many different stakeholders on making web applications more secure, we lack quantitative evidence whether this attention has improved the security of web applications. In this thesis, we study how web vulnerabilities have evolved in the past decade. We focus in this dissertation on SQL injection and crosssite scripting vulnerabilities as these classes of web application vulnerabilities have the same root cause: improper sanitization of user-supplied input that result from invalid assumptions made by the developer on the input of the application. Moreover, these classes of vulnerabilities are prevalent, well-known and have been well-studied in the past decade.
1 Introduction |