Storing your email address and telephone number in and and the inherent drawbacks of these methods Shortcomings of disguising email in markup to avoid spam and other malicious requests (disguise such as mail [at] mail [dot] com) Pitfalls of CDNs (Content Delivery Networks) A widely used code snippet that is subject to XSS attacks Relying on the HTML markup for important data for the application (such as product prices) A loose security mechanism in an Australian governmental website.


Avoid giving sensitive information in a plainly visible way in the HTML markup

We all know of the Mail Me!. However, giving out your email like this makes it pretty easy for bots to filter out your email and place it in a database/file, whatever they desire, making you subject to spam and other malicious attacks. To illustrate, I have created a sample script that gets all results for “mailto:” and <a href=”tel: xxx-xxx-xxxx”> from and stores it in a file or displays it to the browser. This script is just a sample and assumes that all results are on a single page. Furthermore, MeanPath shows only 100 rows from the results unless you pay to get all of them. Regular expressions are used to filter only those results that contain valid email addresses shown directly in the anchor tag. And for the telephone mining, it just gets the phones that are in the xxx-xxx-xxxx format. The code also ensures that no duplicate emails/telephones are entered into the list of the data.

Figure 1: A view of some of the collected emails and telephone numbers. A better way to mine such data would be through

Figure 2: First part of the code. It creates a MeanPath class with a function called mine_elements() which gets all results from MeanPath and stores it in an array. The other function filter_elements() filters only the elements that match the query that was given in the instantiation of the MeanPath object and also makes sure there are no duplicate entries.

Figure 3: Second part of the code. The function display_data() shows the data in a ordered list in the browser. The function save_data_to_file() saves it on a random file, given when calling the function. Lastly, the MeanPath class is instantiated and data saved to a file and displayed in the browser. Thus, it should be evident by now that giving personal information should involve some safeguards. Of course, this is not always necessary. I also have to say that “encoding” the email in a format such as “sample [at] sample [dot] com” or “sample [at] sample . com” does not make it any more secure.

Here we have a short snippet of code. There is an HTML paragraph with an email given in it in that format and a PHP code that gets the file and extracts the email with a silly regular expression that extracts it and saves it into an array. The regular expression checks for any number of characters followed by [at] or @, followed by any number of characters after which there are some of the top-level domains. Here is the browser result of the search: [php] array(5) { [0]=> string(44) " Contact me at stereo [at] room [dot] com" [1]=> string(25) " Contact me at stereo " [2]=> string(4) “[at]” [3]=> string(12) " room [dot] " [4]=> string(3) “com” } [/php] matches[0] is the full expression that matched our search, matches[1] is the first parenthesized part of it that matched, and matches[2] is the second parenthesized part of the regex, and so on.

Use CDNs but be aware of security implications

CDNs (Content Delivery Networks) are a great way to decrease page load time (both because they often provide the script in numerous countries and load the one that’s closer to the user) and because users may have already cached the script by visiting another site which uses that particular CDN. However, there are security risks in that you have no control over what is stored in the loaded file. If the CDN gets compromised, the code in the file you are loading may change, and that can lead to more than just cookies being stolen. Also, the script loaded from the CDN can become unavailable, temporarily or not, leading to a frustrating user experience. If the file was on your domain and your site went down, the users would know there is a problem with the site, however if a CDN file such as jQuery gets unavailable and you are relying on it heavily – they would not know what is happening – the site would be up but it would look completely out of whack. First off, the attacker can change the source code of the delivered script, and:

Replace your design with whatever he wants. Also, the attacker can execute JavaScript snippets if the user is using IE or Firefox (considering it is a .css file). Replace the code to frustrate users, redirect your site to another one, steal users’ cookies and load any kind of exploit code he wants.

Gain server-side control if the delivered file is a JavaScript file.

Figure 4: Executing arbitrary JavaScript within CSS Expression() works for some modes of IE8 and the ones below IE8 (particularly IE7 and IE5) which are still used nowadays. We see a very simple HTML 4.01 page which sets a cookie on each visit to the page without checking if it exists when the page loads. After that we have a