Tuesday, November 24, 2009

BotGraph: Large Scale Spamming Botnet Detection

Summary

This paper proposes BotGraph, a system used to detect and map botnet spamming attacks which attack major Web email providers. BotGraph includes an efficient implementation to construct and analyze large graphs.

BotGraph has 2 main components: aggressive sign-up detection and stealthy bot-user detection. First, BotGraph detects aggressive account signup by looking for a sudden increase in sign-ups from a single IP address. BotGraph uses this information to limit the number of email accounts the botnet can control.

Then, the second part is to identify which are the set of machines controlled by the bot. To do this, BotGraph builds up a user-user graph, where every node is a user and every edge has a weight which represents how similar the 2 users are. BotGraph uses the logic that all the nodes in a bot-user group use the same toolkit and so must have similar behaviors. Therefore, by using the appropriate similarity metric for the graph, the bot-user group will be a connected component in the user-user graph.

The similarity metric that BotGraph uses is the number of common IP addresses logged in by the 2 users. BotGraph relies on the following 2 properties of botnets:
  • The sharing of one IP address: ratio of bot-users to bots is 50:1, so many bot-users use one bot to login, resulting in a shared IP address
  • The sharing of multiple IP addresses: botnets have high churn rate so bots are added and removed often from the botnet. To maximize account utilization, the spammer will reassign accounts to different bots, resulting in each account being used on multiple IP addresses.
The authors implemented their graph creation algorithm using Hotmail logs from 2 1-month periods in 2007 and 2008. The logs contained a signup log and a login log. After constructing the user-user graph, they found 26 million botnet-created user accounts with a false positive rate of 0.44%.

Criticisms & Questions

This was an interesting idea that they proposed and seems to work based on the assumptions they are making. However, given that spammers are constantly evolving their technique, the basic assumptions they make about bots sharing IP addresses might not be valid in the future.

No comments:

Post a Comment