Tuesday, October 27, 2009

On Inferring Autonomous System Relationships in the Internet

Summary

This paper proposes a graph representation of the AS relationships in the Internet at the time it was written. The authors characterize the relationships into customer-provider, peering and sibling relationships. They came up with heuristic algorithms that look at the BGP routing tables and infer the augmented AS graph.

They test the graph generated by their algorithm on AT&T's internal information and find that over 99% of their graph is accurate. They also test sibling-to-sibling relationships using the WHOIS lookup service and find that over half of their graph is accurate.

Their annotated AS graph consists of nodes connected by one of 3 kinds of edges: provider-to-customer, peer-to-peer and sibling-to-sibling.

They explain the basic 3-phase heuristic algorithm they use. The first phase parses the routing tables and calculates the degree of each AS. The next phase goes through all the entries in the routing table and parses it to find the top provider and consecutive AS pairs before and after the top provider. Finally, in the third phase, it assigns all those consecutive AS pairs with a transit relationship.

The basic algorithm assumes that there are no misconfiguration errors. They also propose a refined algorithm that accounts for any misconfiguration errors. Furthermore, they propose another algorithm that infers peering relationships in addition to the provider-customer and sibling relationships that the previous 2 infer.

Finally, they implement all 3 algorithms and using AT&T's internal data and the WHOIS service, they determine that their algorithms are fairly accurate.

Criticisms & Questions

I think this was an interesting paper. I think they did a decent job of explaining the algorithms and their experimental results. I would, however, have liked them to expand more on the motivation for their work. They spend a lot of time talking about how BGP works and what each of the different relationships are. However, they only briefly mention several use cases for the augmented graph representation they present. And, with those few examples, I find it hard to understand why any ISP would need to know the entire representation if they wanted to find out information about a specific ISP. If it's just one ISP, can't they just ask them?

2 comments:

  1. I think your critique is a little tough on them. What they were able to achieve given limited visibility on the whole Internet reachability structure is pretty amazing. The methodology, with support from AT&T, is also impressive in that it shows a high degree of accuracy in determining how ASs should be classified. They did even find a few routing misconfigurations!

    ReplyDelete
  2. I'm certainly not saying that what they did wasn't amazing - I think the fact that they were able to come up with a graph representation of the internet is very interesting.

    I guess my confusion was more about how this would be used by the ISPs. They mentioned that one use case is for an ISP who plans to go into business with another ISP and would use the graph to find out what the other ISP's reachability is. Being able to do this automatically is a huge save in timing, but I'm wondering if this can't be done manually for some reason (ie. can the ISP just talk to the other ISP to find out this information)? And, maybe they can't, maybe it's too time-consuming to get this information.

    ReplyDelete