Posts BGP Part3. BGP Best Path Selection and Manipulation
Post
Cancel

BGP Part3. BGP Best Path Selection and Manipulation

bgp_topology

We continue with the same topology as our previous two posts regarding BGP.

In this post we will be playing with BGP best path algorithm and see some interesting situations.

So far we didn’t touch R7 (ASR920) , but now we will add him to the configuration. We will see how to configure routing protocol using Address Family which is recommended way of configuring stuff nowadays.

1
2
3
4
5
6
7
8
9
R7(config)#router bgp 444
R7(config-bgp)#bgp router-id 7.7.7.7
R7(config-bgp)#neighbor 10.73.73.3 remote-as 222
R7(config-bgp)#neighbor 10.75.75.5 remote-as 222
R7(config-bgp)#neighbor 10.76.76.6 remote-as 444
R7(config-bgp)#address-family ipv4 unicast
R7(config-bgp-af)#neighbor 10.73.73.3 activate
R7(config-bgp-af)#neighbor 10.75.75.5 activate
R7(config-bgp-af)#neighbor 10.76.76.6 activate

So first you create bgp process and give it a router-id. After that you define neighbor in the global bgp configuration. After you defined the neighbor you got to “activate” him under the proper Address Family sub-configuration mode.

BEST PATH SELECTION

BGP has best-path selection algorithm which is used to determine how traffic enters or leaves an AS. BGP installs the first received path for the specific network as best path automatically. Afterwards when addiotional paths are received for that same network prefix, they are compared against the current best path. And if there is a tie it will continue along the list of attributes which are in the BGP best-path selection algorithm.

  1. Prefer the highest weight
  2. Prefer the highest local preference — well known discretionary
  3. Prefer the route originated by the local router
  4. Prefer the path with the shorter AIGP metric attribute — optional non transitive
  5. Prefer the shortest AS_Path — well known mandatory
  6. Prefer the best origin code — well known mandatory
  7. Prefer the lowest multi-exit discriminator (MED) — optional non-transitive
  8. Prefer an external path over an internal path
  9. Prefer the path through the closest IGP neighbor
  10. Prefer the oldest route for eBGP paths
  11. Prefer the path with the lowest neighbor BGP RID
  12. Prefer the path with the lowest neighbor IP address

Some of those attributes are Well known mandatory which are supported by all BGP implementations and vendors and are in BGP update messages. Well-known Discretionary those attributes are supported by all BGP implementations but they don’t have to exist in BGP messages/updates to neighbors. Optional transitive attributes are not understood by all BGP implementations, but if the transitive flag is set they will be passed through to the neighbors. Optional non-transitive attributes are also optional but they will not be passed over.

We will look at some of the most common ones and how to influence traffic in a way we want.

Weight

Weight is defined by Cisco and not all vendors supports it. It is first step in deciding the best-path. Weight is not advertised to other routers so it is only locally significant, with it we can influence which route we want to be the best for specific network prefix when we have multiple paths. So it is influencing outbound traffic, let’s see it in action. Preference is given to higher weight. Let’s take a look at R7 BGP table:

1
2
3
4
5
6
7
8
9
10
R7#show ip bgp
*    11.11.11.11/32   10.76.76.6                             0 333 111 i
 *                     10.75.75.5                             0 222 111 i
 *>                    10.73.73.3                             0 222 111 i
 *    150.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.75.75.5                             0 222 111 i
 *>                    10.73.73.3                             0 222 111 i
 *    160.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.75.75.5                             0 222 111 i
 *>                    10.73.73.3                             0 222 111 i

As we can see R7 is preferring R3 for all routes which were originated by R1. Why does he do that? Well if you follow the Best-Path table above you will come to conclusion that it is the oldest route in the bgp table. Simply coincidence, maybe the R3 was the first neighborship we established. Let’s change that, let’s say we want network 11.11.11.11/32 to go via R6, afterall we have less hops that way right?

1
2
3
4
5
6
7
8
R7(config)#ip prefix-list NET11 permit 11.11.11.11/32
R7(config)#route-map AS444_WEIGHT_IN permit 10
R7(config-route-map)#match ip address prefix-list NET11
R7(config-route-map)#set weight 444
R7(config)#route-map AS444_WEIGHT_IN permit 20
R7(config)#router bgp 444
R7(config-router)#address-family ipv4 unicast
R7(config-router-af)#neighbor 10.76.76.6 route-map AS444_WEIGHT_IN in

Now let’s see the BGP table:

1
2
3
4
5
6
7
*>   11.11.11.11/32   10.76.76.6                           444 333 111 i
 *                     10.75.75.5                             0 222 111 i
 *                     10.73.73.3                             0 222 111 i
 *    150.0.0.0/24     10.75.75.5                             0 222 111 i
 *>                    10.73.73.3                             0 222 111 i
 *    160.0.0.0/24     10.75.75.5                             0 222 111 i
 *>                    10.73.73.3                             0 222 111 i

And as we can see, the weight for 11.11.11.11/32 is set to 444 for R6, and other route’s remained unchanged. So we influenced our outbound preference for that network, but we had to do that when the route was coming “in” to the BGP table by using a route-map.

Local Preference

Local preference (LOCAL_PREF) is a well-known discretionary path attribute and is included with path advertisements throughout an AS. It is not advertised between eBGP peers, and it is most often used to influence the next-hop address for outbound traffic leaving an AS. A higher value is preferred over lower. If BGP router that sits on the edge does not define the local preference for prefixes it receives, the default local preference value for those prefixes is 100 and is passed on to other iBGP peers.

There’s many ways and different scenarios you can use to manipulate the best path traffic pattern, in our case we will show an example of using LOCAL_PREF on R2 to send outbound update to R3 with LPREF of 222 for prefix 150.0.0.0/24. Remember that previously we configured R3/R4 as route reflectors, so in this example we will see also how they transfer the best path attributes also with routes they reflect. When we configure Local Preference of 222 outbound to R3, R3 should reflect it to R5, afterwards R5 should prefer R3 to reach the network 150.0.0.0/24 instead of R4. You can influence inbound updates to be affected by local preference.

1
2
3
4
5
6
7
R2(config)#access-list 1 permit 150.0.0.0
R2(config)#route-map LPREF permit 10
R2(config-route-map)#match ip address 1
R2(config-route-map)#set local-preference 222
R2(config)#route-map LPREF permit 20
R2(config)#router bgp 222
R2(config-router)# neighbor 192.168.3.1 route-map LPREF out

And now we can skip checking R3 and check directly R5’s BGP table to see if R5 received the prefix with LPREF of 222 and if it is preferring R3 over R4.

1
2
3
4
5
6
7
8
9
R5#show ip bgp

*>i 11.11.11.11/32   192.168.2.1              0    100      0 111 i
 * i                  192.168.2.1              0    100      0 111 i
 *                    10.75.75.7                             0 444 333 111 i
 *>i 150.0.0.0/24     192.168.2.1              0    222      0 111 i
 * i                  192.168.2.1              0    100      0 111 i
 *>i 160.0.0.0/24     192.168.2.1              0    100      0 111 i
 * i                  192.168.2.1              0    100      0 111 i

And as we can see , it is indeed preferring the route towards the higher Local Preference, in this case we cannot see the next hop of R3 because when routers are configured as Route Reflectors, they cannot change the next-hop for routes, they just forward the received routes. You can try different topology without route reflectors to see the effect.

Shortest AS_Path

AS_Path length is also one of most common decision factors in the BGP best-path algorithm, you can look at it like AS hop count. A shorter AS_Path is preferred over a longer AS_Path. You can use this method to manipulate the best-path for some routes by using the prepend method. You prepend AS to some prefix AS_Path adding additionaly hops. Usually best practice is to prepend your own AS, and not use somebody elses.

Let’s use this method to affect which path R7 takes to reach 150.0.0.0/24 network. At the moment this is the BGP table for R7:

1
2
3
4
5
6
7
8
9
10
R7#show ip bgp
*>   11.11.11.11/32   10.76.76.6                           444 333 111 i
 *                     10.73.73.3                             0 222 111 i
 *                     10.75.75.5                             0 222 111 i
 *    150.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.73.73.3                             0 222 111 i
 *>                    10.75.75.5                             0 222 111 i
 *    160.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.73.73.3                             0 222 111 i
 *>                    10.75.75.5                             0 222 111 i

As we can see R7 is preferring the path through R5. Let’s change that by appending additional AS_Path on R5 so R7 switch to a better path.

1
2
3
4
5
6
7
R5(config)#access-list 1 permit 150.0.0.0
R5(config)#route-map PREPEND permit 10
R5(config-route-map)#match ip address 1
R5(config-route-map)#set as-path prepend 222
R5(config)#route-map PREPEND permit 20
R5(config-route-map)#router bgp 222
R5(config-router)#neighbor 10.75.75.7 route-map PREPEND out

Now we check on R7 the BGP table, sometimes BGP can be slow so you can use clear ip bgp * soft in to force it to refresh inbound updates without reseting the peer sessions which is quite usefull at times.

1
2
3
4
5
6
7
8
9
10
R7#clear ip bgp * soft in
R7#show ip bgp  

*>   11.11.11.11/32   10.76.76.6                           444 333 111 i
 *                     10.73.73.3                             0 222 111 i
 *    150.0.0.0/24     10.76.76.6                             0 333 111 i
 *>                    10.73.73.3                             0 222 111 i
 *                     10.75.75.5                             0 222 222 111 i
 *>   160.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.73.73.3                             0 222 111 i

And as we can see, R7 now prefers path over R3 instead of R5 like before, and as we can see the AS_Path has been appended to R5’s AS_Path hops.

Origin Type

The well-known mandatory BGP attribute named origin. By default, networks that are advertised on Cisco routers using the network statement are set with the i (for IGP) origin, and redistributed networks are assigned the ? (incomplete) origin attribute. The origin preference order is as follows:

  1. IGP origin (Most)
  2. Exterior Gateway Protocol (EGP) origin
  3. Incomplete origin (Least)

Let’s continue from last example, R7 is now preferring the R3 for 150.0.0.0/24 prefix. Let’s change that once again so it prefer R6. We will go on R3 and change the origin type of 150.0.0.0/24 to incomplete, meaning we will lie that it has been redistributed somewhere along the way, and when R7 receives it it will see it as incomplete “?”.

1
2
3
4
5
6
7
R3(config)#access-list 1 permit 150.0.0.0
R3(config)#route-map ORIGIN permit 10
R3(config-route-map)#match ip address 1
R3(config-route-map)#set origin incomplete
R3(config)#route-map ORIGIN permit 20
R3(config-route-map)#router bgp 222
R3(config-router)#neighbor 10.73.73.7 route-map ORIGIN out

And now let’s check the R7 BGP table:

1
2
3
4
5
6
7
8
9
10
11
R7#show ip bgp

*    11.11.11.11/32   10.73.73.3                             0 222 111 i
 *                     10.75.75.5                             0 222 111 i
 *>                    10.76.76.6                           444 333 111 i
 *>   150.0.0.0/24     10.76.76.6                             0 333 111 i
 *                     10.73.73.3                             0 222 111 ?
 *                     10.75.75.5                             0 222 222 111 i
 *    160.0.0.0/24     10.73.73.3                             0 222 111 i
 *                     10.75.75.5                             0 222 111 i
 *>                    10.76.76.6                             0 333 111 i

And as we can see, the origin type is “?” incomplete for 150.0.0.0/24 coming from R3. So R7 decided to switch to only valid option it had left which is R6, because for R5 we’ve prepended AS_Path before and since R6 AS_Path is better it has chosen R6.

MED (Multi-Exit Discriminator) Attribute

Next best-path decision is optional non-transitive attribute called MED.

As Cisco said it, MED is a hint to external neighbors about the preferred path into an AS that has multiple entry points.

It means if we have multiple peerings/entry points to our neighbor, with MED we can influence which path our neighbor will take to reach the specific prefix. A lower MED value is preferred over a higher value.

Note that when you redistribute routes into BGP, BGP sets the MED automatically to the IGP path metric. Additionally, if the MED is received from an eBGP session, it can be advertised to other iBGP peers, but it should not be sent outside the AS that received it.

According to the RFC, a prefix without a MED value should be given priority and should be considered with value 0.

Let’s see an example of MED in our topology. We will be looking at prefix 160.0.0.0/24 this time. We can see it is being received from R3,R5 and R6. There’s no metric and nothing is being manipulated at this time for prefix 160.0.0.0/24.. For our example I will remove R6 out of the equation and I will shutdown the interface leading to R6. Why? Well because MED has a rule, essentially for MED to be considered during the best path selection the prefixes has to come from the same AS!! So we don’t want R6 to mess this up for us and be chosen as best path, because then we couldn’t test the MED. And since R6 is in AS 333, and R3/R5 are in AS 222, we will be looking for now only at AS 222 and how to use MED to influence which path is better between R3 and R5, later we will re-add R6 into the equation. So let’s get going. At the moment best path is through R3.

1
2
3
R7#show ip bgp
* >    160.0.0.0/24     10.73.73.3                             0 222 111 i
  *                     10.75.75.5                             0 222 111 i

Let’s change this so R5 is preferred. We will configure R5 with lower MED and R3 with higher MED value and see what happens.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
R3(config)#ip prefix-list MED_160 seq 5 permit 160.0.0.0/24
R3(config)#route-map MED permit 10
R3(config-route-map)#match ip address prefix-list MED_160
R3(config-route-map)#set metric 5000
R3(config)#route-map MED permit 20
R3(config)#router bgp 222
R3(config-router)# neighbor 10.73.73.7 route-map MED out
------------------------------------------------------------
R5(config)#ip prefix-list MED_160 seq 5 permit 160.0.0.0/24
R5(config)#route-map MED permit 10
R5(config-route-map)#match ip address prefix-list MED_160
R5(config-route-map)#set metric 3000
R5(config)#route-map MED permit 20
R5(config)#router bgp 222
R5(config-router)# neighbor 10.75.75.7 route-map MED out

After we’re done with configuration, let’s check the BGP table on R7:

1
2
3
R7#show ip bgp
*    160.0.0.0/24     10.73.73.3            5000             0 222 111 i
 *>                    10.75.75.5            3000             0 222 111 i

We can see that R5 is now chosen as best path, and we can see the metric (MED) values are added.

Still this is not good for us, why? Well because you used MED to influence path only for a single Autonomous System, because of the MED rule to only compare routes coming from same AS, remember we removed R6 to show an example. But usually in real life, you always peer with 2 Service Providers for redundancy. And when you peer with 2 Service Providers, you will have the same route twice but coming from 2 totally different Autonomous Systems, what then? How can you use MED in that situation?

Let’s see an example of this situation (assuming best path selection has come to MED), imagine those route updates come in this order:

  1. Route from (AS 1) comes in with MED 50 -> best path
  2. Route from (AS 2) comes in with MED 40 -> not compared at all
  3. Route from (AS 1) comes in with MED 30 -> not compared at all

Here we have two problems: First problem is the fact that MED is not being compared for prefixes received from different AS. Second problem is the fact that MED also has a rule that states: To compare prefix from the same AS, the updates MUST come in sequential order. That means that route updates must come one after the other from same AS to even be considered for MED check. As we can see 3rd update from AS 1 has lowest MED and should be best path, but it was not even considered. So as we can see MED can be quite complicated.

Fortunately there’s an option to use always-compare-med and bgp deterministic-med which allows for the comparison of MED regardless of the AS_Path and sequential order.

Always-Compare MED

Always-Compare MED is exactly that, it allows MED to be compared regardless of the AS the update came from. Let’s bring back R6 into equation and give him the lowest MED and see what happens now (remember R5 was best path before).

1
2
3
4
5
6
7
R6(config)#ip prefix-list MED_160 seq 5 permit 160.0.0.0/24
R6(config)#route-map MED permit 10
R6(config-route-map)#match ip address prefix-list MED_160
R6(config-route-map)#set metric 600
R6(config)#route-map MED permit 20
R6(config)#router bgp 333
R6(config-router)#neighbor 10.76.76.7 route-map MED out

Let’s check BGP table on R7 now:

1
2
3
4
R7#show ip bgp
*    160.0.0.0/24     10.76.76.6             600             0 333 111 i
 *                     10.73.73.3            5000             0 222 111 i
 *>                    10.75.75.5            3000             0 222 111 i

As we can see, nothing happened even though R6 has lowest MED. Now let’s configure always compare-med .

1
2
R7(config)#router bgp 444
R7(config-router)#bgp always-compare-med

And check the R7 BGP table again, and voila R6 is now best path with lowest MED.

1
2
3
4
R7#show ip bgp
*>   160.0.0.0/24     10.76.76.6             600             0 333 111 i
 *                     10.73.73.3            5000             0 222 111 i
 *                     10.75.75.5            3000             0 222 111 i

BGP Deterministic-MED

Even though we dealt with one issue using always-compare MED, we still have second problem left. That is if two updates from same AS come unordered (in differente sequences), they will NOT be considered regarding MED comparision. To test this theory out, we will decrease the MED on R5 from 3000 to 5, so it has best MED. And imagine if you shutdown all interfaces towards R3/R5/R6, and slowly bring them up 1 by 1 starting from R3 (AS 222, MED 5000), R6 (AS 333, MED 600) , so far so good, R6 is best path with MED 600 comparing to R3’s MED 5000 right? Next we bring up R5 (AS 222, MED 5). R5 should become best path right? Well it won’t. Because the update didn’t come in sequence after the first AS 222 update from R3, so it won’t even be considered.

We deal with this issue by using bgp deterministic-med .

1
2
R7(config)#router bgp 444
R7(config-router)#bgp deterministic-med 

And now after using bgp deterministic-med R5 should be best path.

Now i’ve found in Cisco’s documents the next: „ If bgp always-compare-med is enabled, BGP MED decisions are always deterministic“ ..

Researching and labbing i’ve found out that on some IOS versions you can accomplish both Comparing of different AS (always-compare med) and sequence order from same AS (determnistic-med) by only using command bgp always-compare med .. which then makes bgp deterministic-med useless taking into considerations the quote above.

While other times i had to use both commands, so yeah MED is indeed an intriguing thing to lab up.

Also take into consideration when you have to use both commands to achieve the desired result, there can be situation where you have bgp deterministic-med enabled and bgp always-compare med disabled. In this case be careful because if two different AS are not being compared, MED won’t be considered even though you think it will, and you have to go to the next step because maybe one route is external and one internal (external routes preferred over internal), dont let this trick you. Always think through when considering MED.

Thanks for reading!

This post is licensed under CC BY 4.0 by the author.