Authored by: Support.com Tech Pro Team
How to troubleshoot Duplicate IPsec SA Entries on Netgate router
In certain cases, an IPsec tunnel may show what appears to be duplicate IKE (phase 1) or Child (phase 2) security association (SA) entries.
Lengthy testing and research uncovered that the main way this starts to happen is when both sides negotiate or renegotiate simultaneously. If both peers initiate, reauthenticate, or rekey phase 1 at the same time, it can result in duplicate IKE SAs. If both peers rekey phase 2 at the same time, it can result in duplicate child SAs.
Mitigating this problem involves ensuring that the chance of simultaneous negotiation is minimized or eliminated. The easiest way to reach that goal is to set higher phase 1 and phase 2 lifetimes on one peer, or at least make sure both sides are not set identically.
Specific values vary but the settings below are the best general advice. Using the suggested values may not be possible due to peer constraints, such as third-party vendors which do not support these settings or insist upon other settings. Variations are mentioned which should accommodate most situations, however, use as many strategies as possible.
The values in this document take precedence over default recommendations in other IPsec examples and recipes.
Current version (2.5.0 and later)
The values in this section can be found in the current GUI. Older versions may have different values or some settings may not be available. Always run the most recent released version to ensure the best experience.
IKEv2 if supported by both peers.
The total IKE SA lifetime as a hard upper limit (e.g. 28800
)
90% of total IKE SA lifetime (e.g. 25920
).
0
to disable reauthentication.
If the peer requires IKEv1 or only supports IKEv2 reauthentication, set this as mentioned in Rekey Time above and also enable Make Before Break on the Advanced Settings tab.
Defaults to 10% of IKE SA Life Time (e.g. 2880
). A larger Rand Time will decrease the chances of both peers renegotiating simultaneously.
Restart/Reconnect so that this side will reconnect child SA entries when they expire or fail.
Total Child SA lifetime (e.g. 3600
for 1 hour). This peer will attempt to rekey the Child SA before it reaches this limit.
Leave blank to automatically use 90% of the Life Time, or choose a lower amount.
Defaults to 10% of Child SA Life Time (e.g. 360
). A larger Rand Time will decrease the chances of both peers renegotiating simultaneously.
IKEv2 if supported by both peers.
The total IKE SA lifetime as a hard upper limit, but use a higher lifetime than Peer A by at least 10% (e.g. 31680
). With this peer set higher, Peer A will primarily manage IKE SA renegotiation, reducing the chance of conflicts.
If the remote peer insists their lifetime be set to a specific value, then set peer A lower instead by a similar margin.
90% of total IKE SA Life Time
Blank (disabled) to disable reauthentication.
If the peer requires IKEv1 or only supports IKEv2 reauthentication, set this as mentioned in Rekey Time above and also enable Make Before Break on the Advanced Settings tab.
Defaults to 10% of IKE SA Life Time (e.g. 3168
). A larger Rand Time will decrease the chances of both peers renegotiating simultaneously. If using the same Life Time as Peer A, then increase this value further. If using a larger Life Time for Peer B, then leave this at the default or disable it (0
).
Checked so that this side will not automatically initiate IKE SA negotiation.
This peer can still manually initiate a connection from Status > IPsec, but it won’t happen automatically.
Close connection and clear SA so that when a Child SA expires, this side will remove the SA and not attempt to renegotiate a new entry.
A larger value than the Life Time set on Peer A by at least 10%. For example, if Peer A is set to 3600
, set this to 5400
. That way Peer A will primarily manage Child SA renegotiation.
If the remote peer insists their lifetime be set to a specific value, then set peer A lower instead by a similar margin.
Leave blank to automatically use 90% of the Life Time, or choose a lower amount.
Defaults to 10% of Child SA Life Time (e.g. 540
). A larger Rand Time will decrease the chances of both peers renegotiating simultaneously. If using the same Life Time as Peer A, then increase this value further. If using a larger Life Time for Peer B, then leave this at the default or disable it (0
).
Checked if phase 1 uses IKEv2 and reauthentication, not relevant otherwise.
Version 2.4.5-p1 and older
The settings are almost all the same as the section above, with a couple changes. The primary difference is in the GUI settings for rekey and reauth. Only the differences are noted below, so follow the previous section except for the values noted here.
On 2.4.5-p1, setting Responder Only in the phase 1 options requires an extra patch which can be applied using the System Patches Package: 9a69dd4b8ff6eeeaf5779b7388a10743afae8e91
Peer ALifetime
The total time at which this peer will renegotiate the IKE SA (e.g. 28800
)
Margin Time
An amount of time, in seconds, before the Life Time is reached when renegotiation begins. Defaults to 540
, but larger values can help reduce the chance of simultaneous renegotiation. Due to the default behavior of the IPsec daemon, this time can be randomly increased up to twice its value to further help avoid both sides choosing the same time. A larger value will help avoid potential collisions.
Disable Rekey
Unchecked
Disable Reauth
Checked
Peer BLifetime
A higher time than the same field on Peer A by at least 10%. (e.g. 32400
)
Margin Time
If using the same Life Time as Peer A, use a larger value to help avoid simultaneous renegotiation. If using a larger Life Time value, then leave this blank or set to the same value as Peer A.
Disable Rekey
Unchecked
Disable Reauth
Checked
The strongSwan daemon introduces randomness into the renegotiation process which can help mitigate the problem, but still leaves it up to chance if both peers are using the exact same lifetime values. That is why setting one peer higher, beyond the randomness threshold, is a better practice. The randomness also explains why the problem can take a while to manifest in certain environments as the duplicates do not happen until both peers happen to land on the same random rekey time.