[dhcwg] [v6ops] Intro to draft-patterson-intarea-ipoe-health-00

Discussion:

STARK, BARBARA H

2018-10-03 00:13:28 UTC

I've gone through the document, but not in extreme detail. Here are some comments based on my quick reading.

Name: IPoE Health Check sounds more like a marketing name than a name that gives me some sense of what the capability actually does. I think it might be easier for people to get a sense of what this does if its name better evoked its function, like "IPoE Connectivity Check". The "oE" is relevant since the function does place requirements on which MAC addresses to use. If it weren't for that, I would have said it could be used for IP over anything. "Health" is too vague a term.

Abstract/Introduction: Starting with discussion of PPPoE and BFD Echo is very confusing. I much prefer documents to start by telling me what they're about. This document seems to be primarily about defining this IP connectivity check function. Info about PPPoE and BFD Echo is really just background info. I would recommend leading with something like "This document defines a mechanism for CE routers with IP over Ethernet WAN interface to periodically test IP connectivity between it and the first hop router. This mechanism is intended to be used by CE routers that do not implement the BFD Echo mechanism for testing IP connectivity." I don't think PPPoE needs to be mentioned in the abstract, at all. In the Intro, PPPoE background should probably be the last paragraph, instead of the first.

On a related note...
"This document describes a mechanism for IP over Ethernet clients to
achieve connectivity validation, similar to that of PPP over
Ethernet, by using BFD Echo, or an alternative health check
mechanism."
.... isn't an accurate description of this document. This document defines a connectivity check mechanism that can be used by CE routers that don't support BFD Echo.

CE is "Customer Edge" and not "Customer Equipment". It's not synonymous with "CPE". "CE Router" is a "Customer Edge Router". It's called this because it's at the edge of the customer premises network, as opposed to a router in the interior of the customer premises network or a router in an access network. "CPE" is properly a word that can be applied to any service-provider-supplied "Customer Premises Equipment", including set-top boxes, VoIP ATAs, ISP-supplied CE routers, etc. A CE router is not necessarily supplied by the ISP.

IPoE Client: I don't see a need to create yet another term for a CE router with an IPoE WAN interface. I find the term "client" in this case a little odd, since there isn't really a client/server relationship in the context of either IP or Ethernet or IP over Ethernet.

"Non-default static parameters SHOULD override any signalled via a
dynamic means (e.g, DHCP or TR-69)."
TR-069, like SNMP and netconf, is not considered a dynamic means of configuring devices. It's generally considered a means of supplying "static" configuration.

The relationship with BFD Echo is very confusing. I would suggest a relationship along the lines of: If BFD Echo and <name of this mechanism> are both supported, the CE router MUST attempt to use both mechanisms to test IP connectivity upon receiving a DHCP-assigned IP address or prefix. If BFD Echo is successful, the CE router MUST cease using <name of this mechanism> and continue only with BFD Echo. If <name of this mechanism> succeeds and BFD Echo does not, the CE router MUST cease using BFD Echo and continue only with <name of this mechanism>.

I'm not sure DHCP configuration is really needed. It seems like overkill.

Recovery behavior for BFD Echo (per TR-124i5) is "Unless overridden by configuration, by default after a failure of 3 successive BFD echo intervals, the RG MUST issue a DHCP renew message following a random jitter interval between 1 and 30 seconds." I think it would be good if we had consistent behavior, independent of mechanism. Or is there a good reason to have different behaviors for different connectivity checks?

The version of TR-124 referenced in the references section doesn't actually include BFD echo requirements. The most recent version is TR-124i5, which is available at https://www.broadband-forum.org/technical/download/TR-124_Issue-5.pdf.

"If all DHCPv6 leases have expired, either naturally or proactively
with IPoE health checks, an IPoE client acting as a router, SHOULD
withdraw itself as a default router on the LAN, following requirement
G-5 of [RFC7084], Section 4.1."
Since the referenced RFC7084 requirement is a "MUST", I would make it a MUST here, too.

Barbara

> -----Original Message-----
> From: v6ops <v6ops-***@ietf.org> On Behalf Of Richard Patterson
> Sent: Tuesday, June 05, 2018 4:03 PM
> To: int-***@ietf.org; ***@ietf.org list <***@ietf.org>; ***@ietf.org
> Subject: [v6ops] Intro to draft-patterson-intarea-ipoe-health-00
>
> Hi All,
>
> The above new draft has been posted to address an operational problem
> that exists with current IPoE (non-PPPoE) fixed line broadband services.
> It can be found here:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__datatracker.ietf.org_doc_draft-2Dpatterson-2Dintarea-2Dipoe-
> 2Dhealth_&d=DwICAg&c=LFYZ-o9_HUMeMTSQicvjIg&r=LoGzhC-
> 8sc8SY8Tq4vrfog&m=FsoSwjTvxj08keS6f_DYpmlby-FM9C5m-G-
> VTwchGJI&s=PlUBOvZt82phRMfoJpQQ--BeqgUeIP_QKKl6uSOdXA4&e=
>
> Hopefully the draft covers the problem statement sufficiently, but in
> summary, PPPoE makes use of LCP echo/replies to detect WAN failures,
> DHCPv4/6 currently has no such mechanism.
> After backhaul migrations or BNG maintenance/failure, an IPoE client can be
> left with a stale DHCP lease for up to the Valid Lifetime.
>
> This draft proposes the use of regular ARP and ND for link monitoring, to
> proactively expire DHCP leases early, and to trigger renewals or getting a new
> lease from scratch.
>
> The intarea WG was chosen as it doesn't neatly fit within the charters of
> either v6ops or dhc, and because it proposes new DHCPv4 and DHCPv6
> Options. Although we expect discussions in both of these WGs as well.
>
> Thanks for comments already received from Ian, and Bernie, that helped
> shape this -00.
>
> -Richard
>
> _______________________________________________
> v6ops mailing list
> ***@ietf.org
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__www.ietf.org_mailman_listinfo_v6ops&d=DwICAg&c=LFYZ-
> o9_HUMeMTSQicvjIg&r=LoGzhC-
> 8sc8SY8Tq4vrfog&m=FsoSwjTvxj08keS6f_DYpmlby-FM9C5m-G-
> VTwchGJI&s=PQDjf2K7lKp7HxtDXAqe1ce3u8xCq8vLbnHcTQjgRlw&e=

Richard Patterson

2018-10-04 10:14:52 UTC

Permalink

Hi Barbara,

Thanks for your review once again. Responses in-line:

On Wed, 3 Oct 2018 at 01:13, STARK, BARBARA H <***@att.com> wrote:

> Name: IPoE Health Check sounds more like a marketing name than a name that gives me some sense of what the capability actually does. I think it might be easier for people to get a sense of what this does if its name better evoked its function, like "IPoE Connectivity Check". The "oE" is relevant since the function does place requirements on which MAC addresses to use. If it weren't for that, I would have said it could be used for IP over anything. "Health" is too vague a term.

OK, we can work on the name.

> Abstract/Introduction: Starting with discussion of PPPoE and BFD Echo is very confusing. I much prefer documents to start by telling me what they're about. This document seems to be primarily about defining this IP connectivity check function. Info about PPPoE and BFD Echo is really just background info. I would recommend leading with something like "This document defines a mechanism for CE routers with IP over Ethernet WAN interface to periodically test IP connectivity between it and the first hop router. This mechanism is intended to be used by CE routers that do not implement the BFD Echo mechanism for testing IP connectivity." I don't think PPPoE needs to be mentioned in the abstract, at all. In the Intro, PPPoE background should probably be the last paragraph, instead of the first.

Totally agree with regards to the abstract. Are you OK with the
existing introduction section referencing PPPoE and BFD Echo to
provide the background, if we reword the abstract to focus on this
document?

>
> On a related note...
> "This document describes a mechanism for IP over Ethernet clients to
> achieve connectivity validation, similar to that of PPP over
> Ethernet, by using BFD Echo, or an alternative health check
> mechanism."
> ... isn't an accurate description of this document. This document defines a connectivity check mechanism that can be used by CE routers that don't support BFD Echo.
>
> CE is "Customer Edge" and not "Customer Equipment". It's not synonymous with "CPE". "CE Router" is a "Customer Edge Router". It's called this because it's at the edge of the customer premises network, as opposed to a router in the interior of the customer premises network or a router in an access network. "CPE" is properly a word that can be applied to any service-provider-supplied "Customer Premises Equipment", including set-top boxes, VoIP ATAs, ISP-supplied CE routers, etc. A CE router is not necessarily supplied by the ISP.

OK, we can standardise on using "CE router".

>
> IPoE Client: I don't see a need to create yet another term for a CE router with an IPoE WAN interface. I find the term "client" in this case a little odd, since there isn't really a client/server relationship in the context of either IP or Ethernet or IP over Ethernet.

Understand from the literal sense, but in context "IPoE" is generally
synonymous with DHCP, in which there is a client/server relationship.
We can, however, avoid this term if it's deemed confusing.

>
> "Non-default static parameters SHOULD override any signalled via a
> dynamic means (e.g, DHCP or TR-69)."
> TR-069, like SNMP and netconf, is not considered a dynamic means of configuring devices. It's generally considered a means of supplying "static" configuration.
>
> The relationship with BFD Echo is very confusing. I would suggest a relationship along the lines of: If BFD Echo and <name of this mechanism> are both supported, the CE router MUST attempt to use both mechanisms to test IP connectivity upon receiving a DHCP-assigned IP address or prefix. If BFD Echo is successful, the CE router MUST cease using <name of this mechanism> and continue only with BFD Echo. If <name of this mechanism> succeeds and BFD Echo does not, the CE router MUST cease using BFD Echo and continue only with <name of this mechanism>.

OK, will try to clarify this.

>
> I'm not sure DHCP configuration is really needed. It seems like overkill.

Other reviewers/commenters previously felt that this was a useful
addition. This would allow Network Operators to trigger or influence
the connectivity check on CE routers that they are not in TR.069/etc.
control of.

>
> Recovery behavior for BFD Echo (per TR-124i5) is "Unless overridden by configuration, by default after a failure of 3 successive BFD echo intervals, the RG MUST issue a DHCP renew message following a random jitter interval between 1 and 30 seconds." I think it would be good if we had consistent behavior, independent of mechanism. Or is there a good reason to have different behaviors for different connectivity checks?

Our current defaults were simply chosen based on an existing
implementation of similar functionality. I agree, consistent defaults
would be good, I'll review the default values in TR124i5 and adjust
accordingly.

>
> The version of TR-124 referenced in the references section doesn't actually include BFD echo requirements. The most recent version is TR-124i5, which is available at https://www.broadband-forum.org/technical/download/TR-124_Issue-5.pdf.

Will update the reference.

>
> "If all DHCPv6 leases have expired, either naturally or proactively
> with IPoE health checks, an IPoE client acting as a router, SHOULD
> withdraw itself as a default router on the LAN, following requirement
> G-5 of [RFC7084], Section 4.1."
> Since the referenced RFC7084 requirement is a "MUST", I would make it a MUST here, too.

Good catch, will update.

Thanks again for taking time to review the document.

-Richard

>
>
> > -----Original Message-----
> > From: v6ops <v6ops-***@ietf.org> On Behalf Of Richard Patterson
> > Sent: Tuesday, June 05, 2018 4:03 PM
> > To: int-***@ietf.org; ***@ietf.org list <***@ietf.org>; ***@ietf.org
> > Subject: [v6ops] Intro to draft-patterson-intarea-ipoe-health-00
> >
> > Hi All,
> >
> > The above new draft has been posted to address an operational problem
> > that exists with current IPoE (non-PPPoE) fixed line broadband services.
> > It can be found here:
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__datatracker.ietf.org_doc_draft-2Dpatterson-2Dintarea-2Dipoe-
> > 2Dhealth_&d=DwICAg&c=LFYZ-o9_HUMeMTSQicvjIg&r=LoGzhC-
> > 8sc8SY8Tq4vrfog&m=FsoSwjTvxj08keS6f_DYpmlby-FM9C5m-G-
> > VTwchGJI&s=PlUBOvZt82phRMfoJpQQ--BeqgUeIP_QKKl6uSOdXA4&e=
> >
> > Hopefully the draft covers the problem statement sufficiently, but in
> > summary, PPPoE makes use of LCP echo/replies to detect WAN failures,
> > DHCPv4/6 currently has no such mechanism.
> > After backhaul migrations or BNG maintenance/failure, an IPoE client can be
> > left with a stale DHCP lease for up to the Valid Lifetime.
> >
> > This draft proposes the use of regular ARP and ND for link monitoring, to
> > proactively expire DHCP leases early, and to trigger renewals or getting a new
> > lease from scratch.
> >
> > The intarea WG was chosen as it doesn't neatly fit within the charters of
> > either v6ops or dhc, and because it proposes new DHCPv4 and DHCPv6
> > Options. Although we expect discussions in both of these WGs as well.
> >
> > Thanks for comments already received from Ian, and Bernie, that helped
> > shape this -00.
> >
> > -Richard
> >
> > _______________________________________________
> > v6ops mailing list
> > ***@ietf.org
> > https://urldefense.proofpoint.com/v2/url?u=https-
> > 3A__www.ietf.org_mailman_listinfo_v6ops&d=DwICAg&c=LFYZ-
> > o9_HUMeMTSQicvjIg&r=LoGzhC-
> > 8sc8SY8Tq4vrfog&m=FsoSwjTvxj08keS6f_DYpmlby-FM9C5m-G-
> > VTwchGJI&s=PQDjf2K7lKp7HxtDXAqe1ce3u8xCq8vLbnHcTQjgRlw&e=

Ole Troan

2018-10-04 10:33:42 UTC

Permalink

>> I'm not sure DHCP configuration is really needed. It seems like overkill.
>
> Other reviewers/commenters previously felt that this was a useful
> addition. This would allow Network Operators to trigger or influence
> the connectivity check on CE routers that they are not in TR.069/etc.
> control of.

Requiring the DHCP option makes a big difference in deployability.
Without it, I can implement this feature today. With it, I have to wait until the DHCP option is standardised. And we would presumably get into a big debate about what CE routers should do in the cases where they don’t get a DHCP option. Should they try this mechanism anyway?

We have not requied explicit configuration in the other cases where we have recommended this mechanism. Ref. RFC5969.

While the echo mechanism requires some special provisioning on the local system (ensure that ingress filtering isn’t blocking packets with yourself as source) I am not aware of anything on the PE that woukd block this. If there is consensus on that, I think it’s perfectly fine to require this mechanism on by default in CE routers.
Although we might add some specifics to deal with a case where DHCP was successful, state in PE was correct, but health check still failed.

Cheers,
Ole

Richard Patterson

2018-10-04 10:48:11 UTC

Permalink

On Thu, 4 Oct 2018 at 11:33, Ole Troan <***@employees.org> wrote:
> Requiring the DHCP option makes a big difference in deployability.
> Without it, I can implement this feature today. With it, I have to wait until the DHCP option is standardised. And we would presumably get into a big debate about what CE routers should do in the cases where they don’t get a DHCP option. Should they try this mechanism anyway?
>
> We have not requied explicit configuration in the other cases where we have recommended this mechanism. Ref. RFC5969.
>
> While the echo mechanism requires some special provisioning on the local system (ensure that ingress filtering isn’t blocking packets with yourself as source) I am not aware of anything on the PE that woukd block this. If there is consensus on that, I think it’s perfectly fine to require this mechanism on by default in CE routers.
> Although we might add some specifics to deal with a case where DHCP was successful, state in PE was correct, but health check still failed.

Perfectly valid reasoning. Personally I'm not too hung up on
requiring the DHCP option, but thought it was useful. If we think it's
going to be a large barrier to implementation, I'm happy to remove it
and then emphasise the warmup period concerns within the Startup
section.

As a side-note, is it really that challenging or slow to get a new
DHCP option assigned? Perhaps I'm showing my naivety here.

-Rich

Bernie Volz (volz)

2018-10-04 12:07:57 UTC

Permalink

Your draft, if advanced, will get you the options defined.

You should update your draft to include the new items that are being added to the DHCPv6 options table maintained by IANA: whether option needs to be in ORO and whether it is singleton option (I believe yes to both is applicable).

- Bernie

> On Oct 4, 2018, at 6:49 AM, Richard Patterson <***@helix.net.nz> wrote:
>
>> On Thu, 4 Oct 2018 at 11:33, Ole Troan <***@employees.org> wrote:
>> Requiring the DHCP option makes a big difference in deployability.
>> Without it, I can implement this feature today. With it, I have to wait until the DHCP option is standardised. And we would presumably get into a big debate about what CE routers should do in the cases where they don’t get a DHCP option. Should they try this mechanism anyway?
>>
>> We have not requied explicit configuration in the other cases where we have recommended this mechanism. Ref. RFC5969.
>>
>> While the echo mechanism requires some special provisioning on the local system (ensure that ingress filtering isn’t blocking packets with yourself as source) I am not aware of anything on the PE that woukd block this. If there is consensus on that, I think it’s perfectly fine to require this mechanism on by default in CE routers.
>> Although we might add some specifics to deal with a case where DHCP was successful, state in PE was correct, but health check still failed.
>
>
> Perfectly valid reasoning. Personally I'm not too hung up on
> requiring the DHCP option, but thought it was useful. If we think it's
> going to be a large barrier to implementation, I'm happy to remove it
> and then emphasise the warmup period concerns within the Startup
> section.
>
> As a side-note, is it really that challenging or slow to get a new
> DHCP option assigned? Perhaps I'm showing my naivety here.
>
> -Rich
>
> _______________________________________________
> v6ops mailing list
> ***@ietf.org
> https://www.ietf.org/mailman/listinfo/v6ops

Ole Troan

2018-10-04 13:33:16 UTC

Permalink

Richard,

> As a side-note, is it really that challenging or slow to get a new
> DHCP option assigned? Perhaps I'm showing my naivety here.

Getting the DHCP option itself isn’t hard. But then you need to get it deployed in DHCP servers, you need to get it deployed in DHCP clients.
And you need to add linkage between the health checker and DHCP.

Harder from implementors perspectiv and harder from deployers perspective.

Does the extra work add value? It is certainly going to add significant amount of time.

Cheers,
Ole

Richard Patterson

2018-10-04 15:29:01 UTC

Permalink

Thanks Ole, understood.
On Thu, 4 Oct 2018 at 14:33, Ole Troan <***@employees.org> wrote:
>
> Richard,
>
> > As a side-note, is it really that challenging or slow to get a new
> > DHCP option assigned? Perhaps I'm showing my naivety here.
>
> Getting the DHCP option itself isn’t hard. But then you need to get it deployed in DHCP servers, you need to get it deployed in DHCP clients.
> And you need to add linkage between the health checker and DHCP.
>
> Harder from implementors perspectiv and harder from deployers perspective.
>
> Does the extra work add value? It is certainly going to add significant amount of time.
>
> Cheers,
> Ole