IPAE problems
Eric Fleischman <ericf@atc.boeing.com> Fri, 11 June 1993 02:46 UTC
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa16027; 10 Jun 93 22:46 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa16023; 10 Jun 93 22:46 EDT
Received: from Sun.COM by CNRI.Reston.VA.US id aa04250; 10 Jun 93 22:46 EDT
Received: from Eng.Sun.COM (zigzag-bb.Corp.Sun.COM) by Sun.COM (4.1/SMI-4.1) id AA11457; Thu, 10 Jun 93 19:35:29 PDT
Received: from sunroof2.Eng.Sun.COM by Eng.Sun.COM (4.1/SMI-4.1) id AA05103; Thu, 10 Jun 93 15:36:28 PDT
Received: from Eng.Sun.COM (engmail1) by sunroof2.Eng.Sun.COM (4.1/SMI-4.1) id AA26935; Thu, 10 Jun 93 15:39:49 PDT
Received: from Sun.COM (sun-barr) by Eng.Sun.COM (4.1/SMI-4.1) id AA13511; Thu, 10 Jun 93 15:36:10 PDT
Received: from atc.boeing.com by Sun.COM (4.1/SMI-4.1) id AA28973; Thu, 10 Jun 93 15:34:49 PDT
Received: by atc.boeing.com (5.57) id AA29171; Thu, 10 Jun 93 15:21:30 -0700
Date: Thu, 10 Jun 1993 15:21:30 -0700
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Eric Fleischman <ericf@atc.boeing.com>
Message-Id: <9306102221.AA29171@atc.boeing.com>
To: ip-encaps@sunroof.eng.sun.com, sip@caldera.usc.edu
Subject: IPAE problems
Dear IPAE/SIP working group, Several weeks ago I distributed a list of what I perceived were the PROs and CONs of the various IPng approaches for feedback. I received a "fair bit" of feedback which led me to modify my lists. One of the comments was a lengthy list of perceived IPAE flaws which were given in response to my positive PRO IPAE statement. This response disturbed me since I am not able to evaluate the significance of these low-level criticisms -- and I want IPAE to work as advertised. Because the author of the comments requested anonymity, I corresponded with Dave Crocker asking for advice. He suggested that I massage the criticisms (to protect the author's anonymity) and then send them to the SIP working group for evaluation and consideration. The purpose of this note is to do exactly that in the hopes that any problems may be expeditiously identified and resolved while the protocol is still new and pliable. I wish you every success in resolving any existing technical problems. Sincerely yours, --Eric Fleischman P.S. Please do me the favor of not trying to guess who the anonymous criticizer may be: Should you correctly guess his/her identity, you will embarass me by pointing out my poor "massaging" ability. Should you incorrectly guess the person's identity, then you may perhaps anger the misidentified person. ================ Anonymous IPAE Criticisms follow ================ The IPAE transiton is *very* similar to the original transition scheme that was proposed as a predecessor to TUBA. However, it was eventually abandoned because of terminal complexity. Unfortunately, I don't know all of the problems that were found at the time. Some of the ones that I know of which apply to IPAE are listed below: - How do IPAE hosts whose local routers are IP-only find the IPAE routers? I think that this can be handled by having the IPAE routers announce an IP route to some fictitious IP network number, and having the IPAE hosts send a packet to that particular IP address. This will allow IPAE hosts to send packets to the nearest IPAE router. - How do IPAE routers find other IPAE routers? This will probably require a lot of manual configuration. I haven't seen this written down. - How to deal with the "enormous branchiness" problem? For a large IP network, there could be a very large number of IPAE routers which are all direct neighbors (in the IPAE sense). This causes a difficult routing problem. This might be handled by manually configuring only a subset of the possible encapsulated links, but this leads to extra hops. - How to deal with the "lost ICMP" problem? ICMP error reports sent via IP-only routers do not contain enough of the discarded IP packet to include all of the SIP header. This means that the error report cannot be translated into a SIP error report to be sent to the source. This in turn implies that ICMP error reports are lost. I have heard two possible solutions to this: (i) update all IP routers to send more data in error reports (which defeats part of the main purpose of encapsulation); (ii) Have IPAE routers maintain a cache of recently received IP-ICMP error reports, and return a SIP-ICMP error report to the source of any packet which would have the same IP appearance over the next hop. Naturally this latter solution is not real pleasant for a vendor which prides itself on its IP performance (since every SIP over IP packet would need to be looked up in the cache, and you might need to return error reports for packets which might not have been lost). - How to make SIP MTU discovery work One small NIT is that SIP MTU discovery has to interwork with IP MTU discovery (complex but do-able). However, if ICMP error reports are lost, then MTU discovery doesn't work, and SIP relies on MTU discovery. - When translating from IP to SIP, how do you set the high order part of the address -- especially the C-bit? The November 11th IPAE draft proposes using either a large table or a DNS lookup. However, if some hosts are old IP- only hosts, and some are updated SIP/IPAE hosts, then the first bit in the address will differ (the C bit) for adjacent hosts on the same subnet, implying that the mapping table has to be maintained on a per-host basis. This implies that the table is potentially HUGE, and cannot be maintained in any one place (ie, *must* be maintained in a distributed fashion via DNS). If you use DNS lookup for this, then what do you do with the IP packets in the mean time? The choices would seem to be to discard them or cache them. However, several router vendors (no names mentioned) have been beat-up emphatically in the past for discarding IP packets while waiting for ARP replies, which implies that discarding is not acceptable to customers. Thus we need to cache them. However, as we are currently shipping routers which can handle more than 400,000 packets per second, and are working on higher speed routers, and given that DNS lookups tend to be slow, this would seem to imply a very large cache and/or very optimistic assumptions about how much of the traffic will be repeat (and how often the cache will need to be flushed due to a routing changes). Also, given that a complete flushing of the cache with this proposal would be a disaster (routers with 1,000,000 packet per second performance could drop hundreds of packets per second while trying to re-build their cache, with DNS straining mightily to try to keep up), it will be necessary to put in partial cache flushing when routes change, which will be complex to make work. - When translating from SIP to IP, how to set the DU ID SIP does not have a data unit ID field. Thus, when translating from SIP to IP, the router will have to pick a value. This means that two packets which happen to take different paths may be translated at different IPAE routers, and could get the same data unit ID. - Practicallity of IP --> SIP/IPAE --> IP translation. The IPAE and SIP documents both make a big deal about the fact that translation can be used to improve the routing problem in Internet backbones *before* any hosts are updated. However, as mentioned above, this runs into a problem with the C bit implying that mapping tables need to be on a per- host basis (ie, enormous). Even if there weren't a C bit problem, the mapping tables would still probably need an entry for each customer of a regional. This implies that the mapping tables will be substantially larger than the associated routing tables (the ones which are allegedly too large to hold). This in turn will almost certainly require the tables to be held in DNS, which implies packet cacheing and rather slow lookups in the fast path in our fastest routers (the routers which go into backbones). The alleged advantage of this approach is that the mapping tables will be very static (change only with administrative changes, not as a result of routing dynamics). Thus it is feasible to have a huge static table plus a small dynamic routing table, even when it is not feasible to have a huge dynamic routing table. However, there is a better approach which can be done with straight IP packet forwarding. This approach has been discussed in the TUBA working group. Here the very static table maps from IP address to regional (and is thus some number of orders of magnitude smaller than the static table required for the SIP/IPAE proposed feature -- how many orders of magnitude smaller depends upon how the C bit problem is handled, and thus whether the IPAE table needs one entry per customer of a regional, or one entry per host). Dynamic routing is then done to the regional. Given that this table is much smaller than the corresponding table in the IPAE plan, it can be maintained in the router, which does away with the need for a DNS lookup. Thus this alleged advantage of IPAE is really an order of magnitude worse than with the other approaches (ie, worse than can be done with straight IP). There is another minor problem with the IPAE transition scheme: it requires routers to reassemble (when we were doing the IP<-->CLNP transition with encapsultion a while ago we thought of this problem, and came up with a way to get around it -- however, this would not be feasible with SIP/IPAE, and in any case required routers to deal with options efficiently). --------- the following is an "aside" in the message ---- Regarding the use of static tables to map from an IP address to a topologically significant address (which may be a SIP address, used with IP --> SIP translation; or may be a provider identifier, used with straight IP forwarding in order to extent the life of IP) please consider the following: With the IPAE/SIP approach, if you ignore the C bit problem, then there is one entry in the static table for every customer of each regional. If CIDR is not religiously followed, then there will also need to be an entry for every IP network number that does not conform to CIDR. Thus if you ignore the C bit problem then the static table is exactly the same size as the Internet routing table would need to be (but is, of course, more static). However, due to the C-bit problem, (unless something major gets changed in the proposal, which is still possible) there will need to be one entry in the table for each host. However, for the straight IP solutions this will be the exact same size as the IPAE solution would be if there were no C bit problem (Note: I was mistaken when I previously said that it would be better than this.) Thus, the pure IP solution is still preferable to the IPAE solution for three reasons: (i) It eliminates the C bit problem, thus the static table is two or three orders of magnitude smaller in the pure IP solution than in the IPAE solution); (ii) It eliminates the need for packet translation (not something that we really want to do with every packet at OC3 rates!); (iii) It maintains the independence of short term steps taken to extend the life of IP from long term steps taken to deploy a new protocol. ------- end of the aside ----- - Training costs for personnel Given the above problems, any savings of training costs for personnel by an IPAE/SIP solution is highly doubtful. I think that it is probably possible to find some solutions (however complex) to each of the aforementioned IPAE problems, and I expect that any additional problems caused by these additional solutions will themselves be fixable (by adding even more complexity?). Thus my guess is that IPAE *can* be made to work with sufficient effort. However, the difficulty of getting all of this to actually work will be hard on network management personnel which equates to a considerable expense. - Dependence between old and new protocols I understand that one thing which really hurt in the DECnet phase 4 to phase 5 transition is the very close coupling of the old and new protocol suite. Thus, for example, DECnet phase 4 hidden areas work fine as a way to extend phase 4; Phase 4 to phase 5 packet translation works fine as a transition technique; The phase 5 assumption that addresses are globally unique (implicit in the name to address lookups) is also reasonable. However, these three facts in combination means that phase 4 to phase 5 transition is very difficult for customers which are already using hidden areas. IPAE similarly couples the old and new network layer protocols very closely (even more closely than the DECnet transition). This is likely to constrain what we can do to try to keep IP going as long as possible. However, whether this turns out to be a problem is unclear until we see what folks do to try to keep IP going. - Basic Motivation behind SIP The basic motivation behind SIP is to make a protocol which is so simple that it is possible to build very high speed routers. However, the router vendors are not involved in any significant way with SIP development. I have not heard anyone from any router vendor suggest that SIP is really the right way to build high speed routers. Most of the work in forwarding at very high speed is outside of the network layer protocol in any case, and alternative means exist which will allow forwarding of protocols with more flexible addresses at the same high speeds. - SIP Address All of this of course ignores the problems of the SIP address space. Clearly the whole point of transitioning to a new network layer protocol is to come up with a network addressing scheme which works. It is the height of folly to use incorrect assumptions (about the need for small addresses to allow high speed forwarding) to force you to an address space which is not well thought out, is not flexible, and is of dubious long-term sufficiency. In 1980 a paper was published as an NBS proposal which proposed a 64-bit address space for an International version of IP (this sort of became input to what later became CLNP). Comments were made that 64 bit addresses were too large (after all, 32 bit addresses are clearly large enough since IP uses them and they can support routing to 2**32 hosts, which is much more than could ever exist in the world), and comments were also made that 64 bits were not large enough. Thus, (to quote Yogi Berra) the current IP address space discussions are a case of "Deja Vue all over again".
- IPAE problems Eric Fleischman
- Re: IPAE problems Christian Huitema