DNS Working Group
26 May, 2016,
At 2 p.m.:
DAVE KNIGHT: If everyone can take their seats. Hi, I am Dave Knight, welcome to the RIPE 72 session of the DNS Working Group. We have two sessions today, we have got lots of content and we are very grateful for that. Thank you to those who brought it. First, some agenda bashing, in the agenda we have a section for follow‑up content from the plenary. It mentions specific presentation there. But if there is things also that have come up in other groups that people want to bring up here, please feel welcome to do so. In the second session there is a proposal on DNS privacy resolver. There is also going to be a brief proposal about the Yeti DNS project and RIPE NCC would like to take a moment to respond to these. On mic discipline, if could you state your name and affiliation, the session is being webcast. There are no outstanding actions from previous sessions and no other changes to the agenda. We have two sets of minutes to approve, we have recently published the RIPE 70 minutes and a little while before that the RIPE 71 minutes. Are there any comments on either of those? No. OK. I think we can consider those approved, thank you. And then moving directly on, I would like to welcome Anand Buddhdev from RIPE NCC to give us his update.
ANAND BUDDHDEV: Thanks, Dave. Good afternoon, Copenhagen. I am Anand from the RIPE NCC, and I am here this afternoon to give you a little update on the stuff that we have been doing, DNS‑wise, at the RIPE NCC, over the last several months. So dive straight in and talk first about K‑root. This has been a big focus of our activity in the last few months, and I am happy to report that we have gone up from our original 17 locations up to 39 locations in the world, and these are all active. We still have our five core sites in Amsterdam, London, Frankfurt, Miami and Tokyo and these are our multi server high capacity sites. And then the 34 other sites that we have are what we call our hosted sites and these are run with the support of local hosts in various locations, they provide the hardware and we run the service.
All these servers are now running the same operating system managed with the same configuration management system, and have more or less the same kind of hardware configuration, which makes it easy for us to support a much larger number of nodes.
Just before this RIPE meeting, we also refreshed the K‑root website, it has now been integrated into the main ripe.net website so please feel free to visit it. On the website we have a map showing all the new locations we are present at, we are also showing DSC statistics so you can go and look at the query rates and RCODES and things like that, and we also have a section of the website where potential K‑root hosts can apply to become a host, the application process is all done through a web form so it makes us easier for us to interact with lots of people.
One interesting thing is that as part of the whole update and refresh of our K‑root sites, we were able to return seven /24 prefixes, IPv4 prefixes, we had a few prefixes from the APNIC region, one from the ARIN region and we were using these for management, but in our new model we don't need so many addresses so we have returned all this address space to the various registries.
(Applause)
One important change that we made when doing the refresh of K‑root was that we no longer have the concept of a local K‑root instance. In the past, we had nodes where we would announce the K‑root prefixes and attach the no export community to them, which means that a peer was not allowed to propagate this prefix to their transits or to any of their customers. We found that this was an artificial limit and in some cases, could cause black holes and was not making efficient use of the K‑root server that we had in a location. We felt that the prefix could be expanded and announced to more peers and bring the K‑root server closer to more users so we now let the host of a K‑root node decide how far they wish to propagate a prefix. If they have the capacity, if they have the network capacity, then are free to announce the prefixes to their up‑streams, if they wish to limit the reach of this K‑root node then they can limit the announcements to just their customers or their nearby peers, for example. And basically, we want to just let BGP do its job as much as possible. So, this is an important change that we have made
So I would like to show you some little graphs here to show the effects of this routing policy change. Here, we have a RIPE Atlas map showing RIPE Atlas anchors and probes reaching the K‑root server in Athens, this is hosted by GRNET at the G RIX Internet Exchange, and this is an example of a K‑root node where the announcement is restricted to just the peers, and you can see that its footprint is fairly local; it's restricted mostly to Greece and nearby places, nearby countries. Another example of this kind of rooting policy is Johannesburg and Montevideo, on this map you see, on the left‑hand side, you have our K‑root server in Montevideo, hosted by LACNIC and as you can see, it's serving its local community rather well, in Uruguay and Argentina. On the right‑hand side, we have a K‑root server in Johannesburg, hosted by the Johannesburg Internet Exchange and again, it has a local footprint and is reaching lots and lots of customers in that region. We have another K‑root node, hosted by Eins Und Eins and this is in Kansas, Kansas city in the USA and this is a network operator, that has a much bigger network and are announcing the prefix much further. So here you can see that they are reaching a wired variety of clients all over North America and as far down as Latin America as well, So this has a slightly bigger footprint. And then, finally, we have a host, Selectel, in St. Petersburg and they have a very, very good connected network and this is their footprint. This is reaching clients almost all over the world, except for Latin America, so this is a example where the host has a fully open peering policy and will announce the prefixes to all their transits and will reach lots of clients all over the world.
And here we have an example, again, showing data from RIPE Atlas of regions that are actually rather poorly served, so this is our neighbour, Belgium, and it seems that almost all the clients in Belgium are served by K‑root instances from Miami, that is the green dots, and all the yellow ones are ‑‑ is K‑root server in Iran, in Tehran, so this is really, really strange. So if there is any one from Belgian ISPs here come to us about hosting a K‑root instance or peering with us at some of our other locations, because we don't think this is very good for customers in the Belgium region.
JAAP AKKERHUIS: It's not just Belgium, Amsterdam as well.
ANAND BUDDHDEV: It is true for anywhere, in any given country clients don't reach because what their network operator is doing peering rise but this is a slightly extreme case.
Moving along, the RIPE NCC has provided secondary DNS services for various ccTLDs for a long time now, many, many years but we have not had a proper policy about this for quite a long time and so we have been supporting lots of these without really any clear guidance on how to provide this service, at the moment we provide service to 77 ccTLDs around the world, many of these are developing and we are quite happy to support them, but we asked the RIPE community for guidance on how to further provide this service and keep providing it, or perhaps how we would stop providing service to the ccTLD if it was no longer necessary and so last year in December a document, RIPE‑663 was published and this document has guidelines on how the RIPE NCC should provide secondary DNS service to ccTLDs and what criteria we can use to determine if we should provide the service or not. So, since the beginning of this month we have began evaluating all the ccTLDs that we provide service to and we are starting to apply the criteria from RIPE‑663 and this means that some of the ccTLDs that we provide service to will not qualify and so we will start to contact them after this RIPE meeting and ask them to gracefully withdraw the service away from our network. I would like to note one ccTLD was really good to us and saw the publication of RIPE‑663 and noticed they were no longer qualifying and pre‑emptively withdrew the service. If you are a ccTLD operator and getting service from the RIPE NCC and you know about this document, then any legwork that do you in advance would really help us.
The RIPE NCC also has another DNS cluster we run the authoritative DNS cluster. This is the cluster where we provide secondary DNS service, as well as running our main zone ripe.net, and all the Reverse‑DNS zones for all the IP address space that we allocate. And one important change we have made there recently for those of you who are eagle eyed and have been keeping an eye on the name servers, you will notice that we have two interesting name servers, called **C1.off DNS and C2.off DNS and we have done this because we are concerned about DDoS attacks against our DNS infrastructure. DDoS attacks are increasing, have become more and more frequent of late and if our authoritative DNS server is attacked, even if the zone or the domain that is being attacked is not ripe.net, our main ripe.net would suffer as a result of this DDoS and perhaps make services of the RIPE NCC less available because all the various names for Whois and FTP and the website are all served out of this cluster, so we have decided to increase the resiliency of RIPE dot by taking partial secondary DNS service from CloudFlare, so these are the two name servers that are part of this name server set, and CloudFlare has many POPs around the world and lots of capacity to absorb traffic. So approximate if there is a large DDoS against the name servers of ripe.net, then there is a good chance that at least two of these name servers will keep answering even if the other low capacity ones are not able to answer. That is short‑term arrangement that we have with crowd flare at the moment and we have committed to doing a full request for proposals where we will be inviting companies to present proposals to us for secondary DNS servers for ripe.net.
One other service that I would like to focus on, which is also runs on our authoritative DNS Cloud is secondary DNS for LIRs. So, some of our LIRs have /16 sized IPv4 allocations, and some have /32 sized IPv6 allocations and if they have that, they are able to request automatic secondary DNS for these large Reverse‑DNS zones by using a special arrangement. When they create their domain object in the RIPE database, they can introduce an end serve attribute and give it the value NS.ripe.net and if they do this we automatically slave their Reverse‑DNS zone and provide service to it from our server NS.ripe.net which is Anycasted and has lots of capacity. This has been provided by the RIPE NCC for a long time because in the past we wanted to have stable Reverse‑DNS service especially for large zones, and so we have been providing this service for a long time.
As I explained, this service is only available if your Reverse‑DNS zone is of size /16 for v4 or /32 for v6. One of the things we cannot do is check ‑‑ do pre delegation checks when accepting this Reverse‑DNS delegation because the zone will not have been provisioned on our server yet at this point so when doing pre delegation checks we have to skip checks for NS.ripe.net and have to accept this delegation into our system. And after the domain object is created, then the provisioning system at the RIPE NCC picks this up, and it goes and configures our hidden masters and NS. Ripe.net. Unfortunately this doesn't always work because we cannot transfer the zone, perhaps the operator has not allowed us to do the transfers, they have ACLs in place that block zone transfers, perhaps their hidden master is not open to the world and things like that.
So I would like to come to some numbers about this service. If you look at this DSC graph here, you can see it's getting, this particular server is receiving up to 14,000 queries per second during peak times this can go up to 20 to 25,000 queries per second so that is a lot of queries. But if I switch to the R code graph for the same service, there is a rather depressing picture there: More than half the responses are ServFails, that is because a lot of the zones that are provisioned on NS.ripe.net have transpired, we can no longer transfer them from the operators and they are sitting there serve failing all the time. So at the moment, or when I did the numbers last week, we had just over 4,000 such zones configured in our system. Of these, nearly half of them are expired. The query rate as I mentioned up to 25,000 queries per second at peak time and of the responses 38% are ServFail.
So, this introduces some problems for us, trying to refresh all these zones perpetually is a burden on our hidden masters as well as on the secondary ‑‑ sorry, on the publication servers and we have nine of them, all of these are trying to refresh all these zones all the time and some of these name servers are rather addresses that refresh, so they can be rather ‑‑ introduce heavy burden on the serves. There is a slight silver lining: We have been running BIND, Knot and and when we exposed these name servers to all these misconfigured zones it helped expose all kinds of timing and refresh problems and all sorts in these software, so, you know, I see that as the silver lining. We help fixed the software in a way. But it's a problem, nevertheless, our servers have to carry the burden of refreshing all these zones.
And a lot of times users don't seem to understand that our NS.ripe.net is Anycast so we won't be doing transfers from the Anycast IP address but from hidden masters so they don't understand this set‑up and structure and we have to explain to them and get them to open ACLs and all sorts of things. So we have been thinking about the future of this service, and we have two sort of main options here: One is that the RIPE NCC could simply retire the use of this service. We could make an announcement and then give people a grace period in which to stop using the service and give people time to migrate away. And the reason for this is these days people have very good DNS infrastructure, it's not like the Internet of the '90s so we think that users should be able to do without the RIPE NCC secondary name service on NS.ripe.net. And the other option is, if we wish to continue this service, if our community feels that we should continue to provide this service, then we as RIPE NCC feel that the provisioning method of using domain objects and doing this configuration post fact, is not the right way of doing this because we can't give feedback to the users on how to configures things so we think that we should be integrate the provisioning of this into the RIPE LIR Portal and this way we can provide feedback to the users, help them configure this service properly and then go and provision it. And then there is a higher chance of it working properly. So, if you have any feedback about this, we'd like to hear about it.
Related to provisioning, so at the RIPE NCC we do pre delegation checks which means that we check all the name servers and see that the reverse zone is configured properly on it, the name server is answering over UDP and TCP and is properly set up and this is to ensure when we do delegation the reverse zone will recorrectly and we have been using this software called DNS check that was developed at the Swedish registry, but this has been abandoned, and there is a suck certificates and it's called Zonemaster. So, we ‑‑ Zonemaster has been written from scratch, it has been properly designed with a lot of new tests in it, and the tests have been ‑‑ the tests themselves have test cases to ensure that they work properly. It has a more modular architecture which means it is easier to scale this horizontally and well‑defined API for communicating with Zonemaster and getting results out of it and later this year we are planning to switching to Zonemaster after we have done our testing and that kind of stuff.
And then I would like to DNSSEC. So the RIPE NCC's forward zone ripe.net and all our reverse zones are DNSSEC sign. From the beginning, when we were signing, we were signing them with the SHA‑1 signature algorithm, because back then that was the only one designed and last year we decided we should upgrade to SHA‑2, SHA‑2‑56, so we contacted our vendor, Secure64, explained that we required them to provide software updates to make this possible. They did that. We did all the testing and then at RIPE 71, we told you guys that we were going to do this algorithm rollover in November, just after the RIPE meeting, and I'm very happy to report that we performed this algorithm roll over successfully, we had no validation failures. During the algorithm roll over we discovered a few interesting issues such as older versions of unbound not being able to validate, we reported them to NLnet Labs so there has been a lot more publicity about this and people are more aware of thousand do algorithm roll overs. We wrote a RIPE Labs article and the article is still there, if you haven't seen this please read it. If you have questions or comments or concerns about thousand do algorithm roll over for your own zones we are happy to talk with you about this. That is it. I think take questions and comments right now.
SHANE KERR: A quick response wins you are asking for feedback on the NS, since it's only for reverse, people need forward DNS anyway, I don't see any real loss of value by turning down the service and that would be my recommendation. Another thing; I look at RIPE 663 really quick, the ccTLD guidelines, and I notice it doesn't say anything about IPv6 so I may do something on the list to see if we can get that, that added as a reason that a ccTLD could get service from the RIPE NCC because I think that is an important missing piece to the IPv6 only loop is that there is a few dozen TLDs which still don't have IPv6 service and you guys do so maybe we could add that.
ANAND BUDDHDEV: Thank you for that. Sure.
JIM REID: Speaking for myself. Anand, I remember before I called some Krofty domains, that were lying around the NCC's DNS infrastructure, things like ripe.int and ripe‑n.cc. Did they actually get removed?
ANAND BUDDHDEV: Oh, yes, yes, we deleted all those domains and sent an email about this, to the DNS Working Group mailing list last year so that has always been cleaned up.
AUDIENCE SPEAKER: Gore abfrom Limelight Networks. During the RIPE dock 663, the other one is a lot of ccTLDs I have worked with for a long time I worked quite a bit with them, I think one of the things to look at is a lot of the smaller ccTLDs might have more than three servers in the same network, right next to each other. And, you know, if ‑‑ probably not qualify but I think that should be considered as well, that there is three servers is very, you know, hard limit, you might want to look at it more subjectively and say, hey, if three three servers but I take the RIPE service out they might not have three secondary in diverse locations, so please keep that in mind. Thanks.
ANAND BUDDHDEV: Thanks.
AUDIENCE SPEAKER: . Blake Wisdale. Come and talk to me about Belgium, later.
ANAND BUDDHDEV: I will come and find you.
DAVE KNIGHT: I assume you are looking for guidance from the Working Group regarding the NS.ripe.net issue. Do you have a timescale in mind for that?
ANAND BUDDHDEV: There is no rush as such. It's a problem that has been going on for white a while, we **can keep tolerating all the failures and refresh issues and stuff we really would like to improve this service for everyone in general and so we are looking at doing something about it starting from later this year into the first half of next year, is kind of our time line.
DAVE KNIGHT: I guess we can take that to the mailing list then.
(Applause)
We have Duane Wessels, and he will be followed with a brief update from Paul Hoffman.
DUANE WESSELS: Yes, thank you. So I am from Verisign, this is about increasing the zone signing key size for the root zone. There is the outline. So, if you are not familiar with some of this lingo, here is some of the abbreviations, you might hear during today. KSK is key signing key, which is a function that is operated by IANA/ICANN. ZSK is the zone signing key which is operated by Verisign. KSR stands for key signing request, and this is bundle of XML stuff that Verisign sends to ICANN for keys to be signed during key signing ceremony and just to be confusing S KR comes back, it's the response to that, it's the signed key response, so you will hear these terms.
So, you may have been hearing things about KSK roll‑over and I am not here to talk about that, Paul will be talking about that right after I am done here, at the end. But just to say, and Paul would say the same thing, that ICANN and Verisign are working very closely to ensure these two do not overlap and interfere with each other and things are going doing very smoothly. This slide shows the current DNSSEC parameters for the root zone, algorithm is number 8, which is RSA SHA‑56. The KSK size is 2048 bits and VSK size is 1024 bits. You can read the rest of that table if you want to. But just to note the only thing we are changing here ZSK size and no other parameters will be changing during this activity.
So in terms of schedule, this shows that the check mark shows things have already happened, there is some testing that took place between ICANN and Verisign back in April. About couple of weeks ago there was the 25th KSK ceremony where the 2016 Q3 keys ZSKs were actually signed, so that is done. The next big event will be similar ceremony in August, where the Q4 ZSKs will be signed. On September 20th, is the date at which the root zone will first contain ZSK of 2048 bits, that will be pre published in the zone for a period of about ten days. And then on October 1st is the day on which there will first be a root zone which has been signed by the larger key.
This diagram sort of shows the same thing, how the ceremonies line up with the quarters, but one thing to point out in particular is, down in this time frame between September and October, the circled area there shows that pre published key that goes in the root zone and that is the thing which has just been signed at the ceremony a couple of weeks ago so that happened already.
So some more details about the roll‑over. If you don't know, the ZSK is rolled quarterly, this has been happening for the last six years or so. A calendar quarter is divided into nine slots of approximately ten days each, sometimes the last has to be a little bit longer. So the DNSKEY RR SIG record changes in each of those ten slots. A roll‑over uses the pre published technique which means incoming keys are pre published for one slot and the outgoing are post published for one slot. This diagram shows that and at the top you can see we have three‑quarters and the smaller boxes at the top say slot 1 and 2 through 8 and slot 9. And you can see that in slots 9 and 1 is when we have two keys published and that results in increase in DNSKEY response message size.
So this is what things will look like when we execute the increase from 1024 to 2048. It's going to be very similar on that second slot ‑‑ in the first one is when we will first see the 2048 bit key. One difference is that the top blue line, instead of post publishing for just one slot the old key will be post published for three slots, just to give us a little bit of extra time in case we need it in case something goes wrong.
So that's what I just said there. Talk a little bit what it means to have a larger ZSK. So one of the things that we kind of worry about a lot is the response message size and this chart shows the sizes of signed DNSKEY response messages, and they are kind of in order here of how they will go during the transition. So at the top we have the situation where it's a single 1024 bit key, ZSK, and that results in a DNSKEY response size of 736 octets. When we do a 1024 to 1024 roll‑over, which has been happening for years, that size jumps up to 883 octets. Then when we do a 1024 to 2048 transition for those ‑‑ for that brief time, it's going doing to 1011 octets. Then it will go back down during the time when we have a single 2048 bit key. When we have to do a 2048 to 2048 that goes up to 1149, and similarly, if there is a KSK roll‑over with 2048 bit KSK and 2048 bit CS K then the size is 1139. Same size.
So at the start I said I wasn't going to talk about the KSK roll‑over, here is a bit more detail about that. At such a time when there would be a KSK roll‑over this is what those responses would look like and they ‑‑ start to get worried about fragmentation and things like that, getting up to 1,400‑plus octets for the size of this response. Still below 1,500 MTU size but getting pretty close.
So that was just about the size of DNSKEY responses, and we also need to think about how are the other responses going to change. A 2048 bit versus 1024 bit signature is 128 octets longer, but it's not just that simple because there are other factors at play. So to understand this we did some simulations and captured some data from the A‑root server and then took a zone file and signed it in various configurations with different key sizes and number of keys and replayed that traffic to name servers serving those zones and recorded the response sizes and whether the TC flag was set and so on. So this was all done earlier this year, back in February, and here is some details about the data that was used, it was a ten minute trace and something like 37 million UDP queries recaptured and then replayed.
So, we looked at fragmentation, at how many responses, here the limit was 1472 octets and there were no responses that needed to be fragmented, the only ones that did need to be response it's to any queries and there were very few of those, just a handful.
Looking at just DNSKEY responses here, this shows how many of those would experience truncation, that is they would be returned with TC bit set. And the green bar on the far left shows again the normal case where we have got a single 1024 bit key and where two‑and‑a‑half% of DNSKEY responses that get truncated and that is going doing up to 5 .5% when we ‑‑ well it goes up already when we do 1024 to 1024 roll and stay at that level when we introduce the 2048 bit key, it doesn't go beyond that in all the simulations.
It's a little bit different when we look at truncation for all responses over all. Here, the amount of truncation only depends on the size of the key that is used for signing. So the normal level is about half a percent of all responses get truncated and after we sign with the larger key that is going doing up to about 1.4%.
This graph shows the distribution of packet sizes, and again the only thing that depends on here is the size of the key used for signing. So even though there is lots of lines sort of drawn on the graph, they are all ‑‑ overlaid and it looks like there is only two lines.
This graph shows the bandwidth that a route server letter or based on simulation this a root would produce for all the different configurations, so when the zone is signed with 1024 bit key we are at about 250 megabits per second and when it goes up the larger key, that will be about 350 megabits per second so this is useful for operators for their planning and capacity for this upcoming change.
So I will say a bit about fall back plans. We fully expect everything doing smoothly and to occur without incident but in case something unforeseen does come up, we have some plans to fall back should it become necessary. And in that case we will go back to what we call a known good state which will be a 1024 ZSK and in fact, we will ‑‑ it will be the contact same ZSK that was used in the quarter prior to the length change. So in order to make this happen, ICANN will, and already has, signed two KSRs, at the first ceremony, and at the next ceremony they will do the same thing, sign two KSRs, one of them includes the 2048 bit ZSK and the other contains a fall‑back 1024 bit ZSK. And this shows graphically the fall back options, at the top is the sequence for the way we expect things doing normally with introduction of the 2048 bit ZSK, but the bottom shows the fall back KSR with just 1024 bit keys. So as I said, in these two key centre requests it's the same key being used both in Q3 and Q4. These green boxes here show the keys that were just signed at the previous ceremony and this shows the ones that will be signed at the next ceremony.
So, should it become necessary to do a fall‑back during slot 9 of Q3, all we need to do is ‑‑ I will talk about that in a minute. First though, criteria for fall back, it would have to be something unforeseen and very serious and something that could not be solved by individual operators by temporarily disabling DNSSEC validation on their name servers. So if it sort of meets those criteria, then Verisign would consider executing a fall‑back plan. The first time we have to worry about any of that will be in slot 9 of Q3 when the new key first appears in the pre published phase and the next time when it's used for signing in slot 1 of Q4. And it may take a couple of days, obviously, for cache signatures and whatnot, to expire from caches in the Internet. And last big milestone will be the point at which the old 1024 bit ZSK is removed from the zone, remember that is three slots of post published period so that is 30 days after the zone is first signed.
So again, in slot 9 if there is a problem all we have to do is unpublish the new key and continue signing with the 1024 bit key, and if that happens there will be no ZSK roll for the next calendar quarter we will just continue with 1024 bits. If there is a problem in this slot 1 phase then we would revert to signing with the old 1024 bit key and at some point we would need to remove the 2048 bit key and exactly when that happens will depend on the mature and severity of the particular problem that caused the fall back.
So lastly, I want to encourage everyone to test their networks, not necessarily the RIPE network because I have already tested that. But when you go home, there is this tool, you can go to this website, called keysizetest.verisignlabs.com and you should see a web page that looks like this. This web page issues a bunch of queries to a zone, signed at Verisign by kind of mimics the root zone and it's a little bit even better test because of the name, these responses are actually larger than they would be for the root zone but you can go to this page and you should see something like this. If you do not see this, I would very much like to you contact me and tell us about it and we can figure out what is going on. Please give that a try. And that brings me to the end. I am going to hold the questions after Paul has a chance to give his presentations and we will change questions together.
Speaker: Hi. So again, everything that Duane has been talking about is the ZSK. Some of you have heard that the KSK is going to be changing too. I have got just two slides on this and none of them with numbers or pretty pictures, like Duane's. The upcoming roll over is going to start after the ZSK is successfully changed, that is that because we are ‑‑ as you saw in one of Duane's slides that we are going to be adding even a little bit more size. We want to make sure that the 2048 bit ZSK has worked just fine. So, you heard Verisign's plans for the ZSK size increase. ICANN and Verisign are working together, on the KSK roll, you guys came out to Los Angeles, we are coming out to DC, all of that is going fairly well and as soon as we have an agreed plan, that will be public, for public review that should be coming up soon. Just the same as you saw in Duane's last couple of slides, we are assuming that this is going to work and we are also planning for if it doesn't work, how to do a good fall back, roll back, we actually haven't picked the word right, so we have fall back, roll back and back out. We will come up with a word before we present this to you, hopefully just one. And the last one is very important: We are going to, as soon as we can, be presenting the DNS community ‑‑ actually the entire community with as much as technical information as we can with both the plans, fall back plans and things like that. Some of that is already out there now, for those of you who have been following along there was a design team document. We will be doing a lot more, we want to hear from the community, we want the community to look through, as we have been doing these things we have found some interesting stuff already and so we are hoping to flush that out sooner. The URL up there is where you can always be following that. There is ‑‑ there is not that much information there now, every time we update anything it will appear there, and for those of you who like Twitter we will have a hash tag, we will be putting up things on the hash tag. That is it from me, if there are people ‑‑ questions for either of you. Come on back, Duane.
SHANE KERR: Beijing Internet institute. It's cool, I saw an earlier version of the ZSK stuff and I think there is more data here that is nice to see. A quick note that the Jetty root server test‑bed we're actually testing roll to 2048 bits are our own ZSK right now. We are not doing it with the elaborate timing stuff and things we wanted to do a basic test to see if it works, that will be done in a few weeks. I had a question about the bandwidth increase, that seemed really thigh me because my understanding most of the traffic through was crap, this was only for good queries, why was there such a big increase.
DUANE WESSELS: I think it's such a big increase because really, so many queries come in with DO bit set and whether ‑‑ I didn't ‑‑ it is all queries, it's not just good, but whether or not those DO bit queries are junk or not, I guess that is another study.
SHANE KERR: Right. If only we could know who is validating and what they are using to do that.
PAUL KOFFMAN: Notice the DO doesn't mean validating. DO means someone told me to do this in a query.
SHANE KERR: Are there going to be specific feedback channels so that you know that there are problems with the ZSK roll or are you relying on Twitter for that or what is the plan?
DUANE WESSELS: There will be some specific feedback channels, we haven't published those yet, but Twitter for sure, an e‑mail address, mailing list, the usual kind of things.
SHANE KERR: That is good enough, thank you.
DAVE KNIGHT: I am going to be rude and jump the queue. During the initial signing there was the root dns.org website, that doesn't look like it's been updated since then and you both have different URLs, do you plan to come back together and collaborate there again or is that dead and was to do with the signing?
DUANE WESSELS: Do you want me to answer that? So we have talked about that. At this point I think that website is going to become sort of historical reference and probably not going to be updated for these activities. That is my sense of it. But, I mean, there is still time and things could change so I don't know.
PAUL HOFFMAN: One thought we had to leave it for stark reasons and put a big flashing thing at the top, if you are looking for new stuff come over. They are sequential, separate, even though we are working together, so and again we are not expecting to start the KSK roll until the successful completion of the ZSK size increase.
JIM REID: Just an observation, first of all, and then a question. I think, first of all, to thanks to both of you for communicating this information. It's good we are getting very clear guidelines about the plan and very well thought out and fall back mechanisms as far as can reasonably be anticipated, it's great this information is being done and I want to you commend you both for all the hard work and everyone else involved. Duane, can you tell us about the rationale for moving from 1,000 bit to 2000 bit for signing zone ‑‑ for three month duration are no longer good or align things with what you are doing in.com and .net?
DUANE WESSELS: So it's a couple of things: As you may be aware there is a missed recommendation on key lengths so that is a part of it and their recommendation had some exceptions for DNSSEC but those exceptions have expired. So, that is one big motivation. And there was also I would say a pretty big mailing list brouhaha a couple of years ago where this was discussed and it's in response to that community input that there should be, the key should be bigger.
JIM REID: Thank you.
PAUL HOFFMAN: One salient bit of the brouhaha that at least I knew and this is before I was at ICANN but people were paying attention to a lot of people have heard about DANE and wanting, there is no ability to use DANE in your browsers that is because the vendors have said we made all of our CAs going to 24 bit and above, we don't haven't to have the weak link, being DNSSEC, whether you agree with that or not they pretty much said we are not going to consider DANE until both the keys are 2048 bit so that helped the decision as well.
PETER KOCH: I was just going to ask the same question that Jim asked, I went back and I came back again because that helps all the people who get their domains signed immediately from the root?
Paul. Yes.
PETER KOCH: How many, a handful. I haven't seen a general recommendation to move to 2048 or some intermediate size for any of the intermediate levels in the DNS tree. And the considerations and measurements that were done during the experiments that Duane described can be read elsewhere, may or may not apply to other levels in the DNS tree so the take away that oh everybody should now run to 2048 for the ZSK, is probably a bit premature.
DUANE WESSELS: I would agree with that. But I guess my question to you is, you said there is no recommendation. Where would you see such a recommendation coming from? I mean, who would be in a position to make that recommendation?
AUDIENCE SPEAKER: DNS Working Group.
PETER KOCH: There is so many of them I don't think they can agree on the size. I would definitely not refer to the Nest recommendations for a variety of reasons, not because they are bad but governance questions.
PAUL HOFFMAN: I would emphasise that, of ‑‑ you don't necessarily want to hear either of us or organisations on stage recommending it. Many national bodies have been recommending going past 1024, not just [Nest], I don't believe I have heard in the last five years any regional national organisation recommending to stay at 1024 bit. The difference between 1024 and 2048 bit is completely magic, that is if you went from 1024 to 1596 that would be an amazing increase but people like powers of two that they can recognise. That, literally, is the only reason that everyone is going to 2048 bit.
AUDIENCE SPEAKER: Phil, speaking for myself. When the thing was announced for breaking the 1024 bits the same also published details on how to break 1024 RSA and basically if you have, say, 100 million spare cash you can build a nan instantly factor it, I wouldn't assume that this is safe for any purpose at this moment.
PAUL HOFFMAN: That is not a correct interpretation of the paper. For one thing, the 1024 bit key had a they broke was a special 1024 bit key which only had an equivalent strength of 750 bit.
AUDIENCE SPEAKER: The thing is nobody has used 100 million in spare cash to demonstrate that you can do it so their calculation is if you would have this kind of pocket change this is how would you build it.
PAUL HOFFMAN: How you might build it. I know there is hardware people in here, especially some network hardware people building a dye that is this large and having the heat work on it is theoretical at this point. It's certainly not impossible and we assume that there is designs going on for that, but no one has demonstrated the hardware configuration that would possibly do it. And by the way, that stops at 1024, as soon as you go larger than than that the calculations go to a dye this large.
AUDIENCE SPEAKER: Olafur
Guðmundsson, CloudFlare. I am not picking on you guys because you are doing the good stuff. I am going to pick on my colleague, Peter.
PAUL HOFFMAN: He is doing good stuff too.
AUDIENCE SPEAKER: The weakest link decides whether you trust the answer or not, if the web browser say we are not going to trust anything that is less than X that means if anybody on the way there is less the chain is not trusted, it doesn't matter what the lower level do, this one in April everybody who is using larger keys below to be trusted. I just checked there is in excess of 85 domains that only use 2048 bit RSA or larger keys so instantly the browsers will start saying those are good, so the next step is to get the rest of the TLDs to start moving up or to switch to a better algorithm. Which have smaller keys.
Pauler: Significantly smaller keys. We will come back in five years to talk about that.
Kaveh Ranjbar: A comment from different perspective, and just food for thought: It's during this week I heard a lot of stuff which is standards, this is inspired by the comment you mentioned about browser vendors and this stuff. We already know there are hard links between this community and IETF and ICANN, but I hear a lot of things which are inspired or designed by W3C and basically this community has very, very low links, we don't have that many links to that SDO, so I think it's food for thought, maybe we need more communication with that basically line of standardisation because also DNS over HTCP and I know they are talking about this stuff, just food for thought.
PAUL HOFFMAN: Sure. And once we have the 2048 bit ZSK I think more of those groups might be willing to initiate conversations, the ones who have said, oh, no, they don't understand, they are using too small of a key, I believe that we will have more discussions, certainly with the browser vendors, who I didn't discuss with in the W3C, I talked with them in the IETF through the TLS Working Group.
DAVE KNIGHT: If there are no more questions, thank you, Duane, thank you, Paul.
(Applause)
Next up we have Ralph Dolmans who is going to be talking about QNAME minimisation and unbound.
RALPH DOLMANS: Hi all. I work for NLnet Labs. This talks about QNAME minimisation and for inbound. We are a not for profit foundation, we do research and development on Internet technologies. We develop open source software and one of them is unbound which is our caching validating resolver.
So, it's all about privacy and has always been important, since the know den revelations a lot more attention got paid to privacy and one of the things did happen is publication of RFC 7528 stating pervasive monitoring is a technical attack that should be mitigated in the design of IETF protocols. For most of the ‑‑ cured by way of communicating for DNS we do not. Almost all DNS traffic in the world is sent plain text. We have DNSSEC to authenticate the validity of the data and not to hide any data. So the question is should we care about since DNS data is public anyway? That's right, I want everybody to know where they can find the mail server I do not want everybody who is sending me e‑mails, I do not want everybody sees who is resolving the MX record. With the trend of storing more data we store more privacy sensitive data there so for example, if somebody can see the query for open BGP key you do not only know the e‑mail address or the domain we are sending to but the complete e‑mail address.
So something has to be done, some of the things we can to are stated in RFC 6973. Two of those privacy data minimisation and security. Security part is mainly about hiding the data using encryption and the data minimisation is about not exposing more data than needed to execute a task. And the simple idea behind is the less data you expose the less that can be abused.
I have an example here of stop resolver wanting to resolve the ‑‑ has empty cache we have to send queries to the authoritative name servers. We go to the root and ask for the the root doesn't have it but it has a delegation of the NL zone so we contact the NL zone ask for A record of NLnet Labs, it's not there but has a delegation and finally we send a query and get the answer and back to the client. All those DNS transactions are plain text.
The ‑‑ Working Group published publication over DNS and ‑‑ another thing we can do is minimising the data we are exposing here because between the resolver and the authoritative name servers we are exposing more data than needed to execute this task.
That is written down in the QNAME minimisation RFC and what it says where possible use QTYPE NS limit or set the QNAME to the name server, the zone we are going to contact plus one label. So if we apply that to the example we see here that the queries to the name servers are changed, instead of sending the full QNAME and original we only ask for the delegation in the root. And we only ask for the NL in NL zone and after that we send the full query to the ‑‑ doing this really makes sense because the only thing we want to know from the root is the delegation to the NL zone, there is no need of sending more data there. So we have implemented this in inbound, it's in there since version 157, for now it's off by default but really easy to enable, you just say QNAME minimisation yes in your configure file and you have it running.
So if do you that and have a look in your log file you have proof that is really working, here we see to the root target we sent the query for the NS records for only for NL to the NL so we sent NS records to the NLnet Labs zone, we sent the full QNAME and original QTYPE. So now that we know what QNAME minimisation is, how you can enable it in inbound we have can have a look at the way it's implemented. It's query will come to resolve minimisation stage and when you have QNAME minimisation is do not minimise state and what we do in this state is we just send out original QTYPE we got from the client and the full QNAME so really the same behaviour as before we had implementation. If do you enable you will start in the INIT minimise state, and name of the zone we are going to contact, so for completely empty cache it will be the root we are going to contact so we set the QNAME to the dot. After that we go to the minimise state which is the heart of the implementation and here we add one labour label to the query and send out, wait for a response at another label and send out and we continue doing that until at the last label. We will go to the do not minimise state and once again we sent the full QTYPE and name. It's possible doing to ‑‑ do ‑‑ see name or need to resolve the name of name server and there is one more state, it's the skip minimise state and we go to that state if we need to send the same QNAME again. If you get out time out from name server we need to send same QNAME again. We go to that state.
Now the question is, when can we stop sending those queries because one thing we could do is we just keep on adding labels and sending out until the numbers of labels is the same as number of labels in incoming QNAME and doing so does make sense as long as the name does exist, if my largeers domain name has four labels and gets query for 100 labels then resolver will send out more than 100 queries, so that is not only wasting traffic but also enables the possibility to abuse this in Dos attacks. So what we should do is should stop sending out those queries as soon as we know that the domain does not exist. Now, in DNS there is a way of signalling that domain does not exist by sending the NX domain. Means do domain does not exist therefore there are no children of this domain. Not all name servers handle NX domains the way they should. I have an example here: If you query for you get ‑‑ and as soon as you query gov dot edge key .net, there is something under this domain but because they do it wrong we cannot trust the codes code any more isn't the only hard code we cannot trust also searchers that give us ServFail or when question RCODES when query for the NS QTYPE. We have a problem because we cannot keep on sending those queries and ignoring the RCODES but we can also not trust because can be wrong what we do in unbound as soon as we see the no error we give up and stop doing QNAME minimisation and do not minimise and send out original QNAME in original QTYPE. This behaviour is not conform the RFC but if we don't do it this way a lot of do. Main names will not be reachable any more, no operator will enable this feature.
So, this fall back clearly has some impact on the privacy because NX domain queries will become visible at the servers, you can actively attack this tall back, if you are running a name server and you want to know the full QNAME and get the answer back. It's getting slightly better in unbound 159, we do not do the fall back any more but just accept the NX domain and give it back to the client. And we do this, this based on the assumption that name servers serving DNSSEC signed zones will handle NX domains the way they should and perhaps the root is signed and a lot of TLDs are signed, coming from those zones, we don't need to do the fall back for those any more so we don't need to expose more data there than needed.
Something else, the number of queries when you enable QNAME minimisation. With that QNAME minimisation if I want to have the record for NLnet Labs, I need to contact the root and contact NLnet Labs is number of zones that I need to contact. If I do enable QNAME minimisation for this example I need to send out one more query because first with the NS QTYPE and only after that I can send a query with A, we do not know ‑‑ it's possible in the NLnet Labs there is a ‑‑ we cannot expose all data yet. So as I said without QNAME minimisation the number of queries is the number of zones you need to contact, with QNAME minimisation it's the number of labels in the QNAME or in the domain name because for each label we are sending a query plus query for the original QTYPE and we do not need to do this for the query we got from the client or for names we have to follow or need to resolve. So I have an example here, if you want to query the AAAA record for ietf.org I need to contact three zones, I also need to contact three zones for the name of the name servers, I need to resolve and in the end I get a C name which for I also need to contact three zones. So to resolve this name without QNAME minimisation I have to send out nine queries. With QNAME minimisation the first has first labels, the name of the name server has four labels and I need to send query for AAAA type and the C name has six labels and also there need extra query for the AAAA, instead of sending nine I send 15 queries. So that almost doubles the number of queries but it's still relatively short. It changes for example with those kind of domain names, the reverse IPv6 addresses. This is the IPv6 address, the reverse address for NLnet Labs and I know to resolve this one I have to contact four zones so without QNAME minimisation that would be four queries. Because this name consists of 34 labels, if I want to resolve the PTR records I have to send out 35 queries. So what we did in unbound 157 to limit the number of queries we need to send out is, when we see that the domain we are resolving is a child of RP 6 ARPA, we add eight labels.
So, yes, this works very the IPv6.arpa example but there are more /HRAFRPL domains names, the black‑listing stuff where you prepend or for wild cards, if you have a wild card we can match that with a lot of labels so generate a lot of traffic to your server. Therefore, what we do starting from unbound 159 is we limit the number of QNAME iterations to 10 and do this by, the first four will always get one label appended before we send queries out, the remaining labels will be divided by the six queries we have left. So, in as example I have a QNAME with 18 labels in first four will have one label appended, then I have 14 labels theft and six queries so 14 divided by 6 is 2.something, so the rest of the queries will have two labels appended and I have two external labels for the last two queries.
Now, the main reason to enable QNAME minimisation is obviously the enhanced privacy but there is another nice benefit and that is the synergy you get when you combine QNAME with the harden below NX domain feature is based on the NX domain cache apart from the improve privacy ‑‑ and the idea behind it is that because NX domain means it doesn't exist and doesn't have any children, therefore if you have NX domain in your cache and get query for the there is no contact to contact authoritative servers any more. If we do this NX cuts stuff without minimisation and I have three examples they all end with non‑existent, the delegation does not exist so they will all result in NX domain answers. If a query it's not existent it's not in my cache, I got query for A dot B ‑‑ it's a child so I can answer this one from cache, there is no heed to contact name servers again. If C non‑existent, it's not a child of anything in my cache I need to ‑‑ if we enable QNAME minimisation we just send query for non‑existent, therefore what enabled the third query here can be answered from cache.
Now, if you want to know if your resolver has QNAME minimisation enabled I made a simple text, you can query for TXT records and you get a message where you have QNAME minimisation enabled or not. So, to wrap it up, it limits the amount of data we are exposing to the name servers, but it's not helping anything on the resolvers, they see all data, it's not helping on the wire, although the TLS seems promising there. But there is still a lot to be done. Are there any questions?
DAVE KNIGHT: No questions for Ralph then. Thank you very much.
(Applause)
Now we have a few minutes left at the end of the session to do any follow‑up on other DNS content that has happened elsewhere in the meeting, in the plenary we had Paul Ebersman spoke about DNSSEC experiences at Comcast, also in the MAT Working Group Maurice spoke about measuring DNS configuration of upstream resolvers with Atlas. No pressure, but if there is any questions about those that you would like to follow up on, now would be good time to do that. Also Jen did one on DNS 64. Or perhaps, Paul, you had suggest wanted to add or... OK. Well, I guess then you get a few minutes back, we have ended on a nicely prompt manner, assume that means everybody will be back nice and quickly after the break. Thank you very much.