24 May, 2016
At 4 p.m.:
DANIEL KARRENBERG: Hello. Hi. Welcome back everyone. Please take a seat. As, you know, we have been calling for nominations for the Programme Committee and so we had two candidates so I would like each of them to come up and say a little something about themselves.
SPEAKER: Hello, Peter Hessler, I am a member of ‑‑ I work for Lierhost server in Germany, also a developer with the open BSc project and open on open BGP project, I wanted to run for the Programme Committee so I can talk about or at least help with more of a focus on the alternate methods that people can do networking beyond the huge Google style or Cisco style networking. Thank you.
SPEAKER: Hi, I am one of the many Andres at cz.nic and I mainly do DNS and as you might know me, and I am also developer to top up the open BSD, and I also serve on some ICANN things and I am one of the RIPE arbiters and well, I would just like to help the ‑‑ to make the choices of the presentation the best possible because I like RIPE meetings and I have been going for ten years now, OK.
DANIEL KARRENBERG: And since we got two nominations for two spots, both of the candidates have been automatically elected. So let's give them a round of applause.
And along with that, I would like to invite Andres Arcia‑Moret from the University of Cambridge, talking about network deployments.
ANDRES ARCIA‑MORET: Hello. So, good afternoon everyone. First of all I would like to thank the RIPE 72 organisation for having me invited to this event, especially I would like to thank to Gergana who has taken care of all the details for having me here.
So, today I am going to be talking about alternative network deployment. And I am working at the network for ‑‑ network development laboratory which is a small group, recently created within the computer lab at the University of Cambridge.
So this work is a joint work that has been around for a couple of years now and it's been treated within the IRTF, specifically within the global access Internet for all, this is the co‑authors, it's a long list plus other authors in the next slide, who have been working together since 2014, people coming from different universities and centres for network and research and actually working on the deployment of alternative networks. So this is entire list of authors.
We are intended ‑‑ intending to have an informational RFC, soon to be, I expect. We are currently having the version 6 on line so you can go for it and read it and criticise as much as you can. Actually, this version ‑‑ this document has been developed within the objectives of the group. The objectives is, again, summarised in this, to document and share deployment experiences and research results to the wider community through scholarly publications white papers, information and experimental RFCs". Ours is the first of its kind and trying to summarise what is going on in the world for free communications and alternative ways of deploying a network in under‑served areas.
So the motivation came on this register in the document as in this slide so it's the first time to make a taxonomy on emerging models at the IRTF, which is addressed but not limited to independent community organisations, bringing alternative topology, infrastructure and business models to get connected to the Internet, setting a precedent for civil society as well, members under‑served areas, developing regions mostly, it's not limited to that but it's where this use case in which we pretend to do the major impact in the future. So actually, it's also being developed, having in in mind to provide an essential piece of information for more than 4 billion people who are actually disconnected from the Internet.
So, the spirit of deploying an alternative network goes into freedom and this is the freedom of communication, so how ever we ‑‑ we haven't gone from the ‑‑ from what is the freedom to us to actually defining it, but rather, we have approached thises a practical, with a practical perspective, so we have seen where it's already deployed, we have paid attention to what people is defining as freedom and then we have tried to summarise this into the document. So first, the the one /TPWHROPBGS definition of free communication, it belongs to nonprofit organisation, based in the US and mainly supported by and advised by ‑‑ three main principles: Like the first one, freedom number 0 is freedom to communicate for any purpose without discrimination, interference or interception and this concerns security aspects. Number one is the pre come to to grow, improve, communicate across and connect to the whole network; and number 2, the freedom to study, use, remix, and share any network communication mechanisms in the most reusable form. However, this is the other extreme, this is ‑‑ these are the principles that are stated by guifi.net, which is the largest community network in the world. So alternatively, I am going to be used during the talk the word or the name communicate network and alternative network, actually, what we have came up lately is to have this generic term 'alternative network' and within the alternative network there are many possibilities and the one that triggered actually the term was the communicate network, that is why sometimes I switch terms, right, but it's going to be, I hope, clear at the end of the presentation. So guifi.net is the largest providing free opening network communications, now it accounts for more than 30,000 nodes installed by people for the people, and it also has more than 50,000 kilometres of wireless links. It's been around for ten years and it covers the whole region of cat loan I can't in Spain, so they define whenever someone wants to join the network, they define the principles, the freedom to use the network for what are purpose, neutrality, they are not trying to ‑‑ the right to fully understand the network and its components, as well to spread out, gain knowledge from users, the right to offer public and private services, including the commercial purposes, and the right to join network and extend inherent right to anyone else who in turn can be a client of the client. So, the typical levers found in this sort of organisations is do it yourself and makers community people that want to build it themselves, they are on software, hardware and organisations, low cost commodity and open source technologies and through the public institutions, such as universities, public universities mainly around the world, and independent organisations. So, however, the so far, and this is state‑of‑the‑art organisations ‑‑ statement I am going to make, is the biggest challenge is about governance, so how are we going to solve conflicts when they arise within the community networks since the governance tends to be distributed.
So these are examples of community networks coming from many projects summarised in the document in more detail, the first one is installation of 12 metre pole making it 5 gigahertz link in micro niece I can't, this is people from ICT P in Italy ‑‑ this here, the figure 3, A is super node in the guifi.net community; B is in India and wireless service provider who is actually a private initiative, trying to provide service to the rural areas; and the third one is a research project from Spain at the University of Carlos which is intended to provide 3G coverage in the Peruvian jungle for small communities.
What are alternative networks, understand more definition which are networks that do not share characteristic of mainstream networks but what are the mainstream network characteristics? So alternative networks are not top down control networks, have no infrastructure with substantial investment, have no exclusive participation of an elite network and technology designers and have no central authorities allowed to intervene communications.
There are other aspects as well ‑‑ relevant for alternative networks, such as they can be used in developed regions, so to fill the gap of under‑served areas which accounts for few percentage of population getting out of the Internet connection, but also the biggest impact is in bridging the digital divide by increasing availability and affordability of the network infrastructure, tackling digital litter see and adapting regulatory framework for the masses. Popularising content and services. This is a simple example of how we can find an alternative network deployment and how actually it looks like. This is the mow coverage in Johannesburg, and when we get out of ‑‑ away from the urban and sub urban area we immediately find areas without service. In the right figure you can see an alternative network has to be or is likely to appear, as it is the case of the 30% of networks that we have surveyed in the past, 30% of the existent networks worldwide. So, actually, the problem is how are we going to provide the back hole communication for these networks, this organisation in our lab we have been working for some years now in several projects and most recent one is try to convey back hole communication through satellite doing off‑peak areas as is well‑known, satellite communications are very expensive and times inconvenient, but in this case we are exploring the possibility of providing Sal light communication in off‑peak hours and to time shift requests during the day using specific technologies for providing better service within the communicate network. And this services are listed in this slide. So we are trying to leverage on information centre networking, massive caching, and locally managed services, software defined networking or delay tolerant network and white spaces so we can convey a better service within the communicate. So the aim is to lower the costs, which is the main block to get this communicates connected to the Internet.
Why is it ICT is different in non‑developed regions? Because we don't have proper national/international bandwidth, affordability of services and devices required to access the information and communication technologies, the instability and/or lack of power supply, is sometimes unpredictable. Having a proper schedule of when we are counting on proper power supply, for example. The scarcity of qualified staff and the added problem that once the staff is qualified and trained, then they are willing to leave their commune tease. The existence of policy and regulatory framework that hinders the development of alternative models since thought for companies that have enough money and resources to invest on people to get people connected.
And another interesting aspect is the rural area, what is the rural area? It's something that we have just came lately with a consensus on what is or at least what the common characteristics of rural areas are, since it arise from country to country. So let's say that we are talking about rural areas in this respect, as in the slide, low per capita income, having non‑affordable communes communities, communities have no price of connecting to the Internet exceeds 5% of their income, the scarcity or absence of a basic infrastructure, the low population density and distance between population clusters so it is even more difficult to get these communities to collaborate so they can share their back hole, which actually is high in cost.
Under‑developed social services healthcare and education, lack of adequately educated and trained technicians and leave once they are trained and a process to takes years to do. Harsh environments leading to failure in electronic communication devices. So striking the sweet spot in providing the proper hardware in these conditions is difficult because it increases costs.
Then considering this five characteristics, promoters, purpose, governance and technologies employed, scenarios, we have been able to provide in the next slide I am going to show you how this networks mainly what the main tend sees are and how they can be grouped into different types of network, if I may say.
So the promoters goes from public stakeholders, and academic entities and to reduce costs but in the capital expenditure and operational expenditures, the governance we have seen both of them centralised and distributed, however distributed is the preferred one and we have seen nowadays, we can count on good publications and research experience on the largest community network hosted in Spain, in which they have passed from a fully distributed governance model to nowadays having small, centralised authority that actually solves, is in charge of solving conflicts between the different business scenarios and different, now, communities, since as you may imagine, 30,000 nodes is a considerably large network nowadays. So, many different cases for extending the network now, they are nowadays present in this specific network. And we expect that all the networks will follow the same example in the near future. So, technologies employed, most of these networks counselled on wireless technologies such as wi‑fi in the IS M band but they are not limited to that and we have found other examples such as 8 O2.16.22 in white spaces, low cost optical fibre. It's not limited to rural areas, we can find alternative deployments in urban areas but the most common one is in rural. So this is how it looks like from, we have three main actors in the alternative networks movement, so these are the community of users, providing community networks; the private companies providing wireless Internet service providers that are actually focused on providing service to rural areas; and public stakeholders such as universities executing projects to actually allow to understand how people willing to connect in this under‑served areas behave and how we can make things better. When they mix, say, the most interesting one is the first one here, we have two different organisations, the first one is cooperative, depending on the size of the network and they report economical benefits to a private company, if they don't they are called cooperate tiffs, so the private company can take the management of this pool of resources contributed by people, but when they provide some benefit then they pass ‑‑ we have named them shared infrastructure. And so, having the collaboration of people in maybe including the wireless Internet access through wi‑fi boxes, has been also subject of academic projects in the interests of private companies and so that is what we call crowd shared approach.
So these are the main characteristics on the whole set of type of networks within this paradigm, if I may call it like that, of alternative networks, and then we can say that the salient characteristics are that these are to serve under served areas, and mainly distributed in governance, wireless is the desired technology for connecting the nodes and they are present, all of them are present in rural areas, some of them are not present in urban areas.
So I would like that to thank you for your attention and just to end my presentation with this quote from professor or man know, holds the world record for long distance wi‑fi transmission which is 384 kilometres distance with wi‑fi and he also holds the Postel service award on behalf of the Latin American networking school and when I ask him about what was the main obstacle in conveying Internet to under‑served areas, he said it's the fear of failure of these people that makes it more difficult to connect them. So I thank you very much for your attention and I am willing to take your questions.
MARCO HOGEWONING: Wonderful overview of what is going on, it's a bit off to what your presenting but I am curious if you have thought about it ‑‑ Marco, RIPE NCC ‑‑ you mentioned somewhere early on in your slides scalability is one of the fundamental problems you are searching for and most of these technologies you built are using unlicensed spectrum and regular wi‑fi. Now, the trouble I already have at home to get a proper wi‑fi with all my neighbours crowding that, what would you say is the ‑‑ unlicensed seem to be really unscarce resource, would that have impact on these type of networks, the fact that you are really overcrowding a relatively small piece of the spectrum use?
ANDRES ARCIA‑MORET: Yes, OK, thank you for your question. I mean, definitely it's going to have a good impact, congestion especially the IS M band is congested, not only ISM band, that's the purpose nowadays of ‑‑ or that's what we would like to do by investigating and promoting white spaces, and there is a whole movement around trying to claim this, for example, a lot of spectrum that is on the UHF band is being just simply not used, right? Regarding the wi‑fi question, we have conducted several studies so I can point out you out specific papers that we came with, in which we have started the impact of of the congestion from the nomadic perspective; users going into low, say low populated areas or low ‑‑ not that crowded areas in terms of access points, and very congested ones. Yeah, there will be one point of saturation, because it's just, we can't characterise this behaviour as chaotic, there is no order in deploying new equipment, there is no order for that. (We can). Yes, definitely it's going to have an impact, there is research going on trying to tackle this, there are systems, intended systems for store the spectrum activity and try to decide what is best for the users and which /TKHAOPBL use, but this is under research, there is nothing I can point out a specific product. (Channel to).
BENNO OVEREINDER: Not a technical question, actually. So I understand you get some funding from the EU, but how does it work in practice, so if ‑‑ is it ‑‑ is it the community asks for, we want to have interconnection and they come with a request? How do you find funding for that and get expertise at that location? Do you depend on ‑‑ maybe it's a call for help here, so do you need community input to help?
ANDRES ARCIA‑MORET: Yes ‑‑
BENNO OVEREINDER: To help to develop these kind of community networks?
ANDRES ARCIA‑MORET: Yes. So there are ‑‑ possible answers to your ‑‑ I am trying to think of the most effective. Actually the project I am involved in, the main project is rearchitecturing the Internet for everyone, is trying to put this many technologies into one box, right? And try to put it as low cost as possible so that we can eventually we can spin off a company and then sell this product, right? However it's not ‑‑ this is still research aspect; we are developing the technology, the project is just being run for one year so there are two more years to go. But we can ‑‑ we go to places, specific places, for example in Africa, in Latin America, and then, somehow, we get in contact with communities, and then we help them build their communities to a specific, say we can go for two weeks and provide the needed education and make them independent communities so that they can build themselves, their network and they eventually they are going to establish that relationship with us and then we can advise on what to do. But the intention is to provide them with their own tools to build the networks.
DANIEL KARRENBERG: Time for one last question.
SPEAKER: Vesna. I read the paper and there is a very short mention of the IPv6 and most of these networks do not use IPv6. Can you say a little bit more about that
ANDRES ARCIA‑MORET: IPv6 is a challenge, I am not an IPv6 expert at this time. It's a challenge. It's something that has been considered as it should and guifi.net for example, but not the case for all the networks, there are so many obstacles just to get connected people, that we try to avoid the extra complexities of having the network connected. It's what I perceive from this. OK. So thank you very much.
CHAIR: Thank you. The next presentation is another research project, I don't see the presenter. OK. Perfect. So the other research project made with crowdsourcing, the stage is yours.
ANNA MARIA MANDALARI: Hello, I am from university Madrid. And today I would like to present my work informing protocol designing through crowdsourcing and this is European project called metrics, initial training talk and this work was done together with my advisor mar sell low and Andra. And first of all ‑‑ it doesn't work. So the Internet access has successfully enabled multiple waves of innovation, mobility, heterogenity of devices, video communication and VoIP, etc.. and when I started my PhD my advisor showed me this, OK what I need to do? And he told me, you have to measure this and what is that? This is Internet in 2010. And I say, OK, I will come to ‑‑ I will come back to Italy. So, the Internet is changing dramatically in terms of number of device and this is a map of Internet in 2010 and as you can see, its link is a connection between the autonomous systems, it's starry, it's autonomous system and it's a mess, nobody know what is happening in the middle. So the goal of my research is understanding factually is Internet ossified? And because today, many expect to be set in stone, actually personally I love a pay back called why the Internet just works and it's true because Internet just work and nobody knows what is happening in the middle and the much criticism are middle boxes. So middle boxes might field the traffic that does not conform this behaviour.
So the question in my research is: How will Internet react to new protocol? And if it's possible to understand the interaction of middle boxes active along the path. So I say, OK, it's easy, it just off client, you have a server, do middle, you have data, process the data, and it's done, but the problem is that how to measure thousand end users. So you have two possibilities: The first one is have a friend in Google or any other large Internet service providers, or, you can get the your code to run on 1,000 users' machines and in this case, you have a lot of existing large scale measurements platform like, for example, RIPE, Samknows, business mark or PlanetLab but again you have some limitations. For example, the limited and often special position of test‑bed nodes and no possibility to deploy your own test, and you have fix line only, and also, the access to the results.
So, I use these and I would like to present these new approaches to the measurements in Internet, that is crowdsourcing platform. Crowdsourcing platform is a web page which everybody can be employer and worker. An employer pay workers to perform some task. So you can perform measurements paying people to perform your measurements. And I would like ‑‑ for example, this one is campaign that I set up and pay in this case 25 cents for task and people needed to go to my web page and ask some questions and measurements for their computer.
So, I would like to present to you these case studies to understand how to use these methodology to do Internet measurements and I prepared three case studies, the case of the pervasive encryption, TCP fast open and HTTP 2.
Let's start with the case of pervasive encryption, why this topic, Facebook, Google, remove from http, http; increasing of it. LS traffic in Internet, the challenge is provide encryption communication by default, right? So, I would like to understand the feasibility of pervasive encryption in the Internet and understand the interaction of middle boxes along the path with the port that actually are using plain text protocols.
So we did these measurements using http and TLS connections to 68 different ports and set up measurement agents that are the worker and measurement server over 68 different ports.
So we use a simple LAMP model and we also perform packets capture serve site. But OK, crowdsourcing platform you can perform your measurements but there are some limitations, choose the country in which you want the task perform, but you don't know if users are using a mobile phone or a computer, so in this case, we ask the workers to answer to some questions, for example, user connect from fixed line indicate the place from where they are connecting, hot spot, home, and user connected from a mobile line, indicate the technology they are using, 2G, 3G, 4G. In the hand and we also collect and maintain data, so all the data from users and in the background TLS connection perform to server to 68 different ports.
So, we collect in total for fixed line more than 1,000 workers from 53 different countries, we had 286 autonomous systems, and for mobile network we checked the user agent worked on the ‑‑ they were connected using mobile phone, and we collect almost 1,000 workers from 45 different countries in 183 autonomous systems. And that is a freely available on my personal web page so if you need the data you can go there.
And in that, for this case study we have the following results: We see ‑‑ split the result for fixed line and mobile network and see 25% of users are not able to perform a TLS connection over port 80 and we would like to understand why, and we see users that are using a proxy are not able to perform a TLS connection in port 80, and we also possess the packet and see the SYN is missing for 90% of times.
Let's see the case studies to apply to these case study with another, measure http/2, it's a new protocol, it's evolution of http and it is the following features: It is binary instead of text actual and it is fully multiplexed, allows you connection for parallelism, also it allows you compression, and new ‑‑ futures that are pushing and priority. Nobody knows how to use it but they exist.
So the discussion about HTTP 2 during its deployment was about to incrept or not to incrept http. 26789 encryption can come for two reasons: The first reason is security, that it's OK. The second reason is for mailboxes compatibility. In that, the Working Group no ‑‑ you have http 2 encrypted using A LPN and http/2 in clear using http upgrade. But the like Google Chrome or fix fox use. We did these experimental set‑up, a campaign using these crowdsourcing platform, using browser and Android AP, so you can use ‑‑ you can create your Android AP app, put it in play store and micro workers and you can say to workers, download this app and perform the measurements.
So we perform HTTP 2 encrypted using normal browser and we use the Android app to test HTTP 2 clear and http encrypted without A LP N and without upgrade. And we also check workers are not using wi‑fi to do test for mobile network.
So this is the set‑up, it is really simple, and workers put the campaign ID and the worker ID that gets directly from the crowdsourcing platform and then go to the code to be paid and they copy and paste the code to the crowdsourcing platform.
So this is an example of the campaign. In the case, so the cost depends on the ‑‑ how long is the task. So for example in this case in which people needed to download the app, I pay 40 cents for workers. So the data set is the following: We recruit more than 600 workers from 38 different countries and we tested 40 different Internet service providers. Again if you would like to have the data you can go to my personal web page. And the results are the following:
So for browser, of course you can cannot perform HTTP 2 encrypting port 080, we have an error rate of 5% and it's not possible to deploy in port 80 HTTP 2 and clear because with upgrade we have error rate of 7% and without upgrade we have an error rate of 5%. So it seems that ironically, you needed to encrypt http/2 to currently make the study work. Another case study that is a little more complicated to develop in macro workers but you can do it, is TCP fast open, it's an extension of it. CP, that allows to you send data directly, so you can gain entire round tree time. And you send ‑‑ the client in the first connection, the first connection is a simple handshake connection, which you have the SYN, plus the TCP cookie, receive the request, then the serve send to you the cookie, that you can use in the next connection to send directly data in the SYN. In the next direction the client sends to the SYN that saves before pure application data, you receive directly the data.
So for this case study, we create a tool called TCP fast open explorer, TCP open client and serve and script and workers need to download the script and run the code with in the computer and they did it. So we pay a little more and these ‑‑ we perform TCP fast open connection over 68 different ports. In this case you needed administrator privileges and we use a simple LAMP model and perform packets capture serve side and client side. And unfortunately, workers in this platform are normal people so they use Windows, so we recruit only 50 users that actually have Linux, and this is the data set. Anyway, we perform from 18 different countries, with 22 different Internet service providers. And you have the data set freely available.
So the result: We looking for ‑‑ we looked for the following result: The first one is that the TCP fast open client, the worker, is about to perform a TCP fast open connection normally so everything went fine, so you have the ISP, the SYN ACK and everything. Another result may be that middle boxes drop packet with no TCP option and we receive the SYN without option. So this is another situation that may happen. Another situation is that middle boxes drop packets with unknown TCP option and we do not receive the SYN at all. Another situation that is middle boxes drop packets with data in the SYN packet and we receive the SYN without data. So the results are the following: And we have 39% of users are not able to perform a TFO connection over port 80. We test also the case in which we send it directly data in without TCP fast open option and we see 50% are not able to perform a TFO connection over port 80.
You can perform every kind of test but you have some limitations of course, for example, with browser you can perform tasks using browser and you can get thousands of users in few days, and you can pay from 5 cents to 25 cents of dollars, but you have the limitation that you are using a browser so you can use, for example, the case of HTTP 2 only HTTP 2 encrypted or create your Android app but pay more, get 100 users in few days and you can pay, it depends on the longest the task, 40 cents or one dollars and you can create a Linux app but in this case, you can get only dozens of users in some weeks so you needed to pay one dollars minimum or 80 cents. But you can ‑‑ you can use ‑‑ in this case.
To conclude: Download code, you can use Google Play and it's fine because people are comfortable with it. For Windows, I tried some test using Windows but people don't download code for Windows. Linux we get only 50 users. So the best moment to set up a campaign is during the weekend, I don't know why, maybe people out of normal work and they work hard for micro workers. And watch out because people can lie. For example, when we set up a survey, we did the same question three times with different orders of the questions, and we get 60% of workers lie. So, also, short times are better, of course, because you pay less and workers are not boring doing your task.
So, we overcome several limitations of the crowdsourcing platform to to perform network measurements. And probably feasible to roll out TLS connection except for port 80. New protocols at application layer need to be encrypted in particular in mobile networks and more vantage points for TCP/connection.
So this is some references. This is my web page and this is the web page on my project, metrics. And thank you. Thank you for your attention. I prepare also some ‑‑
CHAIR: Thank you.
AUDIENCE SPEAKER: Thank you, interesting presentation. Chris. You said 60% of users lie, what about
ANNA MARIA MANDALARI: When we ask to ‑‑ they just put random answers, when we set up the survey to understand factually they are using wi‑fi or 2G or 3G, we set up three different questions in the same meaning but in different orders, and we discovered result in which we have the different answers for the same questions. So this is ‑‑
AUDIENCE SPEAKER: Tom. Thank you very much, I was wondering whether or not you were able to work out from the data where middle boxes were actually interfering with some of these options, whether or not you could infer from the ISP or find out from a press release to find out which pieces of hardware were doing what?
ANNA MARIA MANDALARI: No, with this, I think that with this kind of experimental set‑up you cannot understand which kind of middle boxes actually is the responsibility ‑‑ responsible for that. We are working on understand if, for example, I detect proxy in the TLS campaign so I saw that the responsible for TLS blocking connection was a proxy. But with this kind of methodology I actually don't know how we can understand if the responsibilities or not are a proxy. We are working on that.
AUDIENCE SPEAKER: Well, that was the question I was going to ask. Martin Levy from CloudFlare. Is there, in that, trying to understand the middle box and I am going to focus on port 80 encryption, http/2 because that I think is the (2)
ANNA MARIA MANDALARI: Hot topic.
MARTIN LEVY: It's the future but I am biased. Is there any understanding, you think, on the part of those middle box providers, the mobile providers, that they are creating a problem? Has this ‑‑ have you done any work in that area?
ANNA MARIA MANDALARI: OK. Actually, no, I think that this is a well‑known problem, right? Because it was amply discussed on the https Working Group. So everybody knows that middle boxes are a problem, but they are necessary, right? So, what we will ‑‑ we can do, it's try to understand what is the problem, what is the responsibility, who is the responsibility of this block and try to see if it's possible to create or modify the middle boxes to deploy a new protocol in the Internet today.
MARTIN LEVY: It was discussed in http business but no one had done the measurements the way you had so thank you very much (B I Z) and please continue.
ANNA MARIA MANDALARI: Thank you.
CHAIR: No more questions, thank you very much.
Going to do the lightning talks slot and the first in the list...
RANDY BUSH: This work was done with a large list of people, excuse me, this is going to be very fast. What tools are popular, what measurements are made, you need to remember these four classes, the built ins, the probes towards the DNS roots and anchors and that is one measurement. When we count measurements remember that. System users, DNSMON, etc.. and then privileged users, that are internal RIPE experiments, and then all the rest of us. The Ops and the researchers, we try to separate them a little, and no personal data were actually used here.
What tools are popular? Here we go from 2015 through 2016‑04, this is one year. DNS dominates, in number of results, OK. Followed by ping, followed by trance route. But if we look at how many users used each, we see whoops ‑‑ we see ping dominates followed by trace route, DNS because it's ‑‑ remember the built‑ins, are one user, DNS is down here. So you and I are doing this. How many pings and trace routes? OK. The built‑ins dominate. In pings, OK, and followed by system users, followed by normal users. System users dominate the trace routes, OK, followed by built ins, followed by the privileged users, etc..
Can we tell an operator from a researcher? We said, hey, operators tend to either look at their own network or maybe connectivity to another network, etc.. so, we called them shooters. They had a source measurement to or from a single AS or single number. Spares are researchers and go looking, I want to do trace route to the whole bloody Internet, OK. And if that is not enough, I want to do it to the whole bloody Internet and from. So here is some numbers. Now, remember the built‑ins are a single user and we are counting users here so the built ins are here and as you get up in this region you are talking about shooters, you are talking about operators. Here are sprayers, here is the number of measurements, that is this scale, so you are talking about, you know, most of the users on the shoot to spray metric are in here, with a tail of operators. And that is because ‑‑ pardon me, I can't say that is because; we con /SKWREBGT that is because operators make a measurement. Then, we also looked at the diversity geographically and topologically, and the diversity of measurements. And it's all in our lovely paper which you can dig out. But what can we do only using the built‑ins? So, the problem with trace route is, you can't differentiate these, but you can pick them out because in actualality this delay is independent of this delay, you can cheat and use something called a central limit /THAOER em, where you have enough things passing through you can /TERBLGS they converge towards a normal distribution and so we have a band of confidence of what we expected here and this stuff out ‑‑ lay outside it and we all remember the end of November. The root attack. But why ‑‑ so we have similar technique for forwarding changes, why do we look at that particular brutally? Here is Telekom Malaysia, similarly, so we have per AS alarm, so you can actually see, oh, this AS was in trouble and then drilled down to the links, same for forwarding (drill) and here is an example are /WR* Telekom Malaysia was 10,000 kilometres that way from London and here we sea level 3 in trouble between London and New York. And the point of the story is research can be operationally useful. Questions? Answers?
AUDIENCE SPEAKER: Alexander Isavnin, the open Net. I followed the link and read the article a, actually it's not as interesting as your presentations because article doesn't contain comic sense. But my question about studying geographical diversity, when Atlas project have been started it was stated that 10,000 probes will be after have a good mapping of Internet, well, did you study confirm this or deny or what is it about coverage by Atlas probes of the global Internet?
RANDY BUSH: I didn't cover that here, that is a little deep. If you read the paper, you will ‑‑ the second paper, not the first one ‑‑ no, the first paper, the first one, was you will see an analysis of geocoverage, actually we looked at probe diversity both topologically and geographically, and as we all know, they are big ‑‑ there are big geographic gaps, OK, and as we find ways to get more and more useful data out of these probes, I think we are going to find we want many more of them than we thought we would. And like, I know somebody in southeast Asia, not in some place we think of Internet rich, they have a serious network, they just asked for 200 probes because they want to put one in every province and on every network. And you can ‑‑ if you start seeing results like this being able to actually see your network performance and see anomalies, they have my sympathies.
AUDIENCE SPEAKER: Thank you.
AUDIENCE SPEAKER: Hi, Brian Trammell, ETH Zurich, unrepentant sprayer. Did you look and maybe if you take this off‑line, on the trace routes look at the difference between ICMP UDP ‑‑
RANDY BUSH: No, we have all looked at that before of course for other things, it's great fun.
AUDIENCE SPEAKER: Thanks.
RANDY BUSH: But we never went past the speed of light. That was a snide remark, I know some of the results of his experiment.
AUDIENCE SPEAKER: Indeed it was, thanks.
CHAIR: Thank you, Randy.
The next one is Anycast DNS, welcome, Dave.
DAVE KNIGHT: Hello, I am Dave Knight, I do DNS at Dyn. So I am going to talk about some work that did I earlier this month with a thing I came up with a while ago but it never occurred to me to try and measure it until recently. Here is the problem, measuring ‑‑ specifically for me measuring DNS provide services provided with Anycast is hard, a single node in the topology can't deterministicically probe all of your points of service delivery. The node, the monitoring node is always going to, because of the way routing works, always going to see the topologically closest point of delivery. So when we want to measure Anycast services or at least to monitor them, we have to employ a bunch of different compromises so one of those is that we could expose the service on a definitively routable address, the management address of the node, but then when we do that we are not sharing the same experience that eyeballs would have, a different address, perhaps the routing goes throughout different equipment. Another compromise if we have many monitors in the top on with Atlas or real user measurement, we can potentially hit all of the Anycast service sentences but it's possible where certain techniques are used which are constraining the routing of a particular delivery of the service, that it can be invisible even to the most widely distributed monitoring platform, I have a new or another compromise, which is to source the queries towards the service local to the service, so on a node which is running an Anycast service, send the queries to that locally but source the packets from the address of a remote monitoring station, so now, the responses coming from the Anycast name server are hitting its proper serve address but the responses will go back to my central monitoring box so. This kinds of sounds like what it is, spoofing, taken place across the lookback on the box, no spoofing is happening in the network, no intend or even unintentional violations of BCP 38 or MANRS or anything like that. Spoofing a query, what I am doing is sending the query and prefixing the times with milliseconds and set the NSID option which tells the name server to include its own name so I can identify which name server generated the response, because when I look at the responses I receive if they all have the same Anycast address in them I can't tell which one it otherwise came from. I have a collector on a specific port and so the spoofed queries look like they are coming from that port so all come back to that collector and I am guarding against, if I run the agent on a node which for a time isn't running the particular Anycast service that I want to probe, those queries would then leave the box and hit a different node which might give me confusing results so I guard against that so they can't leave the box and go anywhere else. I originally wrote this bit a long time ago, written in Perl. So deconstructing the responses, I mentioned the NSID, I can see which of my nodes generated the response, and I have encoded the time in the QNAME so I can pull that out and when I look at the SOA record I can see what the zone that I was actually testing for and what it's serial is. And so the use of this, is very least I now have a heart beat that I don't need to reach out and probe things, I have a heart beat that tells me unbid enwhat is actually working at the edge, I can see changes in the SOA serial for zones as that happens and most importantly I can subtract the query generation time that I put into the query from the current time when I received the responses and get ‑‑ and I can reckon a single trip time between all my nodes in the monitoring station. Now, this assumes the NPT synchronization is very good or at least that it's not fluctuating ‑‑ if I want to see changes in behaviour. In the very limited experience I have had in the last few days, it seems great and works really well without having to make any special effort around the NCP. So scaling this up, so I have described it so far, it's about getting responses to a single monitoring station. Of course I have lots of nodes so rather than sending the responses all to one place I have all of the nodes send the responses to all of the other nodes and to themselves so in the experiment that I did I picked ten of our nodes around the switched it on there. And the responses that I get back, I'm just the single trip latency time I am sending into graphite and I can graph this in graph Anna, and I made it a very simple dashboard so that I can select which collector I want to look at and which originating node I want to look at and which particular service running on those I want to look at and get some graphs so. This one is graphing the latency of a node sending responses to itself and as I hoped and expected the latency hovers around one millisecond and that seems normal. I cherry‑picked one hour in time to put in these graphs, which represents nothing surprising, I have other pictures which look quite weird which I don't really understand yet and don't have time to go through in this lightning talk. Here is a picture with all of the nodes sending to one collector, this is in Virginia, so the yellow line right at the bottom there is the node sedning to itself, the one above that is another on the east coast of the US, the big clump in the middle is western US and Europe and the outlier at the top is Hong Kong so this is exactly what I would have expected to see. I am not sure yet where these periods of bursted jitter are coming from, but this is what we hope to see and it turned out to be. This one is inverting that so it's that node sending responses to all of the others, and it's quite a similar picture, and the interesting things here are, oh, the time going in that direction between that box and Hong Kong is about 25% more than it was the other way around. So that was kind of interesting. And there is two nodes in there which have these big regular spikes and those are not actually nodes that run this particular service, they provide distribution, they do zone transfers for it, they don't actually receive queries so these are periods when they do a mass burst of zone transfers out to all of the other nodes so we can see that the impact of other things running on the box definitely impacts the latency ‑‑ its ability to respond to queries.
So some thoughts on this. It's only useful for UDP. So far, I have only implemented IPv4 and there is no authentication in this. This is very early stuff I put together just in the last few weeks. So I would like to take this forward and compare it to our more traditional measurements and address some of the limitations and publish the tools. I think it has some haven't aages, it's yet one of the other compromises we put together to get a better idea of how Anycast services work. Because we are not probing anything if we package the agent with a service when we turn up we will automatically get the responses where we are monitoring. And the full mesh case isn't going to scale forever but to a constrained number of collectors this would scale quite nicely. And one of the interesting things which I don't have time to show the graphs for is because there is no stake, typically when you monitor; you want to probe a name server with dig, you don't want to keep lots of digs open forever so you have time outs. There is no time outs here so in some of the graphs I didn't show we see some responses which arrived several minutes after the queries were sent, so this ‑‑ I think this may be an example of the kind of zombie behaviour that Geoff has spoken about earlier but these are responses not queries. And we are not seeing them everywhere; I haven't looked into this in great amount of detail but reasonably sure is this isn't measurement error, this is the same response that was sent from a node in Singapore to all of the others is sent against five minutes later. Questions?
AUDIENCE SPEAKER: SIDN. You already mentioned Geoff and I cannot wait to see his presentation on the next RIPE about fallouts from this behaviour. But this is very interesting, very good idea and I think I am going to try to convince our people to do something like this as well, but you are still only effectively measuring whether you can send answers and not whether you can receive the queries, right?
DAVE KNIGHT: Right absolutely. Like I said, this is just one more compromise that gets us a little bit closer to a better picture.
AUDIENCE SPEAKER: Blake Willis. Thanks a lot for this and looking forward to seeing it on GitHub.
BENNO OVEREINDER: NLnet Labs. Neat trick indeed, I like T I have one question: Can you go to slide 3, I guess, it's one of your compromises. This one, yeah. So you say here, well, I am also interested if you used a minutetive IP addresses, you follow a different path. And maybe in your other slides you ‑‑ next slide ‑‑ and then I think, okay you are also interested in the capture of your Anycast node and if you go to the ‑‑ your final solution, you actually spoofed address and then from the different instances of your Anycast network you send the answer back to the monitoring or the measurement, and indeed it follows the path to that instance, but these instances have different captures, so the eyeball is not necessary in the capture of one of these Anycast instances or did I miss some nuance ‑‑
DAVE KNIGHT: That is exactly right. This is a weird behaviour that you wouldn't ever expect to see that these responses get back to this dispersate well distributed range of eyeballs.
BENNO OVEREINDER: Yes, but your monitoring the nodes and not necessarily measuring the path?
DAVE KNIGHT: Right. That is ‑‑ to be honest, I had this slide at the end talking about how useful is this, it's kind of a question as well as list of my own answers, I would be interested to see what other people think.
BENNO OVEREINDER: I like the idea, thanks.
SHANE KERR: Beijing Internet institute. I find this interesting. I did have a question, though: Do you have any plans to look at fragmentation, send really big answers and see if it's affected? I would expect there wouldn't be any impact but it would be interesting for me to know in that was true.
DAVE KNIGHT: Right now this was proving the concept and other people mentioned a couple of things that we could start doing as well so I am kind of keen to gather those up and start putting that into the tools and seeing what other things we can find out.
AUDIENCE SPEAKER: I think it's good project and we use it with Anycast. Can you please see ‑‑ show me the next slide and I ask you: What is so propriety special ‑‑ the one after that ‑‑ so nothing really Dyn had been specific, can you put it to GitHub at some point
DAVE KNIGHT: Absolutely, that is the intention.
AUDIENCE SPEAKER: And the v6, is it because Perl doesn't support v6?
DAVE KNIGHT: No because I literally did this about two‑and‑a‑half weeks ago, and I was in a hurry to put something together and I just didn't do v6 yet.
AUDIENCE SPEAKER: Please share, thanks.
AUDIENCE SPEAKER: This is Dimitry from Greece. You mentioned that you do this source address spoofing. I am wondering whether RPF works? I mean, typically RPF catch packet which do not have routing entry in the upstream router.
DAVE KNIGHT: What I had to do, in order to be able ‑‑ in order for the collector to receive the responses, I had to set response path filtering to loose mode and set except local to true.
AUDIENCE SPEAKER: Thank you.
CHAIR: Thank you very much, Dave.
The last presentation is in this session called preparing for DDoS attack by Ronan Mullally.
RONAN MULLALLY: From Akamai, I am previously Ronan ‑‑ until Acme acquired us about two years ago, I am aware this is all standing between you and a nice Copenhagen evening. This is largely a number of observations based on a few years in pro lax I can and daily litany of DDoS attacks. It is effectively is another talk about the Cloud, I'm afraid. Isn't it lovely and white, look how fluffy it is, it's wonderful, but some days this is what the Cloud will do to you. Have you got one of these? So, if you are going to put yourself in a situation where you need to defend yourself against a DDoS attack, the first thing you have got to do is have a plan and be prepared, you immediate to know what is available, what tools you can use, who you can call for help, your up‑streams, your peers are likely to be a key component in that, you need to know what you can do yourself, in terms of ACLs and routing stuff like that. It's important to be nimble, to be agile, if you put low DNS TTLs on likely targets update and send traffic somewhere else pretty quickly. If you put likely targets on IP ranges that are independently routable you can modify your routing such that the attack traffic will come in on a particular path or not. And you should put together routing policies that enable you to adapt your routing very quickly so, a set of communities that you implement across your network, and that may trigger things like AS path prepending or suppression or no export announcements and stuff like that. You need to be aware of what is going on in your network. Everything from the packets in and bits in on interface, you can pull stats from NetFlow, sFlow, many peering portals offer statistics on a per peer basis. You can run spam ports, fibre taps, that kind of thing to get perpendicular visibility of the traffic as it flows through your network. It's important you know what is going on in your network, otherwise you are basically blind and trying to defend yourself in the dark. It's also useful to know what is going on downstream, a reason why your customer is getting 43 gigs of NPT but ‑‑ if that is the case it becomes very simple to block traffic that is anything else.
Another key consideration that sometimes is overlooked is your DNS infrastructure, if your DNS is broken, so are you. There are many networks, many customers out there that take the approach here on the left, which is, you know, a very simple set of authoritative name servers, they may well be Anycast but it doesn't take all that much effort to lock those name servers off‑line (knock) at least on a regional basis. The ac/KAOEU results on the right, ten or twelve authoritative name servers, all Anycast as well, so you have hundreds if not thousands of servers there that need to be taken off‑line before you have any.impact on DNS resolution for that zone.
Thinking a firewall will save you isn't going to help you because anything that keeps state is going to fall over, it's trivial to generate enough different flows with different source IP addresses that you will fill any state table in seconds. A similar slant on that, rate limiting isn't particularly going to work either because you end up contending that the capacity available and any legitimate traffic that is in that bottleneck is going to suffer just as well as the malicious traffic. If you can put yourself in a situation where you have multiple paths into your network you have much more options in terms of how you defend, you give your safe broader attack surface, you can use selective route announcements of announcements to bring traffic in on transit or not and instead on peering. Depending on who you are or where you are and who the target is, you may find that withdrawing the fruit transit or not routing the route upstream and maintaining normal peering connectivity may actually avoid most of the impact of an attack.
No routing is a ‑‑ null route something a double‑edged sword from, the ISP's point of view you do it to save yourself and defend your network, to sync large flows and to maintain service for everybody else. If you are the customer who is on the receiving end, it's game over; you have accomplished the attacker's intention. It amounts, basically, to capitulation rather than mitigation. You can source and install mitigation appliances, there are many vendors that offer them, you can build your own using many tools, and they will have a certain capacity. Some of them come with Cloud capabilities so you can leverage resources upstream. They can be useful, they can be expensive, you still need to have capacity in place to bring the traffic to the clients in the first place and not every attack will suit every box and ultimately if you are going to mitigate an actack you need people who know how to use them and how to deploy them. Some day you are going to need a bigger boat, if it's good for ten gigs and it's eleven gig attack, you are out of luck so. There are options you can use to basically pass the effort of mitigation to somebody else, you are trying to ‑‑ transit providers may have solutions or you can use alternatives like CDNs or DDoS mitigation services. A CDN basically pushes your content tout a vast server footprint which ultimately absorbs a lot of the traffic and you never even see it. They don't, however, suit all types of traffic. DDoS protection services place themselves between the malicious traffic and the end user so the only way to get to the end user is through the protection service. And they will generally pass clean traffic back across the Internet or via a direct path to the origin. Both of these are more, you can look at them as over the top services rather than in line services, they are typically things that the targets of an attack will avail of directly from a third party provider rather than something a network might offer directly to their customers. So, that's it. In summary here of ultimately what has been discussed. Are there any questions?
AUDIENCE SPEAKER: Hi, Jim Reid, speaking for myself. I think it's a very good checklist, one thing I think might have been a missed was the fact DNS itself can be affected by these kind of DDoS a attacks particularly with the DNSSEC enabled response so is having solid DNS is all very well, what are you going to do if your DNS servers are the target of these attacks?
RONAN MULLALLY: Yes, that is a common factor and it's one we see quite a bit. That gets you into a scenario where you are doing more active mitigation, and that is where you potentially need to look at either having the hardware yourself or a service that can do it for you.
JIM REID: As I am sure you are well aware there has been lots of these DDoS attacks against imported name servers, route servers, TLD but if these start moving further down the food chain into end users networks or may domain.com equivalent the structure might not be as robust
RONAN MULLALLY: Yes.
CHAIR: Any more questions? Thank you.
All right. And before you go, I just want to let everyone know, first off, please rate all the talks, that is how we can continue to make an even better RIPE every time. Secondly, RACI is going to be in this room at 6 p.m. with more awesome research presentations so if you are not there, you should instead be at the BoF that is happening at 6 p.m., which is talking about middle box behaviour. So all right. Bye.