Your Linux Data Center Experts

PyCon 2008 Wireless Network

By  Sean Reifschneider Date March 19, 2008

Also see my related article on networking I set up for PyCon 2007, 2008, 2010, and 2012.

The network at PyCon this year reminds me of a lawyer joke that Van Lindberg told. Question: What do you call a ten-fold increase in bandwidth at PyCon? Answer: A good start.

This year I really pushed the idea that the hotel upstream bandwidth was just not even worth considering. Which is good because this one had a third less bandwidth than the one last year with only two T1s providing 3mbps (and that's shared with the guest rooms as well).

I pushed for DS-3 level bandwidth at 45mbps, and we were able to get nearly that at 40mbps or around 5MB/sec.

The bad news is that we outsourced the wireless and it just really wasn't up to the challenge. By Saturday morning we were able to get the network to a point where it was "serviceable", but the first two days were pretty rough and it never was really what I'd call good.

Last Year

The networking for PyCon 2007 I consider to have been good, though usually when I say that someone who was there corrects me that it was great. Maybe I'm just being humble, but I think it could have been improved a bit -- largely through the use of more APs.

Just as an overview, we had a dozen dual-radio APs, running at very low power, spread out around the area. The big pain point we had last year was not enough bandwidth.

This Year

I pushed really hard for 45-mbps level service, and Carl Karsten was able to pull out a last minute wireless connection at 40mbps. It was rock solid, just great. This was probably the greatest success we had this year.

The wireless was outsourced, and the company doing it and the equipment they had selected really wasn't up to the task. The first 2 days were very rough. By the third day, things did stabilize a fair bit, to the point where the users who were still carrying their laptops (I know some had given up) were living with what they had. Really, during the tutorials and the first day it was largely unusable.

We had few if any wired drops, because of the design of the wireless network, so people didn't have the option of plugging in except for in the atrium (where there were no talks). So if you had to get on, you could get rockin' networking, but otherwise it was passable at best.

To be fair, during the sprints when we only had 235 users total, and they threw all the hardware at covering these users, it worked fairly well. However, that's easy to do -- we were all spread out and had very few users.

The wireless was outsourced to an external provider who, as far as I know, really had no experience running wireless networks, let alone something this big. They are a vendor of wireless hardware, so their job is primarily selling hardware to people who have experience running this sort of network. But, they really didn't do their homework on it, it wasn't until Thursday night that they knew that the equipment they had put the 400 tutorial attendees on was only able to handle 200 users.

Statistics

Well, sadly, one of the problems with the wireless this year was that it (or the company we outsourced it to) was unable to to provide us with any statistics. We don't know how many users we had on at peak, or per protocol, any of the useful statistics we got last year.

This really hurts because one of the important things we could have gotten this year was the trending information to allow us to project for next year. In particular, we wondered if we might have had increased penetration with more users carrying a device and more users carrying two devices (like a laptop and a wifi-capable phone or PDA).

My impression was lots of people had two laptops: XO+laptop, Nokia tablet+laptop, Eee PC+Laptop, or those freaks like me with laptop+laptop. :-)

The Upstream Network

We had a great 40mbps (5MB/sec) link from the hotel, provided via a long-haul terrestrial wireless link from Business Only Broadband. It worked great. Here's a graph, with samples taken at, apparently, half hour intervals:

The top of this graph is our peak bandwidth. The bright red line towards the bottom is the peak of the bandwidth we had last year -- clearly not enough even if we imagine last year was nearly half the usage. We had plenty of headroom.

Note that on Thursday we had only 400-ish people, and Monday we had around 250. The sprints had had pretty heavy usage of the network.

Now, part of this may have been that we were limited by the wireless network. Lots of users had problems getting on, particularly on Thursday and Friday. Note the keynote in the morning on Friday compared to Saturday. Usually these are the heaviest use periods, but Friday it was not particularly heavy. Also realize that a single user on the wireless is limited to around half the upstream bandwidth because 802.11G and A only provide around 2.5MB/sec. So each "cell" is limited to at most 2.5MB/sec, and probably much less because of contention.

We seem to be leap-frogging. The first year we had bad wireless so we couldn't use the upstream. Then we fixed wireless and realized we didn't have nearly enough upstream. Now we fixed the upstream and the wireless was problematic again...

The Wireless Problem

The company that we outsourced this to was using a "wireless switch" solution by a vendor named Extricom. The wireless switch mechanism is the "darling" of the wireless industry right now, and for low densities it's probably pretty nice. However, for handling high densities of serious users it apparently has some serious problems.

The Extricom system basically acts like a single giant AP. There is a switch similar to regular Ethernet switches, but it has a number (8 and 12 ports in the models we used) which can be connected to "dumb" APs, allowing the offloading of the intelligence to the switch. The switch can then coordinate things like only one radio talking at a given time reducing contention between multiple independent APs.

Here are some of the problems I believe we ran into. Some of this information on second-hand from the reseller who was installing and deploying the system, some of it is conjecture that comes from just thinking through the design.

Wireless Problem: Cost

One thing about these systems is that they are relatively expensive. If they had just worked then they might have been worth it. However, my understanding is that the hardware for this solution costs well over $1,000 per AP drop. We paid for 18 APs connected to 2 switches, with another "loaner" switch from the vendor and a fourth finally brought in to just stop our network from falling over. Total cost for this hardware was something around $20,000.

In comparison, the wireless we used last year to handle 650-ish attendees cost us $2,500 for a dozen APs. We ended up purchasing another 20 of these APs this year as a contingency plan when the network was still having serious problems on Friday (if we didn't get them then, we had no option for the rest of the conference). This year the APs were nearly half the cost of last year, so the additional 20 APs cost an addition $2,550. For a total of 32 APs.

So we could have gotten way more APs for a small fraction of the price. This will become important when I discuss contention in a bit...

Wireless Problem: Expectations

The wireless switched network is the latest and greatest thing. Apparently the company we outsourced to brought in reps from Extricom, the hardware vendor, to review the space and plans. I'm wondering if these were sales people and not the engineering staff.

Since leaving GWU where we held the first 3 PyCons, whenever we've outsourced the networking we've always run into an expectation problem. GWU had no problem because we were much smaller, GWU's areas were a University environment so very similar usage to ours, and we were much more spread out.

When we outsource the networking we've tried to be extremely clear about our demands. "We pretty much all have a computer, we're computer geeks, and a huge percentage of our users will have them on. We're probably way more than anything you've ever experienced before." That sort of thing.

However, the vendors have always been like "No problem!" We've tried to make it clear that we're likely to smoke test their network, but they always seem to think we're BSing them. I remember 2 years ago saying "Compared to any other 600 people you've had here, we'll be way more demanding on the network." "No problem!"

I think the short form of this is that we just can never outsource the wireless, certainly not without them providing extensive architecture documentation as part of the proposal to demonstrate why they think they can handle it.

In the future I'd propose that we literally say "Every time we've outsourced, the vendor has said No Problem (tm) but every time their gear has practically started smouldering."

Wireless Problem: Spectrum

The company we outsourced to told us that the Extricom gear has to have all the APs on a switch on a single channel. This is relatively fine for 802.11B+G when running with 3 switches (at $8k each) because 802.11B+G only has 3 non-overlapping channels.

However, 802.11A has 12 non-overlapping channels. We were only using a quarter of the available spectrum in 802.11A.

The outsourced tech that was doing the wireless network was arguing with me that the Extricom system was designed to minimize contention, but I pushed back pretty hard on that assertion. We didn't have more than 10 APs on either floor, so we could have had absolutely 0 contention on the AP side using independent APs instead of the Extricom unit. In this case, the Extricom unit promoted contention because all the 802.11A laptops had to share 3 channels instead of 12.

Last year I reported that I didn't have a single 802.11A user who reported problems. This year, I had a handful of 802.11A users who had success using it at all.

In short, last year we had 802.11A available so that users who had that ability could get out of the way and not contend with the much more precious 802.11B+G spectrum. This year, despite almost certainly having more users with 802.11A-capable devices (I couldn't do A last year and could this year, for example), it feels like we had fewer users on it leading to more contention on the B+G channels.

Again, we had no statistics on the utilization of the different channels or protocols, so I don't have a good read of this at all.

Wireless Problem: Contention

Our vendor also told us that the Extricom units couldn't have their power adjusted, they were all set at 50mw. Last year we ran the 802.11B+G radios at between 5 and 15mw, and spread them around all over creating lots of little "cells" of communication.

It's kind of like being in a big room and having a conversation... You could have one conversation with the entire audience by having someone with a bullhorn talking at the whole audience. But you really only get one conversation at a time... This is a high-contention environment -- really only one person can talk and others have to wait.

On the other hand, you could have a big room with a bunch of round tables and people talking quietly among the people at their table. Even better, imagine that there are now sound-absorbing partitions put up between the tables... You can now have a whole lot of conversations going on at once.

Because the Extricom units were running 50mw, it's more like the bullhorn situation.

Last year the architecture I developed had the APs running very low power in the 802.11B+G spectrum, and kept the APs low so that the attendees would help to dissipate the signal. Then put a lot of APs around, creating small cells with less interference between them. As more people crowded in an area, contention between APs would be reduced because of these tiny cell sizes.

With the Extricom units as provisioned by our vendor, I suspect we had much more contention. If nothing else, we had only 18 total radios, 10 on one floor and 8 on the other, instead of the 32 or 40 we could have had with the independent radios.

Wireless Problem: Proprietary Wiring Runs

This was a huge problem with the Extricom units. Wiring runs in this hotel were fairly scarce, I think we were using all of them with the AP to switch runs. Because these were "special" runs, we couldn't put other gear on them, like switches.

Last year we had a switch available in around half the rooms, so if you had problems with wireless you could get in early and get a wired connection. This year if we did that, we'd lose an AP for each normal wired run we'd set up, which would have just made the wireless issue worse.

Plus, presenters who really could have benefited from wired connections only had them on Saturday and tutorial presenters didn't have them at all.

WEP

Last year we enabled WEP with a trivial WEP key. This was to try to keep non-conference users from swamping our (rather limited) upstream capacity. With only 3.5mbps available, it would have been fairly easy for a single user to make the network unusable for everyone.

This year we didn't have WEP enabled, and things seemed fine. However, part of that was because we had the whole venue the whole time. Last year there were other guests in the hotel the whole time, and even other functions using a number of the meeting rooms while we were there. No such problem this year, we had all available guest rooms in the conference hotel and a lot of people in the overflow hotel.

Shaping

Last year I tried to set up shaping, and while it did prevent anyone from hammering the network, it apparently didn't allow users to share bandwidth if it wasn't over utilized. This year, through a miscommunication, we didn't have a Linux router set up, and so we didn't really have the shaping option.

Because we had so much upstream bandwidth, this never seemed to be a problem.

Going Forward

I've been asked by several people if I'd be willing to be paid to do the networking next year. I explained that I volunteered to do it at no charge this year. However, I guess that that was part of the problem. I get the impression that some of the organizers felt they were taking advantage of me by having me do all that work and they wanted to give me the opportunity to enjoy the conference. So next year we may have to charge them. :-)

After careful consideration, I believe that the network architecture that I came up with for last year would have been up to handling the task. I had planned on putting out many more APs -- around 3 times as many to handle only twice as many users. The many small cells running at very low power served us and other conferences (I found in my research) well.

Lessons Learned

In Conclusion

Nobody really understands our networking needs like we do. Despite trying to communicate it very clearly with providers, they always seem to underestimate it. Unless we can find a vendor that can say something like "We designed the network for OSCON/ACM/IEEE with 1500 attendees", we probably shouldn't believe them that they can handle the networking.

Using many small independent AP devices is more work to set up, but it provides much more flexibility at a dramatically reduced cost and with likely better performance than the Extricom network that was deployed this year.

With the latest generation 802.11N enterprise APs available, the Enterprise APs we used last year are even more attractive from a cost standpoint, allowing us to deploy many APs giving us smaller cells. I still think this is the way to go.

Shameless Plug

tummy.com has smart people who can bring a diverse set of knowledge to augment your Linux system administration and managed hosting needs. See the menu on the upper left of this page for more information about our services.

comments powered by Disqus