PyCon 2008 Wireless Network
Also see my related article on
networking
I set up for PyCon 2007.
The network at PyCon this year reminds me of a lawyer joke that Van
Lindberg told. Question: What do you call a ten-fold increase in bandwidth
at PyCon? Answer: A good start.
This year I really pushed the idea that the hotel upstream bandwidth
was just not even worth considering. Which is good because this one had a
third less bandwidth than the one last year with only two T1s providing
3mbps (and that's shared with the guest rooms as well).
I pushed for DS-3 level bandwidth at 45mbps, and we were able to get
nearly that at 40mbps or around 5MB/sec.
The bad news is that we outsourced the wireless and it just really
wasn't up to the challenge. By Saturday morning we were able to get the
network to a point where it was "serviceable", but the first two days were
pretty rough and it never was really what I'd call good.
Last Year
The networking for PyCon 2007 I consider to have been good, though
usually when I say that someone who was there corrects me that it was
great. Maybe I'm just being humble, but I think it could have been
improved a bit -- largely through the use of more APs.
Just as an overview, we had a dozen dual-radio APs, running at very
low power, spread out around the area. The big pain point we had last year
was not enough bandwidth.
This Year
I pushed really hard for 45-mbps level service, and Carl Karsten was
able to pull out a last minute wireless connection at 40mbps. It was rock
solid, just great. This was probably the greatest success we had this
year.
The wireless was outsourced, and the company doing it and the
equipment they had selected really wasn't up to the task. The first 2 days
were very rough. By the third day, things did stabilize a fair bit, to the
point where the users who were still carrying their laptops (I know some
had given up) were living with what they had. Really, during the tutorials
and the first day it was largely unusable.
We had few if any wired drops, because of the design of the wireless
network, so people didn't have the option of plugging in except for
in the atrium (where there were no talks). So if you had to get on,
you could get rockin' networking, but otherwise it was passable at best.
To be fair, during the sprints when we only had 235 users total,
and they threw all the hardware at covering these users, it worked
fairly well. However, that's easy to do -- we were all spread out and
had very few users.
The wireless was outsourced to an external provider who, as far as I
know, really had no experience running wireless networks, let alone
something this big. They are a vendor of wireless hardware, so their job
is primarily selling hardware to people who have experience running this
sort of network. But, they really didn't do their homework on it, it
wasn't until Thursday night that they knew that the equipment they had put
the 400 tutorial attendees on was only able to handle 200 users.
Statistics
Well, sadly, one of the problems with the wireless this year was that
it (or the company we outsourced it to) was unable to to provide us with
any statistics. We don't know how many users we had on at peak, or per
protocol, any of the useful statistics we got last year.
This really hurts because one of the important things we could have
gotten this year was the trending information to allow us to project for
next year. In particular, we wondered if we might have had increased
penetration with more users carrying a device and more users carrying two
devices (like a laptop and a wifi-capable phone or PDA).
My impression was lots of people had two laptops: XO+laptop, Nokia
tablet+laptop, Eee PC+Laptop, or those freaks like me with laptop+laptop.
:-)
The Upstream Network
We had a great 40mbps (5MB/sec) link from the hotel, provided via a
long-haul terrestrial wireless link from Business Only Broadband. It
worked great. Here's a graph, with samples taken at, apparently, half hour
intervals:
The top of this graph is our peak bandwidth. The bright red line
towards the bottom is the peak of the bandwidth we had last year -- clearly
not enough even if we imagine last year was nearly half the usage. We had
plenty of headroom.
Note that on Thursday we had only 400-ish people, and Monday we had
around 250. The sprints had had pretty heavy usage of the network.
Now, part of this may have been that we were limited by the wireless
network. Lots of users had problems getting on, particularly on Thursday
and Friday. Note the keynote in the morning on Friday compared to
Saturday. Usually these are the heaviest use periods, but Friday it was
not particularly heavy. Also realize that a single user on the wireless is
limited to around half the upstream bandwidth because 802.11G and A only
provide around 2.5MB/sec. So each "cell" is limited to at most 2.5MB/sec,
and probably much less because of contention.
We seem to be leap-frogging. The first year we had bad wireless so we
couldn't use the upstream. Then we fixed wireless and realized we didn't
have nearly enough upstream. Now we fixed the upstream and the wireless
was problematic again...
The Wireless Problem
The company that we outsourced this to was using a "wireless switch"
solution by a vendor named Extricom. The wireless switch mechanism is the
"darling" of the wireless industry right now, and for low densities it's
probably pretty nice. However, for handling high densities of serious
users it apparently has some serious problems.
The Extricom system basically acts like a single giant AP. There is a
switch similar to regular Ethernet switches, but it has a number (8 and 12
ports in the models we used) which can be connected to "dumb" APs, allowing
the offloading of the intelligence to the switch. The switch can then
coordinate things like only one radio talking at a given time reducing
contention between multiple independent APs.
Here are some of the problems I believe we ran into. Some of this
information on second-hand from the reseller who was installing and
deploying the system, some of it is conjecture that comes from just
thinking through the design.
Wireless Problem: Cost
One thing about these systems is that they are relatively expensive.
If they had just worked then they might have been worth it. However, my
understanding is that the hardware for this solution costs well over $1,000
per AP drop. We paid for 18 APs connected to 2 switches, with another
"loaner" switch from the vendor and a fourth finally brought in to just
stop our network from falling over. Total cost for this hardware was
something around $20,000.
In comparison, the wireless we used last year to handle 650-ish
attendees cost us $2,500 for a dozen APs. We ended up purchasing another
20 of these APs this year as a contingency plan when the network was still
having serious problems on Friday (if we didn't get them then, we had no
option for the rest of the conference). This year the APs were nearly half
the cost of last year, so the additional 20 APs cost an addition $2,550.
For a total of 32 APs.
So we could have gotten way more APs for a small fraction of the
price. This will become important when I discuss contention in a bit...
Wireless Problem: Expectations
The wireless switched network is the latest and greatest thing.
Apparently the company we outsourced to brought in reps from Extricom, the
hardware vendor, to review the space and plans. I'm wondering if these were
sales people and not the engineering staff.
Since leaving GWU where we held the first 3 PyCons, whenever we've
outsourced the networking we've always run into an expectation problem.
GWU had no problem because we were much smaller, GWU's areas were a
University environment so very similar usage to ours, and we were much more
spread out.
When we outsource the networking we've tried to be extremely clear
about our demands. "We pretty much all have a computer, we're computer
geeks, and a huge percentage of our users will have them on. We're
probably way more than anything you've ever experienced before." That
sort of thing.
However, the vendors have always been like "No problem!" We've tried
to make it clear that we're likely to smoke test their network, but they
always seem to think we're BSing them. I remember 2 years ago saying
"Compared to any other 600 people you've had here, we'll be way more
demanding on the network." "No problem!"
I think the short form of this is that we just can never outsource the
wireless, certainly not without them providing extensive architecture
documentation as part of the proposal to demonstrate why they think they
can handle it.
In the future I'd propose that we literally say "Every time we've
outsourced, the vendor has said No Problem (tm) but every time their gear
has practically started smouldering."
Wireless Problem: Spectrum
The company we outsourced to told us that the Extricom gear has to
have all the APs on a switch on a single channel. This is relatively fine
for 802.11B+G when running with 3 switches (at $8k each) because 802.11B+G
only has 3 non-overlapping channels.
However, 802.11A has 12 non-overlapping channels. We were only using
a quarter of the available spectrum in 802.11A.
The outsourced tech that was doing the wireless network was arguing
with me that the Extricom system was designed to minimize contention, but I
pushed back pretty hard on that assertion. We didn't have more than 10 APs
on either floor, so we could have had absolutely 0 contention on the AP
side using independent APs instead of the Extricom unit. In this case, the
Extricom unit promoted contention because all the 802.11A laptops had to
share 3 channels instead of 12.
Last year I reported that I didn't have a single 802.11A user who
reported problems. This year, I had a handful of 802.11A users who had
success using it at all.
In short, last year we had 802.11A available so that users who had
that ability could get out of the way and not contend with the much more
precious 802.11B+G spectrum. This year, despite almost certainly having
more users with 802.11A-capable devices (I couldn't do A last year and
could this year, for example), it feels like we had fewer users on it
leading to more contention on the B+G channels.
Again, we had no statistics on the utilization of the different
channels or protocols, so I don't have a good read of this at all.
Wireless Problem: Contention
Our vendor also told us that the Extricom units couldn't have their
power adjusted, they were all set at 50mw. Last year we ran the 802.11B+G
radios at between 5 and 15mw, and spread them around all over creating lots
of little "cells" of communication.
It's kind of like being in a big room and having a conversation...
You could have one conversation with the entire audience by having someone
with a bullhorn talking at the whole audience. But you really only get
one conversation at a time... This is a high-contention environment --
really only one person can talk and others have to wait.
On the other hand, you could have a big room with a bunch of round
tables and people talking quietly among the people at their table. Even
better, imagine that there are now sound-absorbing partitions put up
between the tables... You can now have a whole lot of conversations going
on at once.
Because the Extricom units were running 50mw, it's more like the
bullhorn situation.
Last year the architecture I developed had the APs running very low
power in the 802.11B+G spectrum, and kept the APs low so that the attendees
would help to dissipate the signal. Then put a lot of APs around, creating
small cells with less interference between them. As more people crowded in
an area, contention between APs would be reduced because of these tiny cell
sizes.
With the Extricom units as provisioned by our vendor, I suspect we had
much more contention. If nothing else, we had only 18 total radios, 10 on
one floor and 8 on the other, instead of the 32 or 40 we could have had
with the independent radios.
Wireless Problem: Proprietary Wiring Runs
This was a huge problem with the Extricom units. Wiring runs in this
hotel were fairly scarce, I think we were using all of them with the AP to
switch runs. Because these were "special" runs, we couldn't put other gear
on them, like switches.
Last year we had a switch available in around half the rooms, so if
you had problems with wireless you could get in early and get a wired
connection. This year if we did that, we'd lose an AP for each normal
wired run we'd set up, which would have just made the wireless issue worse.
Plus, presenters who really could have benefited from wired
connections only had them on Saturday and tutorial presenters didn't have
them at all.
WEP
Last year we enabled WEP with a trivial WEP key. This was to try to
keep non-conference users from swamping our (rather limited) upstream
capacity. With only 3.5mbps available, it would have been fairly easy for
a single user to make the network unusable for everyone.
This year we didn't have WEP enabled, and things seemed fine.
However, part of that was because we had the whole venue the whole time.
Last year there were other guests in the hotel the whole time, and even
other functions using a number of the meeting rooms while we were there.
No such problem this year, we had all available guest rooms in the
conference hotel and a lot of people in the overflow hotel.
Shaping
Last year I tried to set up shaping, and while it did prevent anyone
from hammering the network, it apparently didn't allow users to share
bandwidth if it wasn't over utilized. This year, through a
miscommunication, we didn't have a Linux router set up, and so we didn't
really have the shaping option.
Because we had so much upstream bandwidth, this never seemed to be a
problem.
Going Forward
I've been asked by several people if I'd be willing to be paid to do
the networking next year. I explained that I volunteered to do it at no
charge this year. However, I guess that that was part of the problem. I
get the impression that some of the organizers felt they were taking
advantage of me by having me do all that work and they wanted to give me
the opportunity to enjoy the conference. So next year we may have to
charge them. :-)
After careful consideration, I believe that the network architecture
that I came up with for last year would have been up to handling the task.
I had planned on putting out many more APs -- around 3 times as many to
handle only twice as many users. The many small cells running at very low
power served us and other conferences (I found in my research) well.
Lessons Learned
There will be some users who always have problems with wireless.
We will never have 100% success, because of users accidentally
publishing and connecting to Ad-Hoc/Computer-to-Computer networks,
users with driver or card issues, and that sort of thing...
We probably shouldn't outsource the networking again, and
definitely not without a concrete architecture that's reviewed by
me. This year the only information I was given the opportunity to
review was along the lines of "We will be using Extricom wireless
switch networking gear". I went to their web-site, but that's no
substitute for a real architecture like I created last year.
T1s really aren't sufficient to handle our conference. Half a
DS-3 is probably the minimum I'd want to see.
Avoid proprietary AP runs unless we have lots of wiring drops.
Get wired switches in every room.
The Extricom gear as deployed by our vendor really was not up to
the task.
We can't live without the statistics.
In Conclusion
Nobody really understands our networking needs like we do. Despite
trying to communicate it very clearly with providers, they always seem to
underestimate it. Unless we can find a vendor that can say something like
"We designed the network for OSCON/ACM/IEEE with 1500 attendees", we
probably shouldn't believe them that they can handle the networking.
Using many small independent AP devices is more work to set up, but it
provides much more flexibility at a dramatically reduced cost and with
likely better performance than the Extricom network that was deployed this
year.
With the latest generation 802.11N enterprise APs available, the
Enterprise APs we used last year are even more attractive from a cost
standpoint, allowing us to deploy many APs giving us smaller cells. I
still think this is the way to go.
Shameless Plug
tummy.com has smart people who can bring a diverse set of knowledge to
augment your Linux system administration and managed hosting needs. See
the menu on the upper left of this page for more information about our
services.