Home

Jeremy Blackburn's Weblog

Just another University of South Florida Blogosphere weblog

WHOA!

Wow!

I won the poster competition for my poster entitled Power Management for BitTorrent.

That was completely unexpected!!!

My buddy Paolo Aguilar from Mexico won second place with his poster Trajectory Control and Reinforcement Learning (robot soccer!)

Clayton Gandy and Nicole Weber tied for 3rd place with their posters.

Congratulations to everyone in the program. All the posters looked amazing and everyone did a great job!

As always, thanks a ridiculous amount to Prof. Ken Christensen for his guidance. Also, Prof. Miguel Labrador and Daladier Jabba Molinares ran the program to perfection and were instrumental in making this experience great!

Diagraming Tools

An important part of any paper, presentation, or poster is the associated figures and diagrams.

While there are numerous tools to create graphs and charts from numerical data (and I’ve already professed my current love of scipy) the creation of graphs that aren’t based on data is pretty important too.

PowerPoint is probably the most commonly used tool to accomplish this goal. It’s dead simple to use, pretty much everyone knows it, and you can create images of pretty damn good quality.

PowerPoint, however, is is a presentation tool, and weaknesses start to show up when you want to do fancier things. Auto shapes are great, and there are connectors and what not, but, at least for the version I use (PowerPoint:mac 2004) there is a lot left to be desired. A really simple example that bugs the hell out of me quite often is alignment. Yes, there is snap to, and yes you can have a grid overlayed, but what if you want to do something like ensure that two elements on either side of a middle element are the same distance apart from that element?

If there is an easy way to do that, I don’t know what it is…

However, for OS X there is a sweet app called OmniGraffle.

For the example above, OmniGraffle does neat stuff like:

See how it shows you when your two elements have the same horizontal distance from the middle element? That’s pretty cool…

OmniGraffle has a lot of other neat features, and the diagrams it creates look pretty slick right out of the box.

Here is a diagram I made with OmniGraffle giving a birds eye view of what a BitTorrent swarm looks like. I think it turned out pretty nice and creating it in PowerPoint would have been extremely difficult.

I’m sure that there are other applications out there for Windows, Mac, and Unix alike that would produce the same quality of diagram, but the ones I can think of off the top of my head (Photoshop, Gimp, and Visio) are generally designed for another purpose, with perhaps Visio as the exception. (But my dislike for Visio could span multiple posts )

LOL Tiling…

So, I have to calculate some percent differences and get averages etc etc etc…

Lately I’ve been using scipy for most/all of my data analysis, but I decided that I didn’t feel like figuring out how to organize the data in a script and would just do these particular calculations “by hand” with excel.

So, there are 28 total data files that I’m working over and I open them all in Excel through Finder (ya, I’m an Mac guy at the moment, but a post about how I feel on that whole thing will come later).

I actually laughed out loud at the result.

I would assume that it’s Excel’s tiling behavior, and I’m using Office 2004 for Mac, so I’m sure there are issues there, but still, I thought it was funny.

(Oh, and I also determined that it’s probably just a better idea to write a script heh…)

The Wrong Paradigm

So, I’ve finally reached the point in my latest research endeavor at which numbers need to be crunched. Because I’m a programmer, and thus inherently lazy, instead of using a program like Excel which requires me to click at least 3 buttons to get anything useful I decided to take a look at SciPy.

While I’ve messed with Ruby plenty of times in the past and am actually quite fond of it SciPy was my first real experience with Python beyond reading other people’s code or glancing at examples.

Python is some pretty nice stuff. Really nice stuff.

It has the typical map-reduce-filter array functions that seem to be finding their way into more and more of my code, but after writing:


def f(x):
    if x < 0:
        return 0
    else:
        return x

new_array = map(f, some_array)

for the umpteenth time I realized I was probably doing it wrong and took a look at list comprehensions. List comprehension is a pretty cool guy. Eh makes working with arrays easy and doesn’t afraid of anything.

It’s amazing how easy it is to fall into the wrong paradigm in a multi-paradigm language.

Although, I’m not sure if


y_scale_dl = reduce(max, [reduce(max, [y[4] for y in x[1][1]]) for x in experiments])

is really that readable/understandable either

Maybe I’m still doing it wrong?

Undefined symbols: vtable for

gcc said:

Undefined symbols:
“vtable for GreenBitTorrentApp”, referenced from:
__ZTV18GreenBitTorrentApp$non_lazy_ptr in greenbittorrent_app.o
ld: symbol(s) not found
collect2: ld returned 1 exit status

Note to self:

Do not declare a destructor without defining one.

BitTorrent Protocol Overhead

I’ve been doing some investigation into the energy consumption of second generation P2P systems, specifically BitTorrent.

There is a pretty decent push towards using set top boxes in consumer’s home for legal content distribution. In fact, this type of system is being proposed by Pablo Rodriguez at Telefonica here. The benefits to content providers are tremendous. The most obvious benefit is that the ISP no longer has to pay for a data center. In effect, the data center is being pushed out into the subscribers’ homes. The subscribers now pay the associated power and cooling costs.

Unfortunately, we’ve theorized that the reduction in power costs to the ISP will lead to an overall increase in power consumption. The essential concept is that while ideally we would expect power consumption to increase linearly with utilization in reality this is not the case.

Utilization versus power use for IT equipment

The graph above shows that power consumption increases dramatically from 0 to about 15% utilization. Additionally, increasing utilization past about 15% results in a negligible increase in power consumption.

The conclusion we can draw from this is that we want our IT equipment to be heavily utilized to get the greatest energy efficiency out of our hardware.

For set-top boxes to be truly effective peers in a swarm they should be available to other peers pretty much 100% of the time, however they are unlikely to be fully utilized 100% of the time, which in turn means a large reduction in energy efficiency. Ideally we’d like to put set-top boxes that are doing nothing but waiting for peers to connect to sleep, however this is currently not feasible as a peer must be awake to respond to requests from other peers! In fact, there seems to be a lot of protocol traffic outside of pure data transfer that a peer must be made available for.

All traffic vs BitTorrent traffic vs BitTorrent request traffic

In the above graph, the black line is all traffic, the red line is all BitTorrent protocol traffic and the green line is BitTorrent piece request traffic. (The Y-axis is packets per second)

The graph was created from a trace on a reasonably popular swarm (about 12000 peers composed roughly of 50% seeds and 50% leaches)

The question remains as to what amount of the protocol traffic is directly related to data transfer, how much (if any) might be pushed off to a low power proxy which would wake the high power machine up for data transfer, and what, if anything, might be done to ensure that only highly utilized peers are members of the swarm (i.e. how to reduce to the energy footprint of the swarm as a whole while still maintaining it’s usefulness)

Google snubs USF?

It’s funny.

There are a lot of people upset about USF‘s BCS ranking. There are a lot of people that think those people are snubbing USF.

I had noticed this at some point in the past, but both maps.google.com and local.google.com have no idea (Denver?!? Colorado?!?!?!?) where USF is.

They seem to want “university of s florida tampa” (note the “s” not “south”) and the city before they are sure of what you are asking for. In fact, the search breaks completely if you try “university of south florida tampa”

(screenshot for posterity)
(screenshot for posterity)

Perhaps the single computer ranking that didn’t have us at #1 is in cahoots with maps.google.com (or more likely usf.edu needs to do some seo!)

Oh well…

Go Bulls.

Socket buffering is my bane (pt 2)

I ran several more tests and have come up with some results.

First, I modified the toy app a bit. In psuedo code:


wait for client to connect
do forever
  do n times
    generate large_message // (1400 bytes)
    send(large_message)
    if error on send
      // looks like a disconnect happened!
      exit
    end
  end
  sleep
end

Note that the above pseudo code is a bit inefficient. Generating the data multiple times will hurt our throughput. However, in the actual implementation the generated data is a counter to help distinguish what packets make it to the client. For this reason, it is is generated on each iteration of the loop. The point of the code is to try to max out whatever buffering is happening at lower levels of the networking stack. We want to keep the buffer as full as possible at all times.

I ran tests with n = 20, 5, 2

The test was really simple:
1 Start server
2 Start ethereal capture
3 Connect to server with client
4 Wait for a couple of messages to come in from server
5 Yank network cable from client
6 Wait a bit and plug network cable back into client
7 Yank network cable from client and wait until server socket times out

Here is what I found out:

A: send() will block if the internal socket buffer is full. The server
stopped generating messages as it waited for the send() call to return.

B: The socket buffer appears to be about 12600 bytes (12KB or so?) I came
up with this number by subtracting the last message the client had
received [RM] before being removed from the network (each message was
numbered) from the last message that the server attempted [SM] to send()
before send() blocked and multiplying by the size of the message [1400B].
E.g. for one of my traces, SM = 23, RM = 14
–> (SM – RM)*1400B = (23-14)*1400B = 12600B –> 12600/1024 = 12.3KB
(Sorry about my notation, and hopefully my math isn’t horribly wrong…)

C: The server will stop sending retransmissions after a few tries and
resort to looking for the disconnected client with arp requests

D: If reconnected, the server will flush its buffer and client will
receive all messages. In particular, if the server has ceased to attempt
retransmits, if it is successfully able to find the client via arp it will
send another retransmit which will then trigger the flush.

E: Eventually, the server will time out completely, seems to be in the
range of 15 minutes or so.

Questions I still have:

What happens if we use the gserver model (have the server application
queue data) and fill the socket buffer while disconnected from the client?
Will that data (socket buffered) disappear when we reconnect after a
timeout or will it be flushed to the client? (I need to run the tests on a
modified version of gserver1) I’m thinking that it will be gone forever,
but if it isn’t, I think I might be able to make a minor change to the
disconnect detection code to make things work how we want.

Is there anyway to get the state of the socket so that we might be able to
reconcile what’s in the socket buffer with what’s in our application’s
queue?

Is there anyway to get unflushed data from the socket buffer back into our
application? If there is, then we could wait for the normal timeout to
occur (which ought to trigger the disconnect/reconnect mechanism on the
server) and transfer all the data in the socket buffer back into our
application’s queue.

It still feels like I’m being a bit naive about this, but I guess progress
is being made

Socket buffering is my bane (pt 1)

I mentioned in a previous post that my gtelnetd implementation was failing to recognize a non explicit disconnect.

By non explicit I mean that a client becomes unavailable but does not send notification of its intent to be unavailable.

The first question I asked is whether or not a FIN packet is sent when a client goes into hibernate/suspend.

I wrote a little toy app and was able to get it to recognize a disconnect after about 15 minutes…

From what I can tell from the trace, the server sent 8 retransmissions over about 50 seconds. After that, it began sending out arp requests looking for the client.

It sent requests out for 4.5 minutes, then did nothing for about a minute or two, then sent out another group of arp requests for about 6.5 minutes.

Then, the server *FINALLY* recognized that it failed to send() data…

The entire time up to this point, the server was send()ing data and everything looked fine as far as it was concerned.

At least we confirmed that sending the client to hibernate/suspend does *NOT* send a FIN.

In fact, it would appear that as far as a server is concerned, a client going to hibernate/suspend is indistinguishable from a physical connection loss.

Great Success!

I successfully ported the Minix telnetd server to run under Linux.

The majority of changes had to do with pty and tty handling. Specifically, it is important to note that gettyent() and associated files and functions are not used in Linux and there were some minor changes with pathing for getty and login

Once the port was done and I had a working telnetd, I began greenifiying it.

The nice thing about the Minix telnetd implementation, and the reason I chose to port it to Linux, is that the inbound and outbound data is read and written at a single point.

Since I am currently only concerned with queuing data that is to be written to the client, I only had to change a single line:

(void) write(fdout, buf, len); was substituted with (void) g_write(fdout, buf, len);

A few other things to note:

A specialized queue node type was created to help us mimic the call to write(3):

struct gqnode { int fd; char buf[BUF_SIZE]; size_t nbytes; };

I also had to change some of the code that caused the telnetd server to terminate if we were unable to read from a socket. Now, only the fork that was handling the reads from the client terminates and a new process is forked off for additional reconnects.

A (hard to follow) demo video is available.

So, I now have a telnet server that is aware that a client may want to disconnect but not terminate its session.

A new problem has arisen however: the server can only properly handle an explicit disconnect. I.e. if the connection just goes away (client goes to sleep, network cable is yanked from client or server, &c) things break.

Jeremy Blackburn's Weblog

Categories

Archives