Lab 2 Auto-Testing

I have been working on getting a lab#2 tester set-up, both for grading, and for the bake-off. It’s done. More importantly, it’s checked in. You can test it yourself.

The code is found under test/blackbox/reliable_stream, and the file you want is test_throughput.py. It’s pretty-well automated, but you need to understand how it works and also a few of the “glitches”.

The tester needs three Playground installations, even though they can all be the same installation. One is used for the Chaperone, the other two are used for the PTCL peers that communicate with each other.

The Chaperone’s playground is taken from the current python path. If you run this from the test/blackbox/reliable_stream directory, however, you won’t have playground in your path. Solve this problem by setting the path right before launching the program:

>  PYTHONPATH=~/Applications/Playground/src/ python test_throughput.py

This will put ~/Applications/Playground/src at the front of the python search path, so that’s the playground it will use.

The other two Playground paths are just given as two arguments. Again, you can use the same playground install for all three, so in my example it would look like this:

 PYTHONPATH=~/Applications/Playground/src/ python test_throughput.py ~/Applications/Playground/src/ ~/Applications/Playground/src/

What this does is set up a programmatic chaperone running on localhost and 9090. Please note, you can’t have a Chaperone already running, or you’ll get a conflict over the port. It then launches, also programmatically, the two peers.

I’m using playgrounds multiprocessing module to run these tests. It works like the threading module, but each “thread” is running in its own subprocess. This means you really do get the full use of your system’s multiple processors and cores to make the test run smoother. But it also means that I can load a different playground in each process. So, before I launch each peer, I take the path that was given as an arg and through it into the python path for that process only! That’s how I’m able to have two different playground imports.

This does occasionally cause weird problems that I can’t debug. Shared data is sent back and forth as pickled objects, and if something screws up, it prints out weird message. If you see things that talk about problems with sockets, reactors, or other miscellany, that’s what it is. Most of the time things work just fine though.

Now, the real critical part for you to do your own testing is this. You need to have your playground path setup to use your PTCL implementation for RELIABLE_STREAM.

The test runs about 10 separate runs. It’s more-or-less the same data each run, but with increasing levels of error on the wire.

BUT HERE’S THE PART YOU NEED TO READ AND UNDERSTAND: In my own testing, I found out I had a lot of bugs. It was about 12-hours to completely debug the system (it’s not really 12 hours, because a lot of those hours were also debugging the testing framework). I expect that you guys will find you’ve got errors too. Let me talk about the most common ones.

  1. YOU MUST MAKE SURE YOU SEND ALL DATA EVEN AFTER THE HIGHER LAYER CALLS loseConnection. When sending large amounts of data, there will almost always be a backlog. But the application layer expects that everything it sent (or tried to send) before calling loseConnection will arrive on the other side. Don’t forget to process acks and resends.
  2. On the other hand, if you receive a FIN, you do not need to keep sending data. HOWEVER, you may still have some data received that’s waiting to be popped to the higher layer. Watch out for race conditions.
  3. You need to handle the situation where both sides send a FIN to each other at the same time.
  4. You need to make sure you resent SYN’s, ACK’s,  and SYNACK’s until they’re received. In error-prone conditions, it won’t do you any good to ACK data packets if your connection never gets started.

One more note about the FIN-loseconnection-shutdown thing. I’ve written these tests to send application-layer messages when they’re done. In other words, the application layer won’t try to call loseConnection until the other side has told it that it’s finished. This should mean that we don’t have to worry about exactly how you handle data received and sent AFTER the FIN. But it’s still critical that you clear out any data that should have been sent or should have been received. If you don’t understand this, please ask.

I know I promised some of you that I’d test out your code for you, but please try this test on your own first and then get back to me . In fact, when you’re ready for me to test it, please send me one of two things:

  1. A link to a repository that is a COMPLETE PLAYGROUND REPO that works out of the box
  2. OR, if it’s a link to a repository for just your code, make sure that it has an installation file:
    1. The file must be named install2Playground
    2. It can be shell script or python or anything else as long as it works as an executable
    3. It must take a single argument that represents a Playground installation
    4. It must install any code necessary for PTCL AND a playground.conf file with RELIABLE_LAYER set

Leave a Reply

Your email address will not be published. Required fields are marked *