Thursday, May 5, 2011

Summary: From OpenGL and NITE to OpenNI

When we first began this project, we knew nothing about the Kinect.  So, to get started, we followed the instructions on this website to configure our computers to take in data from the Kinect, and render it into a visual representation.  This visual representation relied on OpenGL, GLUT, and NITE.  It gave us pretty pictures (like you saw in our earlier posts) that would show the environment, as well as draw a skeleton on individuals.  This was great, and we thought it was exactly what we needed.  We were able to modify this code so that when the individual was recognized, the code would print the x,y, and z coordinates of the center of mass of each person that the program was tracking (as seen again below).



Again, this was perfect, and we thought our project was almost done before it even started.  Then, we tried to put everything on to the BeagleBoard.

Well, to make a long story short, the program that used OpenGL, GLUT, and NITE wouldn't work on the BeagleBoard.  The first problem was that when the program ran, OpenGL had to open a new window (in other words, you would run the program from the terminal, and then another window would open).  It was in this window that it would do all the image rendering, and create the visual representation of what the Kinect was seeing (in the above image, the terminal is on the left with the x, y, and z coordinates, and the new image rendering window is on the right).  Now, the BeagleBoard, when running Ubuntu, is basically a headless terminal.  This means that you can only open terminal windows on it.  You can't open any other type of window.  Thus, if you tried to run the code, it would crash the BeagleBoard.  So, we figured we would just need to change the code so that the extra window wouldn't open.  We used the divide and conquer technique here.  Tim worked on modifying the code so that OpenGL wouldn't open that new window, and Matt worked on configuring NITE to run on the BeagleBoard.

Eventually, after wasting many hours learning the inner-workings of GLUT and NITE respectively, we realized we weren't going to get this program to run on the BeagleBoard.  The OpenGL main loop will not run without opening another window.  You would think this isn't a problem, since OpenGL just handles the image rendering. However, we discovered that the OpenGL also handled some of the skeleton tracking, and thus, without the OpenGL component of the code, the program was useless.  Right around the same time, we realized NITE would not run on the BeagleBoard.  You see, NITE is configured to run on an x86 platform, and thus is not compatible with ARM, the processor on the BeagleBoard.  When it comes to the "trial and error" process, we certainly were trying, and we sure had a lot of error.

So, we scrapped NITE, and went back to our old friend OpenNI.  OpenNI had a C program that would return the z-position of the pixel at the exact center of the Kinect's field of vision (the resolution of the Kinect is 640x480).  So, we took this code and ran with it.  We took this z-data for the exact center, and translated it into robot commands.  Thus, a person would stand exactly in front of the robot in the dead center of its view.  The z-data for the person would then be processed.  If the person was greater than a certain distance away, the robot was told to move towards them.  If the person was less than a certain distance away, the robot would move away from them.  And, if the person was in the "sweet spot" then the robot would remain still.

This code worked well, but it was very basic.  We added to it extensively, writing code to process the z-data, and writing code to translate that data in to robot commands that were then transmitted via ethernet to the robot.  However, we thought that if we stopped there, we wouldn't really be earning our pay.  So, we decided to go a bit further.

We created code that would search for a person, no matter where they were in the field of vision, and would recognize them.  Once they were seen, the robot would turn towards them until they were directly in front of the robot, while simultaneously moving closer or farther from the person.

To do this, we created our own code based off of the "middle of the screen z data code."  We broke the Kinect's field of vision in to grids.  The code then cycles through these grids, and gets the z data for whatever is in each grid.  This z data corresponds to the distance between the robot and whatever is in the frame.  Our initial code is only for an obstacle free environment, so we work under the assumption that whatever is closest to the robot is the human target.  So, the code cycles through the grids, and stores the z value for each.  Whichever z value is the smallest is determined to be the target, and the robot turns to get that target in its center of vision (in the y plane).  As the robot turns, it also moves closer or farther from the target, until the robot is about four feet away (in the z plane).

And thus, you have it.  Future work for this code will include improving the robot's reaction time and adding on PD control.

Summary: Hardware Configuration

So, how does one attach a Kinect to an X-RHex?  And once its on there, how do you power it?  Well, that's what this post is going to address.

The Base
First of all, you need some kind of plate or platform to hold everything.  For our plate, we designed it in SolidWorks, and then cut it out using the laser-cutter.  The base is simply a piece of acrylic with the necessary holes cut in it.  There are four holes to attach the Kinect, four to attach the BeagleBoard (via stand offs) and then two sets of holes to hold the Picatinny rail mounts.  For anyone that is interested, the necessary SolidWorks files can be found here.

The Cables
Now that you've got the Kinect and BeagleBoard anchored to the robot, you need to power them, and allow the Kinect to communicate with the robot.  Let's begin by solving the power problem.

First of all, this website was helpful, because they actually took apart a Kinect cable and put up pictures of the guts, so we knew what we were getting in to before we started slicing and dicing.  Below is what the standard Kinect power cable looks like:


As you can see, the 12V wall plug and and the USB data cable meet at a special proprietary connector, which the Kinect then plugs in to.  To begin, we cut off the 12V wall plug.  We soldered on a standard 5.5mm male connector to the wall plug, as you can see in the photo below:


To the proprietary connector, we soldered a standard 5.5mm female connector, as shown below:


Next, we made the below cable:


On the left edge is a shielded power connector that connects to the robot's 12V power supply.  On the right side is a standard male cable.  Thus, it is easy for us to switch between powering the Kinect from a standard wall socket, and powering it from the robot.  We made another cable that is similar to the one shown above to connect to the robot's 5V power supply.  This cable would then connect to and power the BeagleBoard.

With our power issues solved, the only remaining cable issues were for data.  The BeagleBoard has multiple USB ports, so data from the Kinect was simply sent to the BeagleBoard over USB, as shown in the picture below:



Finally, to communicate between the BeagleBoard and the robot, we simply connected the BeagleBoard's ethernet port to the ethernet port on the robot, using the below ethernet  cable:


And that's pretty much it.

Here is a photo with everything put together and attached to the robot:



Below is a video explanation of all of our connections:



Parts List
  1. BeagleBoard xM- http://beagleboard.org/hardware-xM
  2. Standard 5.5mm female power connector- DigiKey P/N: CP3-1000-N
  3. Shielded power connector- DigiKey P/N: CP-1380-N
  4. Standard male 5.5mm power receptacle