akhil sean
TRANSCRIPT
-
8/10/2019 Akhil Sean
1/10
1
Leveraging the Kinect SDK to Control of a Remote Device
Akhil Acharya and Sean Freemerman
Summer Ventures in Math and Science 2013
Visualization and Image Processing
Mentors: Dr. Rahman Tashakkori, Luke Rice, Bahar Akhtar, Dan Thyer
Appalachian State University
Abstract: The Microsoft Kinect sensor detects human motion, allowing for hands free interaction with
computer interfaces. This paper examines how to utilize this functionality in conjunction with software
written to control AirSwimmers, remote controlled indoor blimps that receive movement commands
through infra-red signals. The implication here is that this technology is potentially a useful and powerful
control technique.
1.0 Introduction
Image processing and computer vision are important fields of computer science, each with a multitude of
real-world applications. An example of a mainstream device capable of computationally intensive real
time image processing is the Microsoft Kinect. Introduced in 2010, the Kinect has the ability to detect
human motion in 3D space, providing a novel method of human-computer interaction by leveraging an
array of audio-based and video-based sensors. The project was soon recognized for its potential
application beyond game control. This new way to interface with technology provided a relatively cheap
new way to do 3D imaging and tracking in real time [1]. This research aims to demonstrate the potential
of Microsofts Kinect by exploring how the device can be used to control a variety of infrared (IR)
devices. For the purposes of this paper, a remote control (RC) blimp was used.
1.1 Real-time Image Processing on Kinect
The Kinect has the ability to track 3 dimensional movement. It does this using an infrared system which
emits many beams of infrared light, each distinguishable from the others. The Kinect maps the way the
light bounces back, the Kinects main method of determining object depth. Then, the Kinect uses a multi-
node randomized decision tree based on predefined images to determine the location of the user. Its final
skeletal estimation is corrected to two dimensions on the persons coronal plane and set relative to a
coordinate system defined by the Kinects field of view. [2]
-
8/10/2019 Akhil Sean
2/10
2
Figure 1 - Kinect Toolkit Browser
The Kinect Developer Toolkit Browser is a set of projects that Microsoft has created for people
developing software for the Kinect. It contains a variety of sample code provided as a base for application
development including those shown in Figure 2. Example projects include projects leveraging the
Kinects IR sensor to skeletal and facial tracking, as well as voice recognition projects. These widely
varying projects suggest the possibilities of using the Kinect to interact with technology in many
innovative and novel ways.
2.0 Methods
The Skeleton Basics WPF example project was used as a base for our application, as it includes all
proper setup methods to initialize the Kinect, track skeletons, and display tracked skeletons to an image in
real time. This example project was then renamed, and all further additions to the project were coded in
C#, and using the Microsoft Software Development Kit (SDK) and libraries, all developed by Microsoft.
-
8/10/2019 Akhil Sean
3/10
3
2.1 Development
The first action taken to modify the code was to disable tracking of body parts that were not used to
control the IR device, as their function was irrelevant to the code, and wasted computing cycles and
slowed the applications execution. Additionally, it was decided early in the project that all code written
would be produced in a modular, object-oriented fashion. Consequently, major tasks were separated into
objects to take full advantage of C#s Object Oriented design. Also, all code would be maintained both on
an external flash drive and on a Git-compliant version control host. The codebase was maintained on
bitbucket.org, and new data was committed and pushed to bitbucket.org after each day of work.
The joint location data provided by the Kinect was used to programmatically determine the right hands
position relative to the right shoulder using the Kinects provided coordinate system. This was achieved
by obtaining the X/Y coordinate pair for the right shoulder and right hand, and subtracting the tuple. The
resultant numbers can be used to indicate whether the hand is above, below, left and right of the shoulder,
shown in Figure 2.
Figure 2 - Visualization of Kinects interpretation
of subject movement
Figure 3 - An AirSwimmer Shark Variant, used in
testing [3]
The IR device used in this papers studies is a commercially available RC blimp known as an
AirSwimmer. The AirSwimmer has a motor that moves a weight for pitch control, and a motor that pivots
the tail of this fish modeled blimp back and forth, providing a mechanism for turning, as well as thrust.
-
8/10/2019 Akhil Sean
4/10
4
To allow the AirSwimmers settings to remain constant, a null space for the distance between the hand
and shoulder was implemented. Without this, the device would be much harder to control, as one couldnt
only change one motor at a time, all input would be interpreted to be a request to go left or right and to
change pitch. Changing only one control parameter at a time is made possible by this null space in the
vertical and horizontal direction.
2.2 Control System
Figure 4 - IR Transmitter (Dangerous Prototypes IR Toy) [4]
The AirSwimmers are controlled using a two-application system. First, the Kinect detects the location of
the joints on the body of the controller. Next, the Kinect sends this information to the computer, which
runs a program developed to interpret this data (as described in 2.3). The program passes these signals on
to second application, WinLIRC, through the Transmit.exe application, distributed as a part of the
WinLIRC package. Transmit.exe works by taking arguments given to it by the developed program, and
passing them on to the main WinLIRC program. WinLIRC is a Windows port of the Linux Infrared
Remote Control (LIRC) application that provides an interface between the IR Toy and the computer.
-
8/10/2019 Akhil Sean
5/10
5
Figure 5 - Data Flow
WinLIRC takes Transmit.exes input, and then reads a pre-specified configuration file to ascertain if it
contains the command given by Transmit.exe. This configuration file, which is user provided, should
contain a list of all the commands one might wish to send, and the actual data to be sent if one of the
commands specified is invoked. If the command and its arguments given through Transmit.exe match one
specified in the configuration file, WinLIRC takes the data the file associated with that command, and
sends it to the USB device in Figure 4, called an IR Toy. This apparatus is an infrared transmitter that
parses serial data transmitted by WinLIRC to emit as a series of infrared pulses. These pulses are detected
by the infrared receiver on the AirSwimmer and decoded by the onboard microcontroller. If the unit
determines that the received pulses constitute a valid command, the small servo motors are given power,
causing them to move. In turn, this motor movement propels the fish and can be used to change direction.
This entire process is illustrated in Figure 5, and is repeated as long as someone is in the Kinects view.
-
8/10/2019 Akhil Sean
6/10
6
Right Left
Down Up
Figure 6 - Screenshots of the application tracking relevant joints in multiple directions
2.3 Directional Detection
The application ascertains direction by determining if the relative position of the hand joint satisfies the
bounds for left, right, up, down, or center. These directions are then passed to a function that interfaces
with the WinLIRC package to send appropriate IR signals. A custom command is sent each time, done
using the format: {Transmit.exe Location} {WinLIRC Configuration Location} {Fish Type and Direction}
{Command repeat}. Since the application is configured to include the file path to WinLIRC, only the
names of the executable and configuration file are required. Table 1 provides example formats for each
direction.
Table 1 - Command Templates
Direction Conditional Arguments: (ris a variable repeat value)
Turn Left x 2 Transmit.exe AirSwimmer SL r or Transmit.exe AirSwimmer CL r
Turn Right x 2 Transmit.exe AirSwimmer SR ror Transmit.exe AirSwimmer CR r
Pitch Up y 2 Transmit.exe AirSwimmer SU ror Transmit.exe AirSwimmer CU r
Pitch down y 2 Transmit.exe AirSwimmer SD ror Transmit.exe AirSwimmer CD r
-
8/10/2019 Akhil Sean
7/10
7
This can be translated into C# code by implementing a series of If and Else-If statements, and is
illustrated as a code segment in Figure 7.
//Detect Vertical Directionif (handY > .2)
{if (doSend){
control.turnUp(); }
}else if (handY < -.2)
{if (doSend)
{
control.turnDown();
}
}//Detect Horizontal Directionif (handX > .2){
if (doSend)
{control.turnRight();
}
}else if (handX < -.2)
{
if (doSend){
control.turnLeft(); }
}
Figure 7 - Code segment showing the simple algorithm used to determine bounds.
If the value of the hand appears outside of null space on the x or y axis, a command is sent to WinLIRC
using the SendMoveSignal method. This is contained within the AirSwimmerControl class, and is shown
in Figure 8.
-
8/10/2019 Akhil Sean
8/10
8
private string SendMoveSignal(String direction, int repeats){
String repeat = repeats.ToString();
//FishType is a variable instantiated in the AirSwimmerControl constructorString arguments = "AirSwimmer " + fishType + direction + " " + repeat;
ProcessStartInfo startInfo = new ProcessStartInfo();
startInfo.FileName = Constants.FileLocations.lircLocation;
startInfo.Arguments = arguments;
try{
Process.Start(startInfo);
return "Moved";
}catch (Exception e)
{return "Exception: " + e.ToString();
}}
Figure 8 - Sending IR Data in the AirSwimmerControl class
3.0 Results
After development of the preliminary control application, the systems real-world effectiveness was
tested. While commands sent to the AirSwimmer executed, latency was a major issue. While the Kinect
processes the images quickly on the local machine, our testing revealed significant discrepancies between
our movement and the movement of the fish. Furthermore, the fish kept moving, even after the
application was halted and the controller was out of the frame. It was then decided to document lag as
compared to the parameters programmatically added to Transmit.exe on startup. To do this, five
experiments were conducted, each with four trials. Sixteen commands were sent, or four sets of Up,
Down, Left, and Right. After the motions associated with the commands were finished, and there was no
longer input being actively given to the system, the amount of time it took for the AirSwimmer to stop
moving was measured. The first four experiments were done using the application using the Kinect, while
the last was done as a control to check the response of the default control mechanism for the
AirSwimmer, an IR remote.
-
8/10/2019 Akhil Sean
9/10
9
Table 2 - Input Lag Trials
Figure 9 - Repeats compared to Input Latency
3.1 Analysis
According to the data collected, delay between the termination of commands and the amount of time the
balloon takes to stop moving is strongly correlated. As such, altering the application to send only one
command without repeats is the best way to minimize input lag, although it does curtail the systems
built-in mechanisms for redundancy.
4.0 Conclusion
The Kinect is a powerful tool for developers. It allows an interface to the world of computing through
simple body motion. This paper explores just one such application. Using the Kinect with the
AirSwimmers required a lot of problem-solving, as the initial approaches taken failed to work properly.
That said, once properly configured, the Kinect proved itself a viable means of controlling the
AirSwimmers. The Kinect uses an entirely new form of input to computer systems, and opens an intuitive
form of interaction with the technological world.
-
8/10/2019 Akhil Sean
10/10
10
In addition to the motion control implementation documented in this paper, the Kinect can also make use
of a built-in a microphone for speech analysis, something a future project may wish to take advantage of.
This paper outlines moving a fish balloon, but potential applications stretches much further. One may
already have a Kinect in ones living room, and changing the commands to controla home entertainment
system with infrared would be possible using the techniques outlined in this paper.
References
[1] Rob Knies, http://research.microsoft.com/en-us/news/features/kinectforwindowssdk-022111.aspx
[2] John MacCormick, How does the Kinect work?, presented at Department of Mathematics and
Computer Science at Dickinson College, Carlisle, PA. September 6, 2011.
[3] GatgetSin.com,http://gadgetsin.com/uploads/2011/09/airswimmer_remote_controlled_flying_shark_and_
clownfish_3.jpg
[4] Dangerous Prototypes, http://dangerousprototypes.com/docs/images/thumb/b/b8/Irtoy-
cover.jpg/250px-Irtoy-cover.jpg