akhil sean

Upload: ben-marks

Post on 02-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Akhil Sean

    1/10

    1

    Leveraging the Kinect SDK to Control of a Remote Device

    Akhil Acharya and Sean Freemerman

    Summer Ventures in Math and Science 2013

    Visualization and Image Processing

    Mentors: Dr. Rahman Tashakkori, Luke Rice, Bahar Akhtar, Dan Thyer

    Appalachian State University

    Abstract: The Microsoft Kinect sensor detects human motion, allowing for hands free interaction with

    computer interfaces. This paper examines how to utilize this functionality in conjunction with software

    written to control AirSwimmers, remote controlled indoor blimps that receive movement commands

    through infra-red signals. The implication here is that this technology is potentially a useful and powerful

    control technique.

    1.0 Introduction

    Image processing and computer vision are important fields of computer science, each with a multitude of

    real-world applications. An example of a mainstream device capable of computationally intensive real

    time image processing is the Microsoft Kinect. Introduced in 2010, the Kinect has the ability to detect

    human motion in 3D space, providing a novel method of human-computer interaction by leveraging an

    array of audio-based and video-based sensors. The project was soon recognized for its potential

    application beyond game control. This new way to interface with technology provided a relatively cheap

    new way to do 3D imaging and tracking in real time [1]. This research aims to demonstrate the potential

    of Microsofts Kinect by exploring how the device can be used to control a variety of infrared (IR)

    devices. For the purposes of this paper, a remote control (RC) blimp was used.

    1.1 Real-time Image Processing on Kinect

    The Kinect has the ability to track 3 dimensional movement. It does this using an infrared system which

    emits many beams of infrared light, each distinguishable from the others. The Kinect maps the way the

    light bounces back, the Kinects main method of determining object depth. Then, the Kinect uses a multi-

    node randomized decision tree based on predefined images to determine the location of the user. Its final

    skeletal estimation is corrected to two dimensions on the persons coronal plane and set relative to a

    coordinate system defined by the Kinects field of view. [2]

  • 8/10/2019 Akhil Sean

    2/10

    2

    Figure 1 - Kinect Toolkit Browser

    The Kinect Developer Toolkit Browser is a set of projects that Microsoft has created for people

    developing software for the Kinect. It contains a variety of sample code provided as a base for application

    development including those shown in Figure 2. Example projects include projects leveraging the

    Kinects IR sensor to skeletal and facial tracking, as well as voice recognition projects. These widely

    varying projects suggest the possibilities of using the Kinect to interact with technology in many

    innovative and novel ways.

    2.0 Methods

    The Skeleton Basics WPF example project was used as a base for our application, as it includes all

    proper setup methods to initialize the Kinect, track skeletons, and display tracked skeletons to an image in

    real time. This example project was then renamed, and all further additions to the project were coded in

    C#, and using the Microsoft Software Development Kit (SDK) and libraries, all developed by Microsoft.

  • 8/10/2019 Akhil Sean

    3/10

    3

    2.1 Development

    The first action taken to modify the code was to disable tracking of body parts that were not used to

    control the IR device, as their function was irrelevant to the code, and wasted computing cycles and

    slowed the applications execution. Additionally, it was decided early in the project that all code written

    would be produced in a modular, object-oriented fashion. Consequently, major tasks were separated into

    objects to take full advantage of C#s Object Oriented design. Also, all code would be maintained both on

    an external flash drive and on a Git-compliant version control host. The codebase was maintained on

    bitbucket.org, and new data was committed and pushed to bitbucket.org after each day of work.

    The joint location data provided by the Kinect was used to programmatically determine the right hands

    position relative to the right shoulder using the Kinects provided coordinate system. This was achieved

    by obtaining the X/Y coordinate pair for the right shoulder and right hand, and subtracting the tuple. The

    resultant numbers can be used to indicate whether the hand is above, below, left and right of the shoulder,

    shown in Figure 2.

    Figure 2 - Visualization of Kinects interpretation

    of subject movement

    Figure 3 - An AirSwimmer Shark Variant, used in

    testing [3]

    The IR device used in this papers studies is a commercially available RC blimp known as an

    AirSwimmer. The AirSwimmer has a motor that moves a weight for pitch control, and a motor that pivots

    the tail of this fish modeled blimp back and forth, providing a mechanism for turning, as well as thrust.

  • 8/10/2019 Akhil Sean

    4/10

    4

    To allow the AirSwimmers settings to remain constant, a null space for the distance between the hand

    and shoulder was implemented. Without this, the device would be much harder to control, as one couldnt

    only change one motor at a time, all input would be interpreted to be a request to go left or right and to

    change pitch. Changing only one control parameter at a time is made possible by this null space in the

    vertical and horizontal direction.

    2.2 Control System

    Figure 4 - IR Transmitter (Dangerous Prototypes IR Toy) [4]

    The AirSwimmers are controlled using a two-application system. First, the Kinect detects the location of

    the joints on the body of the controller. Next, the Kinect sends this information to the computer, which

    runs a program developed to interpret this data (as described in 2.3). The program passes these signals on

    to second application, WinLIRC, through the Transmit.exe application, distributed as a part of the

    WinLIRC package. Transmit.exe works by taking arguments given to it by the developed program, and

    passing them on to the main WinLIRC program. WinLIRC is a Windows port of the Linux Infrared

    Remote Control (LIRC) application that provides an interface between the IR Toy and the computer.

  • 8/10/2019 Akhil Sean

    5/10

    5

    Figure 5 - Data Flow

    WinLIRC takes Transmit.exes input, and then reads a pre-specified configuration file to ascertain if it

    contains the command given by Transmit.exe. This configuration file, which is user provided, should

    contain a list of all the commands one might wish to send, and the actual data to be sent if one of the

    commands specified is invoked. If the command and its arguments given through Transmit.exe match one

    specified in the configuration file, WinLIRC takes the data the file associated with that command, and

    sends it to the USB device in Figure 4, called an IR Toy. This apparatus is an infrared transmitter that

    parses serial data transmitted by WinLIRC to emit as a series of infrared pulses. These pulses are detected

    by the infrared receiver on the AirSwimmer and decoded by the onboard microcontroller. If the unit

    determines that the received pulses constitute a valid command, the small servo motors are given power,

    causing them to move. In turn, this motor movement propels the fish and can be used to change direction.

    This entire process is illustrated in Figure 5, and is repeated as long as someone is in the Kinects view.

  • 8/10/2019 Akhil Sean

    6/10

    6

    Right Left

    Down Up

    Figure 6 - Screenshots of the application tracking relevant joints in multiple directions

    2.3 Directional Detection

    The application ascertains direction by determining if the relative position of the hand joint satisfies the

    bounds for left, right, up, down, or center. These directions are then passed to a function that interfaces

    with the WinLIRC package to send appropriate IR signals. A custom command is sent each time, done

    using the format: {Transmit.exe Location} {WinLIRC Configuration Location} {Fish Type and Direction}

    {Command repeat}. Since the application is configured to include the file path to WinLIRC, only the

    names of the executable and configuration file are required. Table 1 provides example formats for each

    direction.

    Table 1 - Command Templates

    Direction Conditional Arguments: (ris a variable repeat value)

    Turn Left x 2 Transmit.exe AirSwimmer SL r or Transmit.exe AirSwimmer CL r

    Turn Right x 2 Transmit.exe AirSwimmer SR ror Transmit.exe AirSwimmer CR r

    Pitch Up y 2 Transmit.exe AirSwimmer SU ror Transmit.exe AirSwimmer CU r

    Pitch down y 2 Transmit.exe AirSwimmer SD ror Transmit.exe AirSwimmer CD r

  • 8/10/2019 Akhil Sean

    7/10

    7

    This can be translated into C# code by implementing a series of If and Else-If statements, and is

    illustrated as a code segment in Figure 7.

    //Detect Vertical Directionif (handY > .2)

    {if (doSend){

    control.turnUp(); }

    }else if (handY < -.2)

    {if (doSend)

    {

    control.turnDown();

    }

    }//Detect Horizontal Directionif (handX > .2){

    if (doSend)

    {control.turnRight();

    }

    }else if (handX < -.2)

    {

    if (doSend){

    control.turnLeft(); }

    }

    Figure 7 - Code segment showing the simple algorithm used to determine bounds.

    If the value of the hand appears outside of null space on the x or y axis, a command is sent to WinLIRC

    using the SendMoveSignal method. This is contained within the AirSwimmerControl class, and is shown

    in Figure 8.

  • 8/10/2019 Akhil Sean

    8/10

    8

    private string SendMoveSignal(String direction, int repeats){

    String repeat = repeats.ToString();

    //FishType is a variable instantiated in the AirSwimmerControl constructorString arguments = "AirSwimmer " + fishType + direction + " " + repeat;

    ProcessStartInfo startInfo = new ProcessStartInfo();

    startInfo.FileName = Constants.FileLocations.lircLocation;

    startInfo.Arguments = arguments;

    try{

    Process.Start(startInfo);

    return "Moved";

    }catch (Exception e)

    {return "Exception: " + e.ToString();

    }}

    Figure 8 - Sending IR Data in the AirSwimmerControl class

    3.0 Results

    After development of the preliminary control application, the systems real-world effectiveness was

    tested. While commands sent to the AirSwimmer executed, latency was a major issue. While the Kinect

    processes the images quickly on the local machine, our testing revealed significant discrepancies between

    our movement and the movement of the fish. Furthermore, the fish kept moving, even after the

    application was halted and the controller was out of the frame. It was then decided to document lag as

    compared to the parameters programmatically added to Transmit.exe on startup. To do this, five

    experiments were conducted, each with four trials. Sixteen commands were sent, or four sets of Up,

    Down, Left, and Right. After the motions associated with the commands were finished, and there was no

    longer input being actively given to the system, the amount of time it took for the AirSwimmer to stop

    moving was measured. The first four experiments were done using the application using the Kinect, while

    the last was done as a control to check the response of the default control mechanism for the

    AirSwimmer, an IR remote.

  • 8/10/2019 Akhil Sean

    9/10

    9

    Table 2 - Input Lag Trials

    Figure 9 - Repeats compared to Input Latency

    3.1 Analysis

    According to the data collected, delay between the termination of commands and the amount of time the

    balloon takes to stop moving is strongly correlated. As such, altering the application to send only one

    command without repeats is the best way to minimize input lag, although it does curtail the systems

    built-in mechanisms for redundancy.

    4.0 Conclusion

    The Kinect is a powerful tool for developers. It allows an interface to the world of computing through

    simple body motion. This paper explores just one such application. Using the Kinect with the

    AirSwimmers required a lot of problem-solving, as the initial approaches taken failed to work properly.

    That said, once properly configured, the Kinect proved itself a viable means of controlling the

    AirSwimmers. The Kinect uses an entirely new form of input to computer systems, and opens an intuitive

    form of interaction with the technological world.

  • 8/10/2019 Akhil Sean

    10/10

    10

    In addition to the motion control implementation documented in this paper, the Kinect can also make use

    of a built-in a microphone for speech analysis, something a future project may wish to take advantage of.

    This paper outlines moving a fish balloon, but potential applications stretches much further. One may

    already have a Kinect in ones living room, and changing the commands to controla home entertainment

    system with infrared would be possible using the techniques outlined in this paper.

    References

    [1] Rob Knies, http://research.microsoft.com/en-us/news/features/kinectforwindowssdk-022111.aspx

    [2] John MacCormick, How does the Kinect work?, presented at Department of Mathematics and

    Computer Science at Dickinson College, Carlisle, PA. September 6, 2011.

    [3] GatgetSin.com,http://gadgetsin.com/uploads/2011/09/airswimmer_remote_controlled_flying_shark_and_

    clownfish_3.jpg

    [4] Dangerous Prototypes, http://dangerousprototypes.com/docs/images/thumb/b/b8/Irtoy-

    cover.jpg/250px-Irtoy-cover.jpg