Elan Microelectronics Corporation v. Apple, Inc.

Filing 236

Declaration of Derek Walter IN SUPPORT OF APPLE INC'S OPPOSITION TO ELAN MICROELECTRONICS CORPORATION'S MOTION FOR PARTIAL SUMMARY JUDGMENT OF INFRINGEMENT OF U.S. PATENT 5,825,352] filed byApple, Inc.. (Attachments: # 1 Exhibit B, # 2 Exhibit C, # 3 Exhibit E, # 4 Exhibit F, # 5 Exhibit I, # 6 Exhibit J, # 7 Exhibit K)(Greenblatt, Nathan) (Filed on 6/2/2011)

Download PDF
Exhibit F HAND TRACKING, FINGER IDENTIFICATION, AND CHORDIC MANIPULATION ON A MULTI-TOUCH SURFACE by Wayne Westerman A dissertation submitted to the Faculty of the University of Delaware in partial ful llment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering Spring 1999 c 1999 Wayne Westerman All Rights Reserved HAND TRACKING, FINGER IDENTIFICATION, AND CHORDIC MANIPULATION ON A MULTI-TOUCH SURFACE by Wayne Westerman Approved: Approved: Approved: Neal Gallagher, Ph.D. Chair of the Department of Electrical Engineering Andras Z. Szeri, Ph.D. Interim Dean of the College of Engineering John C. Cavanaugh, Ph.D. Vice Provost for Academic Programs and Planning I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Signed: John Elias, Ph.D. Professor in charge of dissertation I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Signed: Charles Boncelet, Ph.D. Member of dissertation committee I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Signed: Phillip Christie, Ph.D. Member of dissertation committee I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Signed: Kenneth Barner, Ph.D. Member of dissertation committee I certify that I have read this dissertation and that in my opinion it meets the academic and professional standard required by the University as a dissertation for the degree of Doctor of Philosophy. Signed: John Scholz, Ph.D. Member of dissertation committee ACKNOWLEDGMENTS Abundant thanks go to my adviser, John Elias, whose fond support, daily teamwork, and unfathomable hardware know-how gave me a unique foundation upon which to compose a dissertation. Dr. Neal Gallagher, for inviting me to Delaware, ensuring my research in the Electrical Engineering Department and other parts of campus was always fully supported, o ering weekly spiritual advice, and challenging me with proclamations of what could and could not be done. May he continue to carry his wisdom all across the country. Dr. Charles Boncelet, Dr. Kenneth Barner, and Dr. John Scholz, for suggesting many helpful revisions to this manuscript. Dr. Phillip Christie, for many fascinating lectures, and for challenging me to nd the principles behind my inventions. Dr. Rakesh, for making me write and understand mathematical proofs until they all look trivial. Chris Thomas, for a most enduring friendship. My piano teachers Beverly Stephenson and Ruth Anne Rich, for inspiring me with what the hands can do on a properly responsive instrument. My typing assistants, Sarah Ruth Budd, Sara Levin, Mark Parsia, Denise Lemon, Barbara Westog, my sister Ellen, and my mother Bessie, for helping me to continue to program and publish for the past four years as I perfected less fatiguing methods for data entry. iv Samuel Audet, for generously writing HotScroll, OS/2's only continuous scrolling software. Brian Hall at Microedge, Inc., for adding a keystroke-saving variable-namecompletion feature to Visual Slickedit at my bequest. My fellow residents of Lovett Graduate House, for sharing the television when my mind was too weary for anything else. I am so lucky to have such a loving, patient, and stable family, who always welcome me home twice a year even though I moved so far away. I thank my father for drawing me back to the farm for refreshing manual labor yet giving me time to develop one crazy project after another on my vacations, and also for enforcing the pragmatism and ethics of the frontier. I thank my mother for spicing my vacation diet with wholesome home-grown foods and swimming. I thank my grandmother Edna and her family for bringing poetry, history, and gentleness to my summers may she always whisper from above how her family almost got transplanted to California. I thank my grandfather Walt for introducing electricity to our home town with only a fth grade education that was only the beginning. This work was partially funded by a National Science Foundation Graduate Fellowship for Wayne Westerman. This manuscript is dedicated to: My mother, Bessie, who taught herself to ght chronic pain in numerous and clever ways, and taught me to do the same. v TABLE OF CONTENTS LIST OF FIGURES : LIST OF TABLES : : GLOSSARY : : : : : : ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xv : xxiii : xxiv : xxix Chapter 1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 1.1 The State of Hand-Computer Interaction in 1998 : : : : : : : : : : : 1 1.2 Summary of Final Device Operation : : : : : : : : : : : : : : : : : : 5 1.2.1 Typing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5 1.2.1.1 Default Key Layout : : : : : : : : : : : : : : : : : : 5 1.2.1.2 Key Activation : : : : : : : : : : : : : : : : : : : : : 7 1.2.2 Chordic Manipulations : : : : : : : : : : : : : : : : : : : : : : 7 1.2.2.1 1.2.2.2 1.2.2.3 1.2.2.4 1.2.2.5 Pointing : : : : : : : : : : : : : : : : : : : : : : Dragging : : : : : : : : : : : : : : : : : : : : : Scrolling : : : : : : : : : : : : : : : : : : : : : : Text Editing : : : : : : : : : : : : : : : : : : : Menu Commands such as Cut, Copy and Paste : : : : : : : : : : : : : : : 8 11 11 11 13 1.3 Hardware Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 1.3.1 Sensing Hardware : : : : : : : : : : : : : : : : : : : : : : : : : 13 1.3.2 Signal Processing Hardware : : : : : : : : : : : : : : : : : : : 15 1.4 Summary of Contributions : : : : : : : : : : : : : : : : : : : : : : : : 18 vi 1.5 What is Not Covered : : : : : : : : : : : : : : : : : : : : : : : : : : : 21 1.6 On the Design of Ergonomic Input Devices : : : : : : : : : : : : : : : 23 1.6.1 What is the Role of Ergonomic Device Design? : : : : : : : : 23 1.6.2 Ergonomic Design Objectives : : : : : : : : : : : : : : : : : : 25 1.6.2.1 1.6.2.2 1.6.2.3 1.6.2.4 1.6.2.5 1.6.2.6 Minimize device activation force : : : : : : : : : Minimize repetitive action of the same muscles Encourage neutral postures : : : : : : : : : : : Allow variation of posture : : : : : : : : : : : : Minimize user anticipation : : : : : : : : : : : : Do not discourage rest breaks : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25 25 26 26 27 27 1.6.3 Can so many ergonomic objectives be met at once? : : : : : : 27 1.7 Outline : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28 2 PROXIMITY IMAGE FORMATION AND TOPOLOGY : : : : : 30 2.1 Related Methods for Hand Motion Sensing : : : : : : : : : : : : : : : 30 2.1.1 2.1.2 2.1.3 2.1.4 2.1.5 2.1.6 2.1.7 2.1.8 2.1.9 Free-Space Gestures : : : : : : : : : : : : Data Gloves : : : : : : : : : : : : : : : : : Video Gesture Recognition : : : : : : : : : Bene ts of Surface Contact : : : : : : : : Sensing Finger Presence : : : : : : : : : : Tactile Imaging : : : : : : : : : : : : : : : Capacitance-Sensing Electrode Arrays : : The MTS's Parallelogram Electrode Array No Motion Blur on MTS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31 31 32 32 33 38 38 40 43 2.2 Tactile Image Formation and Background Removal : : : : : : : : : : 43 2.2.1 Optical Image Segmentation : : : : : : : : : : : : : : : : : : : 44 2.2.2 Methods for Proximity Image Formation : : : : : : : : : : : : 45 2.2.2.1 Binary Tree Scanning : : : : : : : : : : : : : : : : : 45 2.2.2.2 Brute Array Scanning : : : : : : : : : : : : : : : : : 46 2.2.2.3 Sensor O set Adaptation : : : : : : : : : : : : : : : 46 vii 2.2.2.4 Proximity Image Filtering : : : : : : : : : : : : : : : 47 2.3 Topology of Hand Proximity Images : : : : : : : : : : : : : : : : : : : 48 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 Flattened Hand Image Properties : : : : : : Properties of Hands in the Neutral Posture : Partially Closed Hand Image Properties : : Pen Grip Image Properties : : : : : : : : : : Comfortable Ranges of Hand Motion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 49 51 52 54 54 2.4 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 58 3 HAND CONTACT SEGMENTATION AND PATH TRACKING 60 3.1 Notation and Major Variable Types : : : : : : : : : : : : : : : : : : : 62 3.2 Contact Segmentation : : : : : : : : : : : : : : : : : : : : : : : : : : 63 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 3.2.6 : : : : : : 63 64 67 68 72 75 Proximity Signi cance Tests : : : : : : : : : : : : : : Strict Segmentation Region Partial Minima Tests : : Flattened Finger Segmentation : : : : : : : : : : : : Contact Height Limitation Test : : : : : : : : : : : : Sloppy Segmentation Region Palm Heel Crease Test 75 77 79 80 80 Introduction to the Contact Segmentation Problem Overview of the Segmentation Process : : : : : : : Proximity Image Smoothing : : : : : : : : : : : : : Segmentation Strictness Regions : : : : : : : : : : : Segmentation Search Pattern : : : : : : : : : : : : Segmentation Boundary Tests : : : : : : : : : : : : 3.2.6.1 3.2.6.2 3.2.6.3 3.2.6.4 3.2.6.5 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3.2.7 Combining Overlapping Groups : : : : : : : : : : : : : : : : : 82 3.2.8 Extracting Group Parameters : : : : : : : : : : : : : : : : : : 83 3.2.8.1 Centroid Computation : : : : : : : : : : : : : : : : : 83 3.2.8.2 Ellipse Fitting : : : : : : : : : : : : : : : : : : : : : 84 3.2.9 Performance of the Segmentation Methods : : : : : : : : : : : 85 3.3 Persistent Path Tracking : : : : : : : : : : : : : : : : : : : : : : : : : 105 3.3.1 Introduction to the Path Tracking Problem : : : : : : : : : : : 105 viii 3.3.2 3.3.3 3.3.4 3.3.5 Prediction of Contact Location Mutually Closest Pairing Rule : Path Parameters : : : : : : : : Path Tracking Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 107 : 109 : 110 : 112 3.4 Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 114 4 FINGER IDENTIFICATION AND HAND POSITION ESTIMATION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 116 4.1 Hand Gesture Recognition : : : : : : : : : : : : : : : : : : : : : : : : 117 4.1.1 Communicative Gestures versus Manipulative Gestures : : : : 117 4.1.2 Locating Fingers within Remote Optical Images : : : : : : : : 119 4.1.3 The Feasibility of Identi cation from Proximity Images : : : : 120 4.1.3.1 Rubine's Encounter with Finger Identi cation : : : 120 4.1.3.2 Summary of Constraints on Contact Identity : : : : 121 4.1.3.3 Underconstrained Cases : : : : : : : : : : : : : : : : 124 4.1.4 Pooling of Fingertip Combinations : : : : : : : : : : : : : : : 125 4.2 Overview of the Hand Tracking and Identi cation System : : : : : : : 126 4.3 Hand Position Estimation : : : : : : : : : : : : : : : : : : : : : : : : 130 4.3.1 4.3.2 4.3.3 4.3.4 4.3.5 Measuring Current Hand Position : : : : : : : : : : : : : : Identi cation Con dence and Filter Delay : : : : : : : : : The Filter Equations : : : : : : : : : : : : : : : : : : : : : Enforcing Hand Separation : : : : : : : : : : : : : : : : : : Interactions with Segmentation and Identi cation Modules : : : : : : 131 : 133 : 135 : 135 : 137 4.4 Finger Identi cation : : : : : : : : : : : : : : : : : : : : : : : : : : : 138 4.4.1 4.4.2 4.4.3 4.4.4 The Basic Attractor Ring : : : : : : : : : : : : : Voronoi Diagram for Single Contact Identi cation Multiple Contacts Compete for Voronoi Cells : : The Assignment Problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 138 : 140 : 142 : 143 4.4.4.1 Localized Combinatorial Search : : : : : : : : : : : : 145 4.4.4.2 Choosing Initial Assignments : : : : : : : : : : : : : 146 4.4.4.3 The Swapping Condition : : : : : : : : : : : : : : : : 146 ix 4.4.4.4 The k-exchange Sequence : : : : : : : : : : : : : : : 147 4.4.5 Geometric Interpretations of the Swapping Condition : : : : : 148 4.4.5.1 Geometric Interpretation of Single Contact Swapping 148 4.4.5.2 Geometric Interpretation of Contact Pair Swapping : 150 4.4.5.3 Summary of Swapping Behavior using Distance-Squared Metrics : : : : : : : : : : : : : : : 155 4.4.5.4 Contact Pair Swapping Behavior with Other Metrics 155 4.4.5.5 Distance-Squared Assignment as Sorting : : : : : : : 158 4.4.5.6 Analyzing Swaps on the Attractor Ring : : : : : : : 159 4.4.6 Tuning the Attractor Ring with Weighted Voronoi Diagrams : 165 4.4.6.1 4.4.6.2 4.4.6.3 4.4.6.4 4.4.6.5 4.4.6.6 4.4.6.7 4.4.6.8 4.4.6.9 4.4.6.10 Constant Additive Weighting to the Distance Matrix 166 Static Palm Heel Weightings : : : : : : : : : : : : : : 167 Dynamic Feature Weightings : : : : : : : : : : : : : 172 Thumb and Inner Palm Orientation Factor : : : : : : 173 Thumb Size Factor : : : : : : : : : : : : : : : : : : : 174 Palm Heel Size Factor : : : : : : : : : : : : : : : : : 175 Palm Heel Separation Factor : : : : : : : : : : : : : 176 Forepalm Attractors and Weightings : : : : : : : : : 177 The Fully Weighted Assignment Cost Matrix : : : : 178 Tolerance of Di erent Hand Sizes : : : : : : : : : : : 179 4.4.7 Thumb Veri cation : : : : : : : : : : : : : : : : : : : : : : : : 180 4.4.7.1 4.4.7.2 4.4.7.3 4.4.7.4 4.4.7.5 Inner Finger Separation Factor : : : : : : : Inner Finger Angle Factor : : : : : : : : : : Thumb-Fingertip Expansion Factor : : : : : Thumb-Fingertip Rotation Factor : : : : : : Combining and Testing the Thumb Factors : : : : : : : : : : : : : : : : : : : : : 181 : 183 : 184 : 184 : 185 4.4.8 Ratcheting Identi cation Accuracy : : : : : : : : : : : : : : : 187 4.4.9 Finger Identi cation Results : : : : : : : : : : : : : : : : : : : 189 4.5 Hand Identi cation : : : : : : : : : : : : : : : : : : : : : : : : : : : : 202 4.5.1 Checking for Contact Stabilization : : : : : : : : : : : : : : : 203 4.5.2 Placing Left and Right Attractor Rings : : : : : : : : : : : : : 205 4.5.3 Generating Plausible Partition Hypotheses : : : : : : : : : : : 206 x 4.5.4 The Optimization Search Loop : : : : : : : : : : : : : : : : : 208 4.5.5 Partition Cost Modi ers : : : : : : : : : : : : : : : : : : : : : 209 4.5.5.1 4.5.5.2 4.5.5.3 4.5.5.4 Clutching Direction Factor : : Handedness Factor : : : : : : Palm Cohesion Factor : : : : Inter-Hand Separation Factor : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 209 : 210 : 211 : 212 4.5.6 Hand Identi cation Results : : : : : : : : : : : : : : : : : : : 213 4.6 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 223 5 CHORDIC MANIPULATION : : : : : : : : : : : : : : : : : : : : : : 225 5.1 Related Input Devices : : : : : : : : : : : : : : : : : : : : : : : : : : 226 5.1.1 Fitts' Law and Pointing Performance : : : : : : : : : : : : : : 226 5.1.1.1 Tracking Delay : : : : : : : : : : : : : : : : : : : : : 227 5.1.2 Integrating Typing and Pointing : : : : : : : : : : : : : : : : : 228 5.1.2.1 Embedding Pointing Devices in Mechanical Keyboards : : : : : : : : : : : : : : : : : : : : 5.1.2.2 Detecting Pointing Gestures Above a Keyboard 5.1.2.3 One Hand Points, the Other Types : : : : : : : 5.1.2.4 Touch Pads and Screens : : : : : : : : : : : : : : : : : : : : : : 228 : 230 : 231 : 231 5.1.3 Manipulation in more than Two Degrees of Freedom : : : : : 232 5.1.3.1 Integrality vs. Separability : : : : : : : : : : : : : : : 233 5.1.3.2 Bimanual Manipulation : : : : : : : : : : : : : : : : 235 5.1.4 Channel Selection : : : : : : : : : : : : : : : : : : : : : : : : : 236 5.2 Synchronization and Typing Detection : : : : : : : : : : : : : : : : : 237 5.2.1 Keypress Registration : : : : : : : : : : : : : : : : : : : : : : 237 5.2.2 The Synchronization Detector : : : : : : : : : : : : : : : : : : 240 5.2.2.1 Sorting Paths by Press and Release Times : : : : : : 240 xi 5.2.2.2 5.2.2.3 5.2.2.4 5.2.2.5 Searching for Synchronized Finger Subsets : : : : Synchronization Detector Decisions and Actions : Issuing Chord Taps : : : : : : : : : : : : : : : : : Avoiding Accidental Mouse Clicks : : : : : : : : : : : : : 243 : 244 : 245 : 247 5.2.3 Keypress Acceptance and Transmission : : : : : : : : : : : : : 247 5.2.3.1 Handling Modi er Keys : : : : : : : : : : : : : : : : 249 5.2.3.2 Alternatives to Full Taps from Suspended Hands : : 250 5.2.3.3 Potential Typing Speeds : : : : : : : : : : : : : : : : 251 5.2.4 Typing Summary : : : : : : : : : : : : : : : : : : : : : : : : : 252 5.3 Hand Motion Extraction : : : : : : : : : : : : : : : : : : : : : : : : : 252 5.3.1 5.3.2 5.3.3 5.3.4 5.3.5 5.3.6 Inputs to the Extraction Algorithm : : : : : Scaling and Rotation Component Extraction Translation Component Extraction : : : : : Dead Zone Filtering : : : : : : : : : : : : : Motion Extraction Results : : : : : : : : : : Motion Extraction Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 254 : 254 : 258 : 260 : 261 : 266 5.4 Chord Motion Recognition : : : : : : : : : : : : : : : : : : : : : : : : 266 5.4.1 Channel Selection : : : : : : : : : : : : : : : : : : : : : : : : : 266 5.4.1.1 Channels Follow Finger Combinations : : : : : : : : 267 5.4.1.2 Initial Finger Combination Sets Channel : : : : : : : 268 5.4.2 MTS Chord Motion State Machine : : : : : : : : : : : : : : : 269 5.4.2.1 State C: Channel Selection : : : : : : : : : : : : : : 269 5.4.2.2 State SC: Synced Subset Channel Selection : : : : : 271 5.4.2.3 State M: Manipulation : : : : : : : : : : : : : : : : : 272 5.4.3 Chord Mappings : : : : : : : : : : : : : : : : : : : : : : : : : 272 5.5 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 275 6 PRELIMINARY EVALUATION, FUTURE DIRECTIONS, AND xii CONCLUSIONS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 276 6.1 Testimonial and Case Study of the Author : : : : : : : : : : : : : : : 276 6.1.1 6.1.2 6.1.3 6.1.4 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 277 : 278 : 279 : 281 First Two Weeks : : : : : : Third and Fourth Weeks : : Fifth and Sixth Weeks : : : Seventh and Eighth Weeks : Ninth and Tenth Weeks : : Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 281 : 281 : 282 : 282 : 284 : 284 My Fitness as an Evaluator Equipment and Methods : : Typing : : : : : : : : : : : : Weekly Symptoms : : : : : 6.1.4.1 6.1.4.2 6.1.4.3 6.1.4.4 6.1.4.5 6.1.4.6 : : : : : : : : : : : : : : : : 6.1.5 Recognition Errors and Accidental Activations : : : : : : : : : 285 6.1.5.1 Bene ts of Higher Frame Rates : : : : : : : : : : : : 286 6.1.6 Chordic Manipulation Performance : : : : : : : : : : : : : : : 287 6.2 Future Evaluations : : : : : : : : : : : : : : : : : : : : : : : : : : : : 294 6.2.1 Usability Trials : : : : : : : : : : : : : : : : : : : : : : : : : : 294 6.2.2 RSI Case Studies : : : : : : : : : : : : : : : : : : : : : : : : : 296 6.2.3 Typing Fatigue Studies : : : : : : : : : : : : : : : : : : : : : : 298 6.3 Future Directions for MTS Development : : : : : : : : : : : : : : : : 299 6.3.1 6.3.2 6.3.3 6.3.4 6.3.5 Increased Array Resolution : : : : : : : : : : : : : : : : : Handwriting Recognition : : : : : : : : : : : : : : : : : : Universal Access : : : : : : : : : : : : : : : : : : : : : : Fault Tolerant Segmentation : : : : : : : : : : : : : : : : Upgrading Operating Systems for High-DOF, Bimanual Manipulation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 299 : 300 : 301 : 301 : : : 302 6.4 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 303 Appendix xiii A ERGONOMICS FOR ENGINEERS : : : : : : : : : : : : : : : : : : : 305 A.1 A.2 A.3 A.4 Risk Factors for RSI : : : : : : : : : : : : : : : : : : : The Role of Force Repetition in Soft Tissue Damage Activation Forces of Input Devices : : : : : : : : : : : Relevance to the MTS : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 305 : 306 : 307 : 309 B VERTICAL INTERPOLATION BIASES ON PARALLELOGRAM ELECTRODE ARRAYS : : : : : : : : : : : : 310 B.1 Nonlinear Vertical Centroid for Parallelogram Interpolation : : : : : : 312 C CONVERGENCE TRAPS FOR LOCALIZED COMBINATORIAL SEARCH ON AN ATTRACTOR RING. : : 315 BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 320 xiv Chapter 2 PROXIMITY IMAGE FORMATION AND TOPOLOGY Limited hand and nger tracking experiments have previously been conducted with a variety of sensing technologies. This chapter begins with a review of these sensing technologies and explains why proximity sensing arrays are particularly well-suited for everyday applications of hand tracking. Then the chapter discusses proximity image pre-processing such as background object removal, sensor o set adaptation, and electrical noise ltering. The chapter concludes with a sampling of proximity images which illustrate the typical features and arrangements of hand contacts. This hand topology section is particularly important to the understanding of the contact segmentation and identi cation algorithms in Chapters 3 and 4, which rely heavily on relative contact shape and position constraints. 2.1 Related Methods for Hand Motion Sensing Hand position and motion can conceivably be detected with mechanical or electromagnetic sensors attached to the hand, with remote optical or acoustical sensors, or with proximity or pressure sensors mounted on an object in the user's environment. At rst glance the attached sensor methods seem advantageous because they can capture three-dimensional hand activity in free space, unconstrained by the physical form factor of an interfacing object. Data gloves and computer vision systems have been popular in virtual reality experiments for this reason. Such systems are clearly appropriate for capturing the free-space hand gestures and sign 30 language as they appear in communication between humans, but several factors make them impractical for everyday human-computer interaction. 2.1.1 Free-Space Gestures The rst problem lies with holding or slowly adjusting hand position in free space. The quick, relative motions of sign language may be easy to perform, but holding the unsupported hands out in front of the body for extended periods is very tiring 152, 153]. In such postures ngertip positions are also somewhat unstable, so considerably less precision is possible than when some part of the hand or arm rests against a rm object. Also, it is very di cult for a computer to distinguish motions intended to be instructions for the computer from postural adjustments or gestures to co-workers. This is known as the gesture saliency problem. To appreciate the di culty of this problem, consider how often we humans mistakenly think someone is gesturing at us when the gesture is actually intended for someone behind us or no one at all. If the direction of gaze of the sender is not known, determining the intended recipient of gestures is even more troublesome. 2.1.2 Data Gloves Free-space motion sensing technologies have limitations as well. Though DataGloves 148] can potentially capture the entire range of nger exion and extension, in practice the exion sensors are imprecise yet expensive and cumbersome to wear. Furthermore, as a bodily attachment, gloves must often be removed when the user resumes non-computer tasks. This is both a practical disadvantage and an ergonomic disadvantage because it discourages users from taking rest breaks and mixing in non-computer tasks which rely on other muscle groups. FakeSpace, Inc. 36] markets pinch or chord gloves for virtual reality systems which detect contact between electrically conducting ngertip pads rather than general exion and extension of the ngers. The lack of exion sensors reduces cost, and consistent with 31 the design philosophy of this dissertation, such physical ngertip contact turns out to be more reliable and easier to learn than free-space nger motion gestures 60]. 2.1.3 Video Gesture Recognition Computer vision technologies avoid the encumbrance of wearing gloves but cannot always infer ngertip location. Assuming decent lighting is available, much of the luminosity information that a video camera supplies is unnecessary for nger tracking, and must be ltered out with computationally intensive algorithms 115]. The body of the hand can occlude the ngertips at some camera and hand angles. Occlusion and limited camera resolution also make it very di cult to determine exactly when the ngers touch a surface. 2.1.4 Bene ts of Surface Contact Most importantly, the emphasis on hand tracking in three-dimensional free space ignores the long history of manipulating hand tools and musical instruments which provide rich haptic feedback as the tool is acquired. While economics may preclude customizing the shapes of general-purpose input devices as much as hand tools are customized, detection of contact with a physical surface provides, at the bare minimum, a clear demarcation between motions on the surface that the computer is intended to recognize and motions away from the surface that the computer should ignore. Though individual nger activity on a surface is constrained to twoand-a-half dimensions, Chapter 5 will demonstrate that extra degrees of freedom can be extracted from rotational and scaling motions of multiple ngers on a surface. For many applications the improved clarity of user intent and tactile feedback that surface contact imparts will more than make up for the slight reduction in movement freedom. 32 2.1.5 Sensing Finger Presence Technologies which have been applied to detecting nger or stylus contact include resistive membranes, surface acoustic wave, active optics and nger capacitance sensing (see Lee's 1984 Master's Thesis 88] for an early review). Most implementations are limited to unambiguous location of a single nger because they rely on what Lee calls \projective" sensor matrices. In a projective matrix (Figure 2.1a), one sensor element is allocated to each row and column at the edge of the active a) b) Figure 2.1: The two basic multi-touch proximity sensor arrangements. In a), \projective" row and column spanning sensors integrate across each row and column electrode and only need connections at the edges of the matrix. Touching ngertips can be counted by counting the maxima in the column signals assuming the ngertips lie in a roughly horizontal row unobstructed by thumb or palms. The square sensors in b) only integrate over the local square. The exact locations of any number of ngertip-sized contacts can be interpolated from the 2D array of square sensors, but a connection matrix must be run underneath the sensor array to connect the sensors to signal processing circuitry. area. Finger presence anywhere along a row will register on that row's sensor, so that a nger a ects roughly one row and one column sensor. While the total number 33 of sensors needed is related to only the square root of the active area, multiple nger contacts can confuse these systems 88]. As was true in 1984, the surface acoustic wave and infrared touchscreens as well as capacitive touchpads on the market still su er from this limitation. Some devices on the market partially utilize multiple ngers despite the ambiguities of projective sensing. For example, touchpads manufactured by Logitech, Inc. 15, 78] for laptop computers are able to detect the presence of up to three ngertips. The patent to Bisset and Kasser 15] explains that this is done by assuming the ngers lie in a row and counting the number of maxima in the column projection. However, as will be seen in Figures 2.2 and 2.3 below, this projection maxima counting method becomes ambiguous for larger touch surfaces in which one hand part can intersect the same column as another, such as when both ngers and palms touch the sensing area or the hand rotates so ngers lie diagonally or in a column. Figures 2.2 and 2.3 demonstrate the limitations of this projection approach compared to the two-dimensional arrays of sensors (Figure 2.1b) to be discussed in Section 2.1.7. Fingertip, thumb, and palm heel surface contacts are simulated with two-dimensional Gaussians of varying widths on the 2D square grid. The grid samples the Gaussians at 2.5 mm intervals such as would occur in a capacitive sensing array with moderate spatial resolution. The darkness of the squares is proportional to the nger capacitance or proximity sampled at the square. The projective signals which would be measured from the row and column spanning electrodes of Bisset and Kasser 15] are simulated by integrating over each row of the 2D array to obtain the horizontal bar plots to the left of each grid and by integrating over each column to obtain the vertical bar plots under each grid. Figure 2.2 shows the projection sensing ambiguities which can occur when the ngertip row is not horizontal, but lies diagonally instead due to various hand 34 a) b) c) d) Figure 2.2: Projection sensor ambiguities for various diagonal arrangements of ngertips. The di erent ngertip contact arrangements shown on the square sensor grid in a)-c) all produce the same row and column projections (horizontal and vertical bar plots), preventing the projection method from determining the hand rotation, though it can still count the ngertip maxima. In d) the ngertips are so close together that the projection minima between ngertips disappear, preventing ngertip counting, though the diagonal minima are still discernable in the square sensor grid. 35 rotations. In Figure 2.2a-c four maxima appear in both the row and column projections (bar plots), indicating at least four objects are touching the surface, but the projections are the same in each case even though the ngertip arrangements (grid) di er. The same projections could be obtained from a 4 4 array of 16 ngertips also, though most human operators will not have that many ngertips. In Figure 2.2d the ngertips are so close together in their diagonal row that the projection maxima merge, though local maxima are still clearly separated by diagonal partial minima in the sampled 2D array. Figure 2.3 shows how ngertip counting from projection sensors is occluded by the presence of thumb and palms in a neutral hand position. In Figure 2.3a four ngertips lie in a slight arc, producing four maxima in the column projections and one in the row projection. Figure 2.3b includes the thumb in nearly the same column as the index ngertip, causing an additional maximum in the row projection (horizontal bars) only. The index ngertip is removed in Figure 2.3c because the thumb is still in the same columns, the number of projection maxima does not change, though the amplitudes change somewhat. Because the amplitudes also depend on how lightly each nger touches the surface, the change in projection amplitudes cannot reliably resolve this ambiguity the amplitude changes could also be a result of a lightening in hand pressure. In Figure 2.3d the palms touch as well, leaving three maxima in the row projection but causing the column projection maxima to merge into just two. Therefore from the row projection one could surmise that some palms, the thumb, and some ngertips are touching, but one can no longer tell how many ngertips are touching because the palm column projections get integrated with and obscure the ngertip column signals. As Lee points out, measuring projections from additional angles such as diagonals can help disambiguate multiple contacts, as is done in tomography systems, 36 a) b) c) d) Figure 2.3: Ambiguities in projective sensing caused by presence of the thumb and palms in the same columns as ngertips. a) simply contains a slightly arched row of ngertips producing four column projection maxima (vertical bars at bottom) and one row projection maximum in the horizontal bars. Adding a thumb contact in b) adds a row maximum but not a column maximum because the thumb intersects nearly the same columns as the index ngertip. Removing the index ngertip in c) does not chance the number of projection maxima, meaning ngertips cannot be counted reliably in the presence of the thumb. Adding the palms in d) further obscures the ngertip row projection maxima, which get merged with those of the palms. 37 but details inside concave contacts will still be undetectable 88]. The number of unambiguously locatable contacts is generally one less than the number of projection angles utilized 88]. McAvinney's \Sensor Frame" 107, 108, 129], an attachment to the screen of a computer monitor which senses intersection of ngers with infrared beams from four directions, utilizes this tomography approach to unambiguously locate up to three ngers. 2.1.6 Tactile Imaging This complex tomography approach can be avoided with a regular twodimensional array of individually addressable sensors (Figure 2.1b), in which each sensor corresponds to a pixel in a \tactile image." Layered resistive-membrane pressure sensors can be constructed economically in this con guration, but their substantial activation force is ergonomically inferior to zero-activation-force proximity sensing. Another approach is to place a camera under a translucent tabletop and image the shadow of the hands 81, 110]. Unfortunately the bulky optics under the table will limit portability and leg room, and such systems cannot di erentiate nger pressure 88]. Active optical imaging with an array of infrared transmitters and receivers on the surface could easily detect nger proximity, but would be prohibitively expensive and power consumptive. 2.1.7 Capacitance-Sensing Electrode Arrays The remaining option is to measure the capacitance between the ngers and an insulated array of metal electrodes. The presence of a nger e ectively increases the electrode capacitance to ground since the capacitance between the conductive ngertip esh and an electrode plate is typically a few pF but the capacitance of the human body with respect to earth ground is relatively large (about 100pF) 88]. Since the capacitance between parallel plates drops quickly in inverse proportion to the distance between the plates, this technique can only detect ngers within a 38 few millimeters of the electrodes. Spatial resolution increases dramatically as the ngers approach the electrodes. Precision of .2 mm can easily be obtained with 4 mm electrode spacings by computing a nger centroid, i.e., interpolating between neighboring electrodes. The capacitive technique also indicates nger force up to a couple Newtons because the e ective capacitor area increases as the ngertip pulp attens against the surface 134]. While the limited proximity sensing range of electrode arrays ensures ngertip proximity information is clear and uncluttered, it also prevents detection of the nger joints and palms unless the whole hand is attened against the surface. Lee built the rst such array in 1984 with 7mm by 4mm metal electrodes arranged in 32 rows and 64 columns. The \Fast Multiple-Touch-Sensitive Input Device (FMTSID)" total active area measured 12" by 16", with a .075mm Mylar dielectric to insulate ngers from electrodes. Each electrode had one diode connected to a row charging line and a second diode connected to a column discharging line. Electrode capacitance changes were measured singly or in rectangular groups by raising the voltage on one or more row lines, selectively charging the electrodes in those rows, and then timing the discharge of selected columns to ground through a discharge resistor. The principal disadvantage of Lee's design was that the column diode reverse bias capacitances allowed interference between electrodes in the same column. Even with 2048 electrodes and suitable interpolation between electrodes, the electrode spacing was probably too coarse to reproduce the ne mouse positioning achieved with current single- nger touchpads 46 48, 50, 51, 111]. Though its scanning rate depended irregularly on the number of and positions of surface contacts, for ten ngers it would have only been able to achieve 1-5 fps, which is much too slow for either typing or gesture applications. Rubine 129, 130] reports seeing another multi-touch tablet demonstrated at AT&T in 1988 by Robert Boie which could detect all ten ngers. It boasted a 30 39 fps frame rate and resolution of 1 mil (.025 mm) in lateral position and 10 bits in pressure. Possibly it measured sensor capacitance with the synchronous detection technology in a 1995 patent by Boie et al. 17] that brie y mentions multi-touch tablets as an application. 2.1.8 The MTS's Parallelogram Electrode Array The MTS contains a 16 96 electrode array (Figure 2.4) much like those in the above multi-touch tablets. It employs a special wedge electrode geometry to reduce the number of rows necessary by a factor of three without causing serious non-uniformities in vertical position interpolation. This reduction in electrode count speeds fabrication of research prototype arrays by lowering the discrete part count, but would not necessarily be bene cial for volume manufacturing techniques. Rectangular electrodes (Figure 2.5) like those used by Lee 88] are more sensitive to vertical position changes near the top and bottom of the electrodes, where it is possible to interpolate between two electrodes, than in the middle of an electrode. If a nger is in the middle, the electrode is so tall that the electrodes above and below do not register enough signal to get a reliable interpolation. In contrast, the vertically interleaved parallelogram electrodes interpolate via their physical geometry. The ratio of the horizontal cross-sections between electrodes in a column varies continuously with vertical location of an object (Figure 2.6a-d)). Though this improves uniformity of vertical interpolation compared to rectangular electrodes of the similar height, it also has the e ect of vertically smearing signals, making it di cult to distinguish objects which appear in the same electrode column less than one row spacing apart. For research prototyping purposes this is tolerable because the ngers tend to lie in a row, no more than one per column. However, once in awhile the thumb or pinky pass behind and intersect columns of the other ngertips, becoming indistinguishable from the ngertip in front of them (see Section 2.3.3). Also, as is discussed in Appendix B, vertical interpolation biases do arise 40 14 41 Vertical Position on Surface (Y axis cm) 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.4: Diagram of electrode layout for the entire 16 96 parallelogram electrode array. Row pitch is 1.2 cm and column pitch is 0.4 cm, but electrodes are only 0.25 cm wide. a) b) c) d) e) Figure 2.5: A 3 3 section a) of rectangular electrode array. Vertical interpolation between top and bottom electrodes works in b)-c) but not in d)-e). for small contacts which are not centered on or between columns of the parallelogram electrode array. Thus a commercial product, especially one which attempts to recognize a handwriting grip or stylus, would have to abandon the electrode count savings of this scheme for traditional square electrodes and a smaller row spacing. a) b) c) d) Figure 2.6: Vertical interpolation on the parallelogram electrode array is uniform in a)-d) since ratio of hatched cross sections on top and bottom electrodes changes gradually. 42 2.1.9 No Motion Blur on MTS Another important characteristic of the MTS is that the sensing array multiplexes much of the integration, bu ering and quantization circuitry. Therefore the capacitance of each electrode is measured over a relatively short period of a few hundred microseconds compared to the total array scanning period of ten to twenty milliseconds. This contrasts with the CCD arrays typically used in video cameras which integrate incoming photons at each pixel over most of the period between readouts. An advantage of the MTS's relatively short integration time is that MTS proximity images do not exhibit motion blur. However, if the scanning rate is not fast enough, quick nger taps over an electrode can occur entirely between measurements of that electrode and be completely missed. When tapping key regions during touch typing, ngers usually remain on the surface for at least 50 ms, but the scan period must be somewhat smaller than this for reliable detection. During the experiments conducted for this dissertation, the array scan frequency or frame rate has been set to 50 fps (corresponding to a period of 20 ms), which ensures that each nger tap shows up in at least one scan. However, at this rate the peak nger pressure as the ngertip bottoms out onto the surface in the middle of the tap cannot be measured accurately because the single scan detecting the tap might occur near the beginning or end of the tap cycle when the nger is barely touching the surface. Minor changes to the scanning hardware can easily push the frame rate to 100 fps, which will allow peak nger pressure to be measured fairly accurately even for extremely quick taps. 2.2 Tactile Image Formation and Background Removal While designing a tactile sensor array for robotic ngertips nearly 20 years ago, Danny Hillis 59] realized how much easier touch imaging is than computer vision: 43 ... analyzing a tactile image is like analyzing a visual image with controlled background, illumination, and point of view ... the properties that we actually measure are very close, in kind, to the properties that we wish to infer. Comparing background segmentation techniques in vision-based and tactile hand imaging systems will verify his insight. 2.2.1 Optical Image Segmentation Ahmad's real-time 3D hand tracker 3] segments the background by matching image patches to known skin color histograms, but to keep up with frame rates (30 frames per second) it must limit the skin search region and adaptively subsample the image. Finger positions are obtained by tting ellipses to the segmented hand patches. The total hand patch area weighted with a centered Gaussian roughly indicates the distance between hand and camera. Ahmad also tries to recover nger joint angles, information which data gloves give directly, by nding ngertips and learning an inverse mapping from ngertip and palm position to intermediate joint angle. This feature of the tracker becomes unstable due to ngertip detection failure if the hand is not roughly normal to the camera. The Digital Desk 154 157] is a system pioneered at Xerox for combining interaction with paper and digital documents. The system contains both a computer screen projector and zoomable cameras mounted high above the user's desk. The cameras both track hands and recognize text from paper documents lying on the desk. Since the vision system cannot determine exactly when ngers actually touch the desk surface, a microphone is placed under the desk to \hear" nger taps and thus emulate mouse clicks. Crowley and Coutaz 30] consider color, correlation tracking, principal components and active contours for following a pointing object on a digital desk. In the correlation method, a previous image of a ngertip is used as a reference template for correlations with the next image. The new nger 44 position is indicated by the amount of template image shift which minimizes the sum of squared di erences between template and image. Again, the computational costs of the correlation limit the template search region and thus the maximum trackable nger speed. 2.2.2 Methods for Proximity Image Formation Background segmentation of proximity images from electrode arrays is much easier because extraneous objects are not expected to be visible in the background. Paper or plastic left over the electrodes do not register on capacitive proximity sensors, nor do small metal objects unless they are deliberately grounded. However, spatial non-uniformities in the parasitic capacitances of discrete components and signal lines may cause background measurements at each electrode to di er. Unlike background signals caused by extraneous external objects, such background nonuniformities are not expected to change over time. A local o set calibration or adaptive thresholding scheme can cancel these xed sensor disparities. Once these sensor o sets are taken into account and electrical noise is ltered, the proximity image can simply be thresholded to identify regions of eshy contact. Note that single- nger projective touchpads do utilize o set adaptation but do not have to segment the image into eshy contact regions they simply compute a global centroid from measurements of all row and column electrodes. 2.2.2.1 Binary Tree Scanning Lee's binary tree scanning algorithm 88] combines noise ltering and thresholding in hardware by analog grouping and summation of electrode capacitance measurements. The array is recursively subdivided into rectangular electrode groups of decreasing size via bisection starting with the whole array. Thresholds are calibrated during device initialization for each electrode group at each size, or level, in the recursion. During subsequent scanning, subrectangles are scanned only if the parent 45 rectangle's threshold is exceeded. Once the recursion reaches a measurement which passes threshold at the single electrode level, a nger position is computed as the centroid of the recursed electrode capacitance and its eight neighboring electrode capacitances. Advantages of Lee's scheme are: not every electrode in the array need be separately scanned each pass, and grouping of many electrodes at the beginning of the scan tends to average out noise. The disadvantage is that small, light contacts can be lost among the large electrode groups if the large group thresholds are marginally too high. 2.2.2.2 Brute Array Scanning Both digital and analog processing speeds have increased enough since Lee's prototype was built that the scanning overhead concerns have become negligible, especially in light of the additional nger tracking and gesture recognition algorithms which the MTS must execute. Keep in mind that though the number of discrete components necessary for an electrode array may make it seem large, the number of \pixels" is still small compared to even a low-resolution digital camera image. For this reason, and to ensure even brief, light nger contacts are captured, the MTS employs a brute force electrode scan to form a complete proximity image before applying standard digital ltering techniques. 2.2.2.3 Sensor O set Adaptation Sensor o set calibration will fail during device initialization if the user's hands are already on the board. Since there may not be a time when the ngers are known to be absent, the MTS continuously updates each electrode o set with the minimum of readings from that electrode. Suppose Aij n] is the raw tactile proximity measured from the electrode at row i, column j during scan cycle n. Then the local o sets Oij can be updated as: Oij n] = min(Aij n] Oij n ; 1]) 46 (2.1) The o set-corrected image E is then: Eij n] = Aij n] ; Oij n] 8i j : 0 <= i < Erows 0 <= j < Ecolumns (2.2) Since capacitance measurements always return to baseline when ngers are removed, the o sets will correct themselves by decreasing as soon as ngers are lifted. The danger of this method is that negative electrical noise spikes can cause inadvertent lowering of the o sets. Local o sets which are too low lead to false positive proximity indications, just as o sets which are too high cause nger contacts to be missed. The MTS compromises by decreasing o sets only when at least three low proximities are read consecutively and by allowing very slow recovery, over about a minute, should an o set get lowered too far: Oij n] = min(max(Aij n] Aij n ; 1] Aij n ; 2]) (Oij n ; 1] + )) (2.3) where the max operation provides immunity to single negative noise spikes and a tiny gives a slow recovery rate. Even with a tiny , hands which are left resting on the board a few minutes will appear to fade. To prevent this, is further decreased for those electrodes which the system con dently identi es as underlying a eshy contact. These o sets quickly adapt to the minimum baseline capacitance so any readings above the o sets can be modeled as the esh proximity magnitude plus minor Gaussian background noise. 2.2.2.4 Proximity Image Filtering While Lee 88] electrically averaged the capacitances of entire rectangular groups of electrodes to combat noise before threshold testing, the MTS electrode array is much less noisy than Lee's device. Furthermore, to take full advantage of the electrode array resolution, groups should conform to nger contact shape electrode by electrode rather than be constrained to rectangular groups which poorly t the oval shape of most hand contacts. Therefore, the MTS only employs slight spatial 47 di usion of each o set-corrected image to combat electrical noise. Then it applies signi cance threshold and local maximum tests to each di used pixel to detect the center of each hand contact, as further described in Chapter 3. 2.3 Topology of Hand Proximity Images To illustrate typical properties of hand contacts as they appear in proximity images, Figures 2.7 2.10 contain sample images captured by the prototype array of parallelogram-shaped electrodes. Shading of each electrode darkens to indicate heightened proximity signals as esh gets closer to the surface, compresses against the surface due to hand pressure, and overlaps the parallelogram more completely. Notice that the proximity images are totally uncluttered by background objects unlike optical images, only conductive objects within a couple millimeters of the surface show up at all. Background sensor o sets have already been removed from each image, and background electrical noise levels are so low as to not be visible with the given grayscale intensity map. Certain applications such as handwriting recognition will clearly require ner electrode arrays than indicated by the electrode size in these sample images. In the discussion that follows, the proximity data measured at one electrode during a particular scan cycle constitutes one \pixel" of the proximity image captured in that scan cycle. In this section and the rest of this dissertation, the term \proximity" will only be used in reference to the distance or pressure between a hand part and the surface, not in reference to the distance between adjacent ngers. \Horizontal" and \vertical" refer to x and y directional axes within the surface plane. Proximity measurements are then interpreted as pressure in a z axis normal to the surface. The direction \inner" means toward the thumb of a given hand, and the direction \outer" means towards the pinky nger of a given hand. For the purposes of this description, the thumb is considered a nger unless otherwise noted, but it does not count as a ngertip. \Contact" is used as a general term for a hand part when it 48 touches the surface and appears in the current proximity image, and for the group and path data structures which will represent it in Chapter 3. 2.3.1 Flattened Hand Image Properties Figure 2.7 shows a right hand attened against the surface with ngers outstretched. This attened hand image includes all of the hand parts which can touch the surface from the bottom of one hand, but in many instances only a few of these parts will be touching the surface, and the ngertips may roam widely in relation to the palms as ngers are exed and extended. At the far left is the oblong thumb which tends to slant at about 120 . The columnar blobs arranged in an arc across the top of the image are the index nger, middle nger, ring nger and pinky nger. Since the ngers are fully extended, the creases at nger joints cause slight undulations in proximity along each column, though smearing by the parallelogram electrodes obscures this e ect somewhat. Flesh from the proximal nger joints, or proximal phalanges, appears as the particularly intense undulations at the bottom of the index, middle, and ring nger columns. Since the ngers are fully attened, esh from the forepalm calluses is also visible as small clusters below the proximal phalanges, near the vertical level of the thumb. The inner and outer palm heels cause the pair of very large contacts across the bottom of the image. These palm heels tend to be quite large, mildly oblong, and oriented diagonally. Unless the center of the palm is intentionally pushed against the surface, a large crease or proximity valley clearly separates the inner and outer palm heels. Even though image resolution is fairly low, it is clear that the eshy contacts from di erent parts of the hand have subtly contrasting geometric properties. All the hand contacts are roughly oval-shaped, but they di er in pressure, size, orientation, eccentricity and spacing relative to one another. 49 14 Middle Fingertip Index Fingertip 12 Ring Fingertip Pinky Fingertip Vertical Position on Surface (Y axis cm) 10 8 6 Proximal Phalanges Thumb 4 Forepalms 2 0 −2 −4 Outer Palm Heel Inner Palm Heel −6 0 2 4 6 8 10 12 14 Horizontal Position on Surface (X axis cm) 16 18 Figure 2.7: O set-corrected proximity image of right hand attened onto the surface with ngers outstretched and all hand parts labeled. 50 2.3.2 Properties of Hands in the Neutral Posture Figure 2.8 shows a proximity image for all ngers and palms of both hands Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.8: Proximity image of both hands resting on the surface in their respective neutral or default postures. resting in what will be known hereafter as their default positions. Since these positions correspond to the most neutral hand and nger postures, with wrist straight and ngers curled so ngernails are normal to the surface, gestures are most likely to start from this hand con guration. Note that since ngers are curled, the proximal phalanges and forepalms are far above the surface and not visible. Because the ngers are slightly spread in this neutral posture, all eshy contacts are clearly separated by at least one electrode at the background or zero proximity level. Since only the tips rather than the lengths of the ngers are visible, the ngers appear much shorter than in Figure 2.7, and would appear circular if not for vertical smearing by the parallelogram electrodes. However, the nger widths remain fairly constant 51 regardless of contact elongation. Also, the electrodes at the center of each ngertip do not appear as dark as the central thumb and palm heel electrodes because, in this case, the ngertips contacts are not tall enough to fully overlap any of the parallelograms, limiting the proximity signal regardless of their distance from the surface. The palm heels appear somewhat shorter than in Figure 2.7 since only the rear of the palm can touch the surface when ngers are exed, but the separation between the palm heels is unchanged. The fact that the intermediate nger joints connecting ngertips to palms, i.e., the lengths of the ngers, do not appear in this commonly occurring proximity image has further consequences. While such lack of intermediate hand structure simpli es determination of the ngertip centroid, it is also the main shortcoming of capacitive proximity sensing in terms of hand gesture recognition. Reliably establishing nger or even hand identity when intervening hand structure is missing from the proximity images poses the most challenging problem of the work described in this dissertation. This challenge is the subject of Chapter 4. 2.3.3 Partially Closed Hand Image Properties For a tracking system to support a wide range of hand gestures, it must tolerate contact shapes and juxtapositions which vary from the default. The two extremes to be considered in this work are the previously discussed attened hand and the partially closed hand shown in Figure 2.9. Here the thumb is pushed directly behind the index nger, but vertical smearing by the wedge electrodes may cause thumb and index nger to appear as a single unseparable contact. Unlike the default hand posture in Figure 2.8, adjacent ngertips are so close together as to be distinguishable only by slight proximity valleys or saddle points between them. At the given horizontal electrode spacing, the saddle points between adjacent ngertips may only be separated by a single column wide. Any segmentation algorithm must use the partial minima in the horizontal direction to distinguish these ngertips. In 52 14 12 Vertical Position on Surface (Y axis cm) 10 8 6 4 2 0 −2 −4 −6 0 2 4 6 8 10 12 14 Horizontal Position on Surface (X axis cm) 16 18 Figure 2.9: Proximity image of a partially closed hand with ngertips squished together. 53 case the ngertip row is rotated, partial minima in diagonal directions must also be detected. This con icts with the segmentation needs of palms, which may contain spurious partial minima due to minor variations in sensor gain or esh proximity across their large areas. All partial minima within palm contacts should be ignored except the large crease between the palm heels. 2.3.4 Pen Grip Image Properties Figure 2.10 is a proximity image of a right hand in a pen grip con guration, which is particularly comfortable and dexterous for handwriting or freehand drawing. The thumb and index ngertip are pinched together as if they were holding a pen, but in this case they are touching the surface instead. Actually the thumb and index nger appear the same here as in Figure 2.9. However, the middle, ring, and pinky ngers are curled under as if making a st, so the knuckles from the top of the ngers actually touch the surface instead of the nger tips. The curling under of the knuckles actually places them behind the pinched thumb and index ngertip, very close to the palm heels. The knuckles also appear larger than the curled ngertips of Figure 2.9 but the same size as the attened ngertips in Figure 2.7. These di erences in size and arrangement are su cient to distinguish the pen grip con guration from the closed and attened hand con gurations. Though the contact segmentation and identi cation methods presented in this dissertation extend to the pen grip con guration with minimal modi cation, a higher resolution sensor array without vertically smearing parallelogram electrodes is needed to accurately discern the pinched ngers. 2.3.5 Comfortable Ranges of Hand Motion Given that the MTS prototype has the form factor of a standard computer keyboard and is similarly placed on a desk, lap or workbench to operate from a sitting or standing posture, the ranges of hand position and rotation expected during 54 14 12 Vertical Position on Surface (Y axis cm) 10 8 6 4 2 0 −2 −4 −6 0 2 4 6 8 10 12 14 Horizontal Position on Surface (X axis cm) 16 18 Figure 2.10: Proximity image of a hand with inner ngers pinched and outer ngers curled under towards the palm heels as if gripping a pen. 55 normal operation are fairly limited. When only one hand is on the surface, its maximum inward rotation can occur when it crosses to the opposite side of the surface, as shown in Figure 2.11. This situation maximizes the inward rotation of both the forearm about the elbow and the hand about the wrist. The maximum Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.11: Proximity image of right hand at far left of sensing surface and rotated counter-clockwise to its biomechanical limit. clockwise or outward rotation occurs from the default hand position with forearm parallel to the vertical surface axis, as shown for the right hand in Figure 2.12. Further rotations are only possible through contortions of the whole body or if the operator's torso is not facing the apparatus. When both hands are on the surface, hand position is even further limited by the fact that operators are not expected to let the hands cross over or overlap one another. Figure 2.13 shows the maximum leftward position of the right hand when the left hand is in its default position. For some operations only part of a hand may remain in the active sensing area, as shown for the row of right hand ngertips at 56 Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.12: Proximity image of right hand at far right of sensing surface and rotated outward to its biomechanical limit. Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.13: Proximity image of left hand in default position and right hand up against it. 57 the bottom middle of the surface in Figure 2.14. Though it is hard to imagine how Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.14: Proximity image of left hand in default position and right hand moved down so only ngertips remain in active sensing area. this would be useful, the ngertips can also lie over the top of the active sensing area as in Figure 2.15, so only the thumb and palms remain visible. 2.4 Conclusion Capacitance-based proximity sensing has many advantages over other hand motion sensing techniques. These advantages include precise detection of esh contact with a surface, zero-force activation, avoidance of mechanical encumbrances, prevention of ngertip occlusion, and absence of background scene clutter. An array of a few thousand electrodes is su cient to detect and uniquely determine the positions of any number of contacts from the undersides of both hands. Though each electrode has a constant sensor o set which must be removed, a large MTS can have signal-to-noise ratios as high as its tiny touchpad cousins. 58 Vertical Position on Surface (Y axis cm) 14 12 10 8 6 4 2 0 −2 −4 −6 −15 −10 −5 0 5 Horizontal Position on Surface (X axis cm) 10 15 Figure 2.15: Proximity image of left hand in default position and right hand moved up so only thumb and palms remain in active sensing area. The MTS o ers a previously unexplored compromise between the rich tactile and force feedback of a mechanical keyboard or joystick and the feedback void of free space hand gestures. The proximity signals measured by the MTS correspond almost exactly to the operator's own sensations of engaging and sliding the hand across the surface. Even though hand proximity images contain ambiguities due to the lack of sharp edges between esh contacts and the absence of intervening hand structure, the results of Chapters 3 and 4 will show that these ambiguities are surmountable. Ultimately such a unique, close correspondence between the sensations of the operator and the proximity imaging system can support much faster and more accurate gesture recognition than video-based systems. 59

Disclaimer: Justia Dockets & Filings provides public litigation records from the federal appellate and district courts. These filings and docket sheets should not be considered findings of fact or liability, nor do they necessarily reflect the view of Justia.


Why Is My Information Online?