lnu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
3D Gesture Recognition and Tracking for Next Generation of Smart Devices: Theories, Concepts, and Implementations
KTH, Medieteknik och interaktionsdesign, MID.ORCID-id: 0000-0003-2203-5805
2014 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The rapid development of mobile devices during the recent decade has been greatly driven by interaction and visualization technologies. Although touchscreens have signicantly enhanced the interaction technology, it is predictable that with the future mobile devices, e.g., augmentedreality glasses and smart watches, users will demand more intuitive in-puts such as free-hand interaction in 3D space. Specically, for manipulation of the digital content in augmented environments, 3D hand/body gestures will be extremely required. Therefore, 3D gesture recognition and tracking are highly desired features for interaction design in future smart environments. Due to the complexity of the hand/body motions, and limitations of mobile devices in expensive computations, 3D gesture analysis is still an extremely diffcult problem to solve.

This thesis aims to introduce new concepts, theories and technologies for natural and intuitive interaction in future augmented environments. Contributions of this thesis support the concept of bare-hand 3D gestural interaction and interactive visualization on future smart devices. The introduced technical solutions enable an e ective interaction in the 3D space around the smart device. High accuracy and robust 3D motion analysis of the hand/body gestures is performed to facilitate the 3D interaction in various application scenarios. The proposed technologies enable users to control, manipulate, and organize the digital content in 3D space.

Ort, förlag, år, upplaga, sidor
Stockholm: KTH Royal Institute of Technology , 2014. , s. xii, 101
Serie
TRITA-CSC-A, ISSN 1653-5723 ; 14:02
Nyckelord [en]
3D gestural interaction, gesture recognition, gesture tracking, 3D visualization, 3D motion analysis, augmented environments
Nationell ämneskategori
Medieteknik
Forskningsämne
Medieteknik
Identifikatorer
URN: urn:nbn:se:lnu:diva-40974ISBN: 978-91-7595-031-0 (tryckt)OAI: oai:DiVA.org:lnu-40974DiVA, id: diva2:796232
Disputation
2014-03-17, F3, Lindstedtsvägen 26, KTH, 13:15 (Engelska)
Opponent
Handledare
Anmärkning

QC 20140226

Tillgänglig från: 2014-02-26 Skapad: 2015-03-18 Senast uppdaterad: 2018-01-11Bibliografiskt granskad
Delarbeten
1. Experiencing real 3D gestural interaction with mobile devices
Öppna denna publikation i ny flik eller fönster >>Experiencing real 3D gestural interaction with mobile devices
2013 (Engelska)Ingår i: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 34, nr 8, s. 912-921Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Number of mobile devices such as smart phones or Tablet PCs has been dramatically increased over the recent years. New mobile devices are equipped with integrated cameras and large displays which make the interaction with the device more efficient. Although most of the previous works on interaction between humans and mobile devices are based on 2D touch-screen displays, camera-based interaction opens a new way to manipulate in 3D space behind the device, in the camera's field of view. In this paper, our gestural interaction heavily relies on particular patterns from local orientation of the image called Rotational Symmetries. This approach is based on finding the most suitable pattern from a large set of rotational symmetries of different orders that ensures a reliable detector for hand gesture. Consequently, gesture detection and tracking can be hired as an efficient tool for 3D manipulation in various applications in computer vision and augmented reality. The final output will be rendered into color anaglyphs for 3D visualization. Depending on the coding technology, different low cost 3D glasses can be used for the viewers. (C) 2013 Elsevier B.V. All rights reserved.

Nyckelord
3D mobile interaction, Rotational symmetries, Gesture detection, SIFT, Gesture tracking, stereoscopic visualization
Nationell ämneskategori
Interaktionsteknik
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40988 (URN)10.1016/j.patrec.2013.02.004 (DOI)000318129200010 ()
Tillgänglig från: 2013-06-05 Skapad: 2015-03-18 Senast uppdaterad: 2017-12-04Bibliografiskt granskad
2. 3D photo browsing for future mobile devices
Öppna denna publikation i ny flik eller fönster >>3D photo browsing for future mobile devices
2012 (Engelska)Ingår i: MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, ACM Press, 2012, s. 1401-1404Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

By introducing the interactive 3D photo/video browsing and exploration system, we propose novel approaches for handling the limitations of the current 2D mobile technology from two aspects: interaction design and visualization. Our contributions feature an effective interaction that happens in the 3D space behind the mobile device's camera. 3D motion analysis of the user's gesture captured by the device's camera is performed to facilitate the interaction between users and multimedia collections in various applications. This approach will solve a wide range of problems with the current input facilities such as miniature keyboards, tiny joysticks and 2D touch screens. The suggested interactive technology enables users to control, manipulate, organize, and re-arrange their photo/video collections in 3D space using bare-hand, marker-less gesture. Moreover, with the proposed techniques we aim to visualize the 2D photo collection, in 3D, on normal 2D displays. This process is automatically done by retrieving the 3D structure from single images, finding the stereo/multiple views of a scene or using the geo-tagged meta-data from huge photo collections. By using the design and implementation of the contributions of this work, we aim to achieve the following goals: Solving the limitations of the current 2D interaction facilities by 3D gestural interaction; Increasing the usability of the multimedia applications on mobile devices; Enhancing the quality of user experience with the digital collections.

Ort, förlag, år, upplaga, sidor
ACM Press, 2012
Nyckelord
3D gestural interaction, 3D visualization, motion analysis, photo browsing, quality of experience
Nationell ämneskategori
Människa-datorinteraktion (interaktionsdesign)
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40978 (URN)10.1145/2393347.2396503 (DOI)978-1-4503-1089-5 (ISBN)
Konferens
20th ACM International Conference on Multimedia, MM 2012, 29 October 2012 through 2 November 2012, Nara
Tillgänglig från: 2013-01-02 Skapad: 2015-03-18 Senast uppdaterad: 2018-01-11Bibliografiskt granskad
3. Bare-hand Gesture Recognition and Tracking through the Large-scale Image Retrieval
Öppna denna publikation i ny flik eller fönster >>Bare-hand Gesture Recognition and Tracking through the Large-scale Image Retrieval
2014 (Engelska)Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
SciTePress, 2014
Nationell ämneskategori
Signalbehandling
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40983 (URN)
Konferens
9th International Conference on Computer Vision Theory and Applications (VISAPP)
Anmärkning

NQC 2014

Tillgänglig från: 2014-02-25 Skapad: 2015-03-18 Senast uppdaterad: 2017-04-19Bibliografiskt granskad
4. Interactive 3D Visualization on a 4K Wall-Sized Display
Öppna denna publikation i ny flik eller fönster >>Interactive 3D Visualization on a 4K Wall-Sized Display
2014 (Engelska)Ingår i: Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA), 2014, s. 1-4Konferensbidrag, Publicerat paper (Refereegranskat)
Nyckelord
computer vision;data visualisation;human computer interaction;image capture;motion measurement;object tracking;screens (display);three-dimensional displays;video cameras;video signal processing;2D screen;3D motion parameter retrieval;3D space;4K wall sized display;digital window;head mounted camera;interactive 3D display;interactive 3D visualization;motion capture system;real-time 3D interaction;user head motion measurement;user head motion tracking;video frame capture;vision-based approach;Cameras;Head;Three-dimensional displays;Tracking;Transmission line matrix methods;Visualization
Nationell ämneskategori
Signalbehandling
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40991 (URN)10.1109/APSIPA.2014.7041653 (DOI)
Konferens
International Conference on Image Processing (ICIP 2014)
Anmärkning

NQC 2014

Tillgänglig från: 2014-02-25 Skapad: 2015-03-18 Senast uppdaterad: 2017-04-19Bibliografiskt granskad
5. 3D Visualization of Single Images using Patch Level Depth
Öppna denna publikation i ny flik eller fönster >>3D Visualization of Single Images using Patch Level Depth
2011 (Engelska)Ingår i: Signal Processing and Multimedia Applications (SIGMAP), 2011 Proceedings of the International Conference on, IEEE Press, 2011, s. 61-66Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper we consider the task of 3D photo visualization using a single monocular image. The main idea is to use single photos taken by capturing devices such as ordinary cameras, mobile phones, tablet PCs etc. and visualize them in 3D on normal displays. Supervised learning approach is hired to retrieve depth information from single images. This algorithm is based on the hierarchical multi-scale Markov Random Field (MRF) which models the depth based on the multi-scale global and local features and relation between them in a monocular image. Consequently, the estimated depth image is used to allocate the specified depth parameters for each pixel in the 3D map. Accordingly, the multi-level depth adjustments and coding for color anaglyphs is performed. Our system receives a single 2D image as input and provides a anaglyph coded 3D image in output. Depending on the coding technology the special low-cost anaglyph glasses for viewers will be used.

Ort, förlag, år, upplaga, sidor
IEEE Press, 2011
Nyckelord
Cameras, Glass, Image color analysis, Stereo image processing, Three-dimensional displays, Vectors, Visualization, 3D Visualization, Color Anaglyph, Depth Map, MRF, Monocular Image
Nationell ämneskategori
Signalbehandling
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40980 (URN)
Konferens
International Conference on Signal Processing and Multimedia Applications, 18-21 July, 2011, Seville, Spain
Anmärkning

QC 20140226

Tillgänglig från: 2014-02-25 Skapad: 2015-03-18 Senast uppdaterad: 2017-04-19Bibliografiskt granskad
6. Stereoscopic visualization of monocular images in photo collections
Öppna denna publikation i ny flik eller fönster >>Stereoscopic visualization of monocular images in photo collections
2011 (Engelska)Ingår i: Wireless Communications and Signal Processing (WCSP), 2011 International Conference on, IEEE Press, 2011, s. 1-5Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper we propose a novel approach for 3D video/photo visualization using an ordinary digital camera. The idea is to turn any 2D camera into 3D based on the data derived from a collection of captured photos or a recorded video. For a given monocular input, the retrieved information from the overlapping photos can be used to provide required information for performing 3D output. Robust feature detection and matching between images is hired to find the transformation between overlapping frames. The transformation matrix will map images to the same horizontal baseline. Afterwards, the projected images will be adjusted to the stereoscopic model. Finally, stereo views will be coded into 3D channels for visualization. This approach enables us making 3D output using randomly taken photos of a scene or a recorded video. Our system receives 2D monocular input and provides double layer coded 3D output. Depending on the coding technology different low cost 3D glasses will be used for viewers.

Ort, förlag, år, upplaga, sidor
IEEE Press, 2011
Nyckelord
cameras, feature extraction, image matching, matrix algebra, stereo image processing, video coding, video retrieval, 3D channel, 3D glasses, 3D video-photo visualization, coding technology, digital camera, feature detection, information retrieval, monocular images, overlapping frames, overlapping photos, photo collections, stereoscopic visualization, transformation matrix, Image color analysis, Robustness, Three dimensional displays, Visualization
Nationell ämneskategori
Signalbehandling
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40995 (URN)10.1109/WCSP.2011.6096688 (DOI)2-s2.0-84555194972 (Scopus ID)978-1-4577-1008-7 (ISBN)
Konferens
WCSP 2011, 9-11 Nov 2011, Nanjing
Anmärkning

QC 20140226

Tillgänglig från: 2014-02-25 Skapad: 2015-03-18 Senast uppdaterad: 2017-04-19Bibliografiskt granskad
7. Robust correction of 3D geo-metadata in photo collections by forming a photo grid
Öppna denna publikation i ny flik eller fönster >>Robust correction of 3D geo-metadata in photo collections by forming a photo grid
2011 (Engelska)Ingår i: WCSP2011: IEEE International Conference on Wireless Communications and Signal Processing, IEEE Press, 2011, s. 1-5Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this work, we present a technique for efficient and robust estimation of the exact location and orientation of a photo capture device in a large data set. The provided data set includes a set of photos and the associated information from GPS and orientation sensor. This attached metadata is noisy and lacks precision. Our strategy to correct this uncertain data is based on the data fusion between measurement model, derived from sensor data, and signal model given by the computer vision algorithms. Based on the retrieved information from multiple views of a scene we make a grid of images. Our robust feature detection and matching between images result in finding a reliable transformation. Consequently, relative location and orientation of the data set construct the signal model. On the other hand, information extracted from the single images combined with the measurement data make the measurement model. Finally, Kalman filter is used to fuse these two models iteratively and enhance the estimation of the ground truth(GT) location and orientation. Practically, this approach can help us to design a photo browsing system from a huge collection of photos, enabling 3D navigation and exploration of our huge data set.

Ort, förlag, år, upplaga, sidor
IEEE Press, 2011
Nationell ämneskategori
Mediateknik
Forskningsämne
Data- och informationsvetenskap, Medieteknik
Identifikatorer
urn:nbn:se:lnu:diva-40994 (URN)10.1109/WCSP.2011.6096689 (DOI)978-1-4577-1008-7 (ISBN)
Konferens
IEEE International Conference on Wireless Communications and Signal Processing (WCSP2011), Nanjing, China, 9-11 November 2011
Tillgänglig från: 2012-03-02 Skapad: 2015-03-18 Senast uppdaterad: 2017-04-19Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

fulltext

Personposter BETA

Yousefi, Shahrouz

Sök vidare i DiVA

Av författaren/redaktören
Yousefi, Shahrouz
Medieteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 172 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf