This paper presents a novel approach for performing intuitive gesture-based interactionusing depth data acquired by Kinect. The main challenge to enableimmersive gestural interaction is dynamic gesture recognition. This problemcan be formulated as a combination of two tasks; gesture recognition and gesturepose estimation. Incorporation of fast and robust pose estimation methodwould lessen the burden to a great extent. In this paper we propose a directmethod for real-time hand pose estimation. Based on the range images, a newversion of optical flow constraint equation is derived, which can be utilizedto directly estimate 3D hand motion without any need of imposing other constraints.Extensive experiments illustrate that the proposed approach performsproperly in real-time with high accuracy. As a proof of concept, we demonstratethe system performance in 3D object manipulation on two dierent setups;desktop computing, and mobile platform. This reveals the system capabilityto accommodate dierent interaction procedures. In addition, user studyis conducted to evaluate learnability, user experience and interaction quality in3D gestural interaction in comparison to 2D touch-screen interaction.