A new form of distance and blended education has hit the market in recent years with the advent of massive open online courses (MOOCs) which have brought many opportunities to the educational sector. Consequently, the availability of learning content to vast demographics of people and across locations has opened up a plethora of possibilities for everyone to gain new knowledge through MOOCs. This poses an immense issue to the content providers as the amount of manual effort required to structure properly and to organize the content automatically for millions of video lectures daily become incredibly challenging. This paper, therefore, addresses this issue as a small part of our proposed personalized content management system by exploiting the voice pattern of the lecturer for identification and for classifying video lectures to the right speaker category. The use of Mel frequency Cepstral coefficients (MFCC) as 2D input features maps to 2D-CNN has shown promising results in contrast to machine learning and deep learning classifiers - making text-independent speaker identification plausible in MOOC setting for automatic video lecture categorization. It will not only help categorize educational videos efficiently for easy search and retrieval but will also promote effective utilization of micro-lectures and multimedia video learning objects (MLO).