In this paper we discuss short term traffic congestion prediction, more specifically, prediction of the sudden speed drop when traffic resides at the critical density point. We approach this problem using standard machine learning techniques combining information from multiple sensors measuring density and average velocity. The model used for prediction is learned offline. Our goal is to implement (and possibly update) the predictive model in a multi-agent system, where coupled with each sensor, there is an agent that monitors the condition of traffic, starts to collect data from other sensors located nearby when necessary and is able to predict local sudden speed drops so that drivers can be warned ahead of time. We evaluate Gaussian processes, support vector machines and decision trees not only limited to predictive accuracy, but also the suitability of the learned model in the setup as described above, i.e., keeping in mind that we want the warning system to be decentralized and want to ensure scalability and robustness.