A central part of singing includes learning new pieces of vocal music. Learning a new song is a complex task that involves several functions and modalities, such as memory functions, language and motor skills, and auditory and visual perception. Memory functions are a well-studied area, but it is unknown how memory theories apply to a multimodal activity such as singing. In this study, an attempt is made to translate the theories to the applied field of singing. This study aims to investigate the effectiveness of three types of learning formats for learning new song lyrics: auditory learning with image support (AI), auditory learning with text support (AT), and auditory learning only (A). Ninety-five participants were randomly assigned to one of the three experimental conditions. A univariate analysis of variance revealed a significant effect of condition on the lyric recall score and post-hoc tests showed that participants performed significantly better in the AI condition in comparison to both the AT and the A condition. No significant difference was found between AT and A. This study sheds light on how memory processes might work in learning song lyrics. Practical implications for practitioners such as music educators, conductors, and choir singers are discussed.