Detection of design pattern occurrences is part of several solutions to software engineering problems, and high accuracy ofdetection is important to help solve the actual problems. The improvement in accuracy of design pattern occurrence detection requiressome way of evaluating various approaches. Currently, there are several different methods used in the community to evaluateaccuracy. We show that these differences may greatly influence the accuracy results, which makes it nearly impossible to compare thequality of different techniques. We propose a benchmark suite to improve the situation and a community effort to contribute to, andevolve, the benchmark suite. Also, we propose fine-grained metrics assessing the accuracy of various approaches in the benchmarksuite. This allows comparing the detection techniques and helps improve the accuracy of detecting design pattern occurrences.