Automated wrist fracture recognition has becomea crucial research area due to the challenge of accurate X-ray interpretation in clinical settings without specialized expertise. With the development of neural networks, YOLO models have been extensively applied to fracture detection recently. However, detection models can struggle when trained on small datasets, which is often the case in medical scenarios. In this study, we utilize an extremely small multi-region fracture dataset and hypothesize that the structural similarities between surface cracks and bone fractures can allow YOLOv9 with a generalized efficient layer and programmable gradient information control to transfer knowledge effectively. We show that pre-training YOLOv9 on surface cracks rather than on COCO, which is how YOLO models are typically pre-trained, and fine-tuning it on the fracture dataset yields substantial performance improvements. We also show that knowledge gained from the surface cracks requires fewer epochs to converge and minimizes overfitting. We achieved state-of-the-art (SOTA) performance on the newly released FracAtlas dataset, surpassing the previously established benchmark. Our approach improved the mean average precision (mAP) score by 7% and sensitivity by 13%.