Abstract: Scene graph generation becomes significantly important as it bridges the gap between linguistic and visual information of scenes, facilitating a high-dimensional understanding of scenes. In ...