Overview of OBI Decipherment Using Dual-Condition Stable Diffusion Model
This study presents a dual-phase approach for deciphering Oracle Bone Inscriptions (OBI) into modern Chinese characters, employing a novel text-image dual-condition guided diffusion mechanism inspired by the Stable Diffusion model. Initially, it integrates visual features and semantic information, ensuring coherence between character structure and meaning during the denoising process. Fine-tuning the pre-trained model leverages LoRA technology, focusing on updating only the essential cross-attention layers while preserving core capabilities.
To enhance language processing, the original CLIP model is replaced with Chinese-CLIP, optimizing semantic comprehension specifically for Mandarin. This enhances the model’s accuracy in generating aligned character images, effectively addressing structural and semantic discrepancies between OBI and modern text. The inference stage utilizes only the OBI image to generate modern equivalents, guided by prior training. This method achieves high-quality transformation and presents a significant advancement in OBI decipherment technology.