Apple has unveiled the Pico-Banana-400K, a curated dataset featuring 400,000 images aimed at advancing text-guided image editing research. This dataset, developed using Google’s Gemini-2.5 models, is provided under a non-commercial research license, allowing academic usage only. Apple’s study highlights a significant gap in quality, large-scale editing datasets, which often suffer from inconsistencies and limited diversity. The Pico-Banana-400K dataset addresses these challenges by utilizing real photographs from the OpenImages dataset, covering vital categories like human-centric edits and scene adjustments. Researchers categorized 35 distinct edit types, allowing models to learn effective editing through single and multi-turn prompts. Despite some limitations in spatial editing and typography, Apple hopes this dataset will become a pivotal resource for developing and benchmarking innovative text-guided image editing models. The study is available on arXiv, and the dataset can be accessed on GitHub, enhancing the landscape of AI-driven image editing.
Source link
Share
Read more