PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

Wangxuan Institute of Computer Technology, Peking University
CVPR 2025
Illustration of the generalized settings in content-aware layout generation from both (a) data and (b) approach perspectives.

Generalized settings in content-aware layout generation from (a) data and (b) approach perspectives.

Abstract

In poster design, content-aware layout generation is crucial for automatically arranging visual-textual elements on the given image. With limited training data, existing work focused on image-centric enhancement. However, this neglects the diversity of layouts and fails to cope with shape-variant elements or diverse design intents in generalized settings. To this end, we proposed a layout-centric approach that leverages layout knowledge implicit in large language models (LLMs) to create posters for omnifarious purposes, hence the name PosterO. Specifically, it structures layouts from datasets as trees in SVG language by universal shape, design intent vectorization, and hierarchical node representation. Then, it applies LLMs during inference to predict new layout trees by in-context learning with intent-aligned example selection. After layout trees are generated, we can seamlessly realize them into poster designs by editing the chat with LLMs. Extensive experimental results have demonstrated that PosterO can generate visually appealing layouts for given images, achieving new state-of-the-art performance across various benchmarks. To further explore PosterO's abilities under the generalized settings, we built PStylish7, the first dataset with multi-purpose posters and various-shaped elements, further offering a challenging test for advanced research.

The Proposed Approach: PosterO

An overview of PosterO.
An overview of PosterO.

First, (a) layout tree construction takes image-layout pairs (I, L) from datasets as input for data preparation, jointly modeling various-shaped elements E and design intents D towards layout trees T and building up the latent space of D. Then, (b) layout tree generation takes test images It as input and searches examples based on the predicted design intents ft to apply LLM M through in-context learning. After obtaining generated layout trees , (c) poster design realization can continue the conversation with M to create poster designs seamlessly.

BibTeX


@inproceedings{Hsu-CVPR2025-postero,
  title={PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation},
  author={Hsu, HsiaoYuan and Peng, Yuxin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}