3D city generation is a desirable yet challenging task, since humans are more sensitive to structural distortions in urban environments. Additionally, generating 3D cities is more complex than 3D natural scenes since buildings, as objects of the same class, exhibit a wider range of appearances compared to the relatively consistent appearance of objects like trees in natural scenes. To address these challenges, we propose textbf{CityDreamer}, a compositional generative model designed specifically for unbounded 3D cities. Our key insight is that 3D city generation should be a composition of different types of neural fields: 1) various building instances, and 2) background stuff, such as roads and green lands. Specifically, we adopt the bird's eye view scene representation and employ a volumetric render for both instance-oriented and stuff-oriented neural fields. The generative hash grid and periodic positional embedding are tailored as scene parameterization to suit the distinct characteristics of building instances and background stuff. Furthermore, we contribute a suite of CityGen Datasets, including OSM and GoogleEarth, which comprises a vast amount of real-world city imagery to enhance the realism of the generated 3D cities both in their layouts and appearances. CityDreamer achieves state-of-the-art performance not only in generating realistic 3D cities but also in localized editing within the generated cities.

## Overview

- 3D city generation is a challenging task due to human sensitivity to structural distortions in urban environments and the wider range of building appearances compared to natural scenes.
- To address these challenges, the researchers propose [CityDreamer](https://aimodels.fyi/papers/arxiv/urban-architect-steerable-3d-urban-scene-generation), a compositional generative model designed specifically for 3D city generation.
- The key insight is that 3D city generation should be a composition of different types of neural fields: building instances and background stuff like roads and green spaces.

## Plain English Explanation

The researchers developed a system called [CityDreamer](https://aimodels.fyi/papers/arxiv/urban-architect-steerable-3d-urban-scene-generation) to generate realistic 3D cities. Generating 3D cities is more complex than generating natural 3D scenes because buildings can have a wide variety of appearances, while objects in nature tend to look more similar.

The researchers' approach involves breaking down the 3D city into two main components: the individual buildings and the background elements like roads and parks. They use specialized techniques to model each of these components, which allows the system to create more believable and diverse 3D cities.

The researchers also created a large dataset of real-world city imagery, called the [CityGen Datasets](https://aimodels.fyi/papers/arxiv/grounded-compositional-diverse-text-to-3d-pretrained), to help the system generate cities that look and feel more realistic.

## Technical Explanation

The researchers propose [CityDreamer](https://aimodels.fyi/papers/arxiv/urban-architect-steerable-3d-urban-scene-generation), a compositional generative model for 3D city generation. The key insight is that 3D city generation should be a composition of different types of neural fields: 1) building instances and 2) background stuff, such as roads and green lands.

Specifically, the system uses a bird's eye view scene representation and employs a volumetric rendering approach for both the instance-oriented and stuff-oriented neural fields. The researchers tailor the generative hash grid and periodic positional embedding techniques to suit the distinct characteristics of building instances and background stuff.

Additionally, the researchers contribute the [CityGen Datasets](https://aimodels.fyi/papers/arxiv/grounded-compositional-diverse-text-to-3d-pretrained), which includes a vast amount of real-world city imagery from sources like OpenStreetMap and Google Earth. This dataset helps the system generate 3D cities that are more realistic in terms of both layout and appearance.

## Critical Analysis

The researchers acknowledge that generating realistic 3D cities is a challenging task, as humans are highly sensitive to structural distortions in urban environments. They also note that 3D city generation is more complex than 3D natural scene generation due to the wider range of building appearances.

While the [CityDreamer](https://aimodels.fyi/papers/arxiv/urban-architect-steerable-3d-urban-scene-generation) model and the [CityGen Datasets](https://aimodels.fyi/papers/arxiv/grounded-compositional-diverse-text-to-3d-pretrained) represent significant advancements in the field, the researchers do not discuss potential limitations or areas for further research in detail. For example, it would be interesting to explore how the system might handle the generation of cities with unique architectural styles or cultural influences.

Additionally, the researchers could have compared their approach to other recent developments in 3D city generation, such as [RealMDreamer](https://aimodels.fyi/papers/arxiv/realmdreamer-text-driven-3d-scene-generation-inpainting), [DreamScene](https://aimodels.fyi/papers/arxiv/dreamscene-3d-gaussian-based-text-to-3d), or [StyleCity](https://aimodels.fyi/papers/arxiv/stylecity-large-scale-3d-urban-scenes-stylization), to provide a more comprehensive understanding of the state of the art in this field.

## Conclusion

The researchers have developed [CityDreamer](https://aimodels.fyi/papers/arxiv/urban-architect-steerable-3d-urban-scene-generation), a compositional generative model that addresses the challenges of 3D city generation. By breaking down the task into building instances and background stuff, the system is able to generate more realistic and diverse 3D cities.

The contribution of the [CityGen Datasets](https://aimodels.fyi/papers/arxiv/grounded-compositional-diverse-text-to-3d-pretrained), which includes a vast amount of real-world city imagery, is also a valuable addition that can help advance the field of 3D city generation. While the researchers have made significant progress, there are still opportunities for further exploration and improvement, such as addressing the generation of cities with unique architectural styles or cultural influences.