There is no complete text2scene pipeline.

by nikitavovch - opened Nov 4, 2025

Nov 4, 2025

You cannot generate just from the text description + GT layout, as it says. You'll always need atleast RGB images to inference the text 2 scene pipeline. I dont get it, why you haven't made all in one script. I mean generation of rgb, depth, normal and etc. images to inference the full text to scene pipeline in one click. To inference your own generation from only a GT layout, first of all you need to run the preprocessing script, then you need to generate rgb's by yourself, with their flux wireframe model, and etc etc etc...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment