latents

Hierarchical text-conditional image generation with CLIP latents

Contrastive models like CLIP have been shown to learn robust representations of images that capture each semantics and magnificence. To leverage these representations for image generation, we propose a two-stage model: a previous that...

Recent posts

Popular categories

ASK ANA