Uniting

SHOW-O: A Single Transformer Uniting Multimodal Understanding and Generation

Significant advancements in large language models (LLMs) have inspired the event of multimodal large language models (MLLMs). Early MLLM efforts, equivalent to LLaVA, MiniGPT-4, and InstructBLIP, show notable multimodal understanding capabilities. To integrate LLMs...

Recent posts

Popular categories

ASK DUKE