Multimodal Foundation Models

Bog

Format
Bog, paperback
Engelsk
230 sider

Indgår i serie
Foundations and Trends R in Computer Graphics and Vision

Normalpris: kr. 1.129,95

Medlemspris: kr. 984,95 For at købe bogen til medlemspris skal du have et medlemskab med Shopping-fordele. Du kan prøve medlemskabet gratis i 7 dage. Medlemskabet fornyes automatisk og kan altid opsiges.

Leveringstid: 7-9 Hverdage (Sendes fra fjernlager)
Forventet levering: 14-11-2024
Kan pakkes ind og sendes som gave
Split betalingen op med

Beskrivelse

This monograph presents a comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants.

The focus encompasses five core topics, categorized into two classes; (i) a survey of well-established research areas: multimodal foundation models pre-trained for specific purposes, including two topics - methods of learning vision backbones for visual understanding and text-to-image generation; (ii) recent advances in exploratory, open research areas: multimodal foundation models that aim to play the role of general-purpose assistants, including three topics - unified vision models inspired by large language models (LLMs), end-to-end training of multimodal LLMs, and chaining multimodal tools with LLMs.

The target audience of the monograph is researchers, graduate students, and professionals in computer vision and vision-language multimodal communities who are eager to learn the basics and recent advances in multimodal foundation models.

Læs hele beskrivelsen

Detaljer

SprogEngelsk
Sidetal230
Udgivelsesdato06-05-2024
ISBN139781638283362
Forlag Now Publishers Inc
FormatPaperback

Størrelse og vægt

Vægt357 g

Dybde1,3 cm

10 cm

15,6 cm

23,4 cm

Multimodal Foundation Models

Findes i disse kategorier...