Multimodal learning

Multimodal learning is a type of deep learning that integrates and processes multiple types of data, referred to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, improving model performance in tasks like visual question answering, cross-modal retrieval,[1] text-to-image generation,[2] aesthetic ranking,[3] and image captioning.[4]

Large multimodal models, such as Google Gemini and GPT-4o, have become increasingly popular since 2023, enabling increased versatility and a broader understanding of real-world phenomena.[5]

Motivation

Data usually comes with different modalities which carry different information. For example, it is very common to caption an image to convey the information not presented in the image itself. Similarly, sometimes it is more straightforward to use an image to describe information which may not be obvious from text. As a result, if different words appear in similar images, then these words likely describe the same thing. Conversely, if a word is used to describe seemingly dissimilar images, then these images may represent the same object. Thus, in cases dealing with multi-modal data, it is important to use a model which is able to jointly represent the information such that the model can capture the combined information from different modalities.

Multimodal transformers

Transformers can also be used/adapted for modalities (input or output) beyond just text, usually by finding a way to "tokenize" the modality.

Multimodal models can either be trained from scratch, or by finetuning. A 2022 study found that Transformers pretrained only on natural language can be finetuned on only 0.03% of parameters and become competitive with LSTMs on a variety of logical and visual tasks, demonstrating transfer learning.[6] The LLaVA was a vision-language model composed of a language model (Vicuna-13B)[7] and a vision model (ViT-L/14), connected by a linear layer. Only the linear layer is finetuned.[8]

Vision transformers[9] adapt the transformer to computer vision by breaking down input images as a series of patches, turning them into vectors, and treating them like tokens in a standard transformer.

Conformer[10] and later Whisper[11] follow the same pattern for speech recognition, first turning the speech signal into a spectrogram, which is then treated like an image, i.e. broken down into a series of patches, turned into vectors and treated like tokens in a standard transformer.

Perceivers[12][13] are a variant of Transformers designed for multimodality.

For image generation, notable architectures are DALL-E 1 (2021), Parti (2022),[14] Phenaki (2023),[15] and Muse (2023).[16] Unlike later models, DALL-E is not a diffusion model. Instead, it uses a decoder-only Transformer that autoregressively generates a text, followed by the token representation of an image, which is then converted by a variational autoencoder to an image.[17] Parti is an encoder-decoder Transformer, where the encoder processes a text prompt, and the decoder generates a token representation of an image.[18] Muse is an encoder-only Transformer that is trained to predict masked image tokens from unmasked image tokens. During generation, all input tokens are masked, and the highest-confidence predictions are included for the next iteration, until all tokens are predicted.[16] Phenaki is a text-to-video model. It is a bidirectional masked transformer conditioned on pre-computed text tokens. The generated tokens are then decoded to a video.[15]

Multimodal large language models

Multimodality means "having several modalities", and a "modality" refers to a type of input or output, such as video, image, audio, text, proprioception, etc.[19] There have been many AI models trained specifically to ingest one modality and output another modality, such as AlexNet for image to label,[20] visual question answering for image-text to text,[21] and speech recognition for speech to text.

A common method to create multimodal models out of an LLM is to "tokenize" the output of a trained encoder. Concretely, one can construct an LLM that can understand images as follows: take a trained LLM, and take a trained image encoder . Make a small multilayered perceptron , so that for any image , the post-processed vector has the same dimensions as an encoded token. That is an "image token". Then, one can interleave text tokens and image tokens. The compound model is then fine-tuned on an image-text dataset. This basic construction can be applied with more sophistication to improve the model. The image encoder may be frozen to improve stability.[22]

Flamingo demonstrated the effectiveness of the tokenization method, finetuning a pair of pretrained language model and image encoder to perform better on visual question answering than models trained from scratch.[23] Google PaLM model was fine-tuned into a multimodal model PaLM-E using the tokenization method, and applied to robotic control.[24] LLaMA models have also been turned multimodal using the tokenization method, to allow image inputs,[25] and video inputs.[26]

GPT-4 can use both text and image as inputs[27] (although the vision component was not released to the public until GPT-4V[28]); Google DeepMind's Gemini is also multimodal.[29] Mistral introduced its own multimodel Pixtral 12B model in September 2024.[30]

Multimodal deep Boltzmann machines

A Boltzmann machine is a type of stochastic neural network invented by Geoffrey Hinton and Terry Sejnowski in 1985. Boltzmann machines can be seen as the stochastic, generative counterpart of Hopfield nets. They are named after the Boltzmann distribution in statistical mechanics. The units in Boltzmann machines are divided into two groups: visible units and hidden units. Each unit is like a neuron with a binary output that represents whether it is activated or not.[31] General Boltzmann machines allow connection between any units. However, learning is impractical using general Boltzmann Machines because the computational time is exponential to the size of the machine[citation needed]. A more efficient architecture is called restricted Boltzmann machine where connection is only allowed between hidden unit and visible unit, which is described in the next section.

Multimodal deep Boltzmann machines can process and learn from different types of information, such as images and text, simultaneously. This can notably be done by having a separate deep Boltzmann machine for each modality, for example one for images and one for text, joined at an additional top hidden layer.[32]

Applications

Multimodal machine learning has numerous applications across various domains:

Cross-Modal Retrieval

Cross-modal retrieval allows users to search for data across different modalities (e.g., retrieving images based on text descriptions), improving multimedia search engines and content recommendation systems. Models like CLIP facilitate efficient, accurate retrieval by embedding data in a shared space, demonstrating strong performance even in zero-shot settings.[33]

Classification and Missing Data Retrieval

Multimodal Deep Boltzmann Machines outperform traditional models like support vector machines and latent Dirichlet allocation in classification tasks and can predict missing data in multimodal datasets, such as images and text.

Healthcare Diagnostics

Multimodal models integrate medical imaging, genomic data, and patient records to improve diagnostic accuracy and early disease detection, especially in cancer screening.[34][35][36]

Content Generation

Models like DALL·E generate images from textual descriptions, benefiting creative industries, while cross-modal retrieval enables dynamic multimedia searches.[37]

Robotics and HCI

Multimodal learning improves interaction in robotics and AI by integrating sensory inputs like speech, vision, and touch, aiding autonomous systems and human-computer interaction.

Emotion Recognition

Combining visual, audio, and text data, multimodal systems enhance sentiment analysis and emotion recognition, applied in customer service, social media, and marketing.

See also

References

  1. ^ Hendriksen, Mariya; Bleeker, Maurits; Vakulenko, Svitlana; van Noord, Nanne; Kuiper, Ernst; de Rijke, Maarten (2021). "Extending CLIP for Category-to-image Retrieval in E-commerce". arXiv:2112.11294 [cs.CV].
  2. ^ "Stable Diffusion Repository on GitHub". CompVis - Machine Vision and Learning Research Group, LMU Munich. 17 September 2022. Archived from the original on January 18, 2023. Retrieved 17 September 2022.
  3. ^ LAION-AI/aesthetic-predictor, LAION AI, 2024-09-06, retrieved 2024-09-08
  4. ^ Mokady, Ron; Hertz, Amir; Bermano, Amit H. (2021). "ClipCap: CLIP Prefix for Image Captioning". arXiv:2111.09734 [cs.CV].
  5. ^ Zia, Tehseen (January 8, 2024). "Unveiling of Large Multimodal Models: Shaping the Landscape of Language Models in 2024". Unite.ai. Retrieved 2024-06-01.
  6. ^ Lu, Kevin; Grover, Aditya; Abbeel, Pieter; Mordatch, Igor (2022-06-28). "Frozen Pretrained Transformers as Universal Computation Engines". Proceedings of the AAAI Conference on Artificial Intelligence. 36 (7): 7628–7636. doi:10.1609/aaai.v36i7.20729. ISSN 2374-3468.
  7. ^ "Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality | LMSYS Org". lmsys.org. Retrieved 2024-08-11.
  8. ^ Liu, Haotian; Li, Chunyuan; Wu, Qingyang; Lee, Yong Jae (2023-12-15). "Visual Instruction Tuning". Advances in Neural Information Processing Systems. 36: 34892–34916.
  9. ^ Dosovitskiy, Alexey; Beyer, Lucas; Kolesnikov, Alexander; Weissenborn, Dirk; Zhai, Xiaohua; Unterthiner, Thomas; Dehghani, Mostafa; Minderer, Matthias; Heigold, Georg; Gelly, Sylvain; Uszkoreit, Jakob (2021-06-03). "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale". arXiv:2010.11929 [cs.CV].
  10. ^ Gulati, Anmol; Qin, James; Chiu, Chung-Cheng; Parmar, Niki; Zhang, Yu; Yu, Jiahui; Han, Wei; Wang, Shibo; Zhang, Zhengdong; Wu, Yonghui; Pang, Ruoming (2020). "Conformer: Convolution-augmented Transformer for Speech Recognition". arXiv:2005.08100 [eess.AS].
  11. ^ Radford, Alec; Kim, Jong Wook; Xu, Tao; Brockman, Greg; McLeavey, Christine; Sutskever, Ilya (2022). "Robust Speech Recognition via Large-Scale Weak Supervision". arXiv:2212.04356 [eess.AS].
  12. ^ Jaegle, Andrew; Gimeno, Felix; Brock, Andrew; Zisserman, Andrew; Vinyals, Oriol; Carreira, Joao (2021-06-22). "Perceiver: General Perception with Iterative Attention". arXiv:2103.03206 [cs.CV].
  13. ^ Jaegle, Andrew; Borgeaud, Sebastian; Alayrac, Jean-Baptiste; Doersch, Carl; Ionescu, Catalin; Ding, David; Koppula, Skanda; Zoran, Daniel; Brock, Andrew; Shelhamer, Evan; Hénaff, Olivier (2021-08-02). "Perceiver IO: A General Architecture for Structured Inputs & Outputs". arXiv:2107.14795 [cs.LG].
  14. ^ "Parti: Pathways Autoregressive Text-to-Image Model". sites.research.google. Retrieved 2024-08-09.
  15. ^ a b Villegas, Ruben; Babaeizadeh, Mohammad; Kindermans, Pieter-Jan; Moraldo, Hernan; Zhang, Han; Saffar, Mohammad Taghi; Castro, Santiago; Kunze, Julius; Erhan, Dumitru (2022-09-29). "Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions". {{cite journal}}: Cite journal requires |journal= (help)
  16. ^ a b Chang, Huiwen; Zhang, Han; Barber, Jarred; Maschinot, A. J.; Lezama, Jose; Jiang, Lu; Yang, Ming-Hsuan; Murphy, Kevin; Freeman, William T. (2023-01-02). "Muse: Text-To-Image Generation via Masked Generative Transformers". arXiv:2301.00704 [cs.CV].
  17. ^ Ramesh, Aditya; Pavlov, Mikhail; Goh, Gabriel; Gray, Scott; Voss, Chelsea; Radford, Alec; Chen, Mark; Sutskever, Ilya (2021-02-26), Zero-Shot Text-to-Image Generation, arXiv:2102.12092
  18. ^ Yu, Jiahui; Xu, Yuanzhong; Koh, Jing Yu; Luong, Thang; Baid, Gunjan; Wang, Zirui; Vasudevan, Vijay; Ku, Alexander; Yang, Yinfei (2022-06-21), Scaling Autoregressive Models for Content-Rich Text-to-Image Generation, arXiv:2206.10789
  19. ^ Kiros, Ryan; Salakhutdinov, Ruslan; Zemel, Rich (2014-06-18). "Multimodal Neural Language Models". Proceedings of the 31st International Conference on Machine Learning. PMLR: 595–603. Archived from the original on 2023-07-02. Retrieved 2023-07-02.
  20. ^ Krizhevsky, Alex; Sutskever, Ilya; Hinton, Geoffrey E (2012). "ImageNet Classification with Deep Convolutional Neural Networks". Advances in Neural Information Processing Systems. 25. Curran Associates, Inc. Archived from the original on 2023-07-02. Retrieved 2023-07-02.
  21. ^ Antol, Stanislaw; Agrawal, Aishwarya; Lu, Jiasen; Mitchell, Margaret; Batra, Dhruv; Zitnick, C. Lawrence; Parikh, Devi (2015). "VQA: Visual Question Answering". ICCV: 2425–2433. Archived from the original on 2023-07-02. Retrieved 2023-07-02.
  22. ^ Li, Junnan; Li, Dongxu; Savarese, Silvio; Hoi, Steven (2023-01-01). "BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models". arXiv:2301.12597 [cs.CV].
  23. ^ Alayrac, Jean-Baptiste; Donahue, Jeff; Luc, Pauline; Miech, Antoine; Barr, Iain; Hasson, Yana; Lenc, Karel; Mensch, Arthur; Millican, Katherine; Reynolds, Malcolm; Ring, Roman; Rutherford, Eliza; Cabi, Serkan; Han, Tengda; Gong, Zhitao (2022-12-06). "Flamingo: a Visual Language Model for Few-Shot Learning". Advances in Neural Information Processing Systems. 35: 23716–23736. arXiv:2204.14198. Archived from the original on 2023-07-02. Retrieved 2023-07-02.
  24. ^ Driess, Danny; Xia, Fei; Sajjadi, Mehdi S. M.; Lynch, Corey; Chowdhery, Aakanksha; Ichter, Brian; Wahid, Ayzaan; Tompson, Jonathan; Vuong, Quan; Yu, Tianhe; Huang, Wenlong; Chebotar, Yevgen; Sermanet, Pierre; Duckworth, Daniel; Levine, Sergey (2023-03-01). "PaLM-E: An Embodied Multimodal Language Model". arXiv:2303.03378 [cs.LG].
  25. ^ Liu, Haotian; Li, Chunyuan; Wu, Qingyang; Lee, Yong Jae (2023-04-01). "Visual Instruction Tuning". arXiv:2304.08485 [cs.CV].
  26. ^ Zhang, Hang; Li, Xin; Bing, Lidong (2023-06-01). "Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding". arXiv:2306.02858 [cs.CL].
  27. ^ OpenAI (2023-03-27). "GPT-4 Technical Report". arXiv:2303.08774 [cs.CL].
  28. ^ OpenAI (September 25, 2023). "GPT-4V(ision) System Card" (PDF).
  29. ^ Pichai, Sundar (10 May 2023), Google Keynote (Google I/O '23), timestamp 15:31, retrieved 2023-07-02
  30. ^ Wiggers, Kyle (11 September 2024). "Mistral releases Pixtral 12B, its first multimodal model". TechCrunch. Retrieved 14 September 2024.
  31. ^ Dey, Victor (2021-09-03). "Beginners Guide to Boltzmann Machine". Analytics India Magazine. Retrieved 2024-03-02.
  32. ^ "Multimodal Learning with Deep Boltzmann Machine" (PDF). 2014. Archived (PDF) from the original on 2015-06-21. Retrieved 2015-06-14.
  33. ^ Hendriksen, Mariya; Vakulenko, Svitlana; Kuiper, Ernst; de Rijke, Maarten (2023). "Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study". arXiv:2301.05174 [cs.CV].
  34. ^ Quach, Katyanna. "Harvard boffins build multimodal AI system to predict cancer". The Register. Archived from the original on 20 September 2022. Retrieved 16 September 2022.
  35. ^ Chen, Richard J.; Lu, Ming Y.; Williamson, Drew F. K.; Chen, Tiffany Y.; Lipkova, Jana; Noor, Zahra; Shaban, Muhammad; Shady, Maha; Williams, Mane; Joo, Bumjin; Mahmood, Faisal (8 August 2022). "Pan-cancer integrative histology-genomic analysis via multimodal deep learning". Cancer Cell. 40 (8): 865–878.e6. doi:10.1016/j.ccell.2022.07.004. ISSN 1535-6108. PMC 10397370. PMID 35944502. S2CID 251456162.
  36. ^ Shi, Yuge; Siddharth, N.; Paige, Brooks; Torr, Philip HS (2019). "Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models". arXiv:1911.03393 [cs.LG].
  37. ^ Shi, Yuge; Siddharth, N.; Paige, Brooks; Torr, Philip HS (2019). "Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models". arXiv:1911.03393 [cs.LG].

Read other articles:

Koordinat: 49°36′59″N 06°08′20″E / 49.61639°N 6.13889°E / 49.61639; 6.13889 Rekosntruksi Benteng Thüngen merumahi Musée Dräi Eechelen. Benteng Thüngen adalah sebuah benteng bersejarah di Kota Luksemburg yang berada di wilayah selatan Luksemburg. Lokasi Benteng Thüngen berada di Taman Dräi Eechelen, kawasan Kirchberg, timur laut kota tersebut. Referensi Fort Thüngen (dalam bahasa Prancis). Service des sites et monuments nationaux. Diarsipkan dari vers...

 

Paulina Haning-Bullu Paulina Haning-Bullu (lahir 10 April 1956)[1] adalah seorang politikus Indonesia. Paulina Haning-Bullu adalah salah satu perempuan pertama di Nusa Tenggara Timur melalui proses panjang akhirnya dilantik sebagai Bupati Rote Ndao periode 2019–2024 di Aula Fernandes Kantor Gubernur NTT, pada Kamis, 14 Februari 2019. Paulina Haning-Bullu adalah istri dari mantan Bupati Rote Ndao dua periode, Leonard Haning. Dengan dilantiknya Paulina Haning Bullu menjadi Bupati Rote...

 

Cruise ship built in 2002 Carnival Pride Carnival Pride in Kiel, 2023 History Panama NameCarnival Pride Owner Carnival Corporation & plc OperatorCarnival Cruise Lines Port of registryPanama City,  Panama Builder Kvaerner Masa-Yards Helsinki New Shipyard Helsinki, Finland CostUS $375 million Yard number500 Laid downMarch 30, 2000 LaunchedMarch 29, 2001 Sponsored byTamara Jernigan ChristenedJanuary 7, 2002 CompletedDecember 12, 2001 In service2002–present Identification Call sig...

Pour les articles homonymes, voir Bile (homonymie). Cet article est une ébauche concernant la biologie. Vous pouvez partager vos connaissances en l’améliorant (comment ?) selon les recommandations des projets correspondants. Le cycle entéro-hépatique de la bile ou cycle biliaire de Schiff La bile[1] est un liquide biologique jaune-verdâtre, légèrement basique (pH compris entre 7,6 et 8,6[2]) qui favorise la digestion, plus spécifiquement celle des lipides. Elle est sécrétée...

 

Clarence Seedorf Informasi pribadiNama lengkap Clarence Clyde Seedorf[1]Tanggal lahir 1 April 1976 (umur 48)Tempat lahir Paramaribo, Suriname[2]Tinggi 176 cm (5 ft 9 in)Posisi bermain GelandangKarier senior*Tahun Tim Tampil (Gol)1992–1995 Ajax 65 (11)1995–1996 Sampdoria 32 (3)1996–1999 Real Madrid 121 (15)1999–2002 Inter Milan 92 (14)2002–2012 AC Milan 300 (47)2012–2014 Botafogo 72 (23)Total 640 (100)Tim nasional1994–2008 Belanda 87 (11)Kepel...

 

North-south avenue in Manhattan, New York Template:Attached KML/Seventh Avenue (Manhattan)KML is from Wikidata Seventh AvenueSeventh Avenue South (south of 11th St)Fashion Avenue (26th–42nd Sts)Adam Clayton Powell Jr. Boulevard (north of 110th St)Seventh Avenue heading north to Greenwich Village and Central ParkNamesakeGarment District and Adam Clayton Powell Jr.OwnerCity of New YorkMaintained byNYCDOTLength5.3 mi (8.5 km)[1][2]LocationManhattan, New York CitySouth...

† Человек прямоходящий Научная классификация Домен:ЭукариотыЦарство:ЖивотныеПодцарство:ЭуметазоиБез ранга:Двусторонне-симметричныеБез ранга:ВторичноротыеТип:ХордовыеПодтип:ПозвоночныеИнфратип:ЧелюстноротыеНадкласс:ЧетвероногиеКлада:АмниотыКлада:Синапсиды�...

 

Offensive play in basketball For other uses, see Alley Oop (disambiguation). This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.Find sources: Alley-oop – news · newspapers · books · scholar · JSTOR (January 2010) (Learn how and when to remove this template message) Trey Burke sets up an alley-oop to Glenn Robinson III for Mi...

 

آلة السدس بحار يقوم بقياس ارتفاع الشمس بالسدس. السدس (الجمع: السُدُسَات) هو آلة فلكية قديمة كانت تستخدم لقياس الزاوية بين جسمين أو نجمين، والتي اخترعها أبو محمود الخجندي في القرن العاشر. تستخدم آلة السدس في الأساس لتحديد الزاوية بين جرم سماوي والأفق الذي يعرف باسم الارتفاع....

British military award AwardConspicuous Gallantry CrossObverse of the medal. Ribbon: 32 mm, white with blue edges and a red central stripeTypeMilitary decorationAwarded for... an act or acts of conspicuous gallantry during active operations against the enemy.[1]Description36 mm max. width; silver cross patée imposed on a wreath of laurel, with the Royal Crown in a circular panel in the centre. Suspended by a ring from a plain suspension bar.CountryUnited Kingdom of Great Britain...

 

هنودمعلومات عامةنسبة التسمية الهند التعداد الكليالتعداد قرابة 1.21 مليار[1][2]تعداد الهند عام 2011ق. 1.32 مليار[3]تقديرات عام 2017ق. 30.8 مليون[4]مناطق الوجود المميزةبلد الأصل الهند البلد الهند  الهند نيبال 4,000,000[5] الولايات المتحدة 3,982,398[6] الإمار...

 

2020 studio album by Sébastien TellierDomesticatedStudio album by Sébastien TellierReleased29 May 2020 (2020-05-29)Length32:03LabelRecord MakersProducernitJam CityMind GamersPhilippe ZdarVarnish La PiscineSébastien Tellier chronology L'Aventura(2014) Domesticated(2020) Singles from Domesticated A BalletReleased: 29 January 2020[1] Domestic TasksReleased: 8 April 2020[2] Stuck in a Summer LoveReleased: 19 May 2020[3] Professional ratingsAggreg...

نادي تريفيزو تأسس عام 1909،  و1993،  و2009،  و2013،  و2019  البلد إيطاليا  المدرب لويجي راديشي (1 يوليو 1968–30 يونيو 1969)  الموقع الرسمي الموقع الرسمي  تعديل مصدري - تعديل   نادي تريفيزو (بالإيطالية: F.C.[1] Treviso)‏، نادي كرة قدم إيطالي يقع في مدينة تريفيزو في إيطالي�...

 

لجائحة فيروس كورونا تأثير كبير على صناعة الموسيقى خلال انتشارها في عام 2019-2020، وعلى جميع مجالات الفن حيث قد أُلغيت أو أُجلت العديد من الحفلات الموسيقية والمهرجانات الموسيقية، وجولات الحفلات الموسيقية حول العالم، وحفلات تقديم الجوائز. بالرغم من أن هذه الجائحة قامت بمنح ال�...

 

Her Majesty's Government of Western AustraliaCoat of arms of Western Australiaمعلومات عامةالبلد  Australiaالاختصاص أستراليا الغربية نظام الحكم State Governmentالهيئات الفرعية Department of Commerce (en) [1]Tourism Western Australia (en) Head of state (sovereign) Monarch (Queen)رئيس الحكومة Premierالتكوين 21-10-1890المدة 133 سنةً و7 أشهرٍ ويومًا واحدًاالمقر الرئ...

Монгольская народная партиямонг. Монгол Ардын НамМНП / МАН Лидер Лувсаннамсрайн Оюун-Эрдэнэ Основана 25 июня 1920 Штаб-квартира Монголия, Улан-Батор, пр-т Молодёжи, 14191 Страна  Монголия Идеология В настоящее время:Социал-демократияДемократический социализмЛевый национа�...

 

2016年夏季奥林匹克运动会汤加代表團汤加国旗IOC編碼TGANOC湯加體育與國家奧林匹克委員會網站oceaniasport.com/index_id_73.html(英文)2016年夏季奥林匹克运动会(里約熱內盧)2016年8月5日至8月21日運動員7參賽項目4个大项旗手开幕式:皮塔·陶法托夫瓦(跆拳道)[1]闭幕式:Siueni Filimone(田径)[2]历届奥林匹克运动会参赛记录(总结)夏季奥林匹克运动会198419881992199620...

 

Ilustrasi Jendral Yue Fei Wikimedia Commons memiliki media mengenai Yue Fei. Yue Fei adalah jendral terkenal dari Dinasti Song.[1][2] Ia adalah jendral utama dalam pengembalian daerah yang direbut Dinasti Jin di bawah Kaisar Song Gaozong.[1][2] Kisah asal mula penganan Cahkwe terkait dengan kematiannya.[1][2] Yue Fei difitnah oleh pejabat kerajaan Qin Hui yang menyebabkan dirinya dihukum oleh Kaisar Song Gaozong.[1][2] Hal ini me...

Rank in the Swedish Navy CaptainkommendörFlag of the captain, Swedish Navy.Shoulder mark of a Swedish captain.Sleeve insignia of a Swedish captain.Country SwedenService branchSwedish NavyAbbreviationKmd (Swedish),[1] Capt (N) (English)[2]RankCaptainNATO rank codeOF-05Non-NATO rankO-6Formation1600sNext higher rankRear admiral (lower half) (2000–)Senior captain (1972–2000)Rear admiral (–1972)Next lower rankCommanderEquivalent ranksColonel Kommendör, abbreviated kmd ...

 

1992 musical with songs by George and Ira Gershwin Crazy for YouOriginal cast recordingMusicGeorge GershwinLyricsIra GershwinBookKen LudwigBasisGirl Crazy by George Gershwin Ira Gershwin Guy Bolton John McGowanProductions1992 Broadway1993 West End2011 West End revival2022 Chichester Festival Theatre2023 West End revivalAwardsTony Award for Best MusicalLaurence Olivier Award for Best New MusicalLaurence Olivier Award for Best Musical Revival Crazy for You is a romantic comedy musical with a bo...