Explained: Multitask Unified Model

TLDR: MUM can search the internet to access information across different languages, text, images and media to help users find answers to detailed questions, think multi-part questions. It’s not just search – it’s research.   They call it the mother of all search algorithms. So the name MUM or Multitask Unified Model seems appropriate. MUM […]

Topics

  • TLDR: MUM can search the internet to access information across different languages, text, images and media to help users find answers to detailed questions, think multi-part questions. It’s not just search – it’s research.

     

    They call it the mother of all search algorithms. So the name MUM or Multitask Unified Model seems appropriate.

    MUM follows Hummingbird which was first introduced in 2013 when it replaced the Caffeine algorithm. While it was a new engine, it used part of the older models like Panda and Penguin. Both were significant; the Panda update acted as a filter designed to penalise poor content from ranking well, and Penguin penalised websites with bad link building techniques and keyword stuffing. Later, Google search engine introduced RankBrain where machine learning was used to determine the most relevant results to queries. With RankBrain, the question went through an interpretation model to factors in other considerations like location, personalisation, and true intent based on vocabulary in addition to the actual query.

    Until now, BERT or Bidirectional Encoder Representations from Transformers has been the most sophisticated algorithm since it can fill in gaps for context, like adding text before and after the search query to better understand the sentence. For this reason, it’s called a transformer, it can rearrange the sentence without losing the meaning. The algorithm BERT changed a lot in the field of NLP and was applied in thousands of diverse applications and industries. It helps humans and machines understand each other better.

    So, what’s so special about MUM? This algorithm is able to do what BERT does but on a more complex level. The way developers put it – it’s not just search, it’s research. MUM can search the internet to access information across different languages, text, images and media to help users find answers to detailed questions, think multi-part questions.

    The fact that MUM is able to cross barriers like language and media format is a big win for Google to promote search across its products. Take the example of Lens, a user searching for a certain graphic pattern, could use the Google Lens to input an image containing a pattern.

    According to a blog by a Google developer, MUM is much more capable of identifying relationships between entities and hence works better for more complex questions and long-tail keywords. Marketers would be thrilled to make better use of long-tail keywords for search queries that are more specific than their “head” counterparts. It is for this reason specifically that long-tail keywords often have a higher conversion rate.

    MUM uses the T5 text-to-text framework and is 1,000 times more powerful than BERT. Transfer learning’s effectiveness comes from pre-training a model on abundantly-available unlabeled text data with a self-supervised task, such as language modelling or filling in missing words. After that, the model can be fine-tuned on smaller labelled datasets, often resulting in (far) better performance than training on the labelled data alone. The recent success of transfer learning was ignited in 2018 by GPT, ULMFiT, ELMo, and BERT, and 2019 saw the development of a huge diversity of new methods like XLNet, RoBERTa, ALBERT, Reformer, and MT-DNN. The rate of progress in the field has made it difficult to evaluate which improvements are most meaningful and how effective they are when combined. An important ingredient for transfer learning is the unlabeled dataset used for pre-training. To accurately measure the effect of scaling up the amount of pre-training, one needs a dataset that is not only high quality and diverse, but also massive.

    MUM not only understands language, but also generates it. It’s trained across 75 different languages and many different tasks at once, allowing it to develop a more comprehensive understanding of information and world knowledge than previous models. And MUM is multimodal, so it understands information across text and images and, in the future, can expand to more modalities like video and audio.

    Did you know that whenever Google announces a new model for search, the new technology must undergo a rigorous evaluation process to ensure the results are more meaningful? The company uses human raters, who follow the set Search Quality Rater Guidelines to evaluate the results. MUM, released in May 2021, is undergoing the same testing.

    It’s exciting to see how search, shop and stream technologies are consolidating to offer customers a more streamlined experience. Imagine how Google’s MUM can make shopping an easier task for the user. Google plan to introduce a new button that will enable the user to instantly shop the image they are looking at on the screen.

    If you liked reading this, you might like our other stories

    Get Your Arabic Website Rank on Google
    Arabic SEO Guide: Drive Organic Traffic to Your Website 

    Topics

    More Like This