Making AI Lighter, Not Heavier with ONLY 1 network and many masks
Background
In order to understand lighter AI, imagine that you’re packing for a vacation. You want to bring a bunch of guidebooks for all the places you’ll visit, but your suitcase cannot fit all of them. This is also like a challenge we face in the world of AI, particularly with language models. These are like digital brains that help computers understand and generate human language. Now, teaching these AI brains to do new things usually means packing in more and more information, making them heavy and so hard to handle. But what if we could just bring one guidebook that magically has all the information we need? That’s what we’ve done with ProPETL!
ProPETL
What is ProPETL?
ProPETL introduces a novel approach to teach AI systems new skills without the model redundancy and heavy parameters. Traditionally, when teaching a new skill to an AI, we need to first copy the parameters and then train it with the specific data. This process is akin to using a separate guidebook for each new destination on a journey. However, ProPETL revolutionizes this approach by providing an adaptable guidebook whose contents can be changing with different decryption glasses.
How to build ProPETL?
This innovative method uses “binary masks” as the decryption tool. With these glasses, the AI model sees only the relevant pages of the guidebook required for the current task. Consequently, this singular guidebook can flexibly instruct the model on a wide range of tasks, from understanding Shakespearean texts and using Mandarin to write emails, all by simply changing its visible content with the “glasses”. ProPETL significantly enhances the efficiency of large language model training, reducing the parameter storage needs to merely 10% of what previous adapter methods required. This facilitates the learning of advanced skills such as coding, language translation, and content summarization using only a fraction of the parameters. As model sizes get larger and larger, bringing challenges for their practical application in everyday scenarios, ProPETL offers a promising step forward in integrating large-scale AI models into our daily lives more easily.
Conclusion
In simple words, ProPETL is a great way to make language model both smart and lightweight. It’s like giving our AI a magical, all-in-one guidebook for any skill we want it to learn. This is not just exciting for tech experts, but for everyone, as it means we can have lighter model helpers in our lives without them becoming too complex or expensive.
Check our paper if you are insteresting!