Clip-Example img is used to extract image features text is used to extract text features The code is highly efficient and can fully utilize GPU performance