Highlights
- Pro
Stars
[ECCV 2024] API: Attention Prompting on Image for Large Vision-Language Models
[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
Unofficial re-implementation of "Trusting Your Evidence: Hallucinate Less with Context-aware Decoding"
[ICLR 2025] This is the official repository of our paper "MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine“
Repository containing all relevant steps to create visuals and summary statistics for the research project "Prompt Injection Attacks on Large Language Models in Oncology" by Clusmann et al.
Medical o1, Towards medical complex reasoning with LLMs
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI.
This repository implements teleoperation of the Unitree humanoid robot using XR Devices.
This repository compiles a list of papers related to the application of video technology in the field of robotics! Star⭐ the repo and follow me if you like what you see🤩.
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Codes for paper "SafeAgentBench: A Benchmark for Safe Task Planning of \\ Embodied LLM Agents"
This is the official repository for the ICLR 2025 accepted paper Badrobot: Manipulating Embodied LLMs in the Physical World.
Official repo of Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Visualizing the attention of vision-language models
official code for paper Probing the Decision Boundaries of In-context Learning in Large Language Models. https://arxiv.org/abs/2406.11233 [NeurIPS 2024]
[ICLR'25] MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks [ICLR 2025]
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, B…
[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
Code for Arxiv Double Descent Demystified: Identifying, Interpreting & Ablating the Sources of a Deep Learning Puzzle
Official PyTorch Implementation for Meaning Representations from Trajectories in Autoregressive Models (ICLR 2024)