Our Github Page: https://github.com/Q-Future

Please use the HF versions for the benchmark datasets by Q-Future.

from datasets import load_dataset

ds = load_dataset("q-future/Q-Bench-HF") # or A-Bench-HF, Q-Bench2-HF
ds["dev"][0] # Containing images (in PIL.ImageFile), questions, and answers

Our Spaces

Great thanks to the research GPU grants!

Q-Align (Most Powerful Visual Scorer):
Q-Instruct (Low-level Vision-Language Assistant/Chatbot, support 1-4 images):
Q-Bench (Benchmark for General Purpose MLLMs):

Our Mainstream Models

q-future/one-align: AutoModel for Visual Scoring. Trained with Mixture of existing datasets: See Github for details.
q-future/co-instruct: AutoModel for Low-level Visual Dialog (Description, Comparison, Question Answering). Trained with the scaled Co-Instruct-562K dataset (will also release soon!).
q-future/q-instruct-mplug-owl2-1031: Older version of Q-Instruct, as reported by paper. Trained with released Q-Instruct-200K dataset.

Though we have other model variants released for the community to replicate our results, please use the previous ones as they are proved to have more stable performance.