DataGen 4
- Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment
- Negative Matters: Multi-Granularity Hard-Negative Synthesis and Anchor-Token-Aware Pooling for Enhanced Text Embeddings
- Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval
- SyNeg: LLM-Driven Synthetic Hard-Negatives for Dense Retrieval