Curiosity-driven Red-teaming for Large Language ModelsZhang-wei HongIdan Shenfeldet al.2024ICLR 2024Conference paper
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsZhang-wei HongAviral Kumaret al.2023NeurIPS 2023Conference paper