Release v1.6.4

This release, v1.6.4, includes updated benchmark data and refined code for MMAR, enhancing the evaluation of deep reasoning in speech, audio, music, and their combinations. Users will find improved performance metrics and additional examples to facilitate better understanding and application of the benchmark. We encourage feedback to further enhance the toolkit.