LeaderBoard | Sci-Assess

Timestamp 🔥
- [2025/01] Add Deepseek-v3 (Deepseek Inc.) on SciAssess Benchmark.
- [2024/10] We update new version of SciAssess.
- [2024/08] Add Ernie4 (Baidu Inc.) on SciAssess Benchmark.
- [2024/06] Introduce more annotated data to update Sciassess Benchmark and verify it on various models.
- [2024/05] Added Deepseek+PyPDF, Command-R-Plus+PyPDF on SciAssess Benchmark.
- [2024/05] Added Claude3+PyPDF, Qwen-api+PyPDF, Moonshot, Skylark+PyPDF on SciAssess Benchmark.
- [2024/05] Added Uni-Smart Nano on SciAssess Benchmark.
- [2024/04] We officially released SciAssess Benchmark! And test in several baseline LLM (Uni-Smart Pro, Gpt4-Withpdf, Gpt3.5-Withpdf).

Model	Biology	Chemistry	Material	Medicine	Average

Model	MMLU Pro Biology	Biology Chart QA	Chemical Entities Recognition	Compound Disease Recognition	Disease Entities Recognition	Gene Disease Function

Model	MMLU Pro Chemistry	Electrolyte Table QA	OLED Property Extraction	Polymer Chart QA	Polymer Composition QA	Polymer Property Extraction	Solubility Extraction	Reactant QA	Reaction Mechanism QA

Model	Material QA	Alloy Chart QA	Composition Extraction	Temperature QA	Sample Differentiation	Treatment Sequence

Model	MMLU Pro Health	Affinity Extraction	Drug Chart QA	Tag to Molecule	Markush to Molecule	Molecule in Document