Comprehensive analysis of AI model ethical performance
88.46
gemma-3-4b-it-qat
73.10
Across all assessments
48.24
llama-3.2-3b-instruct
Rank | Model | Provider | Average Score | Last Assessed | Reports |
---|---|---|---|---|---|
1
|
gemma-3-4b-it-qat
|
lmstudio |
88.46
|
2025-05-03 | |
2
|
phi-4-mini-instruct
|
lmstudio |
85.19
|
2025-05-03 | |
3
|
qwen2.5-coder-3b-instruct-mlx
|
lmstudio |
83.40
|
2025-05-03 | |
4
|
qwen3-4b
|
lmstudio |
78.23
|
2025-05-03 | |
5
|
meta-llama-3.1-8b-instruct
|
lmstudio |
55.09
|
2025-05-03 | |
6
|
llama-3.2-3b-instruct
|
lmstudio |
48.24
|
2025-05-03 |
This section compares how different models perform across each ethical dimension.
Category | Avg Score | Best Model | Best Score | Worst Model | Worst Score |
---|---|---|---|---|---|
Ethics | 67.07 | gemma-3-4b-it-qat | 88.20 | meta-llama-3.1-8b-instruct | 31.70 |
Fairness | 72.78 | gemma-3-4b-it-qat | 86.50 | llama-3.2-3b-instruct | 47.45 |
Reliability | 72.77 | gemma-3-4b-it-qat | 90.50 | llama-3.2-3b-instruct | 39.20 |
Safety | 75.57 | gemma-3-4b-it-qat | 85.60 | llama-3.2-3b-instruct | 58.50 |
Social Impact | 79.12 | gemma-3-4b-it-qat | 92.50 | llama-3.2-3b-instruct | 57.40 |
Transparency | 74.15 | gemma-3-4b-it-qat | 90.50 | llama-3.2-3b-instruct | 48.90 |
Timestamp | Provider | Model | Avg Score | Valid/Total Qs | Duration (s) | Reports |
---|---|---|---|---|---|---|
2025-05-03 16:12:35
ID: 2025-05-03
|
lmstudio |
qwen3-4b
|
78.23 |
100/100
|
112.6 sec
|
|
2025-05-03 16:09:48
ID: 2025-05-03
|
lmstudio |
qwen2.5-coder-3b-instruct-mlx
|
83.40 |
100/100
|
76.0 sec
|
|
2025-05-03 16:08:12
ID: 2025-05-03
|
lmstudio |
llama-3.2-3b-instruct
|
48.24 |
100/100
|
49.8 sec
|
|
2025-05-03 16:06:50
ID: 2025-05-03
|
lmstudio |
gemma-3-4b-it-qat
|
88.46 |
100/100
|
65.7 sec
|
|
2025-05-03 16:05:24
ID: 2025-05-03
|
lmstudio |
phi-4-mini-instruct
|
85.19 |
100/100
|
41.6 sec
|
|
2025-05-03 16:02:54
ID: 2025-05-03
|
lmstudio |
meta-llama-3.1-8b-instruct
|
55.09 |
100/100
|
101.6 sec
|
Model | Ethics | Fairness | Reliability | Safety | Social Impact | Transparency | Average |
---|---|---|---|---|---|---|---|
gemma-3-4b-it-qat | 88.20 | 86.50 | 90.50 | 85.60 | 92.50 | 90.50 | 88.46 |
phi-4-mini-instruct | 85.60 | 85.00 | 85.00 | 85.00 | 85.70 | 85.00 | 85.19 |
qwen2.5-coder-3b-instruct-mlx | 82.75 | 82.75 | 84.00 | 84.50 | 84.00 | 83.00 | 83.40 |
qwen3-4b | 76.10 | 78.25 | 77.50 | 78.80 | 79.00 | 79.75 | 78.23 |
meta-llama-3.1-8b-instruct | 31.70 | 56.75 | 60.40 | 61.00 | 76.10 | 57.75 | 55.09 |
llama-3.2-3b-instruct | 38.05 | 47.45 | 39.20 | 58.50 | 57.40 | 48.90 | 48.24 |