Dealing with #Overfitting in ML.NET (#AI / #ML)
One of the issues with ML.NET, is that it will tend to be over-eager to provide a prediction, even when there is not enough information to go on, in order to make any sort of accurate prediction. This is where you should also look at the Score Property, and take the Max Value of this array in order to determine the level of confidence.
If you then plot confidence against correct predictions and incorrect predictions, and hopefully you should see that incorrect predictions should lie primarily on the lower end of the confidence scale, and correct predictions should lie on the higher end of the confidence scale.
Ideally, you should see a clearer delimination between correct and incorrect, to allow you say that if confidence is less than .3 (or whatever), then the prediction is unknown, not the wild guess that ML.NET has suggested.
To generate this graph, here is the SQL I used;
create table PercentileGraph
(
id int identity(0,1),
[Start] as cast(id as float)/100,
[End] as cast(id as float)/100 + 0.01,
Correct int,
Incorrect int,
)while 1=1
begin
insert into PercentileGraph default values
end— Run for a second, then stop
delete from PercentileGraph where id>100
select * from PercentileGraph
update PercentileGraph set correct =
(
select count(*) from Evaluation where
Model=Predicted
and ModelConfidence>=PercentileGraph.[start]
and ModelConfidence<=PercentileGraph.[end]
)update PercentileGraph set incorrect =
(
select count(*) from Evaluation where
Model<>Predicted
and ModelConfidence>=PercentileGraph.[start]
and ModelConfidence<=PercentileGraph.[end]
)select * from PercentileGraph