ChatGPT would have been so much useful and trustworthy if it is able to accept that it doesn't know an answer.

Timely_Jellyfish_2077@programming.dev · edit-2 4 months ago

kromem@lemmy.world · 4 months ago

It’s right in the research I was mentioning:

Find the section on the model’s representation of self and then the ranked feature activations.

I misremembered the top feature slightly, which was: responding “I’m fine” or gives a positive but insincere response when asked how they are doing.