Interesting approach, but what about the fact that, because of the inherent nature of probabilistic sampling and the tendency to confabulate, an LLM can quite often just ignore system prompts or finetuning instructions? Seems like you're always going to be playing bias whack-a-mole.
Hey Jim. CEO & Cofounder of Change Agent here... our goal is to substantially shift the probabilities so that harmfully biased outputs are highly unlikely. Eliminating harmful bias entirely is not possible and there's a deeper philosophical conversation to be had there, too.
That said, magnitude matters a lot. As an opposite extreme, Grok has gone full "mechaHitler" because they shifted the probabilities towards hate. We're doing the reverse and it means that our clients enjoy an LLM that's far more values-aligned.
Interesting approach, but what about the fact that, because of the inherent nature of probabilistic sampling and the tendency to confabulate, an LLM can quite often just ignore system prompts or finetuning instructions? Seems like you're always going to be playing bias whack-a-mole.
Hey Jim. CEO & Cofounder of Change Agent here... our goal is to substantially shift the probabilities so that harmfully biased outputs are highly unlikely. Eliminating harmful bias entirely is not possible and there's a deeper philosophical conversation to be had there, too.
That said, magnitude matters a lot. As an opposite extreme, Grok has gone full "mechaHitler" because they shifted the probabilities towards hate. We're doing the reverse and it means that our clients enjoy an LLM that's far more values-aligned.