Okay, itβs interesting because we can understand the hallucination from a metaphysical level. For example, a model generated the composer of Rocky is John Williams. Of course, Rocky is very famous and John Williams is also very famous, but the metaphysical station between Rocky and John Williams is not very stable in models internal representation so that if we can compute the internal representation of Rocky and and John Williams and also the relation between Rocky and John Williams then we can detect models internal hallucinations
Stay updated
Occasional essays on LLM epistemology, alignment, and political philosophy. No spam.