Understanding the evaluation and auto correction processes
messages
returns a risk score for each input (input_score
) and output (output_score
), along with a list of any violated policies (policy_violations
). The risk score indicates the degree to which the content complies with the policies. By adjusting the threshold, you can control the sensitivity of this evaluation—lowering the threshold enforces stricter adherence to the policies, while raising it allows for more leniency.
policy_violations
. It examines the specific violated policies and edit the content to bring it into compliance (correction
).