Top-p nucleus sampling is another sampling parameter that is different from temperature. Before the model generates the output, it generates a set of tokens. In top-p sampling mode, the candidate word list is dynamic and selected from the tokens based on a percentage. Top-p introduces randomness to the selection of tokens, allowing other high-scoring tokens to have a chance of being selected instead of always choosing the highest-scoring one.
Note:
Top-p is similar to randomness. In general, it is not recommended to change it together with the randomness parameter, temperature.