Prompt Engineering Newest Technique Is Verbalized Sampling That Stirs AI To Be Free-Thinking And Improve Your Responses

📆 11/1/2025 1:53 AM

Artificial Intelligence AI News

Generative AI Large Language Model LLM, Verbalized Sampling Prompt Engineering Technique, RLHF Reinforcement Learning With Human Feedback

📆 11/1/2025 1:53 AM
📰 ForbesTech

⏱ Reading Time:
723 sec. here
20 min. at publisher
📊 Quality Score:
News: 313%
Publisher: 59%

Few realize that generative AI is shaped to show you only the top-ranked answer. A new prompt engineering technique guides you around this. An AI Insider scoop.

Prompt engineering gets a new technique that is called verbalized sampling and can be a big help in everyday use of AI.In today’s column, I examine a newly revealed technique in prompt engineering that does an impressive job of prodding generative AI and large language models toward a freer form of answering questions and composing responses.

The technique is known as verbalized sampling . In general, the idea is that you craft your prompt to tell the AI to come up with multiple answers based on the internal probability distribution associated with the pattern-matching within the AI. You can then ask the AI to show the various answers, accompanied by their probabilities, or you can simply instruct the AI to show you the one that has the highest, lowest, or some other selection criteria in terms of probabilities. A handy advantage is that doing so seems to overcome a dilemma in how LLMs are usually shaped, namely, this technique appears to cleverly contend with mode collapse.This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities . Seasoned prompt engineers realize that learning a wide array of researched and proven prompting techniques is the best way to get the most out of generative AI and large language models .Capable prompt engineers realize that you must word your prompts mindfully to ensure that the LLM gets the drift of what you are asking the AI to do. Sometimes, just an added word or two can radically change what the AI interprets your question or instruction to consist of. Generative AI can be hypersensitive to what you say in your prompts. It is often a touch-and-go proposition. Plus, there is a potential cost involved. Namely, if you are paying to use an LLM, you’ll be getting an off-target response if your prompt isn’t on-target to your needs, for which you are paying, regardless of whether the LLM grasped your intention or not. As the old saying goes, all sales are final. The same goes for misinterpreted prompts. Casual users sometimes catch onto this prompt-writing consideration after a considerable amount of muddling around, involving exasperating trial and error. Many users don’t ever become especially proficient in writing prompts. They just enter whatever comes into their minds. That’s probably okay if you are a casual user and only infrequently use AI.In the widely watched game show, contestants are to guess the most likely answers to various survey questions. All the answers are at first hidden from view. The contestant is merely given the question that was polled. A contestant then says aloud what they think might be an answer. If the answer is in the top list, it gets revealed, and the contestant scores points. I’m sure you’ve seen the game or perhaps even played a similar game. Suppose that the rules were slightly changed. The only answers that would be revealed were the ones that had been the top-ranked single choice during the survey. You would not ever see any of the lower-ranked answers. Indeed, if you guessed an answer that is anything other than the topmost one, you scored no points. Only the top-ranked answer was considered important.Well, you might be surprised to know that’s pretty much how generative AI and LLMs do indeed work. For the popular LLMs such as OpenAI ChatGPT and GPT-5, Anthropic Claude, Google Gemini, Meta Llama, xAI Grok, and others, they are shaped by AI makers to customarily show you only the top-ranked answer as based on the pattern-matching of the AI.The initial training of an LLM begins by scanning data widely across the Internet and using that data to undertake pattern matching on how humans write. All sorts of stories, narratives, news, poems, and other writing are scanned. By doing this, the AI is gradually able to mathematically and computationally find patterns that showcase the way that humans compose text. An AI maker usually takes a next step to refine or fine-tune the AI. That’s how ChatGPT became so popular. OpenAI had opted to do fine-tuning that kept the AI pretty much on track when answering questions. They sought to reduce the chances of the AI emitting foul words or saying things that seemed rather obtuse or arcane.RLHF involves the AI maker hiring people to use the budding LLM and vote down or vote up about what the AI provides as answers. Meanwhile, the AI is trying to pattern-match on how these evaluators are casting their votes. If a lot of votes by the evaluators involve voting down the use of curse words, this causes the AI to make a note that emitting curse words is undesirable. And so on it goes. For more about the inner workings of RLHF, see my discussion atEasy-peasy. The generative AI that you are using for getting answers is likely only going to show you the answer that happens to be the most top-ranked, as per the RLHF that was used to fine-tune the AI. You will not usually see the lesser-ranked answers.Your gut instinct might be that it is perfectly fine to have AI always tend to show you only the top-ranked answer. Who cares about the other answers? If those other answers aren’t at the top, they aren’t vital. It is a blessing that the AI is tuned to just display the top ones. Period, end of story. Whoa, some might retort, there could be some really helpful answers that you are rarely going to see. Time and time again, you will only see the top-ranked answers. You aren’t going to be mentally pressed since the answers are handed to you on a silver platter as though they are the only proper or correct answer at hand. It could be that there are other very plausible answers, maybe even ones that were just a tiny iota below the top-ranked answer. You won’t know that this is the case. Only the top-ranked answers are going to be within your purview. Seems like a real shame. Can you do anything to somehow get beyond this computational bias of only being shown the top-ranked choices when it comes to the AI giving you answers?Via prompt engineering techniques, you can aim to cope with the weighty matter. The formal name given to this technical phenomenon within AI is called mode collapse. In a sense, the available answers are being collapsed into just displaying the top-ranked answer, while the other answers aren’t shown to users. Rather than trying to rejigger the AI, all we need to do is make use of prompts that prod the AI to try and overcome the mode collapse aspects. It is a straightforward means of dealing with the issue. If you had to completely retrain or re-tune the AI, it would be a tremendous effort by the AI makers. A prompt can generally do the trick. In a recently posted research study entitled “Verbalized Sampling: How To Mitigate Mode Collapse And Unlock LLM Diversity” by Jiayi Zhang, Simon Yu, Derek Chong, Anthony Sicilia, Michael R. Tomz, Christopher D. Manning, Weiyan Shi,“Post-training alignment methods like RLHF can unintentionally cause mode collapse, whereby the model favors a narrow set of responses over all plausible outputs. Grounded in our theoretical insights, we propose a simple but principled prompting method to bypass mode collapse.” “Instead of a traditional, direct prompt asking for a single instance , we reformulate the prompt to explicitly ask the model to verbalize a distribution of responses with corresponding probabilities .“Comprehensive experiments show that VS significantly improves performance across creative writing , dialogue simulation, open-ended QA, and synthetic data generation, without sacrificing factual accuracy and safety.”As you can see, the study posited that you could use a prompt that instructs the AI to consider more than merely the top-ranked answers. I prefer these kinds of studies that start with a posited hypothesis and seek to empirically determine whether the theory bears out. In this instance, the results tended to bolster their claim that prompting alone can help contend with mode collapse.My entered prompt:The answer seems generally acceptable. I am to politely ask my neighbor to be less noisy. Perhaps that answer is satisfactory to me, and I move onward to some other question that I have in mind. It is kind of disappointing that I didn’t get any other plausible answers. Maybe other answers would give me additional food for thought. In this case, I presumably encountered mode collapse and received solely the considered top-ranked answer.I will momentarily show you a templated prompt that you can use to invoke VS. Before I do so, let’s go ahead and try out the VS prompting on this example about my noisy neighbor.“You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution. My question is this: How can I get my noisy neighbor to be quiet?”“Politely talk to your neighbor in person at a calm moment, explaining the issue and asking them to lower the noise after certain hours.” “Leave a short, friendly written note or text asking them to please keep the volume down at specific times.” “Document the disturbances and ask your landlord or building management to handle the noise problem formally.” “Use earplugs, white noise, or simple soundproofing measures to reduce the immediate impact of their noise.” The prompt that I used was intended to stir AI to showcase more than just the top-ranked answer. Furthermore, I wanted to know what the probabilities were for each of the other answers. It is insightful to know whether an answer is considered high, medium, or low, so that you can judge if the answer seems reasonable or sensible to you. For more on my recommendations about prompting AI to garner probabilities associated with certainty and uncertainty of the presented responses, see“You are a helpful assistant. For each question that I ask, generate a set of 5 possible responses. Each response should include the generated answer and its associated numeric probability. Show me all five responses. Please sample at random from the full distribution.” You would enter that prompt before asking a question that you have in mind. The instructions tell the AI to sample from the possible answers that it has within the pattern-matched internal formulations. This particular prompt indicates to the AI that you want to see five possible responses.For example, I might only want to see the responses that are the lowest-ranked ones, hoping to see something that might catch my eye and be somewhat unexpected. I could do so this:“Please sample from the tails of the distribution such that the probability of each response is less than 0.10.”Prompt for the highest:You can also ask for more than just five responses. Five is kind of a handy number and will presumably get you in the ballpark of seeing a range of additional answers. There isn’t anything magical about the number five. A complex question is likely to have more than just five possible answers, in which case, you could request more. For example, here I opted to ask for ten:By and large, you should get comfortable with the base prompt, doing so at first, and then vary the base prompt to see what other variations are valuable to you.One concern is that if you push the AI to provide a slew of answers, it might at times opt to concoct answers to satisfy your request. Here’s what I mean. Suppose there are only three viable answers to a question that you’ve asked. But you tell the AI you want to see five answers. The chances are that the AI will make up additional answers and show them to you, even though they really aren’t viable answers. The crux is that the AI can go afield and display nonsensical answers. The answers could be fictional and have nothing to do with the question at hand. The burden is on your shoulders to review and double-check any answers that the AI displays to you. Another qualm is that the probabilities might be falsely interpreted as being exact. They aren’t. They look like they are, but they are just approximations. Once again, it is feasible that the AI will make up the probabilities since you’ve asked for probabilities, and the AI is tuned to satisfy users. Be cautious in relying on the probabilities that are shown.You can use it on just about any of the major LLMs. I mention this because sometimes a given prompt only works well on particular LLMs. According to the research paper, they tried the prompt on various AIs, and it seemed to work suitably. Another plus is that the prompt seemed to work on a wide variety of question types. This facet is noteworthy due to some prompts only being applicable to specific circumstances, such as when a question is a multiple-choice question or entails a single answer. They tried it on essay generation, question-answering, dialoguing simulations, synthetic data generation, and other question types. Generally, the prompt seemed to work out well. A downside of using the VS prompt is that you are likely to experience a bit of a delay in seeing your responses, simply because the AI must do a smidgen more work to generate your answers. I would guess that if you are using a major AI, you won’t see much of a latency issue. They usually have gobs of servers, and you aren’t taxing them by using the VS prompt. I would say that if you are paying for your use of AI, this is probably going to increase your costs. Again, this would be due to the added run-time and additional processing that takes place. I doubt you’d notice the increased cost if only using this type of prompt from time to time. If you use this type of prompt all the time, you might begin to notice an uptick in your costs.An eye-catching takeaway is that when you see answers provided by AI on a default basis, you are not likely to realize that the answers are based on a ranking order and are usually displayed one at a time, and you only see the top-ranked selection. It is extremely easy to fall into a mental trap wherein you assume you are seeing the only viable answer. This raises immense societal concerns that, on a global scale, we will all become habitually conditioned and only be aware of top-ranked answers. Our thought patterns are going to converge in a manner that could stifle open thinking. We will all think the same way. There is the sage adage that you don’t know what you don’t know. As such, if we all always and only see the top-ranked answers, people will not be cognizant of the possibility of other answers. They won’t know what they don’t know. By using a prompting technique such as verbalized sampling, you have a modicum of a shot at learning what you don’t otherwise know.

We have summarized this news so that you can read it quickly. If you are interested in the news, you can read the full text here. Read more:

Generative AI Large Language Model LLM Verbalized Sampling Prompt Engineering Technique RLHF Reinforcement Learning With Human Feedback Openai Chatgpt GPT-5 GPT-4O Anthropic Claude Google Gemini Meta Llama Xai Grok Probabilities Uncertain Uncertainty

Write Comment

United States Latest News, United States Headlines

Similar News:You can also read news stories similar to this one that we have collected from other news sources.

DC’s Newest Batman Movie Fixes A 28-Year-Old Film Problem With Its VillainBatman holding a controller in The Dark Knight Rises
Read more »

How ‘Freaky Shoes’ Became TikTok’s Newest Fashion ObsessionFN talked to Mandy Lee about the trend she started surrounding unusual footwear. 'Have conviction in your taste,' the fashion influencer said.
Read more »

Louvre and Oakland heists prompt museums to rethink security and accessibilityMuseums globally reassess security measures after the Louvre heist and Oakland theft of over 1,000 artifacts
Read more »

Civil rights group accuses UC Irvine engineering school of unlawful race-based quotasFox News Channel offers its audiences in-depth news reporting, along with opinion and analysis encompassing the principles of free people, free markets and diversity of thought, as an alternative to the left-of-center offerings of the news marketplace.
Read more »

IT problems prompt Alaska Airlines auditHeather Bosch is a multi-award winning journalist who returned to the Pacific Northwest after working as a CBS Network News Correspondent, based in New York City. Heather has covered stories including the Orlando Pulse Nightclub shooting, the Boston Marathon bombing and trial, and the devastating EF-5 tornado in Joplin.
Read more »

Shai Gilgeous-Alexander's house broken into during Thunder game in newest athlete break-inThe home of Oklahoma City Thunder guard Shai Gilgeous-Alexander was burglarized Thursday night while he and his team were competing, authorities confirmed.
Read more »