Orca 2 enhances small language models' reasoning by teaching diverse strategies for tasks, outperforming models up to 10x larger in complex benchmarks.
Authors: Arindam Mitra; Luciano Del Corro, work done while at Microsoft; Shweti Mahajan, work done while at Microsoft; Andres Codas, denote equal contributions; Clarisse Simoes, denote equal contributions; Sahaj Agarwal; Xuxi Chen, work done while at Microsoft;; Anastasia Razdaibiedina, work done while at Microsoft; Erik Jones, work done while at Microsoft; Kriti Aggarwal, work done while at Microsoft; Hamid Palangi; Guoqing Zheng; Corby Rosset; Hamed Khanpour; Ahmed Awadall.
-v2 Collection , which consists of five sub-collections, namely, CoT, NiV2, T0, Flan 2021 and Dialogue. Each sub-collection contains multiple tasks. Following Orca 1 we consider tasks from only CoT, NiV2, T0, Flan 2021 sub-collections, which contain a total of 1913 tasks. Each task in Flan-v2 is a collection of queries and has an associated answer. Some of 1913 tasks in
-v2 dataset contains both zero-shot and few-shot problems. We then train on 5 million ChatGPT data from Orca 1 for 3 epochs. Then we train on the combination of 1 million GPT-4 data from Orca 1 and Orca 2’s 817K data for 4 epochs. Tokenization: We utilize the LLaMA Byte Pair Encoding tokenizer for processing the input examples. Notably, the LLaMA tokenizer splits all numbers into individual digits, and fallbacks to bytes to decompose unknown UTF-8 characters.
annotations, Orca 1 dataset and the Orca 2 dataset. We also describe the details about the progressive learning. 4.1 Dataset Construction The Orca 2 dataset has four main sources:
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Orca 2: Enhancing Reasoning in Smaller Language Models - Abstract and IntroductionOrca 2 enhances small language models' reasoning by teaching diverse strategies for tasks, outperforming models up to 10x larger in complex benchmarks.
Read more »
Orca 2: Enhancing Reasoning in Smaller Language ModelsOrca 2 enhances small language models' reasoning by teaching diverse strategies for tasks, outperforming models up to 10x larger in complex benchmarks.
Read more »
Orca 2: Enhancing Reasoning in Smaller Language ModelsOrca 2 enhances small language models' reasoning by teaching diverse strategies for tasks, outperforming models up to 10x larger in complex benchmarks.
Read more »
Smaller homes on smaller lots: Could “light touch density” help erase Colorado’s housing deficit?A right-of-center think tank, the American Enterprise Institute, came to Denver on Monday to pitch a free market solution to resolve the state’s housing deficit in under three years and gener…
Read more »
Phase 2 of plan for smaller homes on smaller lots goes before Austin City CouncilOn Thursday, the Austin City Council will hear from the public about a plan to increase density in Austin neighborhoods as a way to increase the amount of affor
Read more »
How Leaky Datasets Undermine AI Math Reasoning ClaimsQuestions over tests of AI math abilities suggest we may never know how capable intelligent machines computers can become.
Read more »