A preview of o1 was released by OpenAI on September 12, 2024. o1 spends time "thinking" before it answers, making it better at complex reasoning tasks, science and programming than GPT-4o.
[3] The codename "Q*" first surfaced in November 2023, around the time of Sam Altman's ousting and subsequent reinstatement, with rumors suggesting that this experimental model had shown promising results on mathematical benchmarks.
[16] OpenAI noted that o1's reasoning capabilities make it better at adhering to safety rules provided in the prompt's context window.
[19] o1 usually requires more computing time and power than other GPT models by OpenAI, because it generates long chains of thought before making the final response.
[20] OpenAI forbids users from trying to reveal o1's chain of thought, which is hidden by design and not trained to comply with the company's policies.
OpenAI cites AI safety and competitive advantage as reasons for the restriction, which has been described as a loss of transparency by developers who work with large language models (LLMs).
[21] In October 2024, researchers at Apple submitted a preprint reporting that LLMs such as o1 may be replicating reasoning steps from the models' own training data.
Adding extraneous but logically inconsequential information to the problems caused a much greater drop in performance, from −17.5% for o1-preview and −29.1% for o1-mini, to −65.7% for the worst model tested.