This is incredible. In April I used the standard GPT-4 model via ChatGPT to help me reverse engineer the binary bluetooth protocol used by my kitchen fan to integrate it into Home Assistant.
It was helpful in a rubber duck way, but could not determine the pattern used to transmit the remaining runtime of the fan in a certain mode. Initial prompt here [0]
I pasted the same prompt into o1-preview and o1-mini and both correctly understood and decoded the pattern using a slightly different method than I devised in April. Asking the models to determine if my code is equivalent to what they reverse engineered resulted in a nuanced and thorough examination, and eventual conclusion that it is equivalent. [1]
Testing the same prompt with gpt4o leads to the same result as April's GPT-4 (via ChatGPT) model.
FYI, there's a "Save ChatGPT as PDF" Chrome extension [1].
I wouldn't use on a ChatGPT for Business subscription (it may be against your company's policies to export anything), but very convenient for personal use.
Wow, that is impressive! How were you able to use o1-preview? I pay for ChatGPT, but on chatgpt.com in the model selector I only see 4o, 4o-mini, and 4. Is o1 in that list for you, or is it somewhere else?
Haha I wish, although I saw the other one i forgot its name which makes music for you, now you can ask it for a soundtrack and it gives it back to you in your voice or something like that idk interesting times are ahead for sure!
I heard on X suno.com has this feature but couldn’t find it maybe its coming soon? Idk but there are ways u can do it, maybe it was a different service suno is pretty cool tho
It's my understanding paying supporters aren't actually paying enough to cover costs, that $20 isn't nearly enough - in that context, a gradual roll-out seems fair. Though maybe they could introduce a couple more higher-paid tiers to give people the option to pay for early access
The linked release mentions trusted users and links to the usage tier limits. Looking at the pricing, o1-preview only appears for tier 5 - requiring 1k+ spend and initial spend 30+ days ago
I often click on those links and get an error that they are unavailable. I’m not sure if it’s openAI trying to prevent people from sharing evidence of the model behaving badly, or an innocuous explanation like the links are temporary.
The link also breaks if the original user deletes the chat that was being linked to, whether on purpose or without realizing it would also break the link.
Even for regular users, the Share button is not always available or functional. It works sometimes, and other times it disappears. For example, since today, I have no Share button at all for chats.
I'm impressed. I had two modified logic puzzles where ChatGPT-4 fails but o1 succeeds. The training data had too many instances of the unmodified puzzle, so 4 wouldn't get it right. o1 manages to not get tripped up by them.
The screenshot [1] is not readable for me. Chrome, Android. It's so blurry that I cant recognize a single character. How do other people read it? The resolution is 84x800.
I didn't work until I switched to "Desktop Site" in the browser menu, as a sibling comment suggested. Then the page reloads with various buttons, etc. Until that just the preview image not reacting to clicks.
What if you copy the whole reasoning process example provided by OpenAI, use it as a system prompt (to teach how to reason), use that system prompt in Claude, got4o etc?
Very cool. It gets the conclusion right, but it did confuse itself briefly after interpreting `256 * last_byte + second_to_last_byte` as big-endian. It's neat that it corrected the confusion, but a little unsatisfying that it doesn't explicitly identify the mistake the way a human would.
Also, if you actually read the "chain of thought" contains several embarrassing contradictions and incoherent sentences. If a junior developer wrote this analysis, I'd send them back to reread the fundamentals.
Well, it doesn't "correct" itself later. It just says wrong things and gets the right answer anyways, because this encoding is so simple that many college freshmen could figure it out in their heads.
Read the transcript with a critical eye instead of just skimming it, you'll see what I mean.
> Asking the models to determine if my code is equivalent to what they reverse engineered resulted in a nuanced and thorough examination, and eventual conclusion that it is equivalent.
Did you actually implement to see if it works out of the box ?
Also if you are a free users or accepted that your chats should be used for training then maybe o1 is was just trained on your previous chat and so now knows how to reason about that particular type of problems
It was helpful in a rubber duck way, but could not determine the pattern used to transmit the remaining runtime of the fan in a certain mode. Initial prompt here [0]
I pasted the same prompt into o1-preview and o1-mini and both correctly understood and decoded the pattern using a slightly different method than I devised in April. Asking the models to determine if my code is equivalent to what they reverse engineered resulted in a nuanced and thorough examination, and eventual conclusion that it is equivalent. [1]
Testing the same prompt with gpt4o leads to the same result as April's GPT-4 (via ChatGPT) model.
Amazing progress.
[0]: https://pastebin.com/XZixQEM6
[1]: https://i.postimg.cc/VN1d2vRb/SCR-20240912-sdko.png (sorry about the screenshot – sharing ChatGPT chats is not easy)