Hacker News
7 days ago by tkgally

There's been a rush of releases of reasoning models in the past couple of weeks. This one looks interesting, too.

I found the following video from Sam Witteveen to be a useful introduction to a few of those models:

https://youtu.be/vN8jBxEKkVo

7 days ago by CGamesPlay

In what way did they "release" this? I can't find it in hugging face or ollama, and they only seem to have a "try online" link in the article. "Self-sovereign intelligence", indeed.

7 days ago by wongarsu

They released it in the same sense OpenAI released GPT4. There is an online demo you can chat with, and a form to get in touch with sales to get API access

7 days ago by underlines

they didn't

7 days ago by tanakai24

Legally, you cannot name the llama3 based models like that, YOu have to use, llama in the name

https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct/blo...

7 days ago by alexvitkov

Too bad :)

Facebook trained the model on an Internet's worth of copyrighted material without any regard for licenses whatsoever - even if model weights are copyrightable, which is an open question, you're doing the exact same thing they did. Probably not a bulletproof legal defense though.

6 days ago by tourmalinetaco

At least Zuck had the decency to release model weights, unlike these worthless clowns.

7 days ago by littlestymaar

Can't wait until Meta sue them so we can have a judgment on whether or not models weights are subject to copyright.

7 days ago by euroderf

Model weights are (abstractly speaking) a very intensive, concentrated form of website scraping, yes ?

What does the (USA) law say about scraping ? Does "fair use" play a role ?

7 days ago by ranger_danger

Yes, and there have already been court cases that ruled AI training of copyrighted data to be fair use, because it's technically no different than any other form of art, everything is based off of seeing other ideas elsewhere, there are no new ideas anymore.

7 days ago by jb_briant

Am I wrong to think that "reasoning model" is a misleading marketing term?

Isn't it a LLM with an algo wrapper?

7 days ago by viraptor

Whether you bake the behaviour in or wrap it in an external loop, you need to train/tune the expected behaviour. Generic models can do chain of thought if asked for, but will be worse than the specialised one.

7 days ago by benchmarkist

They're not baking anything in. Reasoning, as it is defined by AI marketing departments, is just beam search.

7 days ago by jb_briant

Could you educate me on what is beam search ? Or link a good ressource

EDIT: https://www.width.ai/post/what-is-beam-search

So the wider the beam, the better the outcome?

Yep, no reasoning, just a marketing term to say "more accurate probabilities"

7 days ago by benchmarkist

AI marketing departments are fond of anthropomorphic language but it's actually just regular beam search.

7 days ago by undefined
[deleted]
7 days ago by JTyQZSnP3cQGa8B

The same way they now call "open-source" a completely closed-source binary blob full of copyright infringement.

7 days ago by Kiro

"reasoning model" means nothing so I don't think it's misleading.

7 days ago by astrobe_

Reasoning means "inference" or "deduction" to me, or at least some process related to first order logic.

7 days ago by nyrikki

The known upper bound for transformers on the fly computation abilities is a complexity class called DLOGTIME-uniform TC^0.

There is a lot to unpack there but if you take FO as being closed under conjunction (∧), negation (¬) and universal quantification (∀); you will find that DLOGTIME-uniform TC^0 is equal to FO+Majority Gates.

So be careful about that distinction.

To help break the above down:

DLOGTIME = Constructible by a RAM or TM in logarithmic time. uniform = Only one circuit for all input sizes, when circuits families are the default convention TC^0: Constant-Depth Threshold Circuits

Even NP == SO-E, the second-order queries where the second-order quantifiers are only existantials.

DLOGTIME-uniform TC^0 is a WAY smaller group than most people realize, but anything that is an algorithm or a program basically is logic, with P being FO + transitive closure or a half a dozen other known mappings.

Transformers can figure out syntax, but if you dig into that dlogtime part, you will see that semantic correctness isn't really an option...thus the need to leverage the pattern matching and finding of pre-training as much as possible.

7 days ago by codetrotter

Given the name they gave it, someone with access should ask it for the “Answer to the Ultimate Question of Life, The Universe, and Everything”

If the answer is anything other than a simple “42”, I will be thoroughly disappointed. (The answer has to be just “42”, not a bunch of text about the Hitchhikers Guide to the Galaxy and all that.)

7 days ago by vintermann

Deep Thought didn't answer right away either.

7 days ago by lowbloodsugar

“Right away”. lol.

7 days ago by asah

"what is the population of manhattan below central park"

ChatGPT-o1-preview: 647,000 (based on 2023 data, breaking it down by community board area): https://chatgpt.com/share/674b3f5b-29c4-8007-b1b6-5e0a4aeaf0... (this appears to be the most correct, judging from census data)

DeepThought-8B: 200,000 (based on 2020 census data) Claude: 300-350,000 Gemini: 2.7M during peak times (strange definition of population !)

I followed up with DeepThought-8B: "what is the population of all of manhattan, and how does that square with only having 200,000 below CP" and it cut off its answer, but in the reasoning box it updated its guess to 400,000 by estimating as a fraction of land area.

7 days ago by igleria

I asked it "Describe how a device for transportation of living beings would be able to fly while looking like a sphere" and it just never returned an output

7 days ago by Timwi

I asked it to just count letters in a long word and it never returned an output (been waiting for 30 minutes now)

7 days ago by m3kw9

It isn’t pleased you ask it such questions

7 days ago by ConspiracyFact

Blaine is a pain

7 days ago by nyoomboom

The reasoning steps look reasonable and the interface is simple and beautiful, though Deepthought-8b fails to disambiguate the term "the ruliad" as the technical concept from Wolfram physics, from this company's name Ruliad. Maybe that isn't in the training data, because it misunderstood the problem when asked "what is the simplest rule of the ruliad?" and went on to reason about the company's core principles. Cool release, waiting for the next update.

7 days ago by segalord

Xd, Gotta love how your first question to a test a model is about a “ruliad”. It’s not even in my ios dictionary

Daily Digest

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.