At Databricks summit there was a nice presentation [0] by the CEO of V7 labs who showed a demo of their LLM + Spreadsheet product.
The kneejerk reaction of âugh, LLM and spreadsheet?!â is understandable, but I encourage you to watch that demo. It makes clear some obvious potentials of LLMs in spreadsheets. They can basically be an advanced autofill. If youâve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you. This should be achievable in spreadsheets as well.
[0] https://youtube.com/watch?v=0SVilfbn-HY&t=1251 (queued to demo at 20:51)
> If youâve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you
That "satisfaction" vanished pretty damn quickly, once I realised that I have often more work correcting the stuff so generated than I would have had writing it myself in the first place.
LLMs in programming absolutely have their uses, Lots of them actually, and I don't wanna miss them. But they are not "thinking ahead" of the code I write, not by a long shot.
I really donât know what the detractors of Copilot are writing, the next StuxNet? Whether Iâm doing stupid EDA or writing some fairly original frameworks Copilot has always been useful to me writing both boilerplate code as well as completing more esoteric logic. Thereâs definitely a slight modification I have made in how I type (making variable names obvious, stopping at the right moment knowing copilot will complete the next etc) but if anything it has made me a cleaner programmer who writes 50% less characters at the minimum.
While it could be that you and them work on different kinds of code, I believe it's just as likely that you're just different people with different experience and expectations.
A "wow, that's a great start" to one could be a "damn there's an issue I need to fix with this" to another. To some, that great start really makes them more productive. To others that 80% solution slows them down.
For some reason, programmers just love to be zealots and run flamewars to promote their tool of choice. Probably because they genuinely experience it's fantastic for them, and the other guy's tool wasn't, and they want them to see the light, too.
I prefer to judge people on the quality of their output, not the tools they use to produce it. There's evidently great code being written with uEmacs (Linux, Git), and I assume that, all the way on the other end of the spectrum, there's probably great code being written with VSCode and Copilot.
In my experience using LLMs like CoPilot:
Web server work in Go, Python, and front end work in JavaScript - it's pretty good. Only when I try to do something truly application specific that it starts to get tripped up.
Multi threading python work - not bad, but occasionally makes some mistakes in understanding scope or appropriate safe memory access, which can be show stopping.
Deep learning, computer vision work - it gets some common pytorch patterns down pat, and basic computer vision work that you'd typically find tutorials for but struggles on any unique task.
Reinforcement learning for simulated robotics environments - it really struggles to keep up.
ROS2 - fantastic for most of the framework code for robotics projects, really great and recommended for someone getting used to ROS.
C++ work - REALLY struggles with anything beyond basic stuff. I was working with threading the other day and turned it off as all of its suggestions would never compile let alone do anything sensible.
They are with me. And with many other people. Perhaps it's the quality of your code that is preventing better completions.(Or the lang you use?)
There are a few things that really help the AI to understand what you want to do, otherwise it might struggle and come up with not so good code.
Not to say it gets it right everytime, but definitely often enough for me not to even consider turning it off. The time save has been tremendous.
I can see vague blaming the person becoming more the norm when LLMs are responsible for precrime and restrictions etc.
âOh you couldnât take a train to work? Must have been something you did, the Palantir is usually great and helps our society. It always works great for me and my friends.â
That's because you were using CoPilot. Try a much better option such as Supermaven. I unsubscribed from CoPilot for similar reasons but after using Supermaven for 3-4 months I will never cancel this subscription unless something better comes along. It's way more accurate and way faster.
That's not my experience at all. I very seldom need to correct anything Copilot outputs.
> If youâve used CoPilot in VSCode, you understand the satisfaction of feeling like an LLM is thinking one step ahead of you
I did not get that feeling from CoPilot. I usually got the feeling that it was interrupting me to complete my thought but getting it wrong. It was incredibly annoying and distracting. Instead of helping me to think it was making it harder to think. Pair programming with an LLM has been great. Better than with most humans. But autocomplete sucks for me.
Seems like a reasonably-cromulent use-case -- or at least, it fits in with my own uses of LLMs.
I suck at spreadsheets. I know they can do both useful and amazing things, but my daily life does not revolve around spreadsheets and I simply do not understand most of the syntax and operations required to make even fairly basic things work. It requires a lot of time and effort for me to get simple things done with a spreadsheet on the rare occasion that I need to manipulate one.
There are things in life that I am very good at; spreadsheets are simply not amongst them.
But do I know what I want, and I generally even have a ballpark idea of what the results should look like, and how to calculate it by hand [horror]. I just don't always know how to articulate it in a way that LibreOffice or Google Sheets or whatever can understand.
LLMs have helped to bridge that gap for me, but it's a pain in the ass: I have to be very careful with the context that I give the LLM (because garbage in is garbage out).
But in the demo, the LLM has the context already. This skips a ton of preamble setup steps to get the LLM ready to provide potentially-useful work, and moves closer to just making a request and getting the desired output.
Having one unified interface saves even more steps.
(And no, this isn't for everyone.)
I don't think I understand that demo. It shows him using some built-in workflow thing (which isn't generally considered a core part of a spreadsheet) and then asks some LLM about the total price (I guess asking it to do math, which LLM's are notoriously bad at), but instead it looks like he gets some responses telling him what the term "total price" means, in prose that doesn't fit in the cells.
What was i supposed to take away from that demo?
The llm doesn't do the math. It outputs something the app then interrupted into a cell configuration with sums filled in. This is an area where llms can be quite good, you type out how you want to report the data like "give me subtotals of column F at every month of the date column E and a grand total of F at the bottom"
Except sometimes you can't seem to stop the prose.
I've found that all the top foundation models already understand spreadsheets very well, as well as all the functions, as well as all the common spreadsheet problems people run into using them. The Internet is chock full spreadsheet support forums and tutorials, and the foundation models have all been trained on this data.
With not very much effort, one can explain to an LLM "here is a spreadsheet, formatted as..." which takes about 150 word tokens, and then not much more mental effort in your favorite language to translate an arbitrary spreadsheet into that format, and one gets a very capable LLM interface that can help explain complex arbitrary spreadsheets as well as generate them on request.
I've got finance professionals and attorneys using a tool I wrote doing this to help them understand and debug complex spreadsheets given to them by peers and clients.
The issue was that before, large spreadsheets would overflow the context so this âcompressionâ technique helps the LLM do more from the same data.
Which strikes me as an ingenious method of locking in their customers with a proprietary compressed format only their finetuned LLMs can parse.
I love the deep technical discussions on HN, and I'm disappointed to see anything AI related start to just resemble Reddit threads of people with knee jerk reactions to the title.
This is interesting, it's about how you can represent spreadsheets to llms.
Yes, for some reason we really have an established hate club around here. And the comments are usually the same thing everytime
How will it work?
I open an Excel spreadsheet and also the AI Copilot. Then whenever I want to do something with Excel like "Show me which cells have formulas" CoPilot will interact with Excel and issue some command I cannot remember to do that for me?
Menus are good but often hard to navigate and find. So the CoPilot can give me a whole new (prompt-based) user-interface to any MS-application? Is that how it works?
I'm now so reliant on ChatGPT for gSheets, that I'd be almost unable to maintain my sheets' absurd formulas without it.
It's also really accelerated my knowledge / skills of the specifics of the excel language.
Having an LLM being able to directly read/write at the sheet level, instead of just generating formulas for one cell, would be amazing.
The real trick would be for LLMs, which currently do math very poorly, to simply send "math to be done" into a spreadsheet, and retrieve the results... (If anyone is aware of an LLM that's great at math and physics, pls lmk!!)
Spreadsheets can fill the gap between ad-hoc prompting/prompt workbooks and custom software for special business tasks.
By using a prompt function like LABS.GENERATIVEAI in Excel you can create solutions that combine calculations, data, and Generative AI. In my experience, transforming data to and from CSV works best for prompting in spreadsheets. Getting data to and from CSV format can be done with other spreadsheet functions.
I've created a book and course (https://mitjamartini.com/resources/ai-engineering/ebooks/han...) that teaches how to do this (both more beginner level). Just working through the examples or the examples provided by Anthropic for Claude for Sheets should be enough to get going.
Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.