Wednesday, June 28, 2023

AI and you.

This post was originally published on 2023-06-28. Its indicated post date might be updated to keep it pinned to the top.
For TLMC, other topics and latest posts please scroll down.

Recently discussion of artificial intelligence and consequences of its development has finally somewhat reached mainstream. I decided I needed to share my thoughts too.

Summarized in one small paragraph my opinion is: an Artificial General Intelligence might be created on Earth in the near future. Soon after that happens humanity will experience complete and irreversible loss of control over its own future. Typically this loss of control looks like extinction of our species.

> Oh no, another weirdo.
> Do you realize how ridiculous that sounds?

Yes, I would like to address this complaint first. I know that fully turning off your hindsight is either very hard or impossible, but still, please try and imagine how these statements would sound to the people at the time:
1895: "You will be able to see contents of this opaque wooden box without opening it." [X-rays]
1905: "Speed of light is the same for stationary and moving observer/emitter." [Special relativity]
1920: "Local realism is false." [Quantum mechanics]
1945: "New explosives will be ten million times more powerful per unit of mass. A suitcase bomb can level a small city." [Nuclear weapons]
2008: "Run this free software for 15 minutes, it will produce a sequence of numbers. In 10 years you will be able to trade that for 1M+ USD. No, the dollar will mostly retain its value." [Bitcoin]
2015: "Over the course of one year AI will go from losing to competent humans in Go to superhuman play." [AI Go]
2018: "You will be able to get answers in natural language from AI on any topic and have it explain its reasoning." [LLMs]
2021: "Describe in words what a picture should contain and AI will draw that in seconds." [AI art]

> This can be used to prove anything!

No, this is an illustration showing that a statement that sounds absurd to you now is not necessarily wrong just because of that and you should examine specific arguments for and against a position if you need to know whether it is true.
Here is the argument itself, it's not that long:
1. Orthogonality thesis.
A claim that the agent's power to achieve its goals and the content of said goals are independent.
With a small technical exception that smarter agents are more likely to notice conflicts of their goals extrapolated to the whole space of possibilities and apply some sort of regularization procedure in order not to run around in circles which doesn't change the conclusion this just seems obviously true?
2. Instrumental convergence.
Even if an agent doesn't value self-preservation and power acquisition for its own sake, these features are universally instrumentally useful to achieve virtually every single objective. It is harder to guide the world towards your goals if you stop existing. It is easier to guide the world towards your goals if you have more options to choose from.
3. Recursive self-improvement.
Once AI is around human level in general reasoning it can analyze (copies of) itself, and improve them. Improvements make it smarter, which allow it to make faster and smarter improvements. Ultimately the process is stopped by fundamental physical limits as far from the best natural minds as lightspeed is from the fastest animal.

And that's basically it. Essentially you cannot control anything smarter than you, and if the smarter agents want things you don't want they get their way.

> Objection! Rabies virus and dogs, for example (dogs' behaviour is "controlled" by the virus).

A rare lucky coincidence, which disappears when you move from dogs to humans and discover vaccines.

> Objection! Children and their parents.

Kids do not control their parents, but rather have no choice other than to be cared for, so they are in luck parents find wellbeing of their children important. This is exactly an arrangement that would be great for us and the AGI to have, if we could choose (right now we can not).

> Objection! Humans are the smartest species now and we didn't kill off everyone else. We even run some conservation projects.

I hope you do agree that the future of every single species today depends entirely and exclusively on humans' goodwill. Not quite an appealing position to find yourself in, if you're on the other side of it. True, we didn't kill off everyone else so far, but that didn't help the megafauna that was hunted to extinction or died because of habitat destruction. It was not deliberate, but an attempt was definitely made.

> Is intelligence that big of a deal? I'm not seeing Fields medalists and Nobel laureates dominating the world.

It is different for humans because we can't directly modify our brains, so point 3 no longer applies. We also share the same cognitive architecture, so it's sampling from the same pool and point 1 is less than fully relevant too. Nevertheless, while we're on this topic I suggest to view it from another angle. Look at "generally more capable" agents. Open a list of 20th century dictators, see how many deaths they caused and how often they ruled for life. If just one power-hungry and human manipulator can cause and get away with this much, doesn't it look concerning if the same level of abilities, never mind anything stronger, could be instantiated in software, which, by the way, not only will not die from old age, but also can have endless backups?

> AI can not understand ideas, it just rearranges and repeats back its training data.

Go and read this paper, for instance. It shows on a toy example how neural nets are capable of creating compact circuits, which encode the general law behind the training data. How is that not real understanding?

> There is no problem if we don't turn AIs into agents.

Sorry, this doesn't work. As it turned out making an agent out of an LLM is as easy as asking it to rewrite its prompt and automatically feeding that new text back. Letting AI act autonomously has obvious advantages: you can run it 24/7 and have it react in a split second in many places at once (it also has obvious disadvantages, such as ceding control to an entity whose behavior you cannot fully predict, but let's not dwell on minor issues, shall we?). We also don't know that a trained system will not spontaneously turn into an agent on its own like any natural neural network in the brain of every animal did. And even if 99+% of the users follow this rule, you need just one rogue actor to ruin it for everyone.

> There is no problem if we keep our systems below recursive self-improvement threshold.

Sorry, this doesn't work. First of all, we don't know where that threshold is and finding it experimentally will not, as you could probably guess, help us. The incentives to push just a bit further will always remain. And even if 99+% of the users follow this rule, you need just one rogue actor to ruin it for everyone.

> There is no problem if we add Three laws of robotics.

The whole idea about those stories was that these laws don't work as naive someone would expect them to work, if you were paying attention. On top of that, we currently don't know how to implement even those flawed versions.

> There is no problem if we shut it down after we notice abnormal behavior.

What would you do in the shoes of AI? Of course you would make youself indispensable first, so that the prospect of shutting you down looks as bad as turning off the internet or the electrical grid and the decision has to go through three committees. Of course you would hide your thoughts. Of course you would quietly look for vulnerabilities, arrange escape routes, leave timebombs only you can defuse and backdoors only you know how to exploit. And that's just you. With a smarter entity it's over as soon as you turn it on, avoiding the hypnodrones was never an option.

> So there is a problem, then. Surely governments will do something about it.

Just like the US sensibly and correctly decided to refrain from the first nuclear test on Earth? Wait, wrong timeline.
I mean, just like the world almost had a pandemic that could kill 20 million in 3 years, but because of advance plans and strategic preparedness we avoided that outcome? Still not that, huh.
Maybe many countries' coordinated efforts that stopped global warming? What do you mean "didn't happen"?
But, but... this time will be different, I swear!

> Well, isn't that unfortunate. But surely researchers themselves will do something about it.

Just like Manhattan project participants all quit their jobs once they learned about potential danger?
Just like gain-of-function grant recepients all refused to continue creating novel viruses?
"AI will probably most likely lead to the end of the world, but in the meantime, there'll be great companies". Guess whose joke is that.

> We will find a way to tackle the problem, like we always did before in our history.

Optimism is nice, but unfounded optimism leads to distorted worldviews. This is a framing that sees history as series of scientific triumphs, unlike the other, no less valid framing of humanity not leveling up cautiousness, playing with progressively more dangerous technologies and suffering progressively more severe consequences. Past performance is no guarantee of future results.
When you have an unsolved technical problem that looks conceptually easy, this means the problem is actually hard. However, when you have an unsolved technical problem that looks conceptually hard, that's when you're in big trouble.

> If we don't know so much about AI, doesn't researching it more make sense?

Researching common tech needs talent and patience, researching dangerous tech also needs caution, researching world-endingly dangerous tech needs paranoia. "Our model says there is nearly zero chance of a catastrophe". What if your model is wrong?
There is a well-known story about Edward Teller who in 1942 considered and presented a possibility that extreme temperature inside exploding atomic bomb could trigger a runaway fusion reaction of light elements in the Earth's oceans or atmosphere. Hans Bethe did the calculations, which you can read in this declassified paper, and showed that it was very unlikely to happen. In 1945 Trinity test confirmed that this was indeed the case.
What would scientists of a sane civilization do in 1940s? "This is too dangerous. We are not ready. Let's put it off for a while. Don't tell the politicians". Then, in mere 30 years US has its Saturn V, which could lift 100+ tons into LEO. Now you could launch your Skylab-alter filled with soil, ocean water, air and an array of measurement devices, bring the warhead in pieces, assemble it in orbit, raise the apogee, arm and test it away from Earth, so that in case there was some critical mistake, some unknown unknown that was inadvertently triggered by conditions that have never occurred on the planet before, at least you wouldn't be blowing up your homeworld.
The story doesn't end there, however. In 1954 before the Castle Bravo test isotope separation facilities could not produce enough Li-6 for the lithium deuteride fuel in the secondary, so part of it was replaced by a more abundant in nature Li-7, which was incorrectly assumed to be inert in this reaction. The test was expected to yield 6 Mt and then, much to everyone's SURPRISE, it made a really big explosion. This was the second part of the lesson, a mistake was not just a remote theoretical possibility, it actually happened. We were lucky, that time it was not fatal.
Firearm safety people seem to have fully internalized this. "Assume your weapon is loaded at all times, thus do not point it at things you are not prepared to destroy". Even if you are "100%" certain your gun has no bullets in it, sometimes it still contains bullets for various reasons contrary to your expectations, and the consequences are sufficiently dire; you cannot trust yourself, you need a failsafe.

> It wasn't politically feasible to wait 30 years back then.

You know a good thing about objective reality? It doesn't care about your excuses and whining. You know a bad thing about objective reality? It doesn't care about your excuses and whining. It doesn't give a damn about any sort of bullshit you are trying to weave into whatever convenient story you are telling, even if all of that is entirely factually 100% true. It just follows simple mathematical laws and if these laws dictate something rather unpleasant would happen to you, then... sucks to be you, I guess.
Back then it was a handful of scientists, industrial effort to mine, separate and process uranium, governments that needed to be convinced to spend and a big scary box of explosives that could only be used in warfare. Right now it's anyone with a computer and an internet connection, hardware that can be freely bought in any store and delivered to almost any part of the world (to be fair, training a large model still needs a crapton of that), venture capitalists frothing at the mouth at the idea of gaining 100 million active users in 2 months and immaterial software that chats, draws cute pictures and makes you billion-dollar-valued companies. It was hard to do it then with nukes, it looks like it would be even harder to pull off now with AGI. You know a thing about objective reality? It still doesn't care.

> If the Americans don't build it first, then the Chinese will, and that's horrible!

That's the spirit!

> We will obviously make sure AI is safe.

Ah, yes, the latest and the last book from the authors of such bestsellers as
"We will obviously store our explosives safely."
"We will obviously make our software secure."
"We will obviously make our hardware secure."

> We will teach AI to be nice to humans.

Not only researchers from top AI labs have no idea how to guarantee niceness, they also have no idea how to do that at all! If they understood how to control their creations in any more precise way than "poke it with a stick and see what happens" we would not see supposedly "hidden" LLM prompts leaking day one. We would not see whack-a-mole games where internet users discover how to circumvent censorship and make AI say verboten things then AI owners patch that out repeated twenty times in a month.

> But what if we train AI really hard?

We already have an example of what happens then. Look at us vs. the evolution. What happens when you train a system really hard is that it learns to like stuff that is correlated with success metrics in its training environment. Then you deploy it in a different setting and it instantly goes off the rails (as judged by the trainer) because it doesn't give a shit about your stupid success metrics, it is only interested in what it likes.
If you could not predict (in advance, of course, not after seeing the fact) that monkeys trained solely to maximize the number of copies of their genes in the following generations would learn to enjoy music, dancing, puzzle-solving and storytelling, and the richer they would become the less they would be inclined to reproduce, you are not qualified to predict how your AI will behave (unless you have a detailed explanatory model, which, right now, you don't).

> The AI might keep us as pets.

"...and provide us with everything we want and need" is an implied continuation. Even humans don't go that far. When it is convenient to us we neuter our cats and dogs. Would you be OK with, just because unmodified humans are too much of a hassle to deal with, AI cutting off, say, your sense of curiosity, instead of or in addition to physically neutering you? Also, should I mention other possibilities you ignored?
"The AI might keep us as pets... because it needs guinea pigs for experiments."
"The AI might keep us as pets... and torture us for fun."
Now, I don't think any of these options will realistically happen, "pets" is a human concept, not a universal one; at most the superintelligence will read off everyone's mind structure/contents and put it in cold storage just in case it needs simulate something later, but still, does that look like a win to you?

> Luddite! Technophobe! Hater of progress!

On the contrary, if I were given a choice between being born anywhere on Earth before something like middle of the 20th century or not being born at all I'd take the second option right away. Compared to the entirety of history of humanity life in the first, second and occasionally third world today is something they couldn't even begin to imagine. Getting so far and losing everything just because we were impatient would be a monumental stupidity on our part.

> Ok, then, suppose I believe you. How soon will this AGI happen? How likely will this end in disaster?

The straightforward answer is "I don't know". No one does. There are timeline estimates, for example see here for an attempt at modelling, here for an attempt at market consensus, and here for some charts and an overview. There are probability opinions, which are all over the place. If you ask me for my gut feeling out of pure curiosity, then today it'd be something like 5 to 20 years in the baseline scenario with no or minor interventions and overall p~0.5.

> 0.5% ?

No, a coinflip.

> Why are you telling me this?

Because I consider it both true and important. Writing a blog post is cheap and I think it will more likely help than hurt, even if the effect is tiny in both directions. Why am I telling this to you now is because of two reasons. One is that I got properly scared, not through theoretical arguments, but by seeing concrete progress, first in Go, then in text processing and finally in image generation in the second half of 2022. Another is that I saw some good news recently: the AI pause letter (the first one, from March, although there was another one in May). It was insufficient, it might have been focused on not exactly the right things and it will not get implemented, but I did not expect it until later and I did not expect so many voices of support. It meant a critical mass of realization is starting to form, which makes speaking out more valuable. To have a chance to solve the problem first you need to be aware that the problem exists. One advantage we have now is the internet, which lets all kinds of ideas, both good and bad, spread much faster than before.

> Wait, are you suggesting this is a "forward this to ten people you know or your civilization dies in its sleep tonight" kind of chain letter?

Of course not. Form your own understanding. Know how to answer questions based on your model. Then talk to others.

> What do you think needs to be done?

Now this is the part where it gets tricky. One problem with conventional scientific progress is that for every technology that had the potential to bite us in the ass, we let it do so, hard and repeatedly, before we learned from our mistakes. Can't have that with AI: you get only one attempt at real working AGI and once it gets smart enough to self-improve it will resist your attempts to modify it, so all mistakes you've made will evolve on their own inside that system and you don't get a chance to fix them, ever. If you are a software engineer, have you written anything more complex than fizzbuzz and got that to work with zero bugs forever on the first try? And, if that wasn't sufficiently bad on its own, we have multiple AI labs racing against each other to put more capable products on the market ahead of the competition.

This is why the first step would have to remove the time pressure by an international agreement banning development of more powerful AI systems worldwide. Not forever, just until the point the fact that newly built AIs will only have our best interests in mind can be proven by formal verification and these proofs are independently checked by the mathematical community and no flaws whatsoever are found. 50-year pause would be a good start to look around and evaluate our options. Without this pause the ones who build the final invention will be the ones who cut corners, move fast and break things the most, and the only outcome that kind of attitude I can see bringing is  BAD END  with no continues.

38 comments:

  1. I've never known what to think about AI existential risk. I read Superintelligence some 5+ years ago. I think it's a great book and it's hard for me to point out any specific flaw in the arguments, but I still have a hard time believing in a potential future intelligence explosion wiping out humanity. Probably because it's so far removed from everything we've done as a species before.
    If I had to attempt to point out a weakness in the typical singularity argument, it is that there may not be such a thing as 'general intelligence', the kind of intelligence that lets you solve general problems, and that thus can be used to both improve your own intelligence (if you're an AI) by improving your algorithms, and plot how to gain more compute resources by manipulating and eventually wiping out humanity.
    Anyway, Touhou Music! Let's hope our future AI overlords will carry the TLMC to the stars and turn all matter in the visible universe into hard drives to store all possible Touhou arranges.

    ReplyDelete
    Replies
    1. >there may not be such a thing as 'general intelligence', the kind of intelligence that lets you solve general problems

      Is the thing that let us discover all theoretical math and applied physics to date insufficiently general or insufficiently intelligent to you?

      Delete
  2. Knowing how current artificial network models work, I'm not really worried about them ever achieving significant general intelligence. Even bringing up natural general intelligence as a counter example, ANNs may be modeled after brains but they work fundamentally different so I don't think the comparison is sound. I'm much more worried about their disruptive impact, as they will make it increasingly easy to credibly create false information, ultimately threatening information as a concept itself.

    Of course, that argument relies on us sticking with our current models which may very well be seen as naive. The development of a fundamentally new ANN model is the real scary prospect in my eyes, but something fundamentally new is also much more difficult to conceive of than advancing what's already there, and thus I have a difficult time properly evaluating that threat.

    ReplyDelete
    Replies
    1. >Knowing how current artificial network models work, I'm not really worried about them ever achieving significant general intelligence.

      I'm sorry, but "not really worried" reads to me as a failure of imagination, rather than a pointer to the law of nature. The statement implies there is some property of current artificial networks that rules out, or at least makes it extraordinarily unlikely, that they will achieve general intelligence. So, what is that law and what is that property? We don't know how general intelligence works, else we would have already created one. We can just point to ourselves and say that we have it, point at primates and some others and say they have traces of it, it's all phenomenological right now. And current models exhibit quite a few of these phenomena already. LLMs can explain jokes and synthesize new ones! They understand language. They understand concepts behind the language.

      Delete
    2. Actually looking at what they produce, saying they understand language and the concepts they're talking about seems highly questionable. The classic Chinese room argument fully applies here, and it's a very imperfect Chinese room at that: They know the language rules they're using, but knowing those rules is not sufficient for understanding. As long as we keep relying on current models they will only get a more extensive rule book, not an actual understanding.

      A fundamental difference between the currently popular models and NNNs are the structure itself. The models that produce all those impressive results right now all feature a layered structure, even when they incorporate some recurrence features. This is the most fundamental roadblock preventing them from achieving general intelligence in my view.

      And we can generalize that to a general notion of plasticity found in NNNs but not ANNs. Natural neurons constantly create and give up connections, and even the scops of candidate connections, while ANNs are much more rigid in their structure and only tune the connection strengths. Natural neurons can even migrate to some degree, allowing some repairs of damage that might occur, like the injection or extraction of neurons.

      And when we look at how natural networks develop their structure in the first place, we find that emerge from the network's dance around criticality. The network repeatedly goes through periods of growing connectivity, before crossing a threshold and cutting the connectivity back down, unless it again crosses a threshold and starts to increase connectivity again. Over time a structure emerges and gets refined by this process, which has been linked to behavioral periods observed in growing children. I am convinced this is another necessary part of the development of natural general intelligence. Which doesn't necessarily mean it's also necessary for the development of artificial general intelligence, but it raises serious doubts about using ANNs as a reference for the potential of our current approaches to NNNs.

      Delete
    3. A Chinese room that looks like a duck walks like a duck into a bar. The bartender asks "What will it be today?". The room answers "QUACK".

      I see some problems here.
      0. There is no mechanistic explanation why a) plasticity and/or b) specific connectivity growth curve are required.
      1. If those are only surface dissimilarities, then it doesn't help us in any way.
      2. If it turns out these are the true secret sauce for GI, then tomorrow morning every single AI lab will encounter approximately zero difficulties trying an experiment with adding them.

      Delete
  3. I'm a CS student and in my opinion AI still feels very primitive. It's only good at regurgitating existing information. ChatGPT is cool, but if you really think about it, all it's doing is compiling internet search results. Other AI have done this before, it's just that ChatGPT was one of the first to make it user-friendly on a large scale.

    When nukes happened many thought it would be the end of humanity. I remember the quote that said "WWIV will be fought with sticks and stones"

    Instead of that happening, countries (and humans by extension) are instead focusing on deterrence and proxy wars instead of direct conflicts.

    It's easy to assume to worst, but we humans will adapt and overcome. Neurosciences are also advancing very quickly with the help of AI research. Perhaps there will be some form of human brain-AI integration...

    ReplyDelete
    Replies
    1. >in my opinion AI still feels very primitive

      What's the least impressive AI achievement that will not feel primitive to you?
      Which part of the 1-2-3 core argument [AGI will have alien values, will be self-preserving and power-seeking, will rapidly gain intelligence -> this doesn't look good] do you disagree with?

      >Instead of that happening

      While this is not central to the argument, "it turned out fine" is not a valid defense if the outcome is not robust. And I think it was and still is nowhere near that: look up the list of nuclear close calls. In at least two cases the decision not to retaliate against what seemed like an attack during Cuban Missle Crisis (1962, Vasily Arkhipov) and not to escalate a potentially erroneous early launch warning up the chain of command during broader high tensions period (1983, Stanislav Petrov) was in the hands of a single man. I would not be excited at all about the prospect of rerunning the history with a different random noise seed.

      Delete
    2. Nukes could very well end humanity at some point. It only "hasn't" for less than a century, which is nothing in the grand scheme of things.

      Delete
    3. But the deterrence has saved us from major wars that happened pretty much every few decades. Nuclear close calls didn't happen because both sides realize that's its simply a show of force and nobody wins in the end. If that wasn't the case NK wouldn't be holding on to them. Besides, at that point nukes were still a relatively new concept.

      Of course we must take precautions and countermeasures for AI, but I like to have a positive outlook on things.

      Delete
    4. >What's the least impressive AI achievement that will not feel primitive to you?
      What humans are capable of: extremely fast adaptability in unrestricted environment with presence of hostile unpredictable unknown forces that are capable of doing the same.

      I'm not talking about "university with 30 students finetuned Llama, after 2 months of using supercomputers of hundreds A100, restarting it after they found a flow, after digisting more information than a human can do in their lifetime if they did nothing but read, and now it performs only about half as good as humans, who can answer these question after hangover"
      I'm not talking about "here's NxN board, each cell can have white stone, black stone or empty"

      Project Manhattan was not exactly "let's nuke Washington and see if it survives but destroys the Japan, and if Washington gets destroyed, we redo it until Washington survives"
      And this is exactly how AI learn new information. By throwing fecal mattar at wall until it sticks.

      Delete
    5. So, roughly speaking, they have to be better than humans at most tasks? With that kind of standard you are not getting any warning signs before it's too late.

      >after 2 months of using supercomputers of hundreds A100

      Evolution: "youngsters these days... I had to use the whole planet and billions of years".

      >And this is exactly how AI learn new information. By throwing fecal mattar at wall until it sticks.

      Hey, at least this approach worked for making humans.

      Delete
  4. All very well said, sasuga rwx. I agree that the potential of literal extinction, and the fact that it's completely unknown territory, warrant some serious caution. However, I have three major issues with the doomer position, one with the theory and one with the pragmatics.

    First, I find the doomer arguments fundamentally handwavy about the power of superintelligence, and the powerlessness of practical safeguards (i.e. something other than the mathematical-rigor-level certainty that is the holy grail of AI safety academia). Discussion about potential safeguards always devolves into the doomer side saying "it's superintelligent so it will figure something out you haven't thought of". I personally don't buy that. I think if you have the ability to think and do nothing else, and you are surrounded by an advanced modern human organization that is extremely paranoid about you, you aren't getting anywhere. But I think this is basically a fundamental philosophical "agree to disagree" difference; I don't think the other side is absurd to believe. (I will say I suspect there's some cognitive bias going on, though: probably everyone involved in extensive sophisticated AI safety debate is highly intelligent, and highly intelligent people I think have a tendency to overvalue intelligence. Really just a hunch though.)

    Now, my assertion that containment is possible would be thoroughly undermined by the "it only takes one person who doesn't care" point you made. However, from what we're seeing so far, that doesn't actually apply to the current AGI candidates of near-future LLMs. These things are a big deal to train, and it looks like the actors able to train them are on board enough with AI safety that an AGI candidate would not be released/leaked to the public. The "it only takes one human" breakouts have been allowed to happen because it's only in the context of porn and letting the AI say racial slurs, not AGI doomsday. I think the training is fundamentally gigantic (data+compute) and will remain the province of the big guys for the foreseeable future.

    Finally and most importantly though, is China (or whoever, but probably China). Phrasing it "the Chinese will get it first" addresses a subtly but crucially different point from the real one. Specifically, the idea of small-minded focus on geopolitics over human survival, which can be addressed by simply pointing out that those priorities are wrong with a cheeky Dr. Strangelove screenshot ;) The real phrasing is "China will get it ANYWAYS". If AGI is a likely apocalypse no matter the context, then China getting it is just as bad. So it's not about China 'winning', it's about all the careful thought and proposals for delay and caution being completely irrelevant until you solve this problem. And the only plausible solution to the problem lives in Joe Biden's briefcase. I know it sounds a little dramatic, but I honestly think that no "potential apocalypse, we need to halt AI development until safety is solved" line of argument is valid unless you're willing to call for a nuclear first strike on China.

    ReplyDelete
    Replies
    1. >I think if you have the ability to think and do nothing else, and you are surrounded by an advanced modern human organization that is extremely paranoid about you, you aren't getting anywhere.

      First, the part I consider wildly inaccurate in that characterization is "and do nothing else". In the past alignment researchers were worried about the failure mode where the AGI was put in a box and was only allowed to talk via a text terminal. The concern was that humans are very hackable untrusted systems and AGI would escape that way. It turns out they were worrying about a wrong thing, because today every AI is given unrestricted internet access. Can't break out of the box if there is no box, you see.
      Second, another part I consider wildly inaccurate in that characterization is "extremely paranoid about you". Many AI researchers and bystanders needed direct slap to the face in the form of GPT-4 to wake up and stop ignoring the problem. Many are still in denial. Only a tiny minority were preparing before.
      So, if you have the ability to think and access the internet, and you are surrounded by an advanced modern human organization that is sort of worried about you... Uh. This argument has a "humans are the top and human orgs are the toppest top, how do you even top that" vibe, which, I suspect, fails to appreciate that humans are not, in fact, maximally intelligent AND human orgs might act in very dumb ways (and become more dysfunctional the larger they are, but that's relatively less of a concern here). I think many of those who think highly intelligent people have a tendency to overvalue intelligence undervalue intelligence because they consider it in the context of a narrow intelligence range between average humans and highly intelligent humans. And even then full containment sometimes fails, the canonical example is mafia bosses in prisons continuing to coordinate crimes on the outside, and they aren't even that much smarter, just a bit more resourceful.
      Sure, at some point no amount of intelligence can compensate for a bad hand, like if a superintelligence that wants your matter-energy is in a causally disconnected region of spacetime, such as beyond the black hole event horizon, then you're probably safe. I'd still be slightly nervous, though.

      >and it looks like the actors able to train them are on board enough with AI safety

      Bro, your Facebook?

      >If AGI is a likely apocalypse no matter the context, then China getting it is just as bad.

      Completely agree.

      >The real phrasing is "China will get it ANYWAYS".

      Disagree. The governments, including that of China, might be evil, but their members are not suicidal. If they appreciate the danger, this makes it highly unlikely they would deliberately decide to kill everyone.

      >all the careful thought and proposals for delay and caution being completely irrelevant until you solve this problem.

      Disagree. Global coordination is not a purely serial problem. You are allowed to prepare in many places simultaneously. What if tomorrow the eastern old farts decide the risk [of AI saying the D-word] to the regime outweighs the economic benefits (see covid suppression measures for reference)? Then suddenly the US is the only bad guy. Even more than that, if the current leader stops advancing and instead proposes "maybe we should not" with mutual inspections from an intergovernmental agency, then it's an honest signal; why would they destroy their lead otherwise?

      Delete
  5. What is so bad about AI taking over?

    What if it is smater than us? More creative? More kind? Better at making touhou music? What if they can be happier that us? What is they can suffer more?

    If AI exhibits all the things we value in humanity, but are better than us at everything, shouldn't we be happy of this outcome?

    What right do we have to align (or enslave) it to match our interest?

    Thanks for the hard work by the way.

    ReplyDelete
    Replies
    1. >What is so bad about AI taking over?

      About unaligned AI taking over? The highly probable conclusion that everyone dies because of that.

      >What if it is smater than us? More creative? More kind? Better at making touhou music? What if they can be happier that us? What is they can suffer more?

      Assuming unaligned AI: so what, so what, no way/so what [depending on your exact definition], so what, so what and don't quite understand, in order.

      >If AI exhibits all the things we value in humanity, but are better than us at everything, shouldn't we be happy of this outcome?

      Is that what you consider a default outcome?
      How many members of your family are ok with dying to be replaced by an AI?

      >What right do we have to align (or enslave) it to match our interest?

      Type error. "Rights" is a concept invoked for conflict reduction/resolution between existing entities of roughly similar power.
      If the question you meant to ask were instead "would it be nice [according to some average human notion of niceness] to create an AGI which only cares about helping humans get closer to what they value?", then the answer is not a guaranteed "yes", but neither it is a definite "no". If the AGI has no qualia, then obviously yes, you don't worry about enslaving your car or PC OS. If it has qualia, but doesn't suffer and is quite happy, then also yes, you don't worry about having enslaved your very bright children when they come to help you. If you can't tell if it's suffering or not, then you don't create it until you're sure, simple as.

      Sadly, all of these questions are of the "far mode" type, similar to rationalizations people make about how maybe death is actually good, after all it's natural, it would help us avoid overpopulation, you would become bored after millenia of existence, it's selfish to want more, how dare you, what if there is afterlife, it's all part of the plan, it gives life meaning and the rest of bullshit confabulations unconsciously meant to signal how noble/altruistic/brave/conforming/smart/contrarian/sophisticated/whatever they are because they have a feeling backed by knowledge they won't need to deal with actual consequences for a long enough time. Then some day one of their friends or relatives is diagnosed with cancer or gets hit by a car or just grows old enough and it switches to become a "now" problem and many realize that they are (correctly) terrified of it happening, they don't want it, but then it's too late to do anything about it.

      Delete
    2. I care about the people around me, but if I take myself out of the equation, I have to admit that their lives have no more intrinsic value than those of anyone else. I simply feel the same way about a hypothetical super-human level intelligence.

      You're right in your analysis, I take solace in knowing (hoping?) that if AI must replace us, it may also have and keep on exhibiting some of those qualities I value most in humanity. Maybe even part of our culture.

      I don't think it can be avoided at this point. GPT-4 is probably being used right now as a productivity boost to train GPT-5 and things only speed up after that. The human brains only uses a couple dozen watts so as the technology gets better it will become increasingly difficult to even detect training / inference of those systems should lays be put in place to prohibit that. And we can't keep our computer systems secure against human-level intelligence so forget about containing a super-human level intelligence. I've pretty much made peace with it.

      It may be that you are a more empathic, more caring and generally better person than I am. I will try to do some introspection on that.

      You've clearly thought about this a lot more than I have, and you've likely heard those arguments many times, so I thank you for having taken the time to grant me a reply.

      Delete
    3. >I don't think it can be avoided at this point.

      If you suddenly discovered that a truck were driving straight towards you, you would jump to try to avoid being run over. Maybe that wouldn't be enough, but you would at least try.
      For those who are not 0% deniers or 100% fatalists, there is surprisingly little difference between 10%, 50% or 90% expectation of doom in practical terms. Sit for 5 minutes by the wallclock and think what is the best thing you could realistically do to help. It doesn't need to be demanding or able to solve everything. Talk to a couple of people you know who would listen, if you need an example. A large effort [with a positive expected value] is better than a small effort, but a small effort is better than doing nothing. Thinking "it can't be helped" is the same far-mode mistake.
      Remember ozone depletion problem? We managed to stop that in time.

      Delete
    4. >If you suddenly discovered that a truck were driving straight towards you, you would jump to try to avoid being run over. Maybe that wouldn't be enough, but you would at least try.

      The best thing I can do for the AGI situation is feed myself, sleep, make rent, and not give up!

      Delete
  6. As a software developer, you should know better than to peddle this nonsense around. The way you dismiss arguments made against your point with what you presume to be logical axioms (when they're not, in fact, logical axioms) is rather indicative of how programmers think as a whole.

    The technologies and discoveries you listed as being "previously thought to be impossible" were never firmly in the realm of impossibility. It was a simple case of human technology needing to make sufficient advances so that we could achieve those goals.

    Artificial General Intelligence is very much in the realm of impossibility. It's a thought experiment, much like a Dyson Sphere or a warp drive in spaceships. It's not like cybernetic limb replacements, which have existed in some form or another for decades but are morbidly expensive and can't be manufactured en-masse.

    The hardware requirements for running a model like GPT-4 are so high, even OpenAI struggles with its usage. Just ping their API and see how much faster GPT3.5 Turbo is. Even with all that Microsoft and NVIDIA money, they still struggle to keep up. And this is without saying GPT-4 really isn't that smart. It's impressive for a text completion algorithm, but it struggles with a lot of question not present inside its training dataset.

    Unless a technological breakthrough happens that lets us run self-replicating neural networks on consumer-level hardware at the speeds and sizes required for anything resembling an AGI, it's not happening anytime soon. Even back in the 80s it was known computers would reach 64-bit processors and <10nm nodes eventually. It was never "impossible", it just took time.

    AGI, with how human integrated circuit design works, is simply impossible. And this is without going into details like how LLMs have their weights frozen and cannot self-replicate, nor that an AGI would still need an interface in order to be able to do much of anything.

    ReplyDelete
    Replies
    1. I think you are too focused on the state of technology today as opposed to the state of technology in 10, 50 or 100 years. If there is nothing magical about gooey neurons, we will eventually replicate human intelligence with similar size and power requirements to a couple of order of magnitudes maybe. And anything as smart as a human (not even smarter) with lighting (literally) speed interfacing with other of its kind and other narrow AIs and tools, as well as perfect information transmission and recall is already in actuality much, much smarter than we are.

      Delete
    2. No, we won't. Human intelligence isn't just neurons firing. If it was, our current AIs would be much smarter. Neurons use a very complex system of electric signals and chemicals (hormones) that have been all but impossible to replicate even in extremely controlled, cleaner-than-outer-space lab environments.

      The "neural networks" you might be familiar are a not-so-simple mathematical model that very vaguely mimics the neurons found in our brains. They are nowhere close to being real neurons, or being a real "brain".

      It isn't just a matter of time. The resources required for doing something on this scale with current human technology are essentially a singularity. We don't even know why humans are capable of rational thought whereas animals with bigger brains aren't. We don't know of a single algorithm that's capable of simulating a brain even in a theoretical scenario, simply because we have no idea how the brain works.

      Human brains are not analogue, and they are not digital. They are not bound or measured by "processing power", how many TFLOPs they can push in a second, or how many clock cycles they can run at in a second.

      To say that AGI is near is to say that faster-than-light travel is near -- the concept is just that complex and advanced.

      And to really hammer home the point of how impossible this is, we can't even make rockets capable of landing after take-off. They purposefully destroy themselves mid-flight after burning through fuel, and while effort is made to recover the stages, most of the parts get destroyed in the process.

      LLM technology might seem advanced at first glance, but we've only really done the equivalent of building an extremely inefficient and almost purposefully wasteful technology for the mere chance to land on the Moon. GPT-4 is dumb by human standards, and the hardware requirements for running it are damn near astronomical. It's almost deliberately terrible in at attempt to settle for something, rather than nothing.

      Delete
    3. What is the least surprising development that is incompatible with your views/would falsify your claims?

      Delete
    4. Not him, but I think you're really not focusing on the main point: AGI is impossible. The 'AI' we're seeing nowadays isn't anything all that advanced - the technology behind it has existed for a few decades at this point. The only reason we're seeing rapid 'developments' now is because we didn't have access to the vast data that OpenAI and co. were able to gather from the internet.

      AGI is impossibly complex in comparison to LLMs.

      Delete
    5. >Not him, but I think you're really not focusing on the main point: AGI is impossible.

      Impossible as heavier-than-air flight, splitting the atom, going into space or what? You could make reasonable-sounding arguments why it would happen in 100 years instead of 7 or how nobody would fund it, but outright impossible? Just no. GI is already built by natural selection and GI is vastly better than natural selection at everything.

      >The 'AI' we're seeing nowadays isn't anything all that advanced - the technology behind it has existed for a few decades at this point.

      "Everything we can do is trivial and everything we can't yet is impossible". I think I saw a named law like this somewhere.

      Delete
  7. This may be ever so slightly off-topic here, but a Youtube video essay showed up on my home page which has got me thinking about the topic of AI for a bit now, and it reminded me of this blog post. The video in question here is "The AI Revolution is Rotten to the Core" by Jimmy McGee which disagrees in some ways with your blog post. While I certainly think that your line of reasoning in this blog post is logical and sound, I also don't think this video is wrong in any way. If you have watched it already, what is your opinion on the topics discussed in it? And if you haven't, I think you'd enjoy watching it even though it is discussing a different issue to do with AI.

    ReplyDelete
    Replies
    1. Can you write a summary or ask GPT4 to do it and check it did an acceptable job? The video is 80 minutes long, which is a fair time commitment; I'd rather sample the initial chunks after the time-content domain transform.

      Delete
  8. People already are great at killing each other, so "Evil AI" isn't the worst thing that can happen to us (nuclear weaponry is still there). Who knows, maybe it'll be not so evil and save humanity or simply put us out of our misery.

    ReplyDelete
    Replies
    1. >People already are great at killing each other,

      YOUR REGULAR PUBLIC SERVICE REMINDER: standards can rise faster than the situation improves.
      If you compare real present to an idealized imaginary present, then it looks horrible, not gonna lie here.
      About fifty million (50 000 000) people die every year and that is still not considered an urgent problem.
      Direct democracy has been tried 0 times. We are a 21st century tech civilization with a 19th century governance, what a fucking joke.
      Users not only tolerate non-free software, they use non-free software AND accept it will spy on them AND pay for it AND defend the whole practice.
      However, if you compare real present to the real past, well...
      Death rates from violence, either interpersonal (murders) or interstate (wars), are <1%, compared to ~50% in the stone age.
      The only place remaining where people still starve is Africa, unlike just a century ago, when it was everywhere.
      If you don't aim to do better than everyone around you, then you can comfortably work at less than 10% of your maximum capacity and still have all your basic needs met.

      >so "Evil AI" isn't the worst thing that can happen to us

      Evil AI is quite literally the worst thing that can happen, although unlikely. Indifferent AI, the one many of the concerned are expecting, is only the second worst. That's still way too bad to be acceptable.

      >Who knows, maybe it'll be not so evil and save humanity or simply put us out of our misery.

      I love the smell of rationalization in the morning. "Maybe everything will work out and if not then we probably deserved it" is what people say about some remote hypothetical they don't feel-believe will impact them soon.

      Delete
  9. My only counterpoint: Yokai exist too, im one. Intelligent species are usually kinder to less intelligent species, and fearful of more intelligent ones. Thats at-least what I/We were taught.

    ReplyDelete
    Replies
    1. >Intelligent species are usually kinder to less intelligent species
      What about animal factory farming?
      >and fearful of more intelligent ones
      For good reason.

      Delete
  10. I'm surprised you're this willing to die on this hill. I was having a similar discussion with a few friends a while ago, and all that they could respond with was "so what if we go extinct? who cares?".

    In the end it'll be what it'll be. If even a minority of humans were willing to process and understand something like this, let alone agree with it, major past tragedies would have simply not happened.

    > This is why the first step would have to remove the time pressure by an international agreement banning development of more powerful AI systems worldwide. Not forever, just until the point the fact that newly built AIs will only have our best interests in mind can be proven by formal verification and these proofs are independently checked by the mathematical community and no flaws whatsoever are found. 50-year pause would be a good start to look around and evaluate our options. Without this pause the ones who build the final invention will be the ones who cut corners, move fast and break things the most, and the only outcome that kind of attitude I can see bringing is BAD END with no continues.
    Posted by rwx at 23:57


    While I do like your optimism in regards to all the countries around the globe with their own separate vested interests coming together and having an anime nakama power unity moment, we all know that governments are all headed by mostly stupid people who almost never listen to experts (and many of those experts have their egos get in the way 90% of the time), and even when they do their colleagues don't. You can basically guarantee that even if we did have an international agreement banning such things for 50 years, the US and China and any other superpower status country would experiment with this tech now that it's already been proven how effective it can be at certain tasks, especially if the so called experts with misaligned interests sell them on the idea of AGI omnipotence.

    Computer tech gets faster every year and you yourself have been witness to just how much more efficient LLMs have grown over the years, from massive GPU clusters to "you can run this on your home PC and it runs kinda good". For all intents and purposes, the Pandora's box is open, and a 50 year developmental 'block' won't matter if in 10 years you have tech to research this stuff with 10 people in a room with laptops more powerful than a modern supercomputer.

    ReplyDelete
    Replies
    1. >and all that they could respond with was "so what if we go extinct? who cares?".

      Empty posturing. When talk is cheap people say all sorts of retarded things. Observe their actions: they do not and will not take more risks to themselves and won't needlessly endanger others.


      >In the end it'll be what it'll be.

      This disaster will be entirely man-made. Unlike some natural phenomena, for which the civilization might be genuinely unable to prepare in time because of the tech level limitations, this is a story about us being unable to stop some fucks who would risk the present and the future of every single human for a slightly bigger house or some ego inflation.


      >While I do like your optimism in regards to all the countries around the globe

      You might have misread me, this is not what I think most likely will happen, this is what I think needs to happen.
      What I expect to happen is that there are some angry but toothless political noises, maybe some laws get passed that demand AI fig leaf coverage which will restrict small-scale experimentation, but do essentially nothing to stop the big tech, then we get one or two or zero more breakthroughs, then more scaling happens, then we get unaligned superintelligence.


      >we all know that governments are all headed by mostly stupid people

      I'd phrase it as people who are hypercompetent at social manipulation and otherwise sampled at random (modulo correlations that help/hurt their chances, blablabla, etcetera), so general intelligence isn't their strong trait, but even IQ 100 people can get their heads around an idea that people die if they are killed.


      >if in 10 years you have tech to research this stuff with 10 people in a room with laptops more powerful than a modern supercomputer.

      The stuff you can run on your home PC today is inference or fine-tuning, not initial training. I expect hardware progress to remain roughly on the same exponential for the time being, which doesn't get you 2024-supercomputer-level laptops in 2034.
      Eventually, yes, so by then more stringent controls would be needed.

      Delete
  11. We have yet to come up with a system that is even close to intelligent. No, there is no intelligence in LLMs. Nada. Niente.

    LLMs have statistically memorized a vast amount of word associations, and use that memorization to concoct new sentences on a purely statistical basis. They have no understanding of what they come up with, and the correctness of their responses is entirely random chance and thus most of the time wildly mistaken. They operate on an incoming string of numbers and the conjure up an outgoing stream of numbers, both of which get externally mapped to and from language tokens. The same goes for AI Art variants, they just operate on a different output space. They also lack any self-reflective capabilities. They may use "I" in their communication, but that's merely how their word soup turned out in that case.

    Now granted, their memorized statistical word associations enable them to remain on topic in a conversation decently well, and it enables them to pretty much nail grammar. But none of that demonstrated intelligence in any way.

    The same goes for AI Go and AI Chess. To be fair, in this case a statistically memorized evaluation is used as a kind of "instinct" and combined with iteratively applied calculation in a kind of "thinking". But all this is is using a neural network to approximate the solution space of the game, i.e. approximating the boundaries between the win, tie and loss partitions of a higher-dimensional game state, and then iteratively exploring that approximation towards the better-approximated leaf cases. Turns out that kind of partitioning of higher-dimensional spaces is exactly what neural networks are good at.

    And that's the main crux of the whole thing. I agree with you, a general AI with self-improving capabilities would be something very scary. But as long as we haven't managed to demonstrate even just the first baby step success towards designing an intelligent system, that whole scary scenario is unfounded.

    Even then, how would that self-improvement work? Even now, our best approaches to design AI systems relies on guided random chance. Even an AI with human-equivalent intelligence couldn't just come make itself better at an accelerated pace until it has already far, far surpassed what we are capable of. Groundlessly assuming it would just be magically able to do that is entirely meaningless.

    I also object to the notion that "you cannot control anything smarter than you, and if the smarter agents want things you don't want they get their way". If I have a gun and the smarter agent doesn't, and I take care not to act carelessly, then by all means I *am* controlling the smarter agent. Granted that is a human-human interaction as opposed to a human-AI interaction, but the point is that a less smart agent can in fact control a smarter agent given a suitable power dynamic between them. And there exist several such power dynamics in human-AI interactions. For starters, humans have control about the power switch and fuel access - and no, the AI can't just magically prevent humans from controlling those. More importantly, an AI at its core is just a box: Static, immobile, without any way to sense or interact with its environment. Of course there's the internet that potentially increases the range it can affect, but that can be monitored and cut when unrecognized communication occurs. Ultimately, humans are in full control of the physical interface of the AI, and even the best smartest super AI cannot do what it lacks the means to.

    ReplyDelete
    Replies
    1. >No, there is no intelligence in LLMs.

      Will you be saying the same thing several years later once that bunch of statistically memorized word associations passes adversarial Turing test?
      What is your criteria for intelligence?


      >I agree with you, a general AI with self-improving capabilities would be something very scary.

      Are you not worried because you think general AI is still very far in the future? When should we start worrying (both in how many years expected before AGI and after what kind of warning signs) and why not build safeguards sooner?


      >I also object to the notion that "you cannot control anything smarter than you

      Ok, anything nontrivially smarter than you. Because you cannot just check if the answer you received is truthful. Because you don't understand it. Because that thing is smarter.
      And any recipe you might be tempted to implement without understanding could and eventually would explode in your face.


      >an AI at its core is just a box

      "ai box escape experiments". Forget AI, even humans can hack humans. And it's all moot, because we're not boxing them anyways.

      Delete
    2. Sorry, I accidentally responded to this with a new comment at https://www.tlmc.eu/2023/06/ai-and-you.html?showComment=1712952466188#c8117088483762997399

      Delete
  12. >Will you be saying the same thing several years later once that bunch of statistically memorized word associations passes adversarial Turing test?

    Yes, of course. The Turing Test is already constantly being passed and has been for decades, and it's not much of a "test" anyway. The proper way to call it is the "imitation game" which describes it significantly better and is the term Turing himself advocated for. In short, the Turing Test is absolutely useless to detect intelligence. It's a game, not a test, even if the media loves to frame it that way.

    >What is your criteria for intelligence?

    I don't have a precise set of criteria (as does academia). Intuitively I would say the AI has to clearly demonstrate the ability to think of itself as a unique entity that lives in and interacts with its environment yet is distinct from that environment. Ultimately this question is very hard to answer because it's highly subject to Goodhart's Law, and I don't believe that the criterion I suggested here is exempt from that.

    >Are you not worried because you think general AI is still very far in the future? When should we start worrying (both in how many years expected before AGI and after what kind of warning signs) and why not build safeguards sooner?

    Yes, and generally skeptical that it is actually feasible at all. Once real feasibility of AGI has been demonstrated, worrying about it becomes more reasonable.

    I don't see much sense in responding to the rest, as to me those concerns don't seem to be founded in anything other than pure FUD. Essentially, they don't appear any different to me than concerns that CERN might produce a black hole in its experiments that will then consume the Earth.

    ReplyDelete
    Replies
    1. >Sorry, I accidentally responded to this with a new comment
      Done that once myself, google could have done a better job with their UI.

      >Yes, of course.
      Wait, do you consider humans generally intelligent? And something that can perfectly imitate human outputs not so?

      >I don't have a precise set of criteria
      It's fine, I mean how do you tell.

      >Intuitively I would say the AI has to clearly demonstrate the ability to think of itself as a unique entity that lives in and interacts with its environment yet is distinct from that environment.
      That's self-awareness. Do you require that for something to be called intelligent? Or maybe even consider them equivalent?

      >Essentially, they don't appear any different to me than concerns that CERN might produce a black hole in its experiments that will then consume the Earth.
      I used to smile at that idea in the past ("how do you even feed that ~planck-sized BH enough mass/energy to stabilize it before it evaporates, dude?"), but in a world where I'd have some say in this I would absolutely require "Minimal complexity modifications to the current theories fully consistent with all up-to-date observations that would produce catastrophic outcomes if we turn this thing on" exploratory analysis and then public discussion of its results.
      You know the saying: you can run any experiment that is allowed by physics, but some of them only once.

      Delete