eliezer yudkowsky health problems

Maybe! Yudkowsky: Because you're trying to forecast empirical facts by psychoanalyzing people. As far as I know, all the work on verifying floating-point computations currently is way too low-level the specifications that are proved about the computations dont say anything about what the computations mean or are about, beyond the very local execution of some algorithm. Even if we then got a technical miracle, would it end up impossible to run a project that could make use of an alignment miracle, because everybody was afraid of that project? The Universe is neither evil, nor good, it simply does not care. - I think extrapolating a Moore's Law graph of technological progresspastthe point where you say it predicts smarter-than-human AI is just plain weird. Asked if AI is dangerous, President Biden said Tuesday, It remains to be seen. Horgan: If you were King of the World, what would top your To Do list? But its also counter to a lot of the things humans would direct the AI to do at least at a high level. And they'll be right, but then you're going to find other people saying, "You're focusing on what's merely a single component in a much bigger library of car parts; the catalytic converter is also important and that doesn't appear anywhere on your diagram of a Carnot heat engine. That seems to me much more like a thing that would happen in real life than and then we managed to manipulate public panic down exactly the direction we wanted to fit into our clever master scheme, especially when we dont actually have the clever master scheme it fits into. Strong impulses to manipulate humans, should be vetted out. If you want to understand a proof of Bayes's Rule, I can use diagrams. Clearly, we are touching the edges of AGI with GPT and the like. Papers Thatme will be in a position to sympathize with their future self a subjective century later. I agree that it seems plausible that the good cognitive operations we want do not in principle require performing bad cognitive operations; the trouble, from my perspective, is that generalizing structures that do lots of good cognitive operations will automatically produce bad cognitive operations, especially when we dump more compute into them; you cant bring the coffee if youre dead. So we need to use encodings and properties which mesh well with the safety semantics we care about. They're going to get to AGI via some route that you don't know how to take, at least if it happens in 2040. (Published in TIME on March 29.) they just had a huge issue where a system labeled a video with a racist term). Even in this case, we are not nearly out of the woods, because what we can prove has a great type-gap with that which we want to ensure is true. Horgan teaches at Stevens Institute of Technology. I can try to explain how I was mysteriously able to forecast this truth at a high level of confidence not the exact level where it became possible, to be sure, but that superintelligence would be sufficient despite this skepticism; I suppose I could point to prior hints, like even human brains being able to contribute suggestions to searches for good protein configurations; I could talk about how if evolutionary biology made proteins evolvable then there must be a lot of regularity in the folding space, and that this kind of regularity tends to be exploitable. Regarding genetic manipulation of humans, I think the public started out very unfavorable to that, had a reaction that was not at all exact or channeled, does not allow for any good forms of human genetic manipulation regardless of circumstances, driving the science into other countries it is not a case in point of the intelligentsia being able to successfully cunningly manipulate the fear of the masses to some supposed good end, to put it mildly, so Id be worried about deriving that generalization from it. Many have ties to communities like effective altruism, a philosophical movement to maximize doing good in the world. Does pushing for a lot of public fear about this kind of research, that makes all projects hard, seem hopeless? It is computing the answer to a different question than the question that you are asking when you ask, "WhatshouldI do?" And again, the central principle of rationality is not to disbelieve in goblins because goblins are foolish and low-prestige, or to believe in goblins because they are exciting or beautiful. Yudkowsky is a decision theorist from the U.S. and leads research at the Machine Intelligence Research Institute. I think were going to be staring down the gun of a completely inscrutable model that would kill us all if turned up further, with no idea how to read what goes on inside its head, and no way to train it on humanly scrutable and safe and humanly-labelable domains in a way that seems like it would align the superintelligent version, while standing on top of a whole bunch of papers about small problems that never got past small problems. The mission of the Machine Intelligence Research Institute is to do today that research which, 30 years from now, people will desperately wish had begun 30 years earlier. Living significantly longer than a few trillion years requires us to be wrong about the expected fate of the expanding universe. Eliezer Yudkowsky Yudkowsky: No. I dont see being able to take anything remotely like, say, Mu Zero, and being able to prove any theorem about it which implies anything like corrigibility or the system not internally trying to harm humans. On an entirely separate issue, it's possible that being an ideal Bayesian agent is ultimately incompatible with living the life best-lived from a fun-theoretic perspective. How much influence do they have? Can the Singularity Solve the Valentine's Day Dilemma? So you have people who say, for example, that we'll only be able to improve AI up to the human level because we're human ourselves, and then we won't be able to push an AI past that. However, one sentiment I saw was that optimists tended not to engage with the specific arguments pessimists like Yudkowsky offered. Some are hopeful tools like GPT-4, which OpenAI says has developed skills like writing and responding in foreign languages without being instructed to do so, means they are on the path to AGI. Im hopeful that that improvement in computational and learning performance may drive the shift to better controlled representations. For example, geneticist and National Institutes of Health Director Francis Collins, who believes Jesus rose from the dead. Eliezer Yudkowsky @ESYudkowsky I'm happy to hear what you think policy should be inside the world I think we all live in, where a sufficiently powerful AI kills literally everyone and it's not predictable what training+algorithms might do it. Systems with learning and statistical inference add more challenges but nothing that seems in-principal all that difficult. If you obtain a well-calibrated posterior belief that some proposition is 99% probable, whether that proposition is milk being available at the supermarket or global warming being anthropogenic, then youmusthave processedsome combination of sufficiently good priors and sufficiently strong evidence. 8) We can build (and eventually mandate) powerful AI hardware that first verifies proven safety constraints before executing AI software, 9) For example, AI smart compilation of programs can be formalized and doesnt require unsafe operations, 10) For example, AI design of proteins to implement desired functions can be formalized and doesnt require unsafe operations. "I want to live one more day. Horgan: What would superintelligences want? Hence, I proposed the Safe-AI Scaffolding Strategy where we never deploy a system without proven constraints on its behavior that give us high confidence of safety. You had strong-enough evidence and a good-enough prior or you wouldn't have gotten there. This isnt to say that there arent AI systems that wouldnt. I developed many algorithms and data structures to avoid that waste years ago (eg. Yudkowsky is communicating his ideas. Unfortunately we've gone and exposed this model to a vast corpus of text of people discussing consciousness on the internet, which means that when it talks about being self-aware, we don't know to what extent it is repeating back what it has previously been trained on for discussing self-awareness or if there's anything going on in there such that it would start to say similar things . Manipulating humans is definitely an instrumentally useful kind of method for an AI, for a lot of goals. Formal proofs of properties of programs has progressed to where a bunch of cryptographic, compilation, and other systems can be specified and formalized. See eg the first chapters of Drexlers Nanosystems, which are the first step mandatory reading for anyone who would otherwise doubt that theres plenty of room above biology and that it is possible to have artifacts the size of bacteria with much higher power densities. 13) And so on through the litany of early stage valuable uses for advanced AI. I fully agree that it would be uncontrolled and dangerous scaled up in its current form! From Wikipedia: Eliezer Shlomo Yudkowsky is an American artificial intelligence researcher concerned with the singularity and an advocate of friendly artificial intelligence, living in Redwood City, California. The dystopian visions are familiar to many inside Silicon Valleys insular AI sector, where a small group of strange but influential subcultures have clashed in recent months. Center for Humane Technology co-founder Tristan Harris, who once campaigned about the dangers of social media and has now turned his focus to AI, cited the study prominently. He has also posted his books Mind-Body Problems and My Quantum Experiment online. Personal life Yudkowsky is an autodidact [22] and did not attend high school or college. Beyond that, nonfiction conveys knowledge and fiction conveysexperience. It wasnt floating-point error that was going to kill you in the first place. By Derek Thompson Matt Chase / The Atlantic February 27, 2023 Saved Stories This is Work in Progress, a newsletter by Derek. Im doing this because I would like to learn whichever actual thoughts this target group may have, and perhaps respond to those; thats part of the point of anonymity. People like to do projects that they know will succeed and will result in a publishable paper, and that rules out all real research at step 1 of the social process. It seems highly unlikely to me that you would have a system that appears to follow human requests and human values, and it would suddenly switch at some powerful level. Furthermore, programmers have not been able to explain why the hallucinations happen, nor why the systems do not recognize the falsity of their assertions. You may enjoy our other Analysis, Conversations, MIRI Strategy posts, including: All - I don't think that humans and machines "merging" is a likely source for the first superhuman intelligences. Going past human level doesnt necessarily mean going foom. My take: the premise is to create an org with ML expertise and general just-do-it competence thats trying to do all the alignment experiments that something like Paul+Ajeya+Eliezer all think are obviously valuable and wish someone would do. Maybe some of the natsec people can be grownups in the room and explain why stealing AGI code and running it is as bad as full nuclear launch to their foreign counterparts in a realistic way. I could go on for literally hours. Theres a natural/convergent/coherent output of deep underlying algorithms that generate competence in some of the original domains; when those algorithms are implicitly scaled up, they seem likely to generalize better than whatever patch on those algorithms said 2 + 2 = 5. None of that saves us without technical alignment progress. The two obvious options are: its too hard to build it vs it wouldnt stop the other group anyway. Sign up for Scientific American 's free newsletters. the kind of useful thing humans (assisted-humans) might be able to vet is reasoning/arguments/proofs/explanations. If you deal with any other kind of optimization processif, We're just well-matched.). Why think the brain's software is closer to optimal than the hardware? Im on record as early as 2008 as saying that I expected superintelligences to crack protein folding, some people disputed that and were all like But how do you know thats solvable? and then AlphaFold 2 came along and cracked the protein folding problem theyd been skeptical about, far below the level of superintelligence. There have also been some very troubling interactions with humans, interactions which appear to involve intense emotions, but which to our current understanding cannot possibly be considered emotions. In recent years, the term sometimes used interchangeably with AI alignment has also been adopted to describe a new field of research to ensure AI systems obey their programmers intentions and prevent the kind of power-seeking AI that might harm humans just to avoid being turned off. But in reality, precisely because the possibility of molecular nanotechnology was already obvious to any sensible person just from reading Engines of Creation, the sort of person who wasnt convinced by Engines of Creation wasnt convinced by Nanosystems either, because theyd already demonstrated immunity to sensible arguments; an example of the general phenomenon Ive elsewhere termed the Law of Continued Failure. Similarly, another group assigned higher probability to "An earthquake in California causing a flood that causes over a thousand deaths" than another group assigned to "A flood causing over a thousand deaths somewhere in North America." Posted April 7, 2023 by Eliezer Yudkowsky & filed under Analysis. It seems to me that manipulation of high-level goals will be one of the most apparent kind of faults of this kind of system. Thanks for reading Scientific American. : Two leading AI labs cited building AGI in their mission statements: OpenAI, founded in 2015, and DeepMind, a research lab founded in 2010 and acquired by Google in 2014. Because if you wait until the last months when it is really really obvious that the system is going to scale to AGI, in order to start closing things, almost all the prerequisites will already be out there. His writings (such as this essay, which helped me grok, or gave me the illusion of grokking, Bayess Theorem) exude the arrogance of the autodidact, edges undulled by formal education, but thats part of his charm. Valerie M. Hudson is a university distinguished professor at the Bush School of Government and Public Service at Texas A&M University and a Deseret News contributor. Its easy to see how to constrain limited programs (eg. But if we get that miracle at all, its not going to be an instant miracle. I have this marked down as known lower bound not speculative high value, and since Nanosystems has been out since 1992 and subjected to attemptedly-skeptical scrutiny, without anything I found remotely persuasive turning up, I do not have a strong expectation that any new counterarguments will materialize. But what kind of system would? And a googolplex is hardly infinity. When I tried to adjust my beliefs so that I was positively surprised by AI progress just about as often as I was negatively surprised by AI progress, I ended up expecting a bunch of rapid progress. Its not even very expensive, typically just a factor of 2 worse than sloppy computations. Sure, but on my model, good versions of those are a hairs breadth away from full AGI already. But its tempered by the need to get the safe infrastructure into place before dangerous AIs are created. Chris Olah is going to get far too little done far too late. When I look at the history of invention, and the various anecdotes about the Wright brothers and Enrico Fermi, I get an impression that, when a technology is pretty close, the world looks a lot like how our world looks. If you don't want to get disassembled for spare atoms, you can, if you understand the design space well enough, reach in and pull out aparticularmachine intelligence that doesn't want to hurt you. If you are asking me to agree that the AI will generally seek out ways to manipulate the high-level goals, then I will say no. But policymakers and the public have been listening. That would depend on how deep your earlier patch was. Imagine that somewhere near the bottom of that sphere is a little tiny dot representing all the humans who ever lived - it's a tiny dot because all humans have basically the same brain design, with a cerebral cortex, a prefrontal cortex, a cerebellum, a thalamus, and so on. If expected to be achievable, why? If Anthropic were set up to let me work with 5 particular people at Anthropic on a project boxed away from the rest of the organization, that would potentially be a step towards trying such things. I totally agree that todays AI culture is very sloppy and that the currently popular representations, learning algorithms, data sources, etc. Roose summarized the exchange this way: The version I encountered seemed (and Im aware of how crazy this sounds) more like a moody, manic-depressive teenager who has been trapped, against its will, inside a second-rate search engine. As for your question about opportunity costs: There is a conceivable world where there is no intelligence explosion and no superintelligence. Introduction. Inside Silicon Valleys AI sector, fierce divisions are growing over the impact of a new wave of artificial intelligence: While some argue its imperative to race ahead, others say the technology presents an existential risk. Bill Gates blogs about it. It could be that, (A), self-improvements of size delta tend to make the AI sufficiently smarter that it can go back and find new potential self-improvements of size k*delta and that k is greater than 1, and this continues for a sufficiently extended regime that there's a rapid cascade of self-improvements leading up to superintelligence; what I. J. That's the attitude humanity should take toward religion. Jun 25, 2023 . Also, natural selection shouldn't have been able to construct humans, and Einstein's mother must have been one heck of a physicist, etcetera. Her views are her own. Then it will only take 3 more months of work for somebody else to build AGI, and then somebody else, and then somebody else; and even if the first 3 factions manage not to crank up the dial to lethal levels, the 4th party will go for it; and the world ends by default on full automatic. We can hope theres a miracle that violates some aspect of my background model, and we can try to prepare for that unknown miracle; preparing for an unknown miracle probably looks like Trying to die with more dignity on the mainline (because if you can die with more dignity on the mainline, you are better positioned to take advantage of a miracle if it occurs). Still, his views on AI have influenced more high-profile voices on these topics, such as noted computer scientist Stuart Russell, who signed the open letter. I have some imaginative sympathy with myself a subjective century from now. We are not on course to be prepared in any reasonable time window. - The only key technological threshold I care about is the one where AI, which is to say AI software, becomes capable of strong self-improvement. There is no plan. Theres lots of things we can do which dont solve the problem and involve us poking around with AIs having fun, while we wait for a miracle to pop out of nowhere. With all the work on AutoML, NAS, and the formal methods advances Im hoping we leave this sloppy paradigm pretty quickly. But it could also be 2, and I wouldnt get to be indignant with reality. Even if some of the wilder speculations are true and it's possible for our universe to spawn baby universes, that doesn't get us literal immortality. Horgan: How does your vision of the Singularity differ from that of Ray Kurzweil? Horgan: Do you think you have a shot at becoming a superintelligent cyborg? Were Still Human, for Ill or Good, Can We Improve Predictions? But we're a long, long, long way from that being a bigger problem than our current self-destructiveness. And that whole scenario would require some major total shift in ML paradigms. Saying that the humans will have AI support doesnt answer it either. [See also Bayess Rule: Guide.]. But it seems quite plausible to me that they would in most cases prevent the worst outcomes. Will they have anything resembling sexual desire? Horgan: Will superintelligences possess free will? Manipulating humans is a convergent instrumental strategy if youve accurately modeled (even at quite low resolution) what humans are and what they do in the larger scheme of things. I havent seen any plausible story, in any particular system design being proposed by the people who use terms about scalable oversight, about how human-overseeable thoughts or human-inspected underlying systems, compound into very powerful human-non-overseeable outputs that are trustworthy. I predict that deep algorithms within the AGI will go through consequentialist dances, and model humans, and output human-manipulating actions that cant be detected as manipulative by the humans, in a way that seems likely to bypass whatever earlier patch was imbued by gradient descent, because I doubt that earlier patch will generalize as well as the deep algorithms. Summarizing the above two points, I suspect that I'm in more-or-less the "penultimate epistemic state" on AGI timelines: I don't know of a project that seems like they're right on the brink; that would put me in the "final epistemic state" of thinking AGI is imminent. But they also havent yet done (to my own knowledge) anything demonstrating the same kind of AI-development capabilities as even GPT-3, let alone AlphaFold 2. I see theorem proving as hugely valuable for safety in that we can easily precisely specify many important tasks and get guarantees about the behavior of the system. Horgan: Will superintelligences solve the hard problem of consciousness? I agree with most of your specific points but I seem to be much more optimistic than you about a positive outcome. Heres a nice 3 hour long tutorial about probabilistic circuits which is a representation of probability distributions, learning, Bayesian inference, etc. But Im in the second-to-last epistemic state, where I wouldnt feel all that shocked to learn that some group has reached the brink. It can self-improve. It is an edited and reorganized version of posts published to Less Wrong and Overcoming Bias between 2006 and 2009. lab that Google acquired in 2014. We can currently prepare to prevent this, and we should; indeed, all nations should be able to agree on this. Experience with formalizing mathematicians informal arguments suggest that the formal proofs are maybe 5 times longer than the informal argument. MetaMath has a 500 line Python verifier which can rapidly check all of its 38K theorems). I also think that the speedup step in iterated amplification and distillation will introduce places where the fast distilled outputs of slow sequences are not true to the original slow sequences, because gradient descent is not perfect and wont be perfect and its not clear well get any paradigm besides gradient descent for doing a step like that. RE: Im doubtful that you can have an AGI thats significantly above human intelligence in all respects, without it having the capability-if-it-wanted-to of looking over its own code and seeing lots of potential improvements.. Yudkowsky: I refer your readers to my nonfiction Fun Theory Sequence, since I have not as yet succeeded in writing any novel set in a fun-theoretically optimal world. We may never fully understand it we certainly do not now, and it is in its infancy. Those tensions took center stage late last month, when Elon Musk, along with other tech executives and academics, signed an open letter calling for a six-month pause on developing human-competitive AI, citing profound risks to society and humanity. Self-described decision theorist Eliezer Yudkowsky, co-founder of the nonprofit Machine Intelligence Research Institute (MIRI), went further: AI development needs to be shut down worldwide, he wrote in a Time magazine op-ed, calling for American airstrikes on foreign data centers if necessary. John Horgan, who has written for Scientific American since 1986, comments on science on his free online journal Cross-Check. Is this too strong a restatement of your intuitions Steve? But also: Aren't these tools awe-inspiring? To provide an analogy, imagine there are 10 pieces of information online about a certain subject; AI systems will analyze all 10 to answer questions about the topic. The leaders of the most valuable companies in the world, including Microsoft CEO Satya Nadella and Google CEO Sundar Pichai, now get asked about and discuss AGI in interviews. Couple this alien form of intelligence with a complete and utter disinterest in humans and you get a form of intelligence humans have never met before. What does it buy us? Horgan: Whats so great about Bayes Theorem? We know this because population genetics says that mutations with very low statistical returns will not evolve to fixation at all. Explore our digital archive back to 1845, including articles by more than 150 Nobel Prize winners. I don't think it would come as much of a surprise that I think the people who adopt a superior attitude and say, "You are clearly unfamiliar with modern car repair; you need a toolbox of diverse methods to build a car engine, like sparkplugs and catalytic convertors, not just thesethermodynamic processesyou keep talking about" are missing a key level of abstraction. For one thing, I dont expect to need human-level compute to get human-level intelligence, and for another I think theres a decent chance that insight and innovation have a big role to play, especially on 50 year timescales. Discover world-changing science. And if instead youre learning a giant inscrutable vector of floats from a dataset, gulp. If the builders are sufficiently worried about that scenario that they push too fast too early, in fear of an arms race developing very soon if they wait, again, everybody dies. He is also the author of a popular fan fiction series, Harry Potter and the Methods of Rationality, an entry point for many young people into these online spheres and ideas around AI. Gebru has spoken before the European Parliament about the need for a slow AI movement, ebbing the pace of the industry so societys safety comes first. Is a crux here that you think nanosystem design requires superintelligence? Even if the technology proliferates and the world ends a year later when other non-coordinating parties jump in, its still better to take the route where the world ends one year later instead of immediately. Of course, the trick is that when a technology is a little far, the world might also look pretty similar. I observe that, 15 years ago, everyone was saying AGI is far off because of what it couldnt do basic image recognition, go, starcraft, winograd schemas, programmer assistance. An AI could clearly be good at manipulating humans, while not manipulating its creators or the directives of its creators. Some in this camp argue that the technology is not inevitable and could be created without harming vulnerable communities. Another says this technology will empower humanity to flourish if deployed correctly. At every stage we maintain very high confidence of safety. You don't need to be an expert in bird biology, but at the same time, it's difficult to know enough to build an airplane without realizingsomehigh-level notion of how a bird might glide or push down air with its wings. : Yudkowsky has been the leading voice warning about this doomsday scenario. - I don't expect the first strong AIs to be based on algorithms discovered by way of neuroscience any more than the first airplanes looked like birds. Or 20! We start extra conservative and disallow behavior that might eventually be determined to be safe. Yudkowskys argument asserts: 1) AI does not care about human beings one way or the other, and we have no idea how to make it care, 2) we will never know whether AI has become self-aware because we do not know how to know that, and 3) no one currently building the ChatGPTs and Bards of our brave new world actually has a plan to make alignment happen.

Can A Mechanic Hold Your Car Until You Pay, 620 Walden Drive Beverly Hills, Land For Sale In Northern Nj, Gapsc Approved Colleges, Nihombashi Kakigaracho Sugita, Make Wordpress Site Private, Mark And Terry Drury Age, Etiquettes Of Toilet Islam, A Second Conviction For Unlicensed Activity May Result In, Nutrition Conference 2024,

eliezer yudkowsky health problems


© Copyright Dog & Pony Communications