European politics

This is for discussions about news, politics, sports, other games, culture, philosophy etc.
User avatar
Argentina Jotunir
Howdah
Posts: 1367
Joined: Mar 31, 2020
Location: Argentina

Re: European politics

Post by Jotunir »

occamslightsaber wrote:
Dolan wrote:Also how did people make such decisions before maths was invented/discovered (depending on what's your position on this question).
Let's imagine a situation in ancient Sumer, before numbers were invented for the purpose of keeping track of property and making contracts.
How did they reason about their decisions without having any concept of numbers or maths. How did peasants who were illiterate until just a century ago reason without even knowing how to read or write.
Image
Best post ever.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

Goodspeed wrote:No it's not. Monte carlo search uses random playouts as a way to more efficiently discard entire nodes in the tree (so it still explores every move, just very efficiently) whereas AG's choices about which moves to explore are based on previous experience, and most of the possible moves are not explored at all. It's not just an optimization of an existing algorithm. It's a fundamentally different approach.
AG uses a combination of many techniques, including MCTS. From the research paper (https://www.nature.com/articles/nature16961) of the authors who participated in the project:
These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks.
So in the end it's all based on search algorithms that are applied on patterns collected and stored from previously simulated games. Meaning the AG software doesn't "think" live, it searches for optimal moves based on previously "learned" (I'd say stored) patterns. Or it might also do a quick live simulation of potential outcomes, but also based on trying out a selection of optimal moves from previously stored patterns.
The thing that developers have tried and failed to do for decades, but Deepmind succeeded at, is mimicing the human ability to quickly determine which of the 300 possible moves are worth exploring. Without that, no amount of efficient searching, monte carlo or otherwise, got anywhere close to the level of the best human Go players.
They just came up with new kinds of heuristics that work probabilistically, from what I've read. So instead of just branching out all possible moves and simulating each, they tap into previously tested/simulated moves and use search trees that ranked the effectiveness of those moves.
What we call creativity is simply applying past experiences in new ways. That's exactly what programs like AG do. It learned from its past experience and applied that knowledge in ways that revolutionized the game. Stamped
If creativity is just applying past experiences in new ways, then a car is just applying physical laws in a mechanical ensemble, nothing else. Anyone could have invented the car, by just applying past experience with physical laws.
What this definition is missing is exactly what differentiates creators from non-creators. Everyone can apply past experiences in new ways, but not everyone ends up being a great inventor, thinker, musician, scientist. What exactly makes this qualitative difference? Just applying past experiences in new ways?
My knowledge of Go is based on the ~2000 games I've played. Sure, I don't know them all by heart, but neither does the AlphaGo agent that sits down to play a game of Go remember all of the games it played.
Wait, but isn't the AG software that decides the moves? And that software has instant access to its database which stores the patterns it collected during its training phase. That gives it a huge advantage over any human player, whose brain is contained in just a volume of 1500m3, compared to the tens of thousands of CPUs and GPUs that power the AG software (or the big server racks in which Google put its tensor flow units).
Like you said it created a collection of rules based on those games, which it then uses in future games. When I think about a move, I apply rules I've learned from playing the game in the past. There's no need for me to go back and actually remember where I learned them, not to mention it would be highly inefficient to have to do so. AG works the same way.
I'm not sure if that's true. Their description of how this software works implies it cannot make any move without tapping into its stored body of knowledge, of "learned" patterns. How else could it even run its game tree searches, which are mentioned in the research paper written by some of its authors?
I think you underestimate the amount of processing power in our brains. In the end, they too are just doing a shit ton of calculations.
I have the feeling your position ultimately stems from your belief that our brains are somehow magical. I see no evidence for this. Creativity and consciousness are not magic.
We don't really know if brains are performing "calculations". That implies using numbers, maths. Whereas the brain does not appear to work with bits of information that have a discrete meaning.
The biggest surprise was simply that it was better than humans at the game in general, which no one saw coming. At the time it was believed (and for good reason) that computers beating humans at Go was still decades away, if possible at all. The second biggest surprise, which played out in the years afterwards, was how much these neural network-based programs ended up changing the meta.
Idk, to me this is not a surprise that a machine with an industrial capacity of computation and simulation can beat a human at a game whose moves are mathematisable. It will obviously excel at tasks that involve a lot of quantitative operations. Just like the invention of the engine made a car a lot more powerful than using a carriage pulled by horses.
What does it mean to "actually think" and why does it matter in this context whether or not the program is "actually thinking"? Am I not using my past experiences when I "think" about a move?
Thinking involves an effort made to create an abstraction that projects something that may or may not exist. Gods, songs and stories are not necessarily based on anything real, they're not just mirror reflections of actual things you can find in your environment. They are products of a mind that goes through a process of creation, much like how you make wine from grapes, bringing into the world something that didn't exist before and for which there was no model, nothing to emulate or copy. They were the result of playing with the material, whether physical or mental, and shaping it into something that was entirely new, that only a human could produce.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Dolan wrote:
Goodspeed wrote:No it's not. Monte carlo search uses random playouts as a way to more efficiently discard entire nodes in the tree (so it still explores every move, just very efficiently) whereas AG's choices about which moves to explore are based on previous experience, and most of the possible moves are not explored at all. It's not just an optimization of an existing algorithm. It's a fundamentally different approach.
AG uses a combination of many techniques, including MCTS. From the research paper (https://www.nature.com/articles/nature16961) of the authors who participated in the project:
These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks.
So in the end it's all based on search algorithms that are applied on patterns collected and stored from previously simulated games. Meaning the AG software doesn't "think" live, it searches for optimal moves based on previously "learned" (I'd say stored) patterns. Or it might also do a quick live simulation of potential outcomes, but also based on trying out a selection of optimal moves from previously stored patterns.
You misunderstand. I'm not saying AG doesn't use monte carlo. That's the old school search algorithm I kept referring to. I'm saying that the addition of the policy network is not an optimization of the existing monte carlo algorithm, but a fundamentally different thing. The innovation is the addition of the policy network, so I am focusing on that part.

Since you brought no arguments to support that MC search "intuitively" discards nodes from its tree, which is what you claimed earlier, I'll assume you changed your mind on that. For the record I'm referring to this false claim:
Dolan wrote:
When a Go player looks at a position, while there may be 300 possible moves, there are really less than 10 that are worth exploring. And how do you describe which are worth exploring? You can't. Developers have tried and failed miserably. There's too much "it just feels right". Intuition. Anyway you take those 10 candidates, explore them by again finding moves your opponent would likely respond with (using intuition), explore those, etc, until you land on a move that's most promising. Anyone who has played any turn-based strategy game before knows this process.
But that's how the Monte Carlo Tree Search works.
So I'm assuming you now understand in what way MC search is a different thing than the AG policy network.
The thing that developers have tried and failed to do for decades, but Deepmind succeeded at, is mimicing the human ability to quickly determine which of the 300 possible moves are worth exploring. Without that, no amount of efficient searching, monte carlo or otherwise, got anywhere close to the level of the best human Go players.
They just came up with new kinds of heuristics that work probabilistically, from what I've read. So instead of just branching out all possible moves and simulating each, they tap into previously tested/simulated moves and use search trees that ranked the effectiveness of those moves.
You can argue that they indirectly do, but I can just as easily argue that human players do the same thing.
Do you actually know what an artificial neural network is? https://en.wikipedia.org/wiki/Artificial_neural_network.
What we call creativity is simply applying past experiences in new ways. That's exactly what programs like AG do. It learned from its past experience and applied that knowledge in ways that revolutionized the game. Stamped
If creativity is just applying past experiences in new ways, then a car is just applying physical laws in a mechanical ensemble, nothing else. Anyone could have invented the car, by just applying past experience with physical laws.
What this definition is missing is exactly what differentiates creators from non-creators. Everyone can apply past experiences in new ways, but not everyone ends up being a great inventor, thinker, musician, scientist. What exactly makes this qualitative difference? Just applying past experiences in new ways?
We are all creators. We constantly apply previous experiences in new ways. Your post that I'm quoting was creative, this one of mine is, etc. And yes, most people could have invented the wheel, and all of the other components in a car, given the right genes, the right past experiences and the right input at the right time.
Where do you think the idea of the wheel came from? Magic?
My knowledge of Go is based on the ~2000 games I've played. Sure, I don't know them all by heart, but neither does the AlphaGo agent that sits down to play a game of Go remember all of the games it played.
Wait, but isn't the AG software that decides the moves? And that software has instant access to its database which stores the patterns it collected during its training phase. That gives it a huge advantage over any human player, whose brain is contained in just a volume of 1500m3, compared to the tens of thousands of CPUs and GPUs that power the AG software (or the big server racks in which Google put its tensor flow units).
I can run a Go program on my phone that would beat the top professionals.
What do you think human players use when they think about a move? Yup, it's a database of stored patterns that they collected during their training phase... No, the AG agent that actually plays the games does not have access to the database of games that it learned from. It only has access to the neural network that was trained by this database, which makes "intuitive" judgments to significantly limit the amount of nodes that need searched. This is functionally identical to how it works in human brains. I've made this point many times before and really don't know what to say at this point to get it through to you.
Like you said it created a collection of rules based on those games, which it then uses in future games. When I think about a move, I apply rules I've learned from playing the game in the past. There's no need for me to go back and actually remember where I learned them, not to mention it would be highly inefficient to have to do so. AG works the same way.
I'm not sure if that's true. Their description of how this software works implies it cannot make any move without tapping into its stored body of knowledge, of "learned" patterns. How else could it even run its game tree searches, which are mentioned in the research paper written by some of its authors?
Its stored body of knowledge is a neural network which is trained by past experiences. That's how it works in our brains too. Its stored body of knowledge is not the entire database of games that it learned from. Of course it can't make any move without tapping into its neural network. Neither can I.
I think you underestimate the amount of processing power in our brains. In the end, they too are just doing a shit ton of calculations.
I have the feeling your position ultimately stems from your belief that our brains are somehow magical. I see no evidence for this. Creativity and consciousness are not magic.
We don't really know if brains are performing "calculations". That implies using numbers, maths. Whereas the brain does not appear to work with bits of information that have a discrete meaning.
I don't mean to imply our brains use decimal numbers. What I mean is there's an insane amount of computing power in our brains. Of course, due to the differences between computers and our brains (and they are significant and fundamental differences, obviously), there are specific kinds of tasks that our brains are very inefficient at compared to computers, and others that we are much more efficient at. But my point is that our brains are extremely resource-efficient and computationally powerful machines. Perhaps more so than you thought. https://en.wikipedia.org/wiki/Computer_ ... _magnitude See under "Petascale computing", "36.8×1015 Estimated computational power required to simulate a human brain in real time".
The biggest surprise was simply that it was better than humans at the game in general, which no one saw coming. At the time it was believed (and for good reason) that computers beating humans at Go was still decades away, if possible at all. The second biggest surprise, which played out in the years afterwards, was how much these neural network-based programs ended up changing the meta.
Idk, to me this is not a surprise that a machine with an industrial capacity of computation and simulation can beat a human at a game whose moves are mathematisable. It will obviously excel at tasks that involve a lot of quantitative operations. Just like the invention of the engine made a car a lot more powerful than using a carriage pulled by horses.
It's not surprising to me that it happened either. But that it happened so soon was not expected by anyone, and surprised both the field of AI and the Go world alike. I've explained the reasons why it's considered a big step. But if the contrarian in you just doesn't want to admit that, that's your prerogative.
What does it mean to "actually think" and why does it matter in this context whether or not the program is "actually thinking"? Am I not using my past experiences when I "think" about a move?
Thinking involves an effort made to create an abstraction that projects something that may or may not exist. Gods, songs and stories are not necessarily based on anything real, they're not just mirror reflections of actual things you can find in your environment. They are products of a mind that goes through a process of creation, much like how you make wine from grapes, bringing into the world something that didn't exist before and for which there was no model, nothing to emulate or copy. They were the result of playing with the material, whether physical or mental, and shaping it into something that was entirely new, that only a human could produce.
Game 2 move 37 was new. No human would have played it. And so were the numerous new opening concepts and ways to play that Go AI came up with post-AG. And new ideas, human or otherwise, are never really new in that there is always a source for them. In other words they are always applications of previously learned patterns. How else does the idea exist? Randomly? Magically?
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

Goodspeed wrote:You misunderstand. I'm not saying AG doesn't use monte carlo. That's the old school search algorithm I kept referring to. I'm saying that the addition of the policy network is not an optimization of the existing monte carlo algorithm, but a fundamentally different thing. The innovation is the addition of the policy network, so I am focusing on that part.
Policy network is a method of truncating results from the Monte Carlo tree search at a certain game state and replacing the subtree below with a function that approximates the implications of picking a certain move. It's a method used to narrow down the search results, in order to speed up the software's performance. But deep down, the software relies on performing searches in search trees that store possible game moves, collected from previous game simulations.
most people could have invented the wheel, and all of the other components in a car, given the right genes, the right past experiences and the right input at the right time.
Where do you think the idea of the wheel came from? Magic?
But that's like saying anyone could fly if they had wings. If you lack wings, you won't fly, at least not naturally.
So not everyone can fly and not everyone could have invented the wheel, since the genes/experience/personality that led to such an invention were missing in everyone else.
And maybe there's more to this than just a passive, mechanistic understanding of how creativity comes about, maybe it takes some unusual efforts made by someone to reach such a level of inventiveness. I really dislike how mechanistic and passive these bland, formulaic explanations of every human behaviour have become: it's all just genes and environment. No, it's not. There are no genes and environments that make you strive for something, when you could chill and be a complacent normie. It takes will.
Even twins reared in the same environment end up having different interests and achievements, despite being genetically similar.
What do you think human players use when they think about a move? Yup, it's a database of stored patterns that they collected during their training phase... No, the AG agent that actually plays the games does not have access to the database of games that it learned from. It only has access to the neural network that was trained by this database, which makes "intuitive" judgments to significantly limit the amount of nodes that need searched. This is functionally identical to how it works in human brains. I've made this point many times before and really don't know what to say at this point to get it through to you.
Then how does the Monte Carlo algorithm work, if it needs to perform searches in search trees? What does it even search, if there's no actual stored past game experience. There must be something in those search trees...
Its stored body of knowledge is a neural network which is trained by past experiences. That's how it works in our brains too. Its stored body of knowledge is not the entire database of games that it learned from. Of course it can't make any move without tapping into its neural network. Neither can I.
And such a neural network has hardware capabilities that far exceed those of a human brain: massive parallel processing and way larger storage capabilities that operate on electronic substrates, meaning they are unlikely to be as fallible and limited in scope as human memory. It's through sheer order of quantitative magnitude of computational capabilities that this software beats humans at Go, not thanks to some unusual qualitative reasoning abilities.
It's not surprising to me that it happened either. But that it happened so soon was not expected by anyone, and surprised both the field of AI and the Go world alike. I've explained the reasons why it's considered a big step. But if the contrarian in you just doesn't want to admit that, that's your prerogative.
I don't think my position is to simply dismiss AI just to be contrarian, though it's possible it may seem like that, simply because the way I think is not influenced by hype and groupthink. I just think I'm realistic and am trying to make a neutral, cool-headed assessment. There's just too much hype built around these buzzwords like AI, blockchain, neural networks, most of it due to commercial interests as well as companies' interest to project an image of being innovation drivers. I'm just trying to dispel some of this undeserved hype built around these subjects and bring people back to earth, to realise they have fallen for media hype. Scientists and academics in general tend to oscillate between these two poles: on one hand they know they can't exaggerate the importance of their work for fear of losing credibility, but on the other hand, the temptation to overblow the importance of their work in the media is too big. And the media is more than happy to sensationalise any new technology, just to keep that readership growing and coming.

We don't even know how a neuron actually works but we have artificial neural networks that emulate its behaviour, rofl. You don't hear about this in the media, right? There's a reason why you don't, because it wouldn't generate clicks, it deflates enthusiasm and actually punctures the previous balloons of hype that you, as a media outlet, have created.
Game 2 move 37 was new. No human would have played it. And so were the numerous new opening concepts and ways to play that Go AI came up with post-AG. And new ideas, human or otherwise, are never really new in that there is always a source for them. In other words they are always applications of previously learned patterns. How else does the idea exist? Randomly? Magically?
No human would have played those moves because they were considered bad. Why, because humans play games differently, as I mentioned before, they noticed human players tend to focus on specific areas on the table, rather than take a holistic, statistical approach, like a machine. A machine doesn't react, it doesn't think, there's no psyche inside the circuits, it just works by inertia, outputting new moves as its algorithms estimate the most likely ones to increase game state value. Whereas humans approach a game like a cogitating animal: they attack, their focus takes the shape of a pin directed at a problem, they can be deceptive. A human could make a bad move in a game just to attract the opponent into a trap, something that a piece of software running on electronic circuits, aka "artificial intelligence", is incapable of, since it's neither intelligent nor capable of artifice.
No Flag RefluxSemantic
Gendarme
Posts: 5996
Joined: Jun 4, 2019

Re: European politics

Post by RefluxSemantic »

Goodspeed wrote:I'm only disagreeing with the statement that AI could never get there, and am trying to explain why machine learning is considered a big step forward in the field.

What AG was proof of concept of is that computers can mimic human intuition through machine learning. Of course it's not proof that we may one day have general purpose AI. I never claimed that.

I specifically responded to your point about mathematical solvability because while Go is indeed solvable in theory, in this context it is practically unsolvable and its theoretical solvability in no way helped Deepmind. Making that point, I thought you were saying AG is just another brute force algorithm and were therefore misunderstanding how it works. For all intents and purposes, Go is as solvable as human interaction.

And yeah, machine learning will need to get a whole lot more efficient to be applied generally. We are far from that point.
Id rather phrase it as: alpha go shows that machine learning can achieve things that are completely impossible for conventional ai. By making an AI that could play go at a high level, they showed that machine learning has a clear usage/niche.

We already have some cool applications like deepl translate. I look forward to seeing more functional applications of machine learning.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Dolan wrote:Policy network is a method of truncating results from the Monte Carlo tree search at a certain game state and replacing the subtree below with a function that approximates the implications of picking a certain move. It's a method used to narrow down the search results, in order to speed up the software's performance. But deep down, the software relies on performing searches in search trees that store possible game moves, collected from previous game simulations.
It's not narrowing down results. It's telling the (monte carlo) search algorithm which moves need explored in the first place. The innovation was being able to immediately pinpoint the ~1% of moves that are promising, meaning that the search algorithm doesn't even have to look at the other 99%. That's what allowed it to compete with humans, who use this same method (exploring only the moves that feel intuitively promising) to limit the amount of time they have to spend looking at variations.

I'm still not sure if you actually know how it works so I'll make one more attempt at explaining why the policy network is much more than just an optimization. Maybe it helps to visualize:

Image

This is a Go program much like AG running on my PC, analyzing a position. I've let it run for about 10 seconds. The colored (blue, green and transparent red) circles are moves that it explored or is currently exploring. The 3 numbers in the blue and green moves are, from top to bottom: estimated win percentage, amount of playouts (!), and estimated score difference.

First of all, the moves without a colored circle have not been explored by the monte carlo search algorithm at all because they weren't deemed promising enough by the policy network.

Also notably, in 10 seconds, the main move that it's looking at has 261 playouts. That means it explored 261 following variations. The moves without numbers in them all have less than 10 playouts (these were not considered as promising by the policy network). You probably know that in software, these are very low numbers, and it's in stark contrast to conventional Chess engines, which explore millions or even tens of millions of variations per second depending on your hardware.

The reason I point this out is that, considering there are so few playouts, Go AI are not as reliant on the monte carlo search algorithm as you may think. Even the Chess computer that beat Kasparov in the 90s was exploring almost 10 magnitudes more variations per second than modern Go programs. The reason they can get away with this and still play at a high level is that the policy network is able to reliably filter the best moves. And it's hard to overstate how impressive that is in a game like Go which relies heavily on intuition and where even the best human players with tens of thousands of hours of experience are still unable to do this (reliably filter the best moves). Impressively, if you replaced the policy network with a strong human pro (say ELO 3500) and gave that player the ability to use monte carlo search to explore variations, they would still lose to the program. This to illustrate that AG and its successors were successful in mimicing human intuition.

Another fun fact is that I, and even players much stronger than I, would lose to this program even if it was only allowed to do 1 playout. This means that it would always pick the most promising move from the policy network and wouldn't use monte carlo at all. Source
And maybe there's more to this than just a passive, mechanistic understanding of how creativity comes about, maybe it takes some unusual efforts made by someone to reach such a level of inventiveness. I really dislike how mechanistic and passive these bland, formulaic explanations of every human behaviour have become: it's all just genes and environment. No, it's not. There are no genes and environments that make you strive for something, when you could chill and be a complacent normie. It takes will.
This comes across as you deluding yourself to feed some kind of superiority complex. It's just not a position that makes sense. Where does this "will" come from? Why do you think it's not simply the result of genes and environment? And why do you think you dislike that idea so much?
Then how does the Monte Carlo algorithm work, if it needs to perform searches in search trees? What does it even search, if there's no actual stored past game experience. There must be something in those search trees...
It's called a search algorithm because it searches for the best move out of a collection of possible moves you feed it.
The gist is that it plays the game out all the way up to the end, picking random moves (yes, actually random). It then weights the nodes in the search tree based on how many of these playouts lead to victory. It sounds inefficient, but it beats other algorithms because making random moves takes very little processing power compared to making decisions based on a bunch of if statements every time you make a move in the search tree (which expands very quickly of course). And for reasons I honestly don't really understand because I don't know enough about it, even by picking random moves it comes up with relatively reliable weights on the search nodes and can somewhat accurately evaluate positions.

This is of course very different from the policy network, which evaluates the viability of a move not by analyzing positions that follow from it, but by inputting the board position and the move it wants to evaluate into a neural network, which then, based on its internal weights (which are the result of its training) comes up with a number estimating the likelihood of this move leading to victory.
I don't think my position is to simply dismiss AI just to be contrarian, though it's possible it may seem like that, simply because the way I think is not influenced by hype and groupthink. I just think I'm realistic and am trying to make a neutral, cool-headed assessment. There's just too much hype built around these buzzwords like AI, blockchain, neural networks, most of it due to commercial interests as well as companies' interest to project an image of being innovation drivers. I'm just trying to dispel some of this undeserved hype built around these subjects and bring people back to earth, to realise they have fallen for media hype. Scientists and academics in general tend to oscillate between these two poles: on one hand they know they can't exaggerate the importance of their work for fear of losing credibility, but on the other hand, the temptation to overblow the importance of their work in the media is too big. And the media is more than happy to sensationalise any new technology, just to keep that readership growing and coming.
I agree that AI is generally overhyped. Of course it is. That's what generates the clicks, as you said. It's obviously nowhere near where some people seem to think it is. But frankly, I don't give a shit what the media is saying. I only care what you're saying, and you're denying that Deepmind's innovation was a significant one. That is a position on the other extreme: In denial about the significant progress that is being made in the field.
Ironically, the media actually seems to influence you a lot in that whatever they say, you tend to immediately adopt the opposite position. It's that cheeky contrarian in you.
Why, because humans play games differently, as I mentioned before, they noticed human players tend to focus on specific areas on the table, rather than take a holistic, statistical approach, like a machine.
It's hard to explain why to someone who doesn't play the game, but this take is pretty rude to people who are actually competent at Go. Looking at the whole board without getting tunnel-visioned into one little area is one of the key concepts of the game and you learn it as a beginner. It's not some elusive, limited-to-machines way of playing the game. Idk where you read this, but it's obviously just a bad attempt at an accessible explanation about why AG is better at the game than humans. The actual ways in which it outplays the pros are impossible to explain to non-players.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

Goodspeed wrote:It's not narrowing down results. It's telling the (monte carlo) search algorithm which moves need explored in the first place. The innovation was being able to immediately pinpoint the ~1% of moves that are promising, meaning that the search algorithm doesn't even have to look at the other 99%. That's what allowed it to compete with humans, who use this same method (exploring only the moves that feel intuitively promising) to limit the amount of time they have to spend looking at variations.
And how does it do that? It's an algorithm that has multiple layers, it's basically sifting data, after the value policy evaluates the current table. And after it gets an evaluation of the current table (current game state), it grades each move according to an estimated probability to win (which was established by the neural network being "trained", so basically storing some patterns based on which it can analyse moves).
Source for this explanation:
Generally, two main kinds of neural networks inside AlphaGo are trained: policy network and value network. Both types of networks take the current game state as input and grade each possible next move through different formulas and output the probability of a win. On one side, the value network provides an estimate of the value of the current state of the game: what is the probability of the black player to ultimately win the game, given the current state? The output of the value network is the probability of a win. On the other side, the policy networks provide guidance regarding which action to choose, given the current state of the game. The output is a probability value for each possible legal move (the output of the network is as large as the board). Actions (moves) with higher probability values correspond to actions that have a higher chance of leading to a win.
https://illumin.usc.edu/ai-behind-alpha ... l-network/
The reason I point this out is that, considering there are so few playouts, Go AI are not as reliant on the monte carlo search algorithm as you may think. Even the Chess computer that beat Kasparov in the 90s was exploring almost 10 magnitudes more variations per second than modern Go programs. The reason they can get away with this and still play at a high level is that the policy network is able to reliably filter the best moves. And it's hard to overstate how impressive that is in a game like Go which relies heavily on intuition and where even the best human players with tens of thousands of hours of experience are still unable to do this (reliably filter the best moves). Impressively, if you replaced the policy network with a strong human pro (say ELO 3500) and gave that player the ability to use monte carlo search to explore variations, they would still lose to the program. This is to illustrate that AG and its successors were successful in mimicing human intuition.
There is no intuition there, there is no thinking entity there in the circuits, it's a filtering algorithm, which ranks the best moves based on probabilites, estimated based on the value policy that already scanned the table. Even the authors of AlphaGo state very clearly that the software still does searches, but it filters them so that move selection can be done faster:
During the match against Fan Hui, AlphaGo evaluated thousands of times fewer positions than Deep Blue did in its chess match against Kasparov; compensating by selecting those positions more intelligently, using the policy network, and evaluating them more precisely, using the value network
Source: https://storage.googleapis.com/deepmind ... ePaper.pdf
This comes across as you deluding yourself to feed some kind of superiority complex. It's just not a position that makes sense. Where does this "will" come from? Why do you think it's not simply the result of genes and environment? And why do you think you dislike that idea so much?
Because as I said, if fraternal twins, who are genetically similar, do not necessarily develop the same interests or reach the same level of achievement, even when they're reared in the same environment, then how a person develops is not just a mechanistic passive result of some factors. You become you in an active way, and people with the same genes and environment may actually make completely different choices when faced with the same problem. Why is that? Because consciousness is not just an organic mechanism, that is pushed around by genes and stimuli, it's also a factor in itself that pushes against them. You're not just a reification of a theory, otherwise how could a theory even emerge? Did you have some special genes which programmed you to develop a specific theory? Where would the impetus to understand things in a new light even come from? Where does novelty in general even come from? If it's all just mechanistic empty functioning of pre-defined behaviours, why would anyone ever do anything new? After all, you're just programmed to (insert here basic bullshit Darwinian crap) and nothing else. So why aren't you doing nothing but that? Why play a video game? It doesn't bring you more mates or more resources, according to Darwinian theory. It's a fucking waste of time. It doens't make you any more evolutionarily fit, you just play it for fun. How does that fit in any mechanistic explanation of the world? Watching a movie does not increase your standing in the world in any way, unless your movie knowledge somehow impresses someone, which could bring some practical use. But that's not the case for most of everyone watching a movie. It's all just a waste of time, a way to make time pass in a more pleasant way, that's all. Not every action that you do is predicted and already accounted for by all these evolutionary theories. These are just theories, creations of the mind, abstractions. You don't live to reify them. Consciousness reacts, so I'd expect people might actually make some life choices in such a way as to disprove theories that claim to predict their behaviour. And then the theory might become less useful, if not even useless, since consciousness plays an active role in how people shape their lives. They don't just live to verify theories.
Then how does the Monte Carlo algorithm work, if it needs to perform searches in search trees? What does it even search, if there's no actual stored past game experience. There must be something in those search trees...
It's called a search algorithm because it searches for the best move out of a collection of possible moves you feed it.
The gist is that it plays the game out all the way up to the end, picking random moves (yes, actually random). It then weights the nodes in the search tree based on how many of these playouts lead to victory. It sounds inefficient, but it beats other algorithms because making random moves takes very little processing power compared to making decisions based on a bunch of if statements every time you make a move in the search tree (which expands very quickly of course). And for reasons I honestly don't really understand because I don't know enough about it, even by picking random moves it comes up with relatively reliable weights on the search nodes and can somewhat accurately evaluate positions.
It does probabilistic evaluations, based on those grades ascribed to each possible move by the value network, which first scans the table. And all that is based on that big collection of moves provided by the Monte Carlo search trees.
I only care what you're saying, and you're denying that Deepmind's innovation was a significant one. That is a position on the other extreme: In denial about the significant progress that is being made in the field.
Ironically, the media actually seems to influence you a lot in that whatever they say, you tend to immediately adopt the opposite position. It's that cheeky contrarian in you.
Wtf do I have to gain from this, if my motive was to simply be a contrarian? I'm wasting my time on this forum anyways, this section is read by like 3 people. So, why would I want to simply adopt a position just because it would oppose someone else's? I'm really just speaking my mind, there's no secret motivation behind me doing this work of demystification.
User avatar
Latvia harcha
Gendarme
Posts: 5136
Joined: Jul 2, 2015
ESO: hatamoto_samurai

Re: European politics

Post by harcha »

Dolan wrote:Wtf do I have to gain from this, if my motive was to simply be a contrarian?
Don't you get tired of doing this? I know I have, after spending the last 15 years in various gaming forums and group chats...
Dolan wrote:I'm wasting my time on this forum anyways, this section is read by like 3 people.
Sorry for posting here, I haven't actually read any single post in full.
POC wrote:Also I most likely know a whole lot more than you.
POC wrote:Also as an objective third party, and near 100% accuracy of giving correct information, I would say my opinions are more reliable than yours.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

harcha wrote:
Dolan wrote:Wtf do I have to gain from this, if my motive was to simply be a contrarian?
Don't you get tired of doing this? I know I have, after spending the last 15 years in various gaming forums and group chats...
Dolan wrote:I'm wasting my time on this forum anyways, this section is read by like 3 people.
Sorry for posting here, I haven't actually read any single post in full.
It wasn't my intention to insult anyone, but if my motivation on why I hold an opinion is just to be a contrarian, what's the practical use of doing this on a very small forum. I didn't say this to belittle those few forumers who read this, I was just saying that there's not much to be gained from holding contrarian views when you're debating like one or two people. If you want to have impact or create a stir, for some reason, you'd go to a bigger arena. I'm really just speaking my mind. Maybe I just mistrust anything that is hyped and suspect the reality of that thing is a lot more ordinary than how people present it.
User avatar
Latvia harcha
Gendarme
Posts: 5136
Joined: Jul 2, 2015
ESO: hatamoto_samurai

Re: European politics

Post by harcha »

oh sry i misunderstood. as i said havent read the above in full
POC wrote:Also I most likely know a whole lot more than you.
POC wrote:Also as an objective third party, and near 100% accuracy of giving correct information, I would say my opinions are more reliable than yours.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Dolan wrote:There is no intuition there, there is no thinking entity there in the circuits, it's a filtering algorithm, which ranks the best moves based on probabilites, estimated based on the value policy that already scanned the table. Even the authors of AlphaGo state very clearly that the software still does searches, but it filters them so that move selection can be done faster:
I know the software still does searches:
I wrote:This layer of the program, like human intuition, comes up with promising candidates and is then combined with an "old school" search algorithm to evaluate them.
...
I'm not saying AG doesn't use monte carlo.
...
It's telling the (monte carlo) search algorithm which moves need explored in the first place.
I'm glad other people are reading because you clearly aren't.

But I suppose I'll point out again that modern trained neural network-based Go software doesn't actually need to do tree searches to beat strong players.
I wrote:Another fun fact is that I, and even players much stronger than I, would lose to this program even if it was only allowed to do 1 playout. This means that it would always pick the most promising move from the policy network and wouldn't use monte carlo at all.
It wouldn't use monte carlo or any other search algorithm to explore the move tree. It would be basing its moves purely on the initial judgments of the policy network. This is equivalent to a human looking at a Go board, intuitively coming up with the most promising move without exploring the variations it might lead to, and playing that move. This is unthinkable for human players because our intuition is just not good enough. We usually need to do some "reading" (looking ahead) to verify if the move is really as good as it looked.

And of course there's no conscious entity there, but that doesn't matter. The point is that the software was able to develop a highly effective decision tree to find promising moves without having to analyze a position by exploring variations, and that it did this by only playing against itself.
I base my initial judgments of a position on stored patterns in my brain that I learned from previous experience, and so does the policy network. The difference is that I, to have any chance of making the correct move, must then explore some variations to verify if my intuition was right, whereas the program is almost always right initially. And this is, again, in stark contrast with if not directly opposite to old school Chess engines, which relied heavily on brute force move tree searches to gain an edge.

Intuition: The ability to understand something instinctively, without the need for conscious reasoning.

The reason I keep saying it's similar to human intuition is that there is no way to translate the policy network's initial judgments to a statement like "this move is good because x, y, and z". The decision tree it created based on its training is too complex to decipher. The same is true for an intuitive judgment by a human. If I look at a Go board and pinpoint the move that looks most promising to me, it's not going to be a result of conscious reasoning (there will be some reasoning, but a large part of it is "it feels right"). Of course the reason it feels right is still stored somewhere in my brain, but this decision tree, which was created based on many games of training, is too complex to decipher.
Because as I said, if fraternal twins, who are genetically similar, do not necessarily develop the same interests or reach the same level of achievement, even when they're reared in the same environment, then how a person develops is not just a mechanistic passive result of some factors.
No environment is exactly the same.
people with the same genes and environment may actually make completely different choices when faced with the same problem.
They may, but you have no evidence for this.
It does probabilistic evaluations, based on those grades ascribed to each possible move by the value network, which first scans the table. And all that is based on that big collection of moves provided by the Monte Carlo search trees.
What is "it" here? Not the policy network, which doesn't rely on the value network's judgments and certainly doesn't rely on "big collections of moves provided by the MC search". If you don't understand this by now, this is a waste of time.
Wtf do I have to gain from this, if my motive was to simply be a contrarian? I'm wasting my time on this forum anyways, this section is read by like 3 people. So, why would I want to simply adopt a position just because it would oppose someone else's? I'm really just speaking my mind, there's no secret motivation behind me doing this work of demystification.
You don't need a clear motive to take a particular contrarian position. It's a subconscious reflex. As for where that comes from, it's not really my place to speculate.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

@Goodspeed
Since you brought up artificial neural networks. I suppose you already know how they work, right? If not, I'll just write a short description, so that there's a clearer image of what they actually do and what is the actual result of them being trained.

First of all, an artificial neuron is an abstraction of a neuron, it's not actually mapped 100% to hardware (like you have in an actual human brain), it's agnostic about hardware support, it's a software abstraction of how currently people think neurons work (because actually some recent research kinda threw some doubt on the notion that we actually understand how neurons work, some studies showed they work in a non-linear way, which is currently unexplainable). An artificial neuron is basically a mathematical function, a set of input values and associated weights. The function sums the weights and maps the results to an output. The input layer is composed of the values from a data record, that constitutes inputs to the next layer of neurons. The final layer is the ouput layer and in between there may be multiple hidden layers. Data which passes through the network is eventually assigned a value to each output node. It needs to be emphasised that the whole process actually works by passing data from previous records (like images in the case of image-recognition trained networks), processing each record one by one and comparing their initial classification of the record (which could be garbage data) with the actual known classification. In the case of game moves, they would feed the network known, played-out game moves (either from human players or from itself) and build a classification of each move according to its contribution to the game state (more or less likely to be conducive to a victory). The whole process uses feedback to correct previous errors and stores the patterns that are classified as more accurate (when these networks are used for pattern recognition) or as having a higher game state value, in the case of a game.
The network "learns" by adjusting the weights of values so as to be able to predict the correct class of input samples (ie, to properly identify whatever its purpose is: a recognised pattern or optimal game move). To sum up, the final state of the artificial neural network is a stored record of patterns that can be used for fast detection of a certain class of objects.

The AlphaGo policy network is the result of two phases:
- training supervised learning policy network and rollout policy network, which are designed to predict human expert moves in a data set of positions.
- reinforcement learning policy network, which improves the supervised learning policy network by optimizing the outcome of games

Now I'm going to quote from a research paper that describes the process in short (https://ieeexplore.ieee.org/document/9195476/), because that will give a clearer idea on what it actually involves:

Image

What this means is that it takes a record of human game moves and uses a statistical algorithm to approximate what is a typical human move.

This is an overview of the hardware design:
Policy network is designed to sample actions in the game of Go. Reinforcement learning is combined with supervised learning to adjust the policy network towards the correct goal of winning games. It takes a representation of the board position as its input, passes it through many convolutional layers with parameters (SL Policy Network or RL Policy Network), and outputs a probability distribution over moves a , which is represented by a probability map over the board. As we can see, Fig. 2 shows a high-level block diagram of the proposed architecture. It consists of a signal preprocessing module, 13 convolutional layers, a scale layer and a softmax layer that is equivalent in size to the number of class labels.

Image
It gets more interesting when describing the input convolution module that sifts the data from the input:
The input convolution module, including one layer, is shown in Fig.3, where 192 convolution kernels with a shape of 5×5 are chosen based on the differences in data and in the complexity of models. It includes a weight serial-parallel conversion module, a weight memory (WMEM), and a parallel multiply-sum operation module. In the first step, the weight serial-parallel conversion module converts the input serial weight data to parallel data and stores them in the weight memory WMEM. Then, the parallel data and feature map data which have a size of 4×19×19 are sent into the parallel multiply-sum operation module for the convolution operations. Finally, the sum of the operation results and bias are sent to the intermediate convolution module.
So again, this process involves taking data and storing them in the chipsets to be later filtered and processed by subsequent operations.

A second convolution layer is even more complex and it includes lots of memory chips (because the whole process is data selection and storage):

Image

The final phase involves using a softmax module, which performs logistic regression in order to normalize the output of the neural network to a probability distribution over predicted game moves.

To wrap it up, in layman terms, the whole process of training a policy network involves taking data input, comparing it to known results, assigning weights, going through a feedback process to eliminate errors. At a hardware level, policy network training involves passing the data through various modules that apply statistical/maths algorithms to approximate the most likely move a human player would make in order to counter it with another move that has a higher value (as established by the stored records/patterns from the artificial neural network).

That's it, there's no magic at any point in this process, it's all just algorithms that use statistics to arrive at the best estimated probabilities for a game move that would increase the game state value (ie, the likelihood of winning).
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

That is not in any way a response to me. You keep arguing against this made up position that the policy network is somehow magical, or a conscious thinking entity. That obviously isn't my position. Also, it's surprising to me that you apparently think human intuition is "magic".
I wrote:I'm aware there's nothing magical
...
Of course it's all maths
...
And of course there's no conscious entity there
And yes, I'm perfectly aware and have stated many times that the policy network's decisions are based on stored patterns that it recognized during its training. My point has been that this is both fundamentally different from and much more effective than pre-AG Go programs. Therefore it's generally considered a significant innovation in the field. (Also because it has potential real world applications)
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

The reason why I wrote the reply like that, instead of just addressing each point from your previous reply, was because it was necessary to establish what AG and the policy network actually do, in the most concrete terms, so that there's no room for vague arguments and discussing on the surface of things.
I'm glad you admitted the whole process relies on stored patterns and not on some magical human-like intuition.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

I've explained why I compare it to human intuition. Our disagreement there is obviously not about how the program works, but about how human intuition works. You think it's magical, whereas I think it's a process that relies on [the recognition of] stored patterns, hence the metaphor. So because I kept comparing it to human intuition, and you think human intuition is magic, you assumed I was saying the program is magic. See where you went wrong there? It's frustrating because I repeatedly stated that I'm fully aware it's not magic.
This all comes from your silly opinion that human intuition is some magical, indeterministic, logic-defying process or something.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

I don't think it's magical, there's a lot we don't get about how human cognition works. I think it's more honest to admit we don't really understand yet lots of things about human cognition.
We have a vague notion that intuition is a type of fast decision-making that doesn't rely on fully reasoning out something.
But the thing is that intuition is not just a passive estimation based on past patterns, because human consciousness, by making decisions, also reacts to some external event or challenge.
So a decision made by intuition is not just a question of quickly tapping into memories, it also involves a quick conscious phase of reacting to a stimulus with a picked pattern. It's an intuitive decision made for something, directed at something, which involves a reaction, not just quickly retrieving a memory.
It also involves reviewing those past patterns in order to react quickly with a decision.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Then why use that word, "magic"?
I wrote:The reason I keep saying it's similar to human intuition is that there is no way to translate the policy network's initial judgments to a statement like "this move is good because x, y, and z". The decision tree it created based on its training is too complex to decipher. The same is true for an intuitive judgment by a human. If I look at a Go board and pinpoint the move that looks most promising to me, it's not going to be a result of conscious reasoning (there will be some reasoning, but a large part of it is "it feels right"). Of course the reason it feels right is still stored somewhere in my brain, but this decision tree, which was created based on many games of training, is too complex to decipher.
What the comparison boils down to is that our subconscious and trained neural networks are similarly enigmatic yet effective decision makers, and similarly based on recognizing learned patterns.

Anyway I don't really care whether or not you agree that it's similar to human intuition. That's ultimately a semantical discussion. I just hope that by now you understand that it's nothing like MCTS, and why it's considered a significant innovation in the field of AI.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

Goodspeed wrote:Then why use that word, "magic"?
I don't think I brought it up first, but using that word is a way to mock the typical reaction of those nerds who read science media and buy into the hype with which they present any news about AI. I don't deny that these new developments are significant progress, that actually made AI more practically usable or closer to that aim, but people need to calm their tits, because they're turning all these new advancements into a religious attitude. It's what you usually see among the transhumanist crowd.
Maybe what the comparison boils down to is that our subconscious and trained neural networks are similarly enigmatic yet effective decision makers.
Subconscious processing in humans is basically preparatory work for decision making. It's accumulated patterns of cognition that become a habit and are activated automatically whenever a decision is made. So they're past conscious decisions that get pushed somewhere in the back of the mind, so to speak, so that they're activated really fast next time someone makes a decision. It's not like there's some kind of pre-programmed or inborn cognition that is stored in something called subconscious. There's none of that.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Dolan wrote:
Goodspeed wrote:Then why use that word, "magic"?
I don't think I brought it up first,
You definitely did. I would never have described either AG or our intuition as "magic", nor would I describe anything else as such.
Subconscious processing in humans is basically preparatory work for decision making. It's accumulated patterns of cognition that become a habit and are activated automatically whenever a decision is made. So they're past conscious decisions that get pushed somewhere in the back of the mind, so to speak, so that they're activated really fast next time someone makes a decision. It's not like there's some kind of pre-programmed or inborn cognition that is stored in something called subconscious. There's none of that.
I think our intuition is mostly based on subconscious pattern recognition. But that's another discussion
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

Goodspeed wrote:
Dolan wrote:
Goodspeed wrote:Then why use that word, "magic"?
I don't think I brought it up first,
You definitely did. I would never have described either AG or our intuition as "magic", nor would I describe anything else as such.
I used that word because it's hillarious to see people filled with so much awe when they talk about AI or other hyped up memes. It's just software built on some abstractions of how human cognition works that are not even very accurate.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Sure. It would be nice though if you left your preconceptions at the door next time and responded to what I'm actually saying. You often seem to argue against positions I never expressed in our discussions
User avatar
Kiribati princeofcarthage
Retired Contributor
Posts: 8861
Joined: Aug 28, 2015
Location: Milky Way!

Re: European politics

Post by princeofcarthage »

Goodspeed wrote:You often seem to argue against positions I never expressed in our discussions
He does this very often when there is nothing he can refute. He seems to always want the last say in the argument and that has to be right. If you confront him with irrefutable proof, he will jump topics. Happened way too many times. The contrarian in him.
Fine line to something great is a strange change.
User avatar
Nauru Dolan
Ninja
Posts: 13064
Joined: Sep 17, 2015

Re: European politics

Post by Dolan »

@Goodspeed
But the technical reply I wrote clarifies both how the policy network works and that artificial neural networks do store some patterns created based on processing previous games.
So I thought there was no need to reply to that post. I could do that but I think I'd be repeating things we've already agreed on.

@princeofcarthage Idk what you're talking about.
User avatar
Netherlands Goodspeed
Retired Contributor
Posts: 13002
Joined: Feb 27, 2015

Re: European politics

Post by Goodspeed »

Dolan wrote:@Goodspeed
But the technical reply I wrote clarifies both how the policy network works and that artificial neural networks do store some patterns created based on processing previous games.
Yes the disagreement seems to be about whether or not human intuition is similarly based on the recognition of stored patterns.
So I thought there was no need to reply to that post. I could do that but I think I'd be repeating things we've already agreed on.
I'm not really talking about specifically that post, more about your apparent assumption that I was claiming the algorithm is more than just maths.
User avatar
United States of America occamslightsaber
Retired Contributor
Posts: 1326
Joined: May 31, 2019
ESO: L1BERTYPR1ME

Re: European politics

Post by occamslightsaber »

Fucking nerds
The scientific term for China creating free units is Mitoe-sis.

I intend all my puns.

Who is online

Users browsing this forum: No registered users and 15 guests

Which top 10 players do you wish to see listed?

All-time

Active last two weeks

Active last month

Supremacy

Treaty

Official

ESOC Patch

Treaty Patch

1v1 Elo

2v2 Elo

3v3 Elo

Power Rating

Which streams do you wish to see listed?

Twitch

Age of Empires III

Age of Empires IV