Hiya!

I started this AI course with my friends a while ago, but we never ended up finishing it. I’m interested in AI these days, so I thought I’d try it on my own. Week 1 is some fluff that’s not worth going over. I’m doing weeks 2 and 3 together because there is one project for both combined. The first couple sections are just notes I took on the videos and concepts. The stuff for the project is at the bottom.

Week 2

They first go over some general stuff about intelligent agents, like the PEAS (performance, environment, actuators, sensors) acronym. Also, types of environments: fully observable (chess, where you can see everything happening) vs partially observable (poker, where there’s hidden info), deterministic (you know what will happen given an action) vs stochastic (where there’s some randomness built in), and a few others.

Then they differentiate between two types of agents: reflex type, where it really just has a simple output (reflex) for a given input, and goal-based agents, where they do higher level stuff like planning and problem solving. Goal-based ones are obviously what we consider higher intelligence.

Then we start getting a little technical. Many, many problems can modeled using a tree, where each node branches for each choice that can be taken for a given decision. Ultimately, solving most of these problems will mean finding the location in that tree that optimizes some value you want, but the tree is enormous, so we have to come up with clever ways of searching it or eliminating parts we don’t need to look at.

First they present DFS and BFS, which I won’t reiterate here. They are presenting the different search methods in the context of finding the shortest route between cities on a map, where you have the distance between adjacent nodes on the map. So, an interesting thing to look at and keep in mind is the “frontier” of the search tree, both in the tree itself and on the map.

They mention 4 aspects of search strategies that should always be considered: 1) Is it complete? That is, does it always find *some* solution? 2) Time complexity, as in, how long does it take depending on the problem size 3) Space complexity, how much stuff it has to keep in memory while solving the problem, depending on the problem size, and 4) Optimality, i.e., does it definitely find the *best* solution?

Lastly, they mention uninformed vs informed search. Uninformed is when you don’t have much information about the domain you’re searching, so it’s kind of “dumb”. Informed is where you do, so you can more intelligently set up the problem to search the space.

The first search method they show that’s new to me is Depth Limited Search (DLS), and Iterative Deepening (ID). DLS is basically a DFS, but you limit the depth it will go to some value less than the total depth of the tree. ID is basically DLS, but repeatedly, increasing that limit to the depth you search. Therefore, ID ends up being kind of a mix between DFS and BFS. For DFS, you normally go straight down to the bottom of the tree, and then only go up from there. When you limit it by DLS, that means within one depth iteration (let’s say, L = 4), you still do DFS within that iteration, but when you’re all done with that one (assuming you didn’t find the solution in that iteration, and go to L = 5 (looking one level deeper), you’ve now already looked in some shallower nodes (in the L = 4 iteration) than you would’ve had you gone directly to L = 5. So it has the effect of kind of mixing DFS and BFS, which is probably good for some problems.

The next type of new uninformed search they show is Uniform Cost Search (UCS). Again, this one is an alternative to pure DFS or BFS that ends up searching the tree in a different pattern. It’s easiest to explain UCS in the context of the map with distances, like they do. What you do for UCS is, at each node (city) you’re currently at, you make its neighbors (neighbor cities) its child nodes, which each have a distance traveled from the starting city to the city represented by that node (the “path cost” g(n)). Then, the next node you look at and expand (its children) is the node with the smallest accumulated distance so far. So, this of course has the added complication that you need to implement an ordered list. What’s cool (that they explain more fully in the lectures) is that it *is* optimal, which kind of makes sense intuitively; if it’s always searching at the lowest cost node, the optimal path node is the one that will have to reach the goal first.

Week 3

Week 3 starts to get really interesting!

Week 3 gets into “heuristics”, which is a measure of how close a state is to a goal. This is really cool because it’s something everyone uses in real life, whether they know it or not, when trying to solve a problem. This is really the first glimpse of what feels like “intelligence”.

A necessary part of a heuristic is knowledge of the domain that you use to judge how close to your goal a state is. It’s important to note that it doesn’t have to be perfect to be very useful.

They use the map and shortest city to city goal again here, because it’s very good for illustrating these techniques. Here it’s assumed that included with your map is not only a list of the distances between neighboring cities (the straight line distance, as the crow flies, etc), but also between every pair of cities. Note though, that the distance between neighboring cities is the distance actually taken (like, that you can drive), but the distance given between two non-neighboring cities is just the Euclidean distance, so you have to find the shortest distance that you can actually *drive* on.

In this problem, we’ll call the heuristic h(n), for a given city/node n.

The first informed search we look at is the famous greedy search, which greedily just follows the path (i.e., expands the node and searches the tree in this order) that has the lowest h(n). You can intuitively see what will happen here. At the starting city node, you expand its neighbors, and choose the one that has the smallest h(n). Then, you expand that one’s neighbors/children, and now look at h(n) for all the cities (the starting city’s children and this neighbor’s children) on the “frontier” of nodes available to us, and choose the smallest h(n) of them and repeat. Of course, chances are that, because you chose the starting city neighbor that was closest to the end city, at least one of its children are going to be even closer to the end city than one of the starting city’s children. So, this makes greedy search have the effect of quickly finding a solution, because it keeps “greedily” grabbing the closest city to the goal.

However, you can quickly think of examples where this would fail. If your end city was on the other side of a curved mountain range than your start city, then greedy search might first grab a neighbor of the start city that is closer to the end city and the mountain range, but can’t actually go through the mountain range. The correct way would’ve been to first take what appeared (to the greedy search) to be a worse path, actually going away from the end city.

The next search method is really cool and famous, the A* (A star) search. While our uninformed searches were searching by the minimum cost g(n) *to* a city n from the starting city, and the greedy search above is searching by the minimum Euclidean distance h(n) from a city n to the end city, A* uses the sum, f(n) = g(n) + h(n). This intuitively makes sense, right? Using only g(n) was good because it took into account the distance you’ve *actually* accumulated getting to a given city, but it didn’t have a good sense of where to search because it was searching/expanding the tree “blindly”, so though it is optimal it’s very inefficient. On the other hand, using only h(n) means the search has a good sense of what to aim for in general, but it’s not taking into account the distance it took to get to where it currently is (because it doesn’t use g(n)), so it ends up taking suboptimal paths.

A* combines these two, effectively using both the “history” and the “future” in searching. What’s *really* cool is the following. A* is what we call “admissible”, meaning that f(n) at a given city n is always *underestimating* the cost of the cheapest path that goes through city n compared to the actual cost. If you use a cost that is admissible, you can prove that the algorithm will definitely find the optimal solution. So one view of A* could be that it has the benefit of optimality (like UCS), but with the added efficiency/intelligence of a greedy search. Its main downsides are exponential time and the fact that it has to keep every single node in memory.

Then, they show it in the context of something with a bit less obvious of a heuristic than distance on a map. The example is one of those puzzle toys where there’s a missing square and you have to sort the numbers or unscramble the picture or whatever. The heuristic here is just “how far is each tile from where it’s supposed to be?” (as if you could move them without regards to the other tiles in the way). This is obviously underestimating the number of moves, and is therefore a admissible heuristic.

Finally, they get to the concept of local search. When the search space is deterministic and known, you can use systematic searches. But when it’s not (like many real world problems), you can’t use those same approaches. Instead, you do a very general technique known as local search where you have a current state, and try and iteratively improve it (to get closer to your goal) by only going to “neighbor” states. It’s essentially looking at the places right next to you that you could go to, and taking the best one that’s better than what you currently have. So, it’s only searching for better states…locally.

Obviously this has a few downsides. It can get stuck in local minima/maxima easily (though there are modifications that try to get around this). Upsides are that it’s often very easy to implement, and you don’t need to maintain a search tree. It also doesn’t need much memory, because it’s really just keeping track of one or a few states. They then mention a few variants and sub-techniques.

Greedy local search is what you’d naively expect: just look at the immediate neighbors, and take the best option. If they’re both worse than what you currently have, you’re at a local best state. Sideways moves means that if you’re at a plateau, where some range of adjacent states have the same value (so you wouldn’t normally take them), you try taking them to get off the plateau. Another is Random-restart, where you restart the algorithm several times from random starting positions. This might make the algorithm go to different maxima, so you keep track of the best one you’ve achieved.

Another variant is local-beam search, where instead of having one state searching locally, you have k states. However, they’re also communicating with each other, passing info that might help the others (like what exactly?). A variant of *this* is stochastic beam search, which is basically beam search, but with a random (movement) element thrown in, to prevent the states from agglomerating.

Next is Genetic Algorithms. They’re…dun dun dun, pretty analogous to evolution and genetics, and actually a variant of stochastic beam search algorithms. Basically, you start with a “population” of k states, each one of which is called an “individual” (get it?), which have some set of characteristics (maybe just a string of 0’s and 1’s). You also have a “fitness function” (GET IT?) that basically tells you how good a state is. Lastly, you create “offspring” states from making pairs (or more?) of “parent” states “reproduce”, by combining the characteristics of the two parents (in some random ratio). You also allow the small possibility of a mutation for the offspring, to introduce variations. It’s…pretty much evolution.

The project

The project is actually just solving the puzzle game using a few different search methods! Pretty cool. The three methods they have us use are DFS, BFS, and A*.

BFS and DFS are very similar, both being uninformed searches, except they traverse the trees using a different data structure to determine the ordering. BFS uses a queue (first in, first out), because it searches in order of increasing depth, while DFS uses a stack (first in, last out) because it “goes deep” into the tree before returning to shallower things. Otherwise, they’re pretty much the same. Again, they’re uninformed so they don’t use a heuristic.

A* is different, since it’s an informed search and therefore uses a heuristic. It uses the heuristic mentioned above, the distance from where each tile should be to where it currently is. It’s therefore optimal, but it also needs a more complicated data structure, a heap. This is essentially a priority queue, where each item in the queue also has some number that helps sort it, and determines what’s next going to leave the heap when you pop something off of it. In this case, the number that determines the sorting is the f(n) = g(n) + h(n) mentioned above (g being the moves taken so far, h being the heuristic at that state).

When I implement it naively, BFS and A* go pretty easily. But DFS is taking a tonnnn of time. What’s up?

I’m guessing it has to do with searching through states already visited. Yep: searching through a list takes O(n) time, whereas a python set takes O(1) time to check if something’s in it.

That’s a speed problem. The other key insight to a common stumbling block is that you need the actual path (that is, ‘Left’, ‘Up’, ‘Up’, …) for when you actually do find the goal state. Naively, this isn’t too bad: you just have a list of the moves taken (moveList), alongside the board state (you can even package them together into a tuple so the frontier pops them out together), and when you perform a move, you append that move to moveList, and push that into the frontier along with the state.

However, if you try this (I noticed it with DFS but it could happen with the others maybe) you’ll quickly notice a problem. Take this simple example, for which the solution is only 3 moves:

However, due to how DFS works, it ends up going extraordinarily deep in the tree, if you look at the answer:

You have to keep the moveList for *each state*. If a path it explores goes to depth 66,125, that means there are in the very rough ballpark of 2^66125 moveLists, each 66,000 long, to keep track of. Of course, that 66,125 deep branch in the tree is probably very sparse, but whatever it is, it ends up being huge. So you basically can’t keep track of the whole moveList for each node. What you *can* do, is keep track of every single move you take throughout the whole search. Then, as long as you know the current depth (N) you’re at (so, you will have to keep track of this for each board, but it’s a single number), you can just look at the last N moves that were taken in that total moveList.

Welp, pretty straightforward. I get all the right solutions, but unfortunately there’s some weird quirk where my BFS gets a very off value for nodes_expanded (even though the answer is correct). The thing is, nodes_expanded can be defined in a few ways, and they’re not totally clear about it. They say somewhere that it’s the number of nodes that *are popped off your queue*, but this doesn’t seem to work for me. Looking at the discussion questions, it seems like a bunch of other people have the same confusion, even getting the exact same wrong nodes_expanded as me. Weird!