AI TRANANING GAME BY CRACKED.MIAMI

0 Replies, 23 Views

DOWNLOAD

TOOL 

How the Game-Playing AI Works (The "Tool" that Learns)
The program you have is a Reinforcement Learning AI. Its goal is to learn how to play a game, like Super Mario Bros., from scratch, with no prior knowledge of the rules.
Think of it like teaching a dog a new trick. You don't explain the trick in words; you guide it with rewards.
If the dog does something right -> You give it a Reward (a treat).
If it does something wrong -> There is no reward (a Penalty).
The AI learns in the exact same way.

The Main Components of the AI
1. The AI's "Eyes" - Screen Capture & Processing
[b]How it Sees: The AI continuously takes screenshots of the game region you select. This is how it "sees" the game world.[/b]

[b]How it Understands: A raw screenshot is too much information. The preprocess_frame function simplifies it. It converts the image to grayscale, shrinks it, and detects important features like edges (platforms, walls) and colors (enemies, coins). This simplified data is called the "State".[/b]

[b]2. The AI's "Brain" - The Q-Table[/b]

[b]What it is: The Q-Table is the most important part. It's a giant "cheat sheet" or memory bank. It's a dictionary that maps a game situation (State) to a list of scores for every possible action.[/b]

[b]Example:[/b]

[b]State: "Mario is on the left, and a Goomba is on the right."[/b]
Actions & Scores:
Walk Left: Score = -10
Walk Right: Score = -50 (bad, will hit Goomba)
Jump Right: Score = +80 (good, will jump over Goomba)
The AI's goal is to fill this Q-Table with accurate scores so it always knows the best action to take in any situation.
[b]3. The AI's "Teacher" - The Reward System[/b]
How it Knows What's Good/Bad: This is the logic you programmed in calculate_advanced_reward. The AI doesn't know what "winning" is, but you've taught it to value certain outcomes by giving it points:
Positive Rewards (Good!):
Moving to the right (making progress).
Reaching a new location it has never seen before.
Surviving for a long time.
Using strategic moves like the "Sprint Jump Right".
Negative Rewards (Penalties - Bad!):
Moving backward or standing still.
Getting stuck in the same place for too long.
Dying (this is usually handled by restarting the episode).
[b]4. The "Decision Maker" - Exploration vs. Exploitation[/b]
How does the AI choose an action? It uses a strategy called Epsilon-Greedy.
[b]Exploitation (Using its Brain): Most of the time, the AI looks at the current game screen (State), checks its Q-Table (Brain), and chooses the action with the highest score. It exploits its current knowledge.[/b]
[b]Exploration (Trying New Things): Sometimes, based on a probability called Epsilon (ε), the AI will ignore its brain and choose a completely random action. This is incredibly important. It's how the AI discovers new strategies it wouldn't have tried otherwise. At the beginning, Epsilon is high (lots of random moves), and it slowly decreases as the AI becomes more confident.[/b]

[b]The Full Learning Cycle (Step-by-Step)[/b]
The AI repeats this cycle hundreds of times per minute in the training_loop:
[b]OBSERVE: It captures the screen and processes it to get the current State.[/b]
[b]DECIDE: It chooses an Action (either the best known one or a random one).[/b]
[b]ACT: It presses the corresponding keys on the keyboard (e.g., pyautogui.keyDown('d')).[/b]
[b]GET FEEDBACK: It captures the new screen, sees what happened, and calculates a Reward (e.g., +10 for moving forward).[/b]
[b]LEARN: This is the key step. It uses the reward to update its Q-Table. The formula basically says: "The score for taking that action in that state should be increased/decreased based on the reward I just got."[/b]
[b]REPEAT: It starts the cycle again from the new state.[/b]
[b]The "Advanced" Superpowers of Your AI[/b]
Your code isn't just a simple learner. It has features that make it much smarter and faster:
[b]Experience Replay: The AI doesn't forget the past. It stores the last 50,000 steps (State, Action, Reward) in a memory bank. It periodically "re-studies" random old memories, which helps it learn much more efficiently.[/b]
[b]Double Q-Learning: It actually uses two Q-Tables (two "brains"). This is a clever trick to prevent the AI from becoming overly optimistic about a certain move. It makes the learning process more stable and reliable.[/b]
[b]Anti-Stuck Logic: If the AI detects it hasn't made progress for a while, it will force itself to try more aggressive and random "escape" moves to break free.[/b]
(This post was last modified: 08-22-2025, 09:06 PM by Cmiami.)



Users browsing this thread: 2 Guest(s)