Code Refactoring: How I Fixed the Problem of Running Out of Tokens in 2 Hours

After adding several mini-games in a row, the workflow started visibly degrading. The most direct signal was token exhaustion. When building a new game, I had to pass existing game files to the AI as reference — but since each game had a different structure, there was a lot to explain, and I kept hitting the context limit within 2 hours of starting. Re-setting everything up from scratch the same day was a hassle, and the amount I could get done each day quietly shrank. The code itself was a problem too. UI structure and state management varied from game to game, so fixing something in one game meant making similar fixes everywhere else. Left like this, things would only get worse as more games were added.

Tests Before Refactoring

I wanted to refactor, but actually starting felt a bit scary. Changing structure without tests means something can quietly break without you noticing. Frontend code has many bugs you only discover after deploying and actually using it — far more nerve-wracking than backend. So I reversed the order. I asked the AI to write test code for the current codebase first. Automated checks for things like whether game data structures are correct, required fields are present, and all multilingual translations exist. With tests in place before touching the code, you know immediately when something breaks.

Finding the Common Parts First — Like Backend Development

Once the tests were ready, I got into the refactoring proper. The approach was similar to how I work on backend code. Just like extracting shared logic into a service layer when writing server code, I had the AI first identify what was duplicated across the games. I had it scan through the entire codebase and catalog the common patterns before touching anything. Based on what it found, I went in three directions. First, unifying UI components — the result cards and ranking dashboards that each game had implemented differently were extracted into shared components. Second, pulling shared functionality into hooks — sound handling code that was scattered across multiple games was consolidated into a single hook. Third, separating UI and business logic within files — cases where screen rendering code and game logic were tangled together got cleaned up.

Fix, Test, Fix, Test

The work itself was a simple repeating cycle. Have the AI modify a file, run the full test suite, and if it fails, figure out the cause and fix it again. The frustrating part was the AI occasionally not following instructions I had already given. For example, I had already said 'always use the shared component for result cards,' but when it was editing the next file, it would quietly slip back to the old approach. Whether it's the AI forgetting earlier instructions because there are too many files, or some internal priority shift, I don't know — but there were quite a few moments where I had to say the same thing multiple times.

UI verification still needs human eyes. Even after all tests pass, opening the actual screen sometimes showed layout changes or something looking off. I'm still not sure if automating UI review is really feasible. I know tools like visual regression testing exist, but I've never tried them in practice so I have no idea how effective they actually are. Honestly, I'm curious how frontend developers handle this.

Code Shrunk 20%, Working Time Extended by an Hour

The results were better than expected. Individual file sizes shrank noticeably, and overall code volume dropped by about 20%. With less unnecessary content when referencing files for the AI, token consumption went down, and working time per session extended from 2 hours to around 3. That sounds small, but the day-to-day difference was real — re-setting the context twice a day dropped to once. I had already been maintaining AI guidelines before the refactoring, but what made the difference this time was adding content about the refactored structure and when to use which components. The results started coming out more consistently than before, and I sent fewer correction requests.

What If I Had Planned the Architecture from the Start?

After finishing the refactoring, my thought was: it wasn't too late to do this at this point. Sure, if I had defined shared components and hook structures from the beginning, I could have saved that time. But honestly, it's hard to know what you'll need before you've built anything. Which patterns would be shared, which parts would become complex — these only became visible after actually building a few games. If I had tried to design the perfect architecture from the start, I probably would have created a bunch of useless abstractions and delayed getting started.

I think that's just how building a service goes. You build, discover what's missing, fix it, and something else becomes visible. Almost nobody starts with a perfect blueprint, and I don't think that's always the right approach anyway. If there's a lesson from this refactoring, it's that it's better to make time to clean things up before they pile up too much. The sooner you pay down technical debt, the less interest you accrue.

Code Refactoring: How I Fixed the Problem of Running Out of Tokens in 2 Hours

Tests Before Refactoring

Finding the Common Parts First — Like Backend Development

Fix, Test, Fix, Test

Code Shrunk 20%, Working Time Extended by an Hour

What If I Had Planned the Architecture from the Start?

More Posts

Mini-Game Dev Log: From Lunchtime Colleagues to a Ranking System

Applying for Google AdSense: Two Rejections and an Ongoing Challenge

Your Personality in Colors: Secrets of Color Psychology

Everything About Reaction Speed: What Grade Is Your Reflex?