forum

osu! Beatmap Atlas: an embedding visualization tool for beatmaps

posted
Total Posts
8
Topic Starter
Ameo


Hey all! I wanted to share a project I've been working on for the past month or so with the tentative name "osu! Beatmap Atlas":

https://osu-map.ameo.design/

It's a web-based tool for exploring the world of osu! beatmaps via an embedding visualization. I'm hoping it will be useful for finding maps to play which are similar to your favorites, comparing your playstyle and progression to other players, and just exploring the world of osu! and seeing how the meta has evolved over the years.

If you enter your osu! username, it will highlight scores that have been in your top 100 as well as find your best play on other maps when you click on them. I've also added some other features like pp simulation for FCs with different accs.

----

Anyway, it's definitely not "done" yet, but I think it's finally in a very usable state! I'd love to hear feedback from people on what they think of it, if any parts are confusing, etc. and ofc I'd greatly appreciate any bug reports.

So yeah - please give it a try and let me know what you think!

Observations


While building the tool and playing with it myself, I've observed some pretty cool patterns and interesting things about the embedding.

One of my favorites is "DTEZ Island" way out on its own on the far left side of the viz which consists of only DTEZ scores. It's very isolated from other scores due to how niche the DTEZ playstyle is.

Another cool spot I've found is that there seem to be two distinct "paths" from ~500pp -> 1000pp: one speed, and one aim:



If you look at the top right of the viz where all the elite plays are at and set the color mode to "aim/speed ratio", it's clear that the right side has a lot of speed/stream maps like Sidetracked Day [Daydream] +DT while the left side has mostly aim-heavy maps like PADORU / PADORU [Gift] +DT.

The fact that those two paths are separated from each other seems to indicate that elite players have end up specializing in either speed or aim once they reach that level - although there are certainly some players that break that mold.

The whole atlas also tends to arrange from low difficulty to high difficulty across the horizontal axis, which makes a lot of sense given how it was created. It's notable that all of these patterns are emergent; they arise naturally from the data itself rather than being designed or engineered.

Besides difficulty, I'd say the other attribute that has the biggest impact at a global level is release year. At a local level, mod combos (DT vs nomod vs HR, etc.) tend to be very impactful.

Technical Details


First off, the whole project is entirely open source: https://github.com/ameobea/osu-embeddings

I used the decade worth of data I've collected from my osu!track to create a big correlation matrix between beatmaps - encoding within it data about relationships between maps, mods, and more.

I'm happy to share any of the data I used to create this - just let me know.

After some pre-processing, I turn that correlation graph into an embedding using some Python libraries and then project it down into 2D with UMAP. The result is a big 2D map where each beatmap+mod is assigned a 2D coordinate. More similar beatmaps end up close to each other and vice versa.

I spent a good bit of effort tuning the various parameters of the embedding process to get an output that both looks good, is easy to interpret (doesn't have 300 circles all stacked on top of each other), and still conveys the core information about the relationships between the beatmaps.

I then dump the whole embedding along with beatmap metadata into a binary file which is downloaded by the frontend.

This whole process is handled via a series of Python notebooks.

Speaking of the frontend, that's built with SvelteKit. The embedding visualization itself is built in a pretty low-level manner with hand-written WebGL shaders for the circles, manual input handlers, hit testing, coloring, and everything else.

I use the awesome Rust `rosu` libraries by /u/badewanne3 for several features like pp simulation, computing aim/speed ratio, star difficulties with mods, and stuff like that. They were indispensable for this project.

I'm happy to answer any questions about any of these pieces as well.
Kurboh
ooo, really cool!
Stoneybeans
Very cool. I'd really like to be able to type in the values for pp/stars/length filters instead of the current setup. Since the max/min are so far from the norm of each setting trying to be precise to find stuff can be a pain, I simply cannot filter for 5.40-5.50 Star rating maps of 1:40 length for example.
McEndu
Any thoughts on other modes?
niat0004
The default-settings map looks a bit like Denmark and its surrounding area.
Topic Starter
Ameo
I've released a new version of the atlas with the latest data up to today.

There are about 3000 more maps included in the visualization, and a significantly higher number of data points used to create it thanks to an integration between osu!track and the popular Bathbot for Discord.

If you preferred the way the old one looked better, there's a button on the button you can click on the left menu in the Atlas which will bring you back.

That being said, I'm personally finding that the new one seems to be a bit better!
Eyeonized
This looks absolutely fantastic. keep up the good work <3
Unfortunately my brain seems to be a little smooth. While the X axis dictates star rating (on default settings), What does the Y axis show?
Topic Starter
Ameo

Eyeonized wrote:

This looks absolutely fantastic. keep up the good work <3
Unfortunately my brain seems to be a little smooth. While the X axis dictates star rating (on default settings), What does the Y axis show?
The axes don't actually correspond to anything exactly. It's true that the X axis largely aligns with star rating/pp, but it wasn't designed to do that. The embedding algorithms used to create the atlas end up doing that because it's an efficient way to represent the relationships in the data.

So yeah there really any defined global meaning of a map's position on the X axis. I have noticed that map release year is a pretty important feature though and explains at least some of the variance there.
Please sign in to reply.

New reply