forum

Good beatmaps to train my Osu!Automapper?

posted
Total Posts
11
Topic Starter
mefuri
Hello fellas, this is the first time i actually type something to forums lol

As the title said, you might be already aware what am i going to talk, and what it is probably doing with Osu!.
tl;dr,
I'm making an AI that learns how to map a beatmap and will vomit raw beatmaps if i try to forcefeed it with .mp3 songs. I need help from you guys to find a proper beatmap to train the AI so it is doing it's job correctly and no longer be spanked by me.

First of all, i have read this user's guide to this part of forum, but i don't think my post is going to fit in Gameplay & Rankings forum, not even Beatmaps section, due to the nature of this post is actually "Suggestion for training data" and "Data Science". But still oddly Osu!-related somewhat so it does not fit in Off-topic either. But please let me know if there's more appropriate Osu! forum section.

So, basically, i've made an Artificial intelligence that learns how to make an Osu!Standard beatmap from songs. I have made the program to convert both beatmap and its song into the data format i need and also to reverse it to beatmap again. I already have some song and beatmaps to start 'teaching' the AI how to make another one, but since this is also the first time i develop something so complex like this, my AI is so limited to produce a good beatmap.
...or in nerd language:
I've made a Neural Network using TensorFlow in python (Anaconda tho) that is designed to map a beatmap by given song. The data wrapper I've made turns one song into training-sized pieces of grayscale frequency spectrogram that covers some duration interval (detailed down to one milisecond / 0.001 sec) using FFmpeg as labels that are synced to the song that is also cut down to pieces in the same duration interval. The data fed and spitted into the machine are NumPy vectors (one-dimensional ndarray, but stacked down lol). But since i know TensorFlow accepts an actual one-dimensional ndarray, i feel relieved.

example for one training epoch:
Input (song): [0, 0, 0, 10, 68, 210, 256, 256, ... ]
>>> size = height of the spectrogram image * the duration interval chosen
Output/label (beatmap): [1, 0, 0]
>>> size = 3, representing either a circle, slider, or spinner presence respectively during current duration interval

Despite the fact that I actually made this far, this is one of my first experience wandering in the wilderness of Machine Learning after I finished the Machine Learning Introduction course last semester in college. So I won't expect much from the AI, but at least I want to try something i have learned before and actually use it for real life implementations.

Here is how it works in a nutshell:


That limitation prevents me to feed almost all beatmap in my Osu! folder.
Those limitations are:

  1. Timing points, most of the maps has complex timing point settings that my stupid AI cannot handle it. Since i choose interval number to cut maps to pieces of certain miliseconds, there is a possibility that a combination of at least two of circle, slider, and spinner occurring during that interval. The problem is caused by either a new inherited timing point (the tp that defines its BPM by number instead of % of previous inherited timing point) has different BPM from previous section or put on a certain timestamp that breaks the previous 'beat' cycle. Different BPM bothers me, because my AI can only produce a map with one, uniform BPM.
  2. most of them aren't long enough to properly teach my AI. Marathon maps are good solutions at the first glance, but it turns not all of them. Half of marathon songs are compilations, like Sotarks' Fhána compilation is a good example to play, but because compilation consisted of songs with different BPM I heavy-heartedly throw this beatmap out of the list. This nano.RIPE compilation even traumatized my AI even more, MADs and meme remixes are the worst for him. Sometimes, to make your AI better is to feed more data to it.
  3. Vocals are not instruments you twat, but, y'know, my AI cannot yet separate vocals and instruments. There are some vocals that sometimes can be wrongly interpreted as instruments and it will effect on how my AI learn.
  4. Songs that are either too loud, has ridiculously high BPM, sounds chaotic, or has moive SFX (like boom or weeeeeeee bang bang bang skreeee, unless if it is used in a rhythmical manner) in it will affect on how the AI learning when should it put hit objects. It could have put an unnecessary insane stream on Peer Gynt's "Morning".

But with those limitations being present, it does not mean it will forever be like that, I'll try to improve so it will solve one limitation at a time and hopefully not generating new limitations. Anyway, according to the limitations i have, I need a suggestion for some beatmaps that is compatible to overcome those limitations. Here are my preferences, if possible:
  1. Has a small ratio of the number of Timing Points to the duration of the beatmap, or at least at a consistent BPM. The lesser Timing Points it has, the better.
  2. Preferably has a long duration, >6 minutes if possible.
  3. Less vocals, the better, instruments only is the best.
  4. The song is Not too loud or has any other auditory harms, and also the beatmap does not feature too much streams or any close-proximity objects (in the perspective of time), only if possible.

And thus this post concludes ^_^

Thank you guys for just taking a peek or even contribute. I have no deliberate intentions to explicitly advertise my project, but i have to elaborate some parts of it to make sure my problems are more easily solved. Also, this project was never intended for public release because this is just a fun project, not really an endeavor to improve science at any point lol. But if you're interested to see the codes and stuff, take a visit to the github page - I warned you: so messy, first time making a solo project and using git.

Again, thanks!

Bonus: screenshot
Osu!Automapper debug program, trying to interpret the objects appearing at current time synced with the actual map playing
Sosteneshion
I was using those ( some tech maps)


I mean there is a bunch of different maps, and the meta now I think is 1-2 jumps, so you should put it to learn 1-2 jumps maps
Topic Starter
mefuri

sosteneshion wrote:

I was using those ( some tech maps)

Thanks for the list, i'll try to observe how do they look like in-game and start compiling them into training data soon if it is compatible.

I mean there is a bunch of different maps, and the meta now I think is 1-2 jumps, so you should put it to learn 1-2 jumps maps

Hmm thanks for pointing this out, i'll keep in mind to include maps that features 1-2 jumps or maybe any other unsaid features.

I'm started think whether should i just separate one Neural Net profile (the weights, biases) for each one feature group of map , like one profile for streams, one profile for jumps, etc. Or should i just mash them all, i'll get to this later.
mulraf
first of all: wow. this really sounds awesome. i really hope you can pull this off :D

second: https://osu.ppy.sh/community/forums/2 this would probably be the forum you've been searching for

third: do you mean "timing points" as in "timing points" or also without 'inherited points' if possible?

some suggestions with few timing points, >6 minutes, good audio quality and partly not too many vocals:
https://osu.ppy.sh/beatmapsets/21678#osu/153246
https://osu.ppy.sh/beatmapsets/111386#osu/289752
https://osu.ppy.sh/beatmapsets/603755#osu/1649874
https://osu.ppy.sh/beatmapsets/765801#osu/1610046
https://osu.ppy.sh/beatmapsets/514601#osu/1093078
https://osu.ppy.sh/beatmapsets/650738#osu/1378892 (don't know if duplicate songs are any helpful or you don't need two of the same song)
https://osu.ppy.sh/beatmapsets/346853#osu/765525
https://osu.ppy.sh/beatmapsets/404844#osu/880206
https://osu.ppy.sh/beatmapsets/847996#osu/1773372
https://osu.ppy.sh/beatmapsets/82635#osu/228607 (actually very few inherity points too if that matters)
https://osu.ppy.sh/beatmapsets/725851#osu/1532489
https://osu.ppy.sh/beatmapsets/297110#osu/667021 (this actually has 1 timing point but less inherited points then all other maps and otherwise fitting so if that was what you're looking for)

okay, so i have more but i don't think it makes more sense looking for it a) before you made what i'm unsure about more precise and b) if that was what you're looking for it's probably enough :D
Topic Starter
mefuri
I have recently got a problem with my uni for my 4th semester administration and spent days to resolve, so i just got a time to reply now.

Reply to mulraf

mulraf wrote:

first of all: wow. this really sounds awesome. i really hope you can pull this off :D

second: https://osu.ppy.sh/community/forums/2 this would probably be the forum you've been searching for

Thank you for these! This might help me for future improvements.

third: do you mean "timing points" as in "timing points" or also without 'inherited points' if possible?

I mean 'timing points' in general, both the ones with exact BPM value, and the ones that uses percent value. However, (please forgive me i am actually confused with the term 'inherited') if that was not possible, i prefer the ones that has the minimal amount of the timing point that uses exact BPM value (it is called inherited, isn't it?). I think you suggested one of them and thanks bruh.

https://osu.ppy.sh/beatmapsets/650738#osu/1378892 (don't know if duplicate songs are any helpful or you don't need two of the same song)

I haven't decided for duplicate songs. I speculated using duplicated songs is just same as teaching a children mathematics with formula A, then with an unfamiliar formula B for the same problem, therefore the AI may messed the output if i ask it to tell me what formula to solve the problem. But nobody knows if we hadn't try, so it's OK for now.

okay, so i have more but i don't think it makes more sense looking for it a) before you made what i'm unsure about more precise and b) if that was what you're looking for it's probably enough :D

I have a lot of beatmap untested, and I am still redesigning on how to feed the AI properly using multiple beatmaps in one batch and hadn't decided on should i separate maps that have distinct features each other (like, i won't teach the AI streams while i am teaching it to do jumps). With the problem is now, i hope, described more precise, feel free to add more, but don't rush it since i already have answer for you, i didn't expect to be that lot, that's enough for now ^^


Just a few update.
I paused the training progress since i think there is a flaw on my methods to teach the AI, and also I think my lack of expertise in using the Neural Network framework, the Tensorflow, makes me feel to study it more before I train my AI to prevent it make a Centipede-like beatmap - I want it to be playable.

Also i encounter more problem. The website i used to do machine learning stuff, Colaboratory, has some problem that prevents me to sign in and upload the beatmaps to train it. This site allows us to use their monstrous GPUs and RAMs for insane computer stuffs for free, as long as it is a machine learning thing. There is a similar site called Kaggle, and i think i am going to give it a try to do the training over there. There's going to be a lot of time for migration, but don't worry, it is just a matter of copy-and-paste.

Thank you for following and helped me! <3

Edit: Oh i just noticed my forum thread is moved to the more appropriate forum section, thank you!
mulraf
the difference between inherited points and timing points is that timing points are usually used when the songs bpm change. inherited points are only used to make sliders go faster or slower. they are the ones 'with percent values'.

if you go to this beatmap: https://osu.ppy.sh/beatmapsets/800640#osu/1680868 for example, then go to the editor, 02:47:506 these are timing points. when you click on the timing tab you will see that the metronome goes slower and sloower and slooower because the song also gets slower.
01:07:662 these are called inherited points. again on the timing panel you will see that the metronome sounds exactly the same all the time even though there are tons of points. that is because the song actually isn't faster or slower, it's just that the mapper decided that the sliders should be faster or slower in these parts.
you can see the differentiation when you click on "Timing Setup Panel":


edit: damn it, messed the "e.g." arrow up xD just the wrong one. but i think you get what i meant.
Vuelo Eluko
Ambitious, hope to see you in https://osu.ppy.sh/forum/2
abraker
Just make sure it does not learn pp mapping or it's going to become the next devil's spawn in mapping
Topic Starter
mefuri

mulraf wrote:

the difference between inherited points and timing points is that timing points are usually used when the songs bpm change. inherited points are only used to make sliders go faster or slower. they are the ones 'with percent values'.

Aaaah i seee, thanks for the enlightment uwu. My previous confusion is caused due to the ambiguous explanation in Osu!Wiki about .OSU file format about timing points. There is a parameter called 'Inherited', as explained in the wiki:
  1. Inherited (Boolean: 0 or 1) tells if the timing point can be inherited from.

Thus made the timing points value is True (1) and inherited points value is False (0), which implicitly tells us inherited points is not an inherited points lol. Just another programming documentation mistake. I used think i should address this problem about disambiguation, but so far i never see people who got troubled about it so ok then.

Vuelo Eluko wrote:

Ambitious, hope to see you in https://osu.ppy.sh/forum/2

Thank you for cheering up! Although i don't really aim this to be a real program everyone can use, i will post there anyway for everyone to use or to continue development once i finished my fun experiments.

abraker wrote:

Just make sure it does not learn pp mapping or it's going to become the next devil's spawn in mapping

lmao the first beatmap the AI trying to produce is literally a map consisted of circles every 1 milisecond, but it is getting more 'normal' every training. I'll try my best to make to make the AI make a moderate map.

---

There's a few more thing if you want to know more about this project:
The actual aim for the AI going to achieve is to produce a beatmap that is human-playable. However, this does not mean the hit objects in the beatmap are going to be placed tidily nor smoothly. At least for now. This happens due to the output 'holes' on the AI only consisted of three channels that only tells us the presence of circles, sliders, and spinners. Therefore, the AI does not tell if 'this circle should be put on this coordinate' or anything else.

The current condition of the output forces us as human mappers to refine the AI-made beatmap. The AI decides when the hit objects appear, and we decide where the hit objects appear.

To compensate this, i made an algorithm to 'refine'. After the AI done the timing placement of the hit objects, this program ease our job in positioning the objects by 'spreading' them prior to manual editing. This was not committed to github yet because it is basically not even halfway done.


Thanks for coming!
AstralKun
Now im curious as there is different type of beatmap style are u going to specialised each beatmap style on example: 4 different AI each specialising in a certain style? then again there r many possibilites so good luck!
Topic Starter
mefuri

AstralKun wrote:

Now im curious as there is different type of beatmap style are u going to specialised each beatmap style on example: 4 different AI each specialising in a certain style? then again there r many possibilites so good luck!

Right now, i am just feeding the AI 'Normal' maps. As 'Normal' i mean the maps with less frequency of special features/styles, like jumps or streams for example. I also planned to separate maps with different playstyles in the future so different AI can be trained different playstyles.

I previously scrapped this idea: instead of playstyles, i once thought that it'd be better if the AI is trained based on the mapper, so for example one AI can mimic Sotarks' style while mapping.

Thank you for coming uwu

And if you're interested in some nerd terms,
Since the AI model i am using is Neural Network, which uses weights and biases stored in two lists/arrays, each consisted of weights matrices and biases vectors respectively. So when i finished on training the Ai on one playstyle, i could just export that somewhere else like to a .txt, and give a new, random weights and biases when i want to train the AI different playstyles
Please sign in to reply.

New reply