Osu! beatmaps are text data, so theoretically GPT should be able to learn to generate them pretty quickly. I doubt GPT-3 or GPT-4 have been trained on beatmaps specifically, but if someone can fine-tune a lower resource model like Alpaca 7B on a multimodal dataset containing all previously mapped songs and their corresponding beatmaps, it's possible that the model could learn to associate audio files with the beatmap outputs right? Of course you would have to turn songs into machine-readable features like spectrograms or something. But beatmap files are basically code files and we know GPT is really good at generating code.
I also asked ChatGPT what information a hitobject string represents and it gave an almost correct answer!
Can you explain what each section of this hitobject represents (separated by commas)? 256,192,540,70,0,L|257:217,1,25.000000372529,2|2,0:0|0:0,0:0:0:0:
-----
Sure, I can explain each section of the hitobject string you provided:
256,192,540,70,0,L|257:217,1,25.000000372529,2|2,0:0|0:0,0:0:0:0:
Each section is separated by a comma, and they represent the following information:
X-coordinate: 256
Y-coordinate: 192
Time: 540
Hit object type: 70 (this represents a circle)
Hit sound: 0 (this means no sound will be played when the object is hit)
Hit object extras: L|257:217 (this represents a slider, with the slider's endpoint at x-coordinate 257 and y-coordinate 217, forming a straight line between the circle and endpoint)
Edge hitsounds and timing points: 1,25.000000372529,2|2 (this represents two timing points - the first one is at time 540 with a 1/1 beat length and no hitsound, and the second one is at time 565 with a 1/1 beat length and a hitsound of 2)
Slider path: 0:0|0:0 (this represents the path the slider follows, in this case it's just a straight line from the circle to the endpoint)
Slider edge hitsounds: 0:0:0:0: (this means no hitsounds will be played when the slider is hit)
I hope that helps! Let me know if you have any other questions.
I also asked ChatGPT what information a hitobject string represents and it gave an almost correct answer!
Can you explain what each section of this hitobject represents (separated by commas)? 256,192,540,70,0,L|257:217,1,25.000000372529,2|2,0:0|0:0,0:0:0:0:
-----
Sure, I can explain each section of the hitobject string you provided:
256,192,540,70,0,L|257:217,1,25.000000372529,2|2,0:0|0:0,0:0:0:0:
Each section is separated by a comma, and they represent the following information:
X-coordinate: 256
Y-coordinate: 192
Time: 540
Hit object type: 70 (this represents a circle)
Hit sound: 0 (this means no sound will be played when the object is hit)
Hit object extras: L|257:217 (this represents a slider, with the slider's endpoint at x-coordinate 257 and y-coordinate 217, forming a straight line between the circle and endpoint)
Edge hitsounds and timing points: 1,25.000000372529,2|2 (this represents two timing points - the first one is at time 540 with a 1/1 beat length and no hitsound, and the second one is at time 565 with a 1/1 beat length and a hitsound of 2)
Slider path: 0:0|0:0 (this represents the path the slider follows, in this case it's just a straight line from the circle to the endpoint)
Slider edge hitsounds: 0:0:0:0: (this means no hitsounds will be played when the slider is hit)
I hope that helps! Let me know if you have any other questions.