Automatically extract information from http://minecraft.gamepedia.com/ ? #229

rom1504 · 2015-03-22T18:13:00Z

http://minecraft.gamepedia.com/ is a really complete reference on many things on minecraft.

There are already some scripts (https://github.com/andrewrk/mineflayer/blob/master/bin/transform1_recipes.js for example, currently broken though) to extract the recipes from that wiki.
And I think we could extract more things, for example everything that's on the infobox (see http://minecraft.gamepedia.com/Rabbit%27s_Foot vs https://github.com/andrewrk/mineflayer/blob/master/lib/enums/items.json#L950 )

I'm not sure this can really be applied here, but http://dbpedia.org/ has a really good framework to extract information from wikipedia infoboxes and the infoboxes from http://minecraft.gamepedia.com/ just look like the ones from Wikipedia so that might be interesting to look into.

@Kupferhirn has extracted the items manually from that wiki (see #227) and that's nice, but doing the same thing automatically would be really nice.

Edit: well I think the extraction framework of DBpedia is probably way to big for that, doing some simple scripts would be easier.

thejoshwolfe · 2015-03-22T21:01:43Z

An alternative to scraping a wiki is to install debug statements into the game itself. That would be guaranteed to be 100% correct and complete (at least for the mechanical data like id numbers), but it relies on the Minecraft Coder Pack project being caught up to the latest version of Minecraft. I can't really find any authoritative information on MCP anymore; I wonder if that project is still alive.

rom1504 · 2015-03-22T21:19:43Z

I think the official site of MCP is there http://www.modcoderpack.com/website/releases .
Yeah I agree there are many ways to do it.
For example @deathcap is working on upgrading burger (TkTech/Burger#12)

So I think whatever ways we can extract these infos automatically is fine.

roblabla · 2015-03-22T21:48:28Z

Relying on MCP is a bad idea. The project seems very volatile, sadly. A bukkit or forge plugin could also extract information and would seem more stable.

thejoshwolfe · 2015-03-22T21:59:31Z

I thought Forge was built on MCP. Maybe it used to be? If Forge works with 1.8.3, then that seems like the way to go.

What seems so attractive about a mod/plugin is that all the heavy data comes straight from Mojang. The only thing the community provides in this case is a scraping tool. The wiki is community maintained, and might be wrong. Bukkit is community maintained and might be wrong.

The downside of scraping the minecraft binary itself is that you don't always get very good string names and descriptions. Perhaps scraping would only be appropriate for recipes and a sanity check list of id numbers.

roblabla · 2015-03-22T22:04:27Z

Forge is built on MCP, but public builds of MCP take longer and longer to get released.

Bukkit is based on mojang's minecraft server, it can hardly be wrong. They use a similar technique as MCP, but do it themselves.

thejoshwolfe · 2015-03-22T22:17:05Z

Bukkit is based on mojang's minecraft server, it can hardly be wrong.

Bukkit currently doesn't know about Granite: https://github.com/Bukkit/Bukkit/search?utf8=%E2%9C%93&q=granite (contrast with: https://github.com/Bukkit/Bukkit/search?utf8=%E2%9C%93&q=acacia )

Bukkit, like the wiki, is supposed to be kept up to date by the community. This makes it inherently less trustworthy than the actual data in the notchian game itself, which we know must be right at all times by definition.

A Forge plugin still seems like the most reliable solution to me at this point.

roblabla · 2015-03-22T22:30:48Z

This is the wrong repo. Bukkit repo's last commit is in 2014 august. Spigot is still up-to-date and does know about granite, prismarine, etc...

thejoshwolfe · 2015-03-22T22:32:46Z

This is the wrong repo.

Oh ok. Where do we get the current source? Or are you proposing we write a Bukkit plugin to dump the data from the Bukkit runtime binary?

roblabla · 2015-03-22T22:33:12Z

Yes, that's what I was proposing. A forge plugin works too though.

Current source is closed due to the DMCA stuff

rom1504 · 2015-03-22T22:57:39Z

I started fixing the recipes extractor.
And as expected : not all the blocks info are correct, for example the "Trapdoor" https://github.com/andrewrk/mineflayer/blob/master/lib/enums/blocks.json#L1096
is now named "Wooden Trapdoor" (http://minecraft.gamepedia.com/Trapdoor#Crafting)

I think there are many other such errors, that's why some kind of automatic extractor is needed for this.

I will still update the recipes but it won't be perfect until we have an extractor for the blocks and the items (the recipes extractor depend on having correct items.json and blocks.json)

…ecipes with that. Also put the output file in the arguments of the file instead of printing to stdout. I used merge_recipes.js so recipes aren't changed, just added. blocks.json and items.json aren't fully updated (see #229) so some recipes are probably still missing.

rom1504 · 2015-03-23T00:17:36Z

I'm currently extracting from the html of http://minecraft.gamepedia.com/Crafting#Complete_recipe_list but it's not very reliable (or easy).
Getting the wiki source of that might be useful, I didn't find how to do that for the complete list, but it's possible for a single item (for example http://minecraft.gamepedia.com/index.php?title=Andesite&action=edit&section=3) which might be easier to parse.
To use the individual pages it would be needed to get them all : should be integrated in the script.

The wiki source is generally much easier to parse than the html, and it might be possible to parse the items and blocks information from it (see the source of the infobox there http://minecraft.gamepedia.com/index.php?title=Andesite&action=edit)

Edit: apparently the complete list is generated with a script like that http://minecraft.gamepedia.com/Module:Recipe_list , this might be useful

Edit2: there's a "Pocket Edition only" or "Console edition only" note on some of the recipes, check that on the script (and remove the recipes that shouldn't have been added if needed)

Kupferhirn · 2015-03-23T08:08:07Z

"trapdoor" is the unlocationed name from the notchian client. I have checked all block that could have changed

rom1504 · 2015-03-23T08:55:42Z

@Kupferhirn "name": "trapdoor", is ok , the problem is "displayName": "Trapdoor",
And other similar stuff (I think most blocks/items with a different qualifiers like this have problem at least in the displayName)
And I need the display name to be coherent in my script to extract the recipes.

I don't have it right now, but I'll put here a list of blocks/items with problems tonight if that can be useful.

rom1504 · 2015-03-23T09:38:59Z

So I found out a bit more about these recipe-related scripts :

rom1504 · 2015-03-23T09:51:28Z

See this http://minecraft.gamepedia.com/Talk:Crafting#Wiki_source_of_the_recipes

rom1504 · 2015-03-23T09:57:04Z

The recipes of the furnace are there http://minecraft.gamepedia.com/Smelting

For the brewing stand : http://minecraft.gamepedia.com/Brewing

see http://minecraft.gamepedia.com/Template:Grid#Other_templates for various grid-related pages.

rom1504 · 2015-03-27T12:25:57Z

this should somehow go in https://github.com/PrismarineJS/minecraft-data

rom1504 · 2015-03-27T14:34:03Z

I think I might just start by making a script to get the wiki source of everything on the wiki, because there is a lot of information on it, not just recipes.

pokeball99 · 2015-03-27T16:27:35Z

Or have said info hosted on a new repo,aND get it to draw info from it

rom1504 · 2015-03-27T17:43:06Z

@pokeball99 that's already done there but we still need to extract minecraft info to put it in minecraft-data ;)

rom1504 · 2015-03-27T17:52:11Z

Ok, this issue PrismarineJS/minecraft-data#8 tracks the progress for the wiki extraction.
If someone want to work on extraction from burger, mcp or whatever else, he can open an issue on the minecraft-data repo.

Closing this issue.

rom1504 mentioned this issue Mar 22, 2015

Update block enum for 1.8 #230

Merged

deathcap mentioned this issue Mar 22, 2015

Missing Some 1.7 Blocks/Items #189

Closed

rom1504 added the data-related label Mar 27, 2015

rom1504 mentioned this issue Mar 27, 2015

Automatically extract information from http://minecraft.gamepedia.com/ PrismarineJS/minecraft-data#8

Closed

rom1504 closed this as completed Mar 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically extract information from http://minecraft.gamepedia.com/ ? #229

Automatically extract information from http://minecraft.gamepedia.com/ ? #229

rom1504 commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

rom1504 commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

rom1504 commented Mar 22, 2015

rom1504 commented Mar 23, 2015

Kupferhirn commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 27, 2015

rom1504 commented Mar 27, 2015

pokeball99 commented Mar 27, 2015

rom1504 commented Mar 27, 2015

rom1504 commented Mar 27, 2015

Automatically extract information from http://minecraft.gamepedia.com/ ? #229

Automatically extract information from http://minecraft.gamepedia.com/ ? #229

Comments

rom1504 commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

rom1504 commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

thejoshwolfe commented Mar 22, 2015

roblabla commented Mar 22, 2015

rom1504 commented Mar 22, 2015

rom1504 commented Mar 23, 2015

Kupferhirn commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 23, 2015

rom1504 commented Mar 27, 2015

rom1504 commented Mar 27, 2015

pokeball99 commented Mar 27, 2015

rom1504 commented Mar 27, 2015

rom1504 commented Mar 27, 2015