User:Keller999/projects/gamepageupdate: Difference between revisions

Updated script to v1.02, please see Version 1.02 Updated
m (Itty bitty bugfix for when wikipedia pages are inconsistent (NEVAR!))
(Updated script to v1.02, please see Version 1.02 Updated)
Line 17: Line 17:


--[[User:Keller999|Keller999]] 13:15, 18 August 2011 (CEST)
--[[User:Keller999|Keller999]] 13:15, 18 August 2011 (CEST)
----
==== Version 1.02 Updated ====
I started out regenerating every page with the latest Wikipedia information, but quickly realized that was going to be not only really slow, but I didn't think it was right to replace all the hard work that's been done on the game pages.  So if a page already has good information, I'm running it through the script without inputting Wikipedia information.  Fortunately, the script handles that just fine and still cleans up the article and adds new categories as expected.
I updated the script twice, to v1.02 now.  Most of the changes are a result of bugs I've hit as I've been processing.  At some point, I may see if I can script this process JUST for article cleanup, not for new Wikipedia info.  Would have to be extremely careful and go through a lot of testing, but this is definitely an automate-able process.
Did several game page updates tonight, working through [[Category:Pages using the outdated Testing template]].  Once I'm through that, I'll see how the other "Pages with..." categories and looking.  This one should at LEAST take me several nights, though. =P
--[[User:Keller999|Keller999]] 12:48, 20 August 2011 (CEST)




Line 32: Line 43:
* Better supports not providing information (for example, if you don't supply a Wikipedia entry, the description from the Dolphin page will be reused)
* Better supports not providing information (for example, if you don't supply a Wikipedia entry, the description from the Dolphin page will be reused)
* Added shortcut to just generate a generic template
* Added shortcut to just generate a generic template
'''''1.02'''''
* input and platforms are now preferred from the original article, if they exist
* added some more regex-magic to clean up formatting I found in wikipedia articles
* whenever the script reads in Infobox params, it now picks them apart and recompiles them so that they always look the same.  About the only thing NOT be recompiled now is the Problems and Video sections.
* instead of using the existing article's Infobox as-is, we now treat it like it came from Wikipedia so that it gets the same processing
* The "Automatic Categories" note now only shows up if automatic categories are, in fact, generated
* Virtual Console handling in place.  Unfortunately, the script strips out the platform being emulated -- need to work on this.
* Categories are now always capitalized
* Mention of any of our supported systems in description text is now automatically Wiki-fied


<pre>
<pre>
Line 42: Line 62:
$imageSize = 300; # Assuming Wii by default
$imageSize = 300; # Assuming Wii by default


my ($image, $savedSizeLine);
my ($image, $savedSizeLine, $savedInputLine, $savedPlatformsLine);


#######
#######
Line 50: Line 70:


print "*********************************\n";
print "*********************************\n";
print "* Dolphin Wiki Page Update v1.01 *\n";
print "* Dolphin Wiki Page Update v1.02 *\n";
print "*********************************\n\n";
print "*********************************\n\n";


Line 106: Line 126:
elsif ($newLine =~ /^\ *\={1,4}\ *Version Compatibility\ *\=*$/i) { $currentSection = "versionCompat"; }
elsif ($newLine =~ /^\ *\={1,4}\ *Version Compatibility\ *\=*$/i) { $currentSection = "versionCompat"; }
elsif ($newLine =~ /^\ *\={1,4}\ *Testing\ *\=*$/i) { $currentSection = "testing"; }
elsif ($newLine =~ /^\ *\={1,4}\ *Testing\ *\=*$/i) { $currentSection = "testing"; }
elsif ($newLine =~ /^\ *\={1,4}\ *.*Videos\ *\=*$/i) { $currentSection = "videos"; }
elsif ($newLine =~ /^\ *\={1,4}\ *Gameplay\ {1,}Videos\ *\=*$/i) { $currentSection = "videos"; }
elsif ($newLine =~ /^\ *\[\[Category\:.*\]\]$/i) { $currentSection = "categories"; }
elsif ($newLine =~ /^\ *\[\[Category\:.*\]\]$/i) { $currentSection = "categories"; }
elsif ($newLine =~ /^\ *\{\{Navigation\ .*\}\}$/i) { $currentSection = "categories"; }
elsif ($newLine =~ /^\ *\{\{Navigation\ .*\}\}$/i) { $currentSection = "categories"; }
Line 116: Line 136:
$currentSection = "description";
$currentSection = "description";
} else {  # We want to keep anything else in the infobox.  This is in case the user didn't give us a wikipedia article to generate a new one from
} else {  # We want to keep anything else in the infobox.  This is in case the user didn't give us a wikipedia article to generate a new one from
$newLine =~ s/^\|(\S*)\ *=\ *(.*)/\|$1 \= $2/i;
push (@infoboxSection, $newLine);
push (@infoboxSection, $newLine);


Line 122: Line 143:
$image = $newLine;
$image = $newLine;
# just keep the filename, and format  
# just keep the filename, and format  
$image =~ s/\|image\ *\=\ *\[\[(?:File|Image)\ *\:\ *(.*?)(?:\||\]).*/$1/i; #
$image =~ s/\|image\ *\=\ *\[\[(?:File|Image)\ *\:\ *(.*?)(?:\||\]){1,2}.*/$1/i;
} elsif ($newLine =~ /^\|\ *size\ *=\ *.+/gi) {
} elsif ($newLine =~ /^\|\ *size\ *=\ *.+/gi) {
$savedSizeLine = $newLine; # saved for later
$savedSizeLine = $newLine; # saved for later
$savedSizeLine =~ s/^\|size\ *=\ *(.*)/\|size \=\ $1/i;
} elsif ($newLine =~ /^\|\ *input\ *=\ *.+/gi) {
$savedInputLine = $newLine; # saved for later
$savedInputLine =~ s/^\|input\ *=\ *(.*)/\|input \=\ $1/i;
} elsif ($newLine =~ /^\|\ *platforms{0,1}\ *=\ *.+/gi) {
$savedPlatformsLine = $newLine; # saved for later
$savedPlatformsLine =~ s/^\|platforms{0,1}\ *=\ *(.*)/\|platforms \=\ $1/i;
}
}
}
}
Line 181: Line 209:


system('clear');
system('clear');
if (not(@wikipedia)) {  #if we didn't get anything from Wikipedia, use the existing info from the article
push (@wikipedia, '{{Infobox VG');
push (@wikipedia, @infoboxSection);
push (@wikipedia, '}}');
push (@wikipedia, @descriptionSection);
}


foreach (@wikipedia) {
foreach (@wikipedia) {
Line 200: Line 235:
#Platforms
#Platforms
if ($newLine =~ /\|\ *platforms/gi) {
if ($newLine =~ /\|\ *platforms/gi) {
# If the previous article, had a platforms list, we trust it over the Wikipedia one
if ($savedPlatformsLine) { $newLine = $savedPlatformsLine; }
#Wii
#Wii
if ($newLine =~ /.*Wii.*/i) {
if ($newLine =~ /.*Wii.*/i) {
$platforms .= '[[Wii]] ';
$platforms .= '[[Wii]] ';
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, "[[Category:Wii games]]");
push (@autoCategory, "[[Category:Wii games]]");
$platformAltered = 1;
$platformAltered = 1;
Line 211: Line 251:
                 if ($newLine =~ /GameCube/i) {
                 if ($newLine =~ /GameCube/i) {
$platforms .= '[[GameCube]] ';
$platforms .= '[[GameCube]] ';
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, "[[Category:GameCube games]]");
push (@autoCategory, "[[Category:GameCube games]]");
$platformAltered = 1;
$platformAltered = 1;
Line 219: Line 260:
                 if ($newLine =~ /WiiWare/i) {
                 if ($newLine =~ /WiiWare/i) {
               $platforms .= '[[WiiWare]] ';
               $platforms .= '[[WiiWare]] ';
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, "[[Category:WiiWare games]]");
push (@autoCategory, "[[Category:WiiWare games]]");
$platformAltered = 1;
$platformAltered = 1;
$imageSize = 175;
$imageSize = 175;
       }
       }
#Virtual Console
#TODO: If Virtual Console is found, we need to include the WHOLE list of platforms without filtering
if ($newLine =~ /.*Virtual\ *Console.*/i) {
$platforms .= '[[Virtual Console]] ';
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, "[[Category:Virtual Console games]]");
$platformAltered = 1;
$imageSize = 300;
}


#TriForce
#TriForce
                 if ($newLine =~ /TriForce/i) {
                 if ($newLine =~ /TriForce/i) {
               $platforms .= '[[Triforce]] ';
               $platforms .= '[[Triforce]] ';
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, "[[Category:Triforce games]]");
push (@autoCategory, "[[Category:Triforce games]]");
$platformAltered = 1;
$platformAltered = 1;
Line 244: Line 297:
}
}


if ($newLine =~ /^\|\ *(title|developer|publisher|distributor|director|producer|designer|programmer|artist|composer|license|series|engine|resolution|released|genre|mode|ratings|input|size|fps|dspcode|dtkadpcm|channeltype|mode|modes)\ *=\ *.+/gi) {
# If the parameter is not in our list, it's ignored
if ($newLine =~ /^\|\ *(title|developer|publisher|distributor|director|producer|designer|programmer|artist|composer|license|series|engine|resolution|released|genre|mode|ratings|size|fps|dspcode|dtkadpcm|channeltype|mode|modes)\ *=\ *.{2,}/gi) {
$skip = 0;
$skip = 0;
$newLine =~ s/^\|(\S*)\ *=\ *(.*)/\|$1 \= $2/i;
} else {
} else {
$skip = 1;
$skip = 1;
Line 264: Line 319:
$genreLine = $newLine;
$genreLine = $newLine;
$genreLine =~ s/^\|\ *genre\ *\=\ *//i;
$genreLine =~ s/^\|\ *genre\ *\=\ *//i;
$genreLine =~ s/(\<.+?\>)|\(|\)/\,/gi; # try to clean this line up a bit
@genres = split (/\,|\<br.*?\>/,$genreLine);
@genres = split (/\,|\<br.*?\>/,$genreLine);
foreach (@genres) {
foreach (@genres) {
if (@autoCategory eq 0) {
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, '<!-- Categories below this line were automatically generated -->');
$line = $_;
 
if ($line ne "") {
$line =~ s/^\ *//;
$line =~ s/\ +$//;
$line =~ s/\ *(game|games)\ *//i;
$line = ucfirst $line;
push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
}
}
$line = $_;
$line =~ s/^\ *//;
$line =~ s/\ +$//;
push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
}
}
}
}
Line 280: Line 339:
$modeLine = $newLine;
$modeLine = $newLine;
$modeLine =~ s/^\|\ *modes{0,1}\ *\=\ *//i;
$modeLine =~ s/^\|\ *modes{0,1}\ *\=\ *//i;
$modeLine =~ s/(\<.+?\>)|\(|\)/\,/gi; # try to clean this line up a bit
@modes = split (/\,|\<br.*?\>/,$modeLine);
@modes = split (/\,|\<br.*?\>/,$modeLine);
foreach (@modes) {
foreach (@modes) {
if (@autoCategory eq 0) {
if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
push (@autoCategory, '<!-- Categories below this line were automatically generated -->');
$line = $_;
 
 
if ($line ne "") {
$line =~ s/(\<.+?\>)|\(|\)/\,/gi;
$line =~ s/^\ *//;
$line =~ s/\ +$//;
$line = ucfirst $line;
push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
}
}
$line = $_;
$line =~ s/^\ *//;
$line =~ s/\ +$//;
$line =~ s/\ *(game|games)\ *//i;
push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
}
}
}
}
Line 296: Line 359:
if ($newLine =~ /^\}\}$/) {
if ($newLine =~ /^\}\}$/) {
if ($savedSizeLine ne "") { push (@finalResult, $savedSizeLine); }  # Saved size
if ($savedSizeLine ne "") { push (@finalResult, $savedSizeLine); }  # Saved size
if ($savedInputLine ne "") { push (@finalResult, $savedInputLine); }  # Saved input
if ($image ne "") { # Saved image line
if ($image ne "") { # Saved image line
push (@finalResult, '|image = [[File:' . $image . '|' . $imageSize . 'px]]');  
push (@finalResult, '|image = [[File:' . $image . '|' . $imageSize . 'px]]');  
Line 309: Line 373:


if (($foundWikipediaInfobox eq 1) and ($skip eq 0)) {
if (($foundWikipediaInfobox eq 1) and ($skip eq 0)) {
if ($insideInfobox eq 0) {
$newLine =~ s/GameCube/[[GameCube]]/i;
$newLine =~ s/Nintendo GameCube/[[GameCube]]/i;
$newLine =~ s/\ Wii\ /[[Wii]]/i;
$newLine =~ s/WiiWare/[[WiiWare]]/i;
$newLine =~ s/Virtual Console/[[Virtual Console]]/i;
$newLine =~ s/Triforce/[[Triforce]]/i;
}
push (@finalResult,$newLine);
push (@finalResult,$newLine);
}
}
Line 352: Line 425:
# If a Wikipedia Infobox was received, it was already pushed into @finalResult.
# If a Wikipedia Infobox was received, it was already pushed into @finalResult.
# TODO: Wikipedia parsing should push into its own array, which is then added in this section
# TODO: Wikipedia parsing should push into its own array, which is then added in this section
} elsif (($foundWikipediaInfobox eq 0) and ($foundDolphinInfobox eq 1)) {  # We didn't get a Wikipedia Infobox, so use the one from Dolphin
push (@finalResult, '{{Infobox VG');
push (@finalResult, @infoboxSection);
push (@finalResult, '}}');
} elsif (($foundWikipediaInfobox eq 0) and ($foundDolphinInfobox eq 0)) { # We didn't get ANY infoboxes, create a generic one
} elsif (($foundWikipediaInfobox eq 0) and ($foundDolphinInfobox eq 0)) { # We didn't get ANY infoboxes, create a generic one
push (@finalResult, '{{Infobox VG');
push (@finalResult, '{{Infobox VG');
Line 412: Line 481:


print "\n\n";
print "\n\n";
</pre>
</pre>
1,411

edits