User:Keller999/projects/gamepageupdate

From Dolphin Emulator Wiki
< User:Keller999
Revision as of 09:48, 20 August 2011 by Keller999 (talk | contribs) (Updated script to v1.02, please see Version 1.02 Updated)
Jump to navigation Jump to search

I am working on doing several things for all existing game pages:

  • If it's not already in place, copy/paste a cleaned-up Infobox from Wikipedia into the game page
  • Update templates to use the most recent versions, rather than redirected versions
  • Ensure that all sections that need user updates have a template to copy-paste, or a link to further documentation
  • Update pages to conform to the standard game page -- Problems, Configuration, Version Compatibility, Testing, Gameplay Videos
  • Remove existing non-used config variables if they are present in the page

Scripts

Hooray for programming mini-project! I put together a script that takes the Infobox and summary from Wikipedia, then the existing Dolphin Wiki article, and creates a standards-compliant merge of the two. For an example of what this script does, please see Call of Duty: Black Ops/sandbox. Revision 23655 is before, Revision 23677 is after. I did NOT do any additional touchup between these revs, as I wanted to show exactly what the script was doing.

I have updated the page to match what I believe to be current standard. I intend to start running this script for many pages starting tomorrow evening, so PLEASE review the newest revision of Call of Duty: Black Ops/sandbox to ensure that I'm not missing anything! I would hate to do a bunch of mass changes and then have to go repair them all.

Also, expect this script to have a bug or two until I can get it working and find out the issues with it.

Let me know your thoughts!

--Keller999 13:15, 18 August 2011 (CEST)


Version 1.02 Updated

I started out regenerating every page with the latest Wikipedia information, but quickly realized that was going to be not only really slow, but I didn't think it was right to replace all the hard work that's been done on the game pages. So if a page already has good information, I'm running it through the script without inputting Wikipedia information. Fortunately, the script handles that just fine and still cleans up the article and adds new categories as expected.

I updated the script twice, to v1.02 now. Most of the changes are a result of bugs I've hit as I've been processing. At some point, I may see if I can script this process JUST for article cleanup, not for new Wikipedia info. Would have to be extremely careful and go through a lot of testing, but this is definitely an automate-able process.

Did several game page updates tonight, working through. Once I'm through that, I'll see how the other "Pages with..." categories and looking. This one should at LEAST take me several nights, though. =P

--Keller999 12:48, 20 August 2011 (CEST)


dolphinpageupdate.pl

Runs great on my Linux box, uses no special modules. I know for a fact that system('clear') does NOT work in Windows, but you could probably replace it with system('cls') and get the same effect.

Version History

1.0

  • Initial release

1.01

  • Image is now always parsed, and will set the size based on platforms detected. Defaults to Wii (300px)
  • If no Wikipedia Infobox is supplied, re-use the one from the Dolphin wiki page
  • If no Infobox is supplied at all, generate a generic one
  • Better supports not providing information (for example, if you don't supply a Wikipedia entry, the description from the Dolphin page will be reused)
  • Added shortcut to just generate a generic template

1.02

  • input and platforms are now preferred from the original article, if they exist
  • added some more regex-magic to clean up formatting I found in wikipedia articles
  • whenever the script reads in Infobox params, it now picks them apart and recompiles them so that they always look the same. About the only thing NOT be recompiled now is the Problems and Video sections.
  • instead of using the existing article's Infobox as-is, we now treat it like it came from Wikipedia so that it gets the same processing
  • The "Automatic Categories" note now only shows up if automatic categories are, in fact, generated
  • Virtual Console handling in place. Unfortunately, the script strips out the platform being emulated -- need to work on this.
  • Categories are now always capitalized
  • Mention of any of our supported systems in description text is now automatically Wiki-fied
#!/usr/bin/perl

my (@wikipedia, @originalPage);
my (@infoboxSection, @descriptionSection, @problemSection, @configurationSection, @versionCompatSection, @testingSection, @videoSection, @categorySection);
my @finalResult;

$imageSize = 300; # Assuming Wii by default

my ($image, $savedSizeLine, $savedInputLine, $savedPlatformsLine);

#######
#INPUT#
#######
system('clear');

print "*********************************\n";
print "* Dolphin Wiki Page Update v1.02 *\n";
print "*********************************\n\n";

print "First, copy and paste the game's Wikipedia article from the top to wherever you'd like to end the new description.  Zero input is fine.  Enter \'-1\' to indicate the end, or -2 to just get a blank template.\n\n";

# Wikipedia input
while (($line ne "-1") and ($line ne "-2")) {
        $line = <STDIN>;
        chomp($line);

        if (($line ne "-1") and ($line ne "") and ($line ne "-2")) {
                push (@wikipedia,$line);
        }
}

if ($line eq "-2") {
	$line = "-1";
} else {
	$line = "";
}

system('clear');

print "*********************************\n";
print "* Dolphin Wiki Page Update v1.0 *\n";
print "*********************************\n\n";

print "Now, copy and paste the existing Dolphin wiki article to import existing information.  Zero input is fine.  Enter \'-1\' to indicate the end.\n\n";

# Dolphin page input
while ($line ne "-1") {
        $line = <STDIN>;
        chomp($line);

        if (($line ne "-1") and ($line ne "")) {
                push (@originalArticle,$line);
        }
}

#############################
#EXISTING ARTICLE PROCESSING#
#############################

# (@infoboxSection, @descriptionSection, @problemSection, @configurationSection, @versionCompatSection, @testingSection, @videoSection, @categorySection);
$currentSection = "none";
$foundDolphinInfobox = 0;

foreach (@originalArticle) {
	$newLine = $_;
	$checkLine = $_;

	if    ($newLine =~ /^\{\{\ *Infobox.*/i) { $currentSection = "infobox"; $foundDolphinInfobox = 1; }	
	elsif ($newLine =~ /^\ *\={1,4}\ *Problems\ *\=*$/i) { $currentSection = "problems"; }
	elsif ($newLine =~ /^\ *\={1,4}\ *Configuration\ *\=*$/i) { $currentSection = "configuration"; }
	elsif ($newLine =~ /^\ *\={1,4}\ *Version Compatibility\ *\=*$/i) { $currentSection = "versionCompat"; }
	elsif ($newLine =~ /^\ *\={1,4}\ *Testing\ *\=*$/i) { $currentSection = "testing"; }
	elsif ($newLine =~ /^\ *\={1,4}\ *Gameplay\ {1,}Videos\ *\=*$/i) { $currentSection = "videos"; }
	elsif ($newLine =~ /^\ *\[\[Category\:.*\]\]$/i) { $currentSection = "categories"; }
	elsif ($newLine =~ /^\ *\{\{Navigation\ .*\}\}$/i) { $currentSection = "categories"; }

	if ($currentSection eq "infobox") {
		if ($newLine =~ /^\{\{\ *Infobox.*/i) {   # This is the start of the Infobox, ignore
			# ignore
		} elsif ($newLine =~ /^\}\}$/) {   # This is the end of the Infobox, change section and ignore
			$currentSection = "description";
		} else {   # We want to keep anything else in the infobox.  This is in case the user didn't give us a wikipedia article to generate a new one from
			$newLine =~ s/^\|(\S*)\ *=\ *(.*)/\|$1 \= $2/i;
			push (@infoboxSection, $newLine);

			# Now we save the image filename and size to be added onto wikipedia's infobox, if its provided
			if ($newLine =~ /^\|\ *image\ *=\ *.+/gi) {
				$image = $newLine;
				# just keep the filename, and format 
				$image =~ s/\|image\ *\=\ *\[\[(?:File|Image)\ *\:\ *(.*?)(?:\||\]){1,2}.*/$1/i;
			} elsif ($newLine =~ /^\|\ *size\ *=\ *.+/gi) {
				$savedSizeLine = $newLine; # saved for later
				$savedSizeLine =~ s/^\|size\ *=\ *(.*)/\|size \=\ $1/i;
			} elsif ($newLine =~ /^\|\ *input\ *=\ *.+/gi) {
				$savedInputLine = $newLine; # saved for later
				$savedInputLine =~ s/^\|input\ *=\ *(.*)/\|input \=\ $1/i;
			} elsif ($newLine =~ /^\|\ *platforms{0,1}\ *=\ *.+/gi) {
				$savedPlatformsLine = $newLine; # saved for later
				$savedPlatformsLine =~ s/^\|platforms{0,1}\ *=\ *(.*)/\|platforms \=\ $1/i;
			}
		}
	} elsif ($currentSection eq "description") {
		# We're pretty much going to assume that if it's in this section, we want to keep it all as-is
		push (@descriptionSection, $newLine);
	} elsif ($currentSection eq "problems") {
		if ($newLine =~ /^\ *\={1,4}\ *Problems\ *\=*$/i) {   # Matches the section heading, ignore
			# ignore
		} elsif ($newLine =~ /^\ *\={1,4}\ *(.*?)\ *\=*\ *$/gm) {   # This is a sub-heading.  Reformat it
			push (@problemSection, "\=\=\= $1 \=\=\=");
		} else {   # This is something else in the problem section, like user input.  Keep it
			push (@problemSection, $newLine);
		}
	} elsif ($currentSection eq "configuration") {
		if ($newLine =~ /\|\ *.*\=\ *\S+.*$/gi) {   # This is a config entry that has been filled out
			push (@configurationSection, $newLine);
		} # Anything besides filled-out config params are not needed, the rest will be regenerated
	} elsif ($currentSection eq "versionCompat") {
		if ($newLine =~ /^\{\{VersionCompatibilityVersion\|\s*(.+)\s*\|\s*(.+)\s*(\|\s*(.+)\s*)?\}\}$/gi) {   # Version compat report that's been filled out
			$versionCompatEntry = "\{\{VersionCompatibilityVersion\|$1\|$2".(($3)?"\|$3":"")."\}\}";
			push (@versionCompatSection, $versionCompatEntry);
		} # Anything besides filled-out compat reports are not needed, the rest will be regenerated
	} elsif ($currentSection eq "testing") {
		if ($newLine =~ /^\{\{.+?\|revision\=\s*(.*?)\s*\|os\=\s*(.*?)\s*\|cpu\=\s*(.*?)\s*\|gpu\=\s*(.*?)\s*\|result\=\s*(.*?)\s*(\|tester\=\s*(.*?)\s*)?\}\}/i) {
			# Matches test reports with all variables set (tester is optional) and dissects for reassembly (muahahaha!)
			$testResult = "\{\{testing\/entry\|revision\=$1\|OS\=$2\|CPU\=$3\|GPU\=$4\|result\=$5\|tester\=".(($6)?"$6":"")."\}\}";
			push (@testingSection, $testResult);
		}
	} elsif ($currentSection eq "videos") {
		if ($newLine =~ /^\ *\={1,4}\ *.*Videos\ *\=*$/i) {   # Matches the section heading, ignore
			# ignore
		} else {   # Keep everything else
			push (@videoSection, $newLine);
		}
	} elsif ($currentSection eq "categories") {   # We keep all existing categories, and add some new auto-generated ones!
		if ($newLine =~ /^\[\[Category\:\ *(.*)\ *\]\]/i) {   # This is a category entry
			push (@categorySection, "\[\[Category:" . $1 . "\]\]");
		} elsif ($newLine =~ /^\{\{Navigation\ *(.*)\ *\}\}/i) {   # This is a navigation entry
			push (@categorySection, "\{\{Navigation " . $1 . "\}\}");
		}
	}	
}




######################
#WIKIPEDIA PROCESSING#
######################

$foundWikipediaInfobox = 0;
$insideInfobox = 0;

my @autoCategory;

system('clear');

if (not(@wikipedia)) {  #if we didn't get anything from Wikipedia, use the existing info from the article
	push (@wikipedia, '{{Infobox VG');
	push (@wikipedia, @infoboxSection);
	push (@wikipedia, '}}');
	push (@wikipedia, @descriptionSection);
}

foreach (@wikipedia) {
	$newLine = $_;

	$platforms  = "\|platforms \= ";
	$platformAltered = 0;
	$skip = 0;
	$skipUnWiki = 0;

	if ($newLine =~ /^\{\{\ *Infobox.*/i) {
		push (@finalResult, '{{Infobox VG');
		$foundWikipediaInfobox = 1;
		$insideInfobox = 1;
		$skip = 1;
		
	}

	#Platforms
	if ($newLine =~ /\|\ *platforms/gi) {

		# If the previous article, had a platforms list, we trust it over the Wikipedia one
		if ($savedPlatformsLine) { $newLine = $savedPlatformsLine; }

		#Wii
		if ($newLine =~ /.*Wii.*/i) {
			$platforms .= '[[Wii]] ';
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }			
			push (@autoCategory, "[[Category:Wii games]]");
			$platformAltered = 1;
			$imageSize = 300;
		}			

		#GameCube
                if ($newLine =~ /GameCube/i) {
			$platforms .= '[[GameCube]] ';
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			push (@autoCategory, "[[Category:GameCube games]]");
			$platformAltered = 1;
			$imageSize = 300;
	       	 }	

		#WiiWare
                if ($newLine =~ /WiiWare/i) {
       	        	$platforms .= '[[WiiWare]] ';
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			push (@autoCategory, "[[Category:WiiWare games]]");
			$platformAltered = 1;
			$imageSize = 175;
       	 	}

		#Virtual Console
		#TODO: If Virtual Console is found, we need to include the WHOLE list of platforms without filtering
		if ($newLine =~ /.*Virtual\ *Console.*/i) {
			$platforms .= '[[Virtual Console]] ';
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			push (@autoCategory, "[[Category:Virtual Console games]]");
			$platformAltered = 1;
			$imageSize = 300;
		}		

		#TriForce
                if ($newLine =~ /TriForce/i) {
               		$platforms .= '[[Triforce]] ';
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			push (@autoCategory, "[[Category:Triforce games]]");
			$platformAltered = 1;
			$imageSize = 300;
        	}

		push (@finalResult,$platforms);
		$skip = 1;
		$skipUnWiki = 1;
	}

	#Purge un-used parameters
	if ($insideInfobox) {
		#Replace whatever title Wikipedia is using with our own
		if ($newLine =~ /^\|\ *title\ *=\ *.+/gi) {
			$newLine = '|title = {{SUBPAGENAME}}';
		}

		# If the parameter is not in our list, it's ignored
		if ($newLine =~ /^\|\ *(title|developer|publisher|distributor|director|producer|designer|programmer|artist|composer|license|series|engine|resolution|released|genre|mode|ratings|size|fps|dspcode|dtkadpcm|channeltype|mode|modes)\ *=\ *.{2,}/gi) {
			$skip = 0;
			$newLine =~ s/^\|(\S*)\ *=\ *(.*)/\|$1 \= $2/i;
		} else {
			$skip = 1;
		}
	}

	#Un-wiki-fy everything
        if ($skipUnWiki eq 0) {
		$newLine =~ s/\<ref.+?\/.{0,3}\>//gi; # remove citations references
		$newLine =~ s/\{\{vgy\|([0-9]{4})\}\}/$1/gi; # we don't do Template:vgy, removing
		$newLine =~ s/\{\{cite.*?\}\}//gi; # remove references
		$newLine =~ s/\[\[(([.]|[^\|])+?)\]\]/$1/g; # un-wiki-fy wiki links in the format [[link]]
                $newLine =~ s/\[\[.+?\|(.+?)\]\]/$1/g; # un-wiki-fy wiki links in the format [[link|name]]
        }

	#Set genre categories
	if ($newLine =~ /^\|\ *genre/ ne "") {
		$genreLine = $newLine;
		$genreLine =~ s/^\|\ *genre\ *\=\ *//i;
		$genreLine =~ s/(\<.+?\>)|\(|\)/\,/gi; # try to clean this line up a bit
		@genres = split (/\,|\<br.*?\>/,$genreLine);
		foreach (@genres) {
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			$line = $_;

			if ($line ne "") {
				$line =~ s/^\ *//;
				$line =~ s/\ +$//;
				$line =~ s/\ *(game|games)\ *//i;
				$line = ucfirst $line;
				push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
			}
		}
	}
	
	#Set mode categories
	if ($newLine =~ /^\|\ *mode/ ne "") {
		$modeLine = $newLine;
		$modeLine =~ s/^\|\ *modes{0,1}\ *\=\ *//i;
		$modeLine =~ s/(\<.+?\>)|\(|\)/\,/gi; # try to clean this line up a bit
		@modes = split (/\,|\<br.*?\>/,$modeLine);
		foreach (@modes) {
			if (@autoCategory eq 0) { push (@autoCategory, '<!-- Categories below this line were automatically generated -->'); }
			$line = $_;


			if ($line ne "") {
				$line =~ s/(\<.+?\>)|\(|\)/\,/gi;
				$line =~ s/^\ *//;
				$line =~ s/\ +$//;
				$line = ucfirst $line;
				push (@autoCategory, "\[\[Category:" . $line . " games\]\]");
			}
		}
	}		

	#New-line if this is the end of the Infobox
	if ($newLine =~ /^\}\}$/) {
		if ($savedSizeLine ne "") { push (@finalResult, $savedSizeLine); }  # Saved size
		if ($savedInputLine ne "") { push (@finalResult, $savedInputLine); }  # Saved input
		if ($image ne "") { # Saved image line
			push (@finalResult, '|image = [[File:' . $image . '|' . $imageSize . 'px]]'); 
		} elsif ($image eq "") {
			push (@finalResult, '|image = [[File:{{SUBPAGENAME}}.png|' . $imageSize . 'px]]');
		}

		push (@finalResult, $newLine);
		push (@finalResult, " ");
		$skip = 1;
		$insideInfobox = 0;
	}

	if (($foundWikipediaInfobox eq 1) and ($skip eq 0)) {
		if ($insideInfobox eq 0) {
			$newLine =~ s/GameCube/[[GameCube]]/i;
			$newLine =~ s/Nintendo GameCube/[[GameCube]]/i;
			$newLine =~ s/\ Wii\ /[[Wii]]/i;
			$newLine =~ s/WiiWare/[[WiiWare]]/i;
			$newLine =~ s/Virtual Console/[[Virtual Console]]/i;
			$newLine =~ s/Triforce/[[Triforce]]/i;
		}

		push (@finalResult,$newLine);
	}
}

######################
#COMPILE FINAL RESULT#
######################

# At this point, the Infobox and the summary are in place.  Now, time to add our checked and re-formatted content sections.

# CATEOGRY PROCESSING
# Need to combine our categories and make sure there are no duplicates
foreach (@autoCategory) {
	$fullCat = $_;
	my $shortCat;
	$dupe = 0;

	if (($fullCat =~ /^\[\[Category\:\ *(.*)\ *\]\]/i) or ($fullCat =~ /^\{\{Navigation\ *(.*)\ *\}\}/i)) {
		$shortCat = $1;
	}

	foreach (@categorySection) {
		$compareCatLong = $_;
		my $compareCatShort;

		if (($compareCatLong =~ /^\[\[Category\:\ *(.*)\ *\]\]/i) or ($compareCatLong =~ /^\{\{Navigation\ *(.*)\ *\}\}/i)) {
			$compareCatShort = $1;
		}

		if ($compareCatShort eq $shortCat) {
			$dupe = 1;
		}		
	}

	if ($dupe eq 0) {
		push (@categorySection, $_);
	}	
}

#INFOBOX PROCESSING
if (($foundWikipediaInfobox eq 1) and ($foundDolphinInfobox eq 1)) {
	# If a Wikipedia Infobox was received, it was already pushed into @finalResult.
	# TODO: Wikipedia parsing should push into its own array, which is then added in this section
} elsif (($foundWikipediaInfobox eq 0) and ($foundDolphinInfobox eq 0)) { # We didn't get ANY infoboxes, create a generic one
	push (@finalResult, '{{Infobox VG');
	push (@finalResult, '|title = {{SUBPAGENAME}}');
	push (@finalResult, '|image = [[File:{{SUBPAGENAME}}.png|');
	push (@finalResult, '}}');	
}

#DESCRIPTION PROCESSING
if ((@descriptionSection) and ($foundWikipediaInfobox eq 0)) {   # Only re-use the original description if there was one and we didn't get Wikipedia data
	push (@finalResult, "\n");
	push (@finalResult, @descriptionSection);
}

push (@finalResult, "\n\=\= Problems \=\=");
push (@finalResult, @problemSection);

push (@finalResult, "\n\=\= Configuration \=\=");
push (@finalResult, '<!--A full list of options is available at Template:Config/doc-->');
push (@finalResult, "\{\{Config");
push (@finalResult, @configurationSection);
push (@finalResult, "\}\}");

push (@finalResult, "\n\=\= Version Compatibility \=\=");
push (@finalResult, "\{\{VersionCompatibility\}\}");
push (@finalResult, '<!--Use this template for compatibility entries: {{VersionCompatibilityVersion|7617|****}}-->');
push (@finalResult, @versionCompatSection);
push (@finalResult, "\{\{VersionCompatibilityClose\}\}");

push (@finalResult, "\n\=\= Testing \=\=");
push (@finalResult, "\{\{testing\/start\}\}");
push (@finalResult, '<!--Use this template for test entries: {{Test Entry|revision=|OS=|CPU=|GPU=|result=|tester=}}-->');
push (@finalResult, @testingSection);
push (@finalResult, "\{\{testing\/end\}\}");

push (@finalResult, "\n\=\= Gameplay Videos \=\=");
push (@finalResult, @videoSection);

push (@finalResult, "\n");
push (@finalResult, @categorySection);


##############
#FINAL OUTPUT#
##############

system ('clear');

print "****************\n";
print "* FINAL OUTPUT *\n";
print "****************\n";

foreach (@finalResult) {
	print $_ . "\n";
}

print "\n\n";