User:Alexeina/Guides/Pywikibot

From FFXI Wiki

How to edit FFXI Wiki using the Pywikibot Framework



I'm writing this as I learn and find references. Very much a work in progress.
Pywikibot is a Python framework and collection of bots that automate work on MediaWiki sites. Originally designed for Wikipedia, it is now used throughout the Wikimedia Foundation’s projects and on many other MediaWiki wikis.

Essential References

Pywikibot

Pywikibot Homepage: https://www.mediawiki.org/wiki/Manual:Pywikibot
Developer Documentation: https://doc.wikimedia.org/pywikibot/stable/
Compatibility Matrix: https://www.mediawiki.org/wiki/Manual:Pywikibot/Compatibility
Official Installation Guide: https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation
Alternate Installation Guide: https://support.wiki.gg/wiki/PyWikiBot

Wiki Parsers

MWParserFromHell (Powerful and Stable): https://mwparserfromhell.readthedocs.io/en/latest/
WikiTextParser (Easier to Use and Unstable): https://github.com/5j9/wikitextparser

Prerequisites

API access for FFXI Wiki is blocked by firewall. You must provide the admin with your IP address for whitelisting.
Always configure at least 20 second wait between edits to avoid negative impact to bgwiki server performance. Refer to server performance configuration in user-config.py found below.

Config Files

user-config.py

family = 'bgwiki'
mylang = 'en'
usernames['bgwiki']['en'] = 'Your_Username_HERE'  # Put your own username here
password_file = None                    # if no password file is defined the user needs to enter credentials during script startup


# ############# SETTINGS TO AVOID SERVER OVERLOAD ##############

# Slow down the robot such that it never requests a second page within
# 'minthrottle' seconds. This can be lengthened if the server is slow,
# but never more than 'maxthrottle' seconds. However - if you are running
# more than one bot in parallel the times are lengthened.
# By default, the get_throttle is turned off, and 'maxlag' is used to
# control the rate of server access.  Set minthrottle to non-zero to use a
# throttle on read access.
minthrottle = 1  # Impacts Read access requests
maxthrottle = 60

# Slow down the robot such that it never makes a second page edit within
# 'put_throttle' seconds.
put_throttle = 20  # bgwiki admins require minimum of 20 seconds between edits

# Sometimes you want to know when a delay is inserted. If a delay is larger
# than 'noisysleep' seconds, it is logged on the screen.
noisysleep = 3.0

# Defer bot edits during periods of database server lag.  For details, see
# https://www.mediawiki.org/wiki/Maxlag_parameter
# You can set this variable to a number of seconds, or to None (or 0) to
# disable this behavior. Higher values are more aggressive in seeking
# access to the wiki.
# Non-Wikimedia wikis may or may not support this feature; for families
# that do not use it, it is recommended to set minthrottle (above) to
# at least 1 second.
maxlag = 5

# Maximum number of times to retry an API request before quitting.
max_retries = 2
# Minimum time to wait before resubmitting a failed API request.
retry_wait = 5

bgwiki_family.py

from pywikibot import family

# Family file required for accessing BG-wiki

class Family(family.Family):
    name = 'bgwiki'
    langs = {
        'en': None,
    }

    def hostname(self,code):
        return 'www.bg-wiki.com' # bgwiki hostname

    def scriptpath(self, code):
        return '' # The relative path of index.php, api.php

    def articlepath(self, code):
        return '/ffxi' # bg-wiki uses a non-default article path

    def protocol(self, code):
        return 'HTTPS' # required for using HTTPS connections