"""
    MoinMoin - SortBy Macro
    Copyright (c) 2001 by Fred Bremmer <Fred.Bremmer@ubc.ca>
    Modified (c) 2007 by Lane Rettig <Lane.Rettig@deshaw.com>

    Based on an earlier SortBy Macro
    Copyright (c) 2001 by Tim Bird <tbird@lineo.com>
    All rights reserved, see COPYING for details.
    
    Updated June 24, 2008 for compatibility with Moin 1.7.
    (code commented below).

DESCRIPTION: Sort a table by one or more columns and sort types/orders

SortBy(page,num_headers,column,sort_type[,othercolumn,sort_type]...)

Examples:

    [[SortBy(TablePage,1,3,alpha)]]

This would return the first table on TablePage sorted by column 3
(column 1 is the leftmost column).  The rows would be sorted using a
lexicographic (alpha) sort.  It would preserve 1 header row of the
table, without sorting it.

    [[SortBy(TablePage,0,2,reversenumber,1,alpha)]]

This would return the first table on TablePage sorted using a reverse
numeric sort on column 2, but rows with identical values in column 2
would be sorted in alpha order of their values in column 1.  It would
not preserve the first row of the table.  The first row of the table 
would be sorted also (zero header rows left unsorted).

Supported sort types are: alpha, number, nocase, nosort, reversealpha,
reversenumber, reversenocase, reversenosort

'nosort' can be used to make the SortBy macro behave like an Include
that only loads the first table from the page, not the whole page.
Thus you can have comments on the table page that are not shown in the
SortBy rendering on the including page.

See SortByTest below for a more comprehensive list of examples.

Utility functions:
    error()        - convert an error message to HTML
    read_table()   - read a file and return its first table's data

Functions to convert a string into an appropriate sortable value:
    strip_fmt()    - remove wiki formatting and commas
    to_number()    - convert to a number if possible
    to_nocase()    - convert everything to the same case
    to_nosort()    - return ascending numbers regardless of arg

Main loop functions:
    process_args() - parse args string and do some error checking
    sort_table()   - do the actual sorting of table rows
    format()       - generate the HTML output
    execute()      - the macro's main loop, called by MoinMoin

"""

# After installing this macro, you can test it by creating a wiki page
# named "SortByTest" which contains the following text:

SortByTest = """
[[TableOfContents]]

== Original Table ==
||'''First'''	||'''Second'''	||'''Third'''	||
||A		||2,000,000	||zebra		||
||C		||1		||123		||
||b		||2		||word		||
||d		||30		||{{{word}}}	||
||A		||2		||Wordy		||
||B		||4.2		||23		||
||A		||2		||'''word'''	||

== Column 1 No Sort ==
{{{[[SortBy(SortByTest, 1, 1, nosort)]]}}}

[[SortBy(SortByTest, 1, 1, nosort)]]

This simply includes the first table on the source page at this point in the
page containing the macro. Rows are in the same order as the original table.

== Column 1 Reverse No Sort ==
{{{[[SortBy(SortByTest, 1, 1, reversenosort)]]}}}

[[SortBy(SortByTest, 1, 1, reversenosort)]]

Similar to the previous example, but now all rows except the one header row
are in reverse order.

== Column 1 Alphabetical ==
{{{[[SortBy(SortByTest, 1, 1, alpha)]]}}}

[[SortBy(SortByTest, 1, 1, alpha)]]

In alphabetical order all uppercase letters are sorted before lowercase
letters.

== Column 1 Reverse Alphabetical ==
{{{[[SortBy(SortByTest, 1, 1, reversealpha)]]}}}

[[SortBy(SortByTest, 1, 1, reversealpha)]]

Prepending 'reverse' to the sort-type changes the order from ascending to
descending.

== Column 1 Case-Insensitive ==
{{{[[SortBy(SortByTest, 1, 1, nocase)]]}}}

[[SortBy(SortByTest, 1, 1, nocase)]]

Now 'B' and 'b' are considered equal, and are presented in the order they
appear in the original table.

== Column 2 Numerical ==
{{{[[SortBy(SortByTest, 1, 2, number)]]}}}

[[SortBy(SortByTest, 1, 2, number)]]

As you would expect, 30 comes before 2,000,000. Commas are ignored for
sorting. Integers and floating point can both be used, and they sort
correctly.

== Column 2 Alphabetical ==
{{{[[SortBy(SortByTest, 1, 2, alpha)]]}}}

[[SortBy(SortByTest, 1, 2, alpha)]]

In 'alphabetical' order, 2 comes before 3, so 2,000,000 sorts before 30.

== Column 2 Reverse Numerical ==
{{{[[SortBy(SortByTest, 1, 2, reversenumber)]]}}}

[[SortBy(SortByTest, 1, 2, reversenumber)]]

Again, prepending 'reverse' to the sort-type changes the order from ascending
to descending.

== Column 3 Alphabetical ==
{{{[[SortBy(SortByTest, 1, 3, alpha)]]}}}

[[SortBy(SortByTest, 1, 3, alpha)]]

Wiki formatting is ignored, and the sort is stable. The variations of 'word'
are simply listed in the same order as they appear in the original table.

== Column 3 Numerical ==
{{{[[SortBy(SortByTest, 1, 3, number)]]}}}

[[SortBy(SortByTest, 1, 3, number)]]

Now 23 comes before 123, and non-numeric values are sorted in alphabetical
order as above.

== Column 1 Alphabetical, Column 2 Numerical, Column 3 Alphabetical ==
{{{[[SortBy(SortByTest, 1, 1, alpha, 2, number, 3, alpha)]]}}}

[[SortBy(SortByTest, 1, 1, alpha, 2, number, 3, alpha)]]

The main sort is on column 1, but within the duplicate 'A's the 2's and 
2,000,000 are sorted, and within the 2's the values in column 3 are in 
case-sensitive alphabetical order.

== Column 1 Case-Insensitive, Column 2 Numerical, Column 3 Case-Insensitive ==
{{{[[SortBy(SortByTest, 1, 1, nocase, 2, number, 3, nocase)]]}}}

[[SortBy(SortByTest, 1, 1, nocase, 2, number, 3, nocase)]]

The main sort is still on column 1, but within the case-insensitive duplicate
'b's the values in column 2 are in numerical order. Also, within the duplicate
A-2 rows, column 3 is case-insensitive, so now Wordy follows word.

== Zero Header Rows, Case-Insensitive Sort on Column 1 ==
{{{[[SortBy(SortByTest, 0, 1, nocase)]]}}}

[[SortBy(SortByTest, 0, 1, nocase)]]

The first row is sorted along with all the other rows.

== Three Header Rows, Case-Insensitive Sort on Column 1 ==
{{{[[SortBy(SortByTest, 3, 1, nocase)]]}}}

[[SortBy(SortByTest, 3, 1, nocase)]]

The first three rows are not sorted along with all the other rows.

== Invalid Argument Examples ==

[[SortBy(NonExistentPage, 1, 1, nocase)]]

[[SortBy(SortByTest, 1, 1)]]

[[SortBy(SortByTest, x, 1, alpha)]]

[[SortBy(SortByTest, 1, x, alpha)]]

[[SortBy(SortByTest, -1, 1, alpha)]]

[[SortBy(SortByTest, 10, 1, alpha)]]

[[SortBy(SortByTest, 1, 0, alpha)]]

[[SortBy(SortByTest, 1, 10, alpha)]]

[[SortBy(SortByTest, 1, 1, x)]]

[[SortBy(SortByTest, 1, 1, alpha, 1)]]

"""

import sys, re, StringIO, cgi
from MoinMoin.Page import Page


class SortByError(Exception):
    """Raised anywhere in the macro, caught outside the main loop."""

    def __init__(self, msg=''):
        """msg -- a string to be displayed to the user"""
        self.msg = msg


def error(msg, args):
    """Return a message followed by usage information, all in HTML.

    msg  -- a string describing the error
    args -- the macro's original argument string

    """
    html = """
<tt class="wiki">[[SortBy(%s)]]</tt><br>
<strong class="error">SortBy: %s</strong><br>
<small>%s<br>
Valid sort types are: %s</small>"""
    msg = cgi.escape(msg)
    usage = cgi.escape("""Usage: SortBy(TablePage, <number of header rows>,
            <primary sort column number>, <sort type>
            [, <secondary sort column>, <sort type>] ... )""")
    sorts = ' '.join(valid_sorts)
    return html % (args, msg, usage, sorts)


def read_table(page_file):
    """Read page_file and convert its first table's lines into row data"""
    file = open(page_file)
    intable = 0
    table_rows = []
    for line in file.xreadlines():
        if len(line) < 2 or line[0:2] != '||':
            if intable: break # We're done
            else: continue    # Skip non-table lines until we hit the table
        else:
            intable = 1
            table_rows.append(line[:-1]) # strip the \n while we're at it
    return table_rows


def strip_fmt(arg):
    """Remove formatting characters and leading/trailing whitespace.

    Commas (such as in 1,000,000) are considered formatting chars."""
    for fmt_string in ["'''", "''", '{{{', '}}}', ',']:
        arg = arg.replace(fmt_string, '')
    return arg.strip()


def to_number(arg):
    """Convert arg to int or float if possible, else return a string."""
    arg = strip_fmt(arg)
    try: return int(arg)
    except ValueError:
        try: return float(arg)
        except ValueError: return arg


def to_nocase(arg):
    """Return arg in lowercase with no formatting characters."""
    return strip_fmt(arg).lower()


def to_nosort(arg, count=[0]):
    """Return a higher integer each time so rows don't move when sorted."""
    count[0] += 1   # count is a default arg, so references the same list and
    return count[0] # incr's the same [0] every time the function is called.


decorate_functions = {'alpha': strip_fmt,
                      'number': to_number,          
                      'nocase': to_nocase,
                      'nosort': to_nosort}
valid_sorts = decorate_functions.keys()
valid_sorts.sort()
valid_sorts.extend(['reverse'+sort_name for sort_name in valid_sorts])


def process_args(args, request):
    """Parse args string, return (sort_page, table, num_headers, sort_list)."""
    arglist = re.split(r'\s*,\s*', args.strip())
    if len(arglist) < 4:
        msg = """Not enough arguments (%s).
                 At least 4 are required.""" % len(arglist)
        raise SortByError, msg
    table_page, num_headers = arglist[0:2]
    try: num_headers = int(num_headers)
    except ValueError:
        msg = """Number of header rows (%s)
                 must be an integer""" % num_headers
        raise SortByError, msg
    if num_headers < 0:
        msg = """Number of header rows (%s)
                 must be a positive integer""" % num_headers
        raise SortByError, msg
    arglist[:2] = []  # Make arglist contain only column & sort_type pairs
    if len(arglist)%2 != 0:
        raise SortByError, 'Odd number of arguments (%s)' % (len(arglist)+2)
    sort_list = []
    while arglist:

        # Pop the sort_type and column from the end of the arglist.
        # They get stored in the sort_list in the opposite order
        # they were requested so the primary sort is done last.

        sort_type = arglist.pop().lower()
        sort_column = arglist.pop()
        try: sort_column = int(sort_column)
        except ValueError:
            msg = 'Column number (%s) must be an integer' % sort_column
            raise SortByError, msg
        if sort_column < 1:
            msg = """Column number (%s) must be 1 or higher.
                     1 = leftmost column""" % sort_column
            raise SortByError, msg
        sort_column -= 1  # Use zero-based column indexing internally
        if sort_type[:7] == 'reverse': # If the sort_type begins with 'reverse'
            reverse = 1                # set the reverse flag  
            sort_type = sort_type[7:]  # and strip 'reverse' from sort_type
        else: reverse = 0
        sort_list.append((sort_column, sort_type, reverse))  # Append a 3-tuple
    sort_page = Page(request, table_page)
    try: table_rows = read_table(sort_page._text_filename())
    except IOError:
        raise SortByError, 'Unable to open the table page "%s"' % table_page
    table = [row.split('||')[1:-1] for row in table_rows]
    if num_headers > len(table):
        msg = """Number of header rows (%s) is more than
                 the table length (%s)""" % (num_headers, len(table))
        raise SortByError, msg
    return (sort_page, table, num_headers, sort_list)


def sort_table(table, num_headers, sort_column, sort_type, reverse):
    """Sort of the table (in-place), preserving num_headers rows at the top.

    Arguments:
    table       -- a list of lists representing rows of column entries
    num_headers -- an integer number of rows to keep at the top of the table
    sort_column -- the column number (zero-based) containing the sort values
    sort_type   -- which kind of sort to perform on the values being sorted
    reverse     -- 0 or 1, meaning an ascending or descending (reverse) sort

    """
    header = table[:num_headers]
    table[:num_headers] = []

    if table and sort_column > min([len(row) for row in table]):
        msg = """Column number (%s) is higher than
                 the length of one or more rows"""  % (sort_column+1)
        raise SortByError, msg
    if sort_type not in valid_sorts:
        raise SortByError, 'Invalid sort type "%s"' % sort_type
    decorate = decorate_functions[sort_type]

    # Use the 'decorate, sort, undecorate' pattern with ascending or
    # descending indices to ensure that the sort is stable.

    decorations = [decorate(row[sort_column]) for row in table]
    if reverse: indices = xrange(len(table), 0, -1)
    else: indices = xrange(len(table))
    decorated = zip(decorations, indices, table)
    decorated.sort()
    table[:] = [row for (decoration, index, row) in decorated]

    if reverse: table.reverse()
    table[:0] = header
    return


def format(sort_page, macro, table):
    """Format the sorted table and return it as an HTML fragment.

    Arguments:
    sort_page -- a MoinMoin Page object representing the table page
    macro     -- the macro argument that was provided by MoinMoin
    table     -- a list of lists representing rows of column entries

    """
    ret = ''
    table_rows = ['||%s||' % '||'.join(row) for row in table]
    sort_page.set_raw_body('\n'.join(table_rows), 1)  # format the table

    # Here's some tricky stuff copied from Richard Jones' Include macro.
    stdout = sys.stdout
    sys.stdout = StringIO.StringIO()
    sort_page.send_page(content_only=1)  ## Removed macro.request argument for compatibility with Moin 1.7 (no longer needed).
    ret += sys.stdout.getvalue()
    sys.stdout = stdout

    # Output a helper link to get to the page with the table.
    name = sort_page.page_name
    ret += '<small>' + macro.formatter.url(1, name) + \
           macro.formatter.text('[goto %s]' % name) + macro.formatter.url(0) + \
           '</small>'
    return ret    


def execute(macro, args):
    """Parse args, sort_table() repeatedly, return formatted table as HTML.

    Sort the table once for each column & sort_type in the sort_list
    beginning with the last (and least significant) sort in args
    because they were 'popped' from the end of the args. Since
    sort_table() uses a stable sort algorithm, the ordering of the
    earlier sorts is preserved in the later sorts when there are
    duplicates of the sort key.

    """
    try:
        sort_page, table, num_headers, sort_list = process_args(args, macro.request)
        for sort_column, sort_type, reverse in sort_list:
            sort_table(table, num_headers, sort_column, sort_type, reverse)
        return format(sort_page, macro, table)
    except SortByError, e: return error(e.msg, args)
