Thursday, September 5, 2013

Siamese Dream

A magic square of odd order can be efficiently constructed with a simple method brought from the travels of the French mathematician Simon de la Loubère to what was then Thailand (Siam), in the 17th century. I like to think of a time when you brought things like that from overseas trips, and not only a gazillion photos and some cheap souvenirs.

The method is easier to undertand with a visual example:

A Python implementation is also very easy, especially when using Numpy, with which the pos pointer variable, being a numpy.array, can be both updated with one-liner arithmetic operators, and used as a two-dimensional index into the matrix (when cast as a tuple):


import numpy as np

def de_la_loubere(n):
    m = np.zeros((n, n), dtype=int)
    pos = np.asarray([0, n/2]) # initial
    m[tuple(pos)] = 1
    i = 2
    while i <= n ** 2:
        if m[tuple((pos + [-1, 1]) % n)] != 0:
            pos += [1, 0]  # blocked: go down
        else:
            pos += [-1, 1] # go up/right
        pos %= n # wrap
        m[tuple(pos)] = i
        i += 1
    return m

n = 15

m = de_la_loubere(n)

# verify that it's really magic
magic = n * (n ** 2 + 1) / 2 # 65
assert (np.sum(m, axis=0) == np.repeat(magic, n)).all()
assert (np.sum(m, axis=1) == np.repeat(magic, n)).all()
assert m.trace() == magic
assert np.fliplr(m).trace() == magic

print m

# [[122 139 156 173 190 207 224   1  18  35  52  69  86 103 120]
#  [138 155 172 189 206 223  15  17  34  51  68  85 102 119 121]
#  [154 171 188 205 222  14  16  33  50  67  84 101 118 135 137]
#  [170 187 204 221  13  30  32  49  66  83 100 117 134 136 153]
#  [186 203 220  12  29  31  48  65  82  99 116 133 150 152 169]
#  [202 219  11  28  45  47  64  81  98 115 132 149 151 168 185]
#  [218  10  27  44  46  63  80  97 114 131 148 165 167 184 201]
#  [  9  26  43  60  62  79  96 113 130 147 164 166 183 200 217]
#  [ 25  42  59  61  78  95 112 129 146 163 180 182 199 216   8]
#  [ 41  58  75  77  94 111 128 145 162 179 181 198 215   7  24]
#  [ 57  74  76  93 110 127 144 161 178 195 197 214   6  23  40]
#  [ 73  90  92 109 126 143 160 177 194 196 213   5  22  39  56]
#  [ 89  91 108 125 142 159 176 193 210 212   4  21  38  55  72]
#  [105 107 124 141 158 175 192 209 211   3  20  37  54  71  88]
#  [106 123 140 157 174 191 208 225   2  19  36  53  70  87 104]]

Wednesday, May 15, 2013

Impossibly Lean Audit System for Postgres with hstore

I really like Postgres! I recently discovered a new trick that plays especially well with the access control pattern I recently described: a simple (but rather powerful) audit/logging pattern to track the changes made to a database (i.e. inserts, updates and deletes), and most importantly, allow to query them in a flexible way.

It's easy to set up a trigger to record changes. However, if you follow this basic pattern, you end up with your update data stored as unstructured text, which is not the ideal format to query. You could modify that scheme to use as many audit tables as you want to log (thus each reproducing the structure of its target table), but there's a much nicer solution, using PG's hstore data type.

hstore is PG's answer to NoSQL: it's a string-based key-value data type, which you can embed inside your relational schema, within which it interoperates with other data types and SQL constructs seamlessly. There's also the newer JSON type, which plays a similar role, but since its API is not yet as mature as hstore's, we're not going to use it here.

So here's the audit table:


create extension hstore;

create table audit (
    audit_id serial primary key,
    table_name text not null,
    user_name text not null,
    action_timestamp timestamp not null default current_timestamp,
    action text not null check (action in ('i','d','u')),
    old_values hstore,
    new_values hstore,
    updated_cols text[],
    query text
);

And here's the audit trigger that goes with it:


create or replace function if_modified_func() returns trigger as $body$
begin
    if tg_op = 'UPDATE' then
        insert into audit (table_name, user_name, action, old_values, new_values, updated_cols, query)
        values (tg_table_name::text, current_user::text, 'u', hstore(old.*), hstore(new.*),
               akeys(hstore(new.*) - hstore(old.*)), current_query());
        return new;
    elsif tg_op = 'DELETE' then
        insert into audit (table_name, user_name, action, old_values, query)
        values (tg_table_name::text, current_user::text, 'd', hstore(old.*), current_query());
        return old;
    elsif tg_op = 'INSERT' then
        insert into audit (table_name, user_name, action, new_values, query)
        values (tg_table_name::text, current_user::text, 'i', hstore(new.*), current_query());
        return new;
    end if;
end;
$body$
language plpgsql;

Notice how the rows get converted (hstore(old.*) and hstore(new.*)). Now suppose that we have a book table that we'd like to audit:


create table book (
    book_id serial primary key,
    title text,
    author text,
    n_pages int
);

create trigger book_audit after insert or update or delete on book for each row execute procedure if_modified_func();

If we insert a book in it, we can see the audit mechanism in action:


insert into book (title, n_pages) values ('PG is Great', 250);

select action, table_name, new_values -> 'title' as title from audit;

 action | table_name |    title    
--------+------------+-------------
 i      | book       | PG is Great

The third retrieved value (title) demontrates hstore's syntax for value retrieval from a key (using the -> operator), which is just an example among the bunch of operators and functions offered.

If we perform an update on the book:


update book set author = 'Christian Jauvin', n_pages = 300 where book_id = 1;

select action, table_name, updated_cols from audit;

 action | table_name |   updated_cols   
--------+------------+------------------
 i      | book       | 
 u      | book       | {author,n_pages}

The third column now sports the columns that have been updated, using the updated_cols mechanism, implemented as a PG array (another very nice data structure) along with the akeys function, with which we collect the keys of the hstore structure resulting from the hstore(new.*) "minus" hstore(old.*) operation performed in the "update" part of the trigger function.

It's also easy to perform "before and after" type queries:


select old_values -> 'n_pages' as before, new_values -> 'n_pages' as after from audit where audit_id = 2;

 before | after 
--------+-------
 250    | 300

Don't forget however that hstore values are stored as strings, so this for instance wouldn't work:


select old_values -> 'n_pages' + new_values -> 'n_pages' from audit where audit_id = 2;

HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

whereas this would:


select (old_values -> 'n_pages')::int + (new_values -> 'n_pages')::int from audit where audit_id = 2;

One final aspect to note: I find that this pattern works well with my access control pattern, where every application user has its own PG role, on behalf of which the database operations are performed. In this scenario, the user/role gets automatically logged by the trigger (using the current_user variable). In contrast, an access control system implemented at the level of the application (i.e. with some kind of user table and all database operations performed on behalf of a single PG role) wouldn't enjoy that much simplicity I believe.