LogMiner: purging old accesses

I’ve just committed a function to easily purge old data from LogMiner’s database, useful to prevent it from growing too much.

It’ll appear in the next version, but since its release might still be far from now, here’s the code for those who need it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
CREATE OR REPLACE FUNCTION cleanup( _site int8, _upToDate DATE )
RETURNS INT AS $body$
    BEGIN
   
        DELETE FROM accesses
    WHERE req_time < _upToDate
    AND site = _site;

    ALTER TABLE accesses DROP CONSTRAINT accesses_request_fkey;
    ALTER TABLE accesses DROP CONSTRAINT accesses_search_fkey;
    ALTER TABLE accesses DROP CONSTRAINT accesses_referrer_fkey;

    DELETE FROM requests
    WHERE id NOT IN ( SELECT DISTINCT request FROM accesses );

    DELETE FROM search_referrals
    WHERE id NOT IN ( SELECT DISTINCT SEARCH FROM accesses );

    DELETE FROM referrers
    WHERE id NOT IN ( SELECT DISTINCT referrer FROM accesses );

    ALTER TABLE accesses
    ADD CONSTRAINT "accesses_request_fkey"
    FOREIGN KEY ( request )
    REFERENCES requests( id ) ON DELETE CASCADE;

    ALTER TABLE accesses
    ADD CONSTRAINT "accesses_search_fkey"
    FOREIGN KEY ( SEARCH )
    REFERENCES search_referrals( id ) ON DELETE SET NULL;

    ALTER TABLE accesses
    ADD CONSTRAINT "accesses_referrer_fkey"
    FOREIGN KEY ( referrer )
    REFERENCES referrers( id ) ON DELETE SET NULL;

        RETURN 0;
    END;
$body$ LANGUAGE plpgsql;

For instance, if you want to keep only the last six month of data, you can set up a cron job which runs at the first day of every month executing the command:

1
2
# supposing 1 is the id of your site
echo "SELECT * FROM cleanup( 1, CAST( date_trunc( 'month', now() ) - interval '5 months' AS date ));" | psql -U logminer logminer

One comment

  1. Version 1.23 of LogMiner, my Apache/IIS log analysis package, is now available. This release mainly fixes some build problem on modern distributions. It also finally include the cleanup function I was talking about some days ago…

Leave a Reply

Your email address will not be published. Required fields are marked *