Not logged in. · Lost password · Register
Forum: General Help and Support General Stuff RSS
sync/merge - Dokuwiki file structure
Avatar
Wolle #1
Member since Oct 2010 · 8 posts
Group memberships: Members
Show profile · Link to this post
Subject: sync/merge - Dokuwiki file structure
sync - merge - Dokuwiki file structure

Hi all!

I want to sync my Dokuwiki with Unison. Just for the record: I've got a local Dokuwiki instance at home (Linux only, webserver NginX) as a personal wiki and another instance on my usb stick (usage with different OS, webserver NginX). They won't be altered at the same time (unfortunately I can't double myself), so I don't have to be too careful regarding conflicting versions...

What I intend to do is sync the wiki at home with Unison when I plug the usb stick in, triggered by an udev-rule. For removing the stick, I intend to have a simple desktop shortcut that triggers the sync script and umount. In this scenario, using the sync plugin isn't what I want, because usually I'm not logged in (in the wiki) as Administrator and the whole thing seems a little uncomfortable to me... Anyway: I have had a look on the sync plugin.

By now, the script that is triggered by udev works fine and looks as follows:
(By the way: I included some cleanup work as can be found in the dokuwiki wiki...)


#!/bin/bash

# When called by udev, no environment variables are present! Every command has
# to be preceded by the full path!


cleanup() {
    # $1 ... full path to data directory of wiki
    # $2 ... number of days after which old files are to be removed

    # purge files older than $2 days from the attic (old revisions)
#    find "$1"/attic/ -type f -mtime +$2 -print0 | xargs -0r rm -f
    # remove stale lock files (files which are 1-2 days old)
    find "$1"/locks/ -name '*.lock' -type f -mtime +1 -print0 | xargs -0r rm -f
    # remove empty directories
    find "$1"/pages/ -depth -type d -empty -print0 | xargs -0r rmdir
    # remove files older than $2 days from the cache
    find "$1"/cache/?/ -type f -mtime +$2 -print0 | xargs -0r rm -f
}


# set up variables...
HOME_DIR=/home/username
DEVICE_PATH=/media/mountpoint
SYNC_DIR_HOME=${HOME_DIR}/DokuWiki
SYNC_DIR_DEVICE=${DEVICE_PATH}/wwwroot
ATTACH_LOG=${HOME_DIR}/dokuwiki-attach.log
UNISON_STDOUT=${HOME_DIR}/dokuwiki-attach-unison.stdout
UNISON_STDERR=${HOME_DIR}/dokuwiki-attach-unison.stderr
USER=username
GROUP=username


# backup old logfile
if [ -f $ATTACH_LOG ]
then
  /bin/cp $ATTACH_LOG ${ATTACH_LOG}.bak
fi 
# remove old logfile anyways...
/bin/rm -f $ATTACH_LOG
# create new logfile
echo -e $(/bin/date) >> $ATTACH_LOG
echo >> $ATTACH_LOG


# mount USB device => IMPORTANT TO DO THAT NOW!!!
/bin/mount $DEVICE_PATH
# Obviously udev does mount the volume AFTER this script, no matter, what
# order specified by the preceding numbers... :-?
# This could however also be done via another udev rule... :-/


# cleanup DokuWiki installations: (path to datadir, number of days)
echo -ne "Clean up local DokuWiki installation..." >> $ATTACH_LOG
cleanup ${SYNC_DIR_HOME}/data 14 && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG
echo -ne "Clean up USB DokuWiki installation..." >> $ATTACH_LOG
cleanup ${SYNC_DIR_DEVICE}/data 14 && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG


# set environment variable $HOME => NEEDED BY UNISON!!!
export HOME=$HOME_DIR
# Unison run
echo -ne "Synchronizing USB and local wiki..." >> $ATTACH_LOG
/usr/local/bin/unison $SYNC_DIR_HOME $SYNC_DIR_DEVICE \
  -auto -batch -silent -log -fat -ui 'text' \
  -ignore 'Path data/cache' \
  -ignore 'Path data/locks' \
  -backuploc 'local' -maxbackups '5' -backup 'Name *' \
  -merge 'Name *.changes -> cat CURRENT1 CURRENT2 > NEW && sort -u NEW && uniq NEW' \
  > $UNISON_STDOUT 2> $UNISON_STDERR
case "$?" in
    "0") echo " done." >> $ATTACH_LOG ;;
    *) echo " FAILED!" >> $ATTACH_LOG ;;
esac
echo >> $ATTACH_LOG

# Ownership & Permissions
echo -ne "Changing ownership of .unison-files (probably been changed by udev-invoked unison)..." >>  $ATTACH_LOG
cd ${HOME_DIR}/.unison
find . -type f -print0 | xargs -0 chown $USER:$GROUP && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG
echo >> $ATTACH_LOG

cd $SYNC_DIR_HOME
echo -e "Setting up local permissions..." >> $ATTACH_LOG
echo -ne "\tFiles..." >> $ATTACH_LOG
find . -type f -print0 | xargs -0 chmod 0660 && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG
echo -ne "\tDirectories..." >> $ATTACH_LOG
find . -type d -print0 | xargs -0 chmod 0770 && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG

echo -e "Setting up local ownership..." >> $ATTACH_LOG
echo -ne "\tFiles..." >> $ATTACH_LOG
find . -type f -print0 | xargs -0 chown $USER:www-data && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG
echo -ne "\tDirectories..." >> $ATTACH_LOG
find . -type d -print0 | xargs -0 chown $USER:www-data && echo " done." >> $ATTACH_LOG || echo " FAILED." >> $ATTACH_LOG

cd $HOME_DIR
chown $USER:$GROUP $ATTACH_LOG ${ATTACH_LOG}.bak $UNISON_STDOUT $UNISON_STDERR

exit 0

----

The script works fine to what I expected. I have to admit, that I mount my usb stick via /etc/fstab, which isn't good style (udev would be better), but that shouldn't be focussed here...

Currently, I "merge" only the *.changes files, because they are the only ones, I seem to understand. But there are still other files, that maybe need to be merged... so here are my notes on the files:

*.changes
  - meaning: changelog of pages
  - lines with identifier (probably depending on time and unique) at the beginning => sortable!
  - merge command: (merge, sort, remove duplicates)
    -merge 'Name *.changes -> cat CURRENT1 CURRENT2 > NEW && sort -u NEW && uniq NEW' \
    -nice thing about that: I only use standard unix command line tools

*.idx
  - meaning: No idea.
  - Do they have to be merged? Or is "prefer newer" enough?
  - Should I ignore the files and better launch some index building script after the sync?

*.changes.trimmed
  - meaning: No idea.
  - By now, I've got only some (not for every page!) empty files...?

*.indexed
  - meaning: No idea.
  - All files that I examined, contained just the number "2"...?
  - Is "prefer newer" sufficient?

*.txt
  - meaning: current page files
  - Is there already an identical version in the attic? Does Dokuwiki create an attic archive file of the current page state each time a page is saved?
    => yes => "prefer newer", older versions are anyway imported as attic archive files and via corresponding *.changes files.
    => no => prefer newer but manually store older version in attic (Uaaahhh...!!)
      - convention of names?
      - modify page changelog?

*.meta
  - I've read the documentation on metadata, but it's not that clear to me...
  - If for the *.txt files one should just do "prefer newer", then this would apply here, too (wouldn't it?)?
  - What if not?

I think, this is a good start for a sync script. As you can see, I only need to understand the file structure of Dokuwiki a little more - what file for what purpose and how they are built. I couldn't find anything related to this in the FAQ/Manual/Wiki, so can anybody explain the files, their meaning and content to me or maybe just point me in the right direction?

Thanks a lot in advance
    Wolle
Avatar
Albert25 #2
Member for a month · 2 posts
Group memberships: Members
Show profile · Link to this post
Over 8 years later, I'm also looking for answers to mostly the same questions...

Wolle, if you read this, are you still using Unison, and how did you end-up configuring it?

I'm syncing my local Dokuwiki on my notebook with the one on my web server. My main problem is what to do with all these data/index/*.idx files

If anybody knows some answers to Wolle's questions, please share them.
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Go to forum
Imprint
This board is powered by the Unclassified NewsBoard software, 20150713-dev, © 2003-2015 by Yves Goergen
Current time: 2019-10-16, 07:33:15 (UTC +02:00)