Archiv für den Monat: April 2014

Copy a complete filesystem containing hardlinks, symlinks, different user names etc. via TCP/IP

I had to move a LOT of data from an old linux server to a new one. As both systems had four disks configured as a software RAID-5, I couldn’t connect all drives to one computer. I was simply out of ports to plug the drives into. 🙂

I wanted to use rsync, that way I could make sure that all the permissions etc. will stay intact. This was important for me, as a part of the data was from the BackupPC-directory, and BackupPC uses a special user and uses a LOT of hardlinks.

Simply copying everything with rsync wasn’t a success, the underlying SSH worked well, but the speed was by far too slow (~20MB/s). The „old“ machine wasn’t simply powerful enough to handle RAID-5, SSH and rsync with maximum performance at the same time.

Using a differenct cyper (-c arcfour) helped a bit, but the speed was still too slow (~40MB/s). As both computers were in the same secured LAN, I figured I didn’t need the additional protection and could use the rsync itself.

In the config file and the commands below you must change „hostname“ to the name of the „server“ and the „sharename“ accordingly. If you’re using a different IP range, change this as well.

On the server (here: the old machine) I created a file „rsyncd.conf“ with the following contents:

use chroot = true
hosts allow =
log file = /var/log/rsyncd.log
log format = %h %o %f %l %b
comment = sharename
path = /
read only = no
list = yes
uid = root
gid = root

And run this command on the server:

rsync --config=rsyncd.conf --daemon

On the client (here: the new, receiving machine) run this command:

rsync --delete -avPH hostname::sharename

Extract Images from a website and produce a video out of them

I had a big website with a lot of images embedded inside, which were dynamically added by users. The pictures were all of the same size, so I thought it would be easy to create a video out of it. Here are the steps to reproduce it:

Step 0 – Prerequisites

  • You’ll need python and to upload the video directly from the server. If you want to upload yourself, ignore this line and step 5.
  • md5sum
  • avconv – on ubuntu, this is available through sudo apt-get install libav-tools

Step 1 – Extract the images

Save the website you’re about to process to a local file, you only need the html part. (This only works with embedded pictures, for static pictures, try downloading the website with wget -m). Call the following script with the filename of the saved HTML-file.

mkdir -p result
echo "Processing ${file}."
cat ${file} | sed -e 's/>/>\n/g; s/<img/\n<img/g; s/^$//g' | grep 'data:image/jpeg;base64' | sed -e 's/<img src="data:image\/jpeg;base64,//; s/" .*//' >out
while read line
let "count+=1"
echo ${line} | base64 -d >$(printf "result/%05d.jpg\n" ${count})
done <out

Step 2 – Images should be unique

We must delete all the files that are twice in our results.

declare -a list=( ./* );
declare -a sums;
echo "creating md5sum list"
for ((x = 0; x < $cnt -1; x++))
    sums[$x]=`md5sum ${list[$x]} | cut -d ' ' -f 1`
echo "doing compare"
for ((x = 0; x < $cnt -1; x++))
  for ((y = x+1; y < $cnt; y++))
    if [ "${sums[$x]}" == "${sums[$y]}" ];then
      if [ ${list[$x]} != ${list[$y]} ];then
        #remove '#' in next line to enable
        echo "Delete file ${list[$y]}" # && rm -f ${list[$y]}

Step 3 – Renumber

The avconv-Tool can arrange all the images, but they must start with 00000.jpg and strongly increasing numbers. Therefore, we renumber all the jpg-Files accordingly.

find . -type f -iname '*.jpg' | sort -n >filelist
while read line
        mv ${line} $(printf "%05d.jpg\n" ${seq})
        let "seq+=1"
done <filelist

Step 4 – Create the video

We want to output a video with 25 fps and want to simulate that the images are a video with 3 fps. If you want slower changes in the images, increase the „-r 3“ to your wishes. If you want it quicker, you can decrease it down to „-r 1“.

avconv -f image2 -r 3 -i %05d.jpg -r 25 out.mpg

Step 5 – Upload to youtube

Make sure that you have the correct username and a channel created. This part is quite tricky, if you keep getting the 403-Error, make sure that you’ve logged in into youtube and you don’t get any Captchas.

python --email=USERLOGIN --password=PASSWORD --title="The title" --description="The description" --category=Comedy --keywords="keywords" out.mpg