Home > linux, unix > shell and list of files

shell and list of files

October 27th, 2011 Leave a comment Go to comments

How do you loop thru a list of files?

For instance you want to archive than delete all pdf documents in the current directory :

Bad practice :


tar cvf f.tar *.pdf
rm *.pdf

There are multiple issue with the command above

1) new files could come during the tar, so the rm will delete files that have not been archived


filelist=$(ls *.pdf)
tar cvf f.tar $filelist
rm $filelist

2) if there is no file, tar and rm will return an error


filelist=$(ls|grep '\.pdf')
if [ -n "$filelist" ]
then
  tar cvf f.tar $filelist
  rm $filelist
fi

3) this will not work for long list (above 100k documents)


filelist=/tmp/filelist.$(date "+%Y%m%d%H%M%S").$$.$RANDOM
ls|grep '\.pdf' > $filelist
if [ -s "$filelist" ]
then
  tar cvfL f.tar $filelist
  for f in $(<filelist)
  do
    rm $f
  done
fi

As you see, this require special handling. tar for instance use the -L option to accept a list of files, rm could delete files one by one (or in bunches with xargs -L).

This 100’000 limit (the limit may vary for your shell/os) is something that often gets forgotten.

Typical error that could occur are


ksh: no space
bash: Arg list too long

Tags:
  1. Eric Grancher
    October 27th, 2011 at 22:05 | #1

    good evening,

    This could be one way…

    F=/tmp/x
    find . -name \*.pdf > $F
    tar cvfz /tmp/t.tgz –files-from $F
    xargs –arg-file=$F –max-lines=1 rm

    cheers
    Eric

  2. October 28th, 2011 at 09:42 | #2

    please replace
    ls|grep ‘\.pdf’ > filelist
    by

    ls|grep ‘\.pdf’ > $filelist

    in the third script

  3. October 28th, 2011 at 09:53 | #3

    @Eric Grancher : sounds like GNU to me. I do not have those tar and xargs to test, but thanks for sharing

    @donat callens : Thank you !!!

  4. November 2nd, 2011 at 14:12 | #4

    please, before calling rm, check the outcome of the tar !
    What, if the tar fails ? (for example: disk full) – I assume, you don’t want to call rm then, ok ?
    So, don’t forget to test the exit status of the tar – command !

    tar cvf … && rm …

    is the idea

  5. Eric Grancher
    November 2nd, 2011 at 15:01 | #5

    @Sokrates very good point, thank you
    eric

  6. November 2nd, 2011 at 20:53 | #6

    “…
    There are multiple issue with the command above

    1) new files could come during the tar, so the rm will delete files that have not been archived

    or:

    1b) a file already added to the tar-archive could have been overwritten by another file with the same name while tar is still running, so the rm will delete files that have not been archived

    file-lists don’t protect you from that

  7. November 2nd, 2011 at 21:49 | #7

    return code of tar : indeed

    another way to go would be moving the files to a directory, tar the dir, then remove the dir

  1. No trackbacks yet.
*