Skip to content Skip to sidebar Skip to footer

How To Know The Files Inside The Tar Parser

I am developing a visual c++ application . i need to know the file type (i mean whether it contains .png file or.html file or .txt file) present inside the tar file(just by c++ prg

Solution 1:

The contents of a tar file is always header block, data block, header block, data block ... where every header block contains all the information of one file (filename, size, permissions,...) and the following data block contains the contents that file. The size of each data block is the next multiple of 512 of the file size as it is in the header block (that sentence looks awful to me. Could any native speaker correct is please). So if you have read one header block and want to skip to the next one calculate

size_t skip = filesize % 512 ? filesize + 512 - (filesize % 512) : filesize

or, more performantly

size_t skip = filesize + 511 & ~512;

and seek skip bytes forward.

For example if your tar file contains two files a.bin of size 12345 (next multiple of 512 is 12800) and b.txt of size 123 (next multiple of 512 is -- obviously -- 512) then you would have:

  1. header containing information about a.bin starting at Pos. 0
  2. data of a.bin starting at Pos. 512
  3. header containing information about b.txt starting at Pos. 512 + 12800 = 13312
  4. data of b.txt starting at Pos. 13312 + 512 = 13824
  5. the file size of the tar file will be at least 13824 + 512 = 14324. In practice, you will generally find the tar file to be larger and the next 512 bytes at Pos. 14324 will be \0

Post a Comment for "How To Know The Files Inside The Tar Parser"