The Code Project View our sponsorsGet your new domain name now!Advertise on the CodeProject
Home >> Files & Folders >> General

CFileInfoArray: A class for gathering file information recursively through directories
By Antonio Tejada Lacaci

 
 VC++ 5.0-6.0, NT 4.0, Win95/98
 Posted 24 Nov 1999
Articles by this author
Send to a friend
Printer friendly version
Lounge New Articles Sign in Forums Contribute
Broken links? Email us!
20 users have rated this article. result:
4.85 out of 5.

  • Download source - 9 Kb
  • Download demo project - 35 Kb
  • Download demo exe - 19 Kb
  • FCompare screenshot

    Abstract

    I've been envolved in some projects which required file gathering through directories and this class allows just that: gather file information recursively by directory and, as a bonus track, it also calculates 32bit file-checksum (note this is not NT's executables checksum calculated with MapFileAndChecksum) and 32bit file-CRC (with a borrowed code, I didn't feel like re-inventing the wheel and the other option was to review my Codification Theory notes and I'm a bit alergic to dust).

    The second part of this article presents FCompare, a sample application of CFileInfo and CFileInfoArray usage. This application does a:

    1. Recursive search of source and target files to compare, given a directory and a filemask.
    2. Binary comparison of source with target files by their size, partial/total content, partial/total checksum or partial/total CRC.
    3. Feeds a listview with matched filenames and paths.

    Building environment

    VC++ 6.0, with warning level 4.
    Tested on Windows NT 4.0 and W'95.
    Although not tested, I guess CFileInfo, CFileInfoArray and FCompare can be safely recompiled to unicode.

    CFileInfo and CFileInfoArray

    /**
     * @class Stores information about a file in a way like <c CFindFile> does
     */
    class CFileInfo {  
    public:
       /** @access Public members */
       CFileInfo();
       /**
        * @cmember Copy constructor
        * @parm CFileInfo to copy member variables from.
        */
       CFileInfo(const CFileInfo& finf);
    
       /**
        * @cmember Destructor
        */
       ~CFileInfo();
    
       /**
        * @cmember Initializes CFileInfo member variables.
        * @parm Values to init member variables.
        * @parm Path of the file the CFileInfo refers to.
        * @parmopt User defined parameter.
        */
       void Create(const WIN32_FIND_DATA* pwfd, const CString strPath, LPARAM lParam=NULL);
    
       /**
        * @cmember Initializes CFileInfo member variables.
        * @parm Absolute path for file or directory
        * @parmopt User defined parameter.
        */
       void Create(const CString strFilePath, LPARAM lParam = NULL);
       
       /**
        * @cmember Calcs 32bit checksum of file (i.e. sum of all the DWORDS of the file, 
        * truncated to 32bit).
        * @parmopt Number of maximum bytes read for checksum calculation. This number is 
        * up-rounded to a multiple of 4 bytes (DWORD). If 0 or bigger than uhFileSize, checksum
        * for all the file is calculated.
        * @parmopt Force recalculation of checksum (otherwise if checksum has already
        * been calculated, it isn't calculated again and previous calculated value is returned).
        * @parmopt Flag to allow calling application to abort the calculation of 
        * checksum (for multithreaded applications).
        * @parmopt Pointer to counter of bytes whose checksum has been calculated. 
        * This value is updated while checksum is being calculated, so calling application
        * can view the progress of checksum calc (for multithreaded applications).
        * Maximum value for pulCount is uhFileSize.
        */
       DWORD GetChecksum(const ULONGLONG uhUpto=0, const BOOL bRecalc = FALSE, 
          const volatile BOOL* pbAbort=NULL, volatile ULONG* pulCount = NULL);
    
       /**
        * @cmember Calcs 32bit CRC of file contents (i.e. CRC of all the DWORDS of the file).
        * @parmopt Number of maximum bytes read for CRC calculation. This number is 
        * up-rounded to a multiple of 4 bytes (DWORD). If 0 or bigger than uhFileSize, CRC
        * for all the file is calculated.
        * @parmopt Force recalculation of CRC (otherwise if CRC has already
        * been calculated, it isn't calculated again and previous calculated value is returned).
        * @parmopt pbAbort Flag to allow calling application to abort the calculation of 
        * CRC (for multithreaded applications).
        * @parmopt Pointer to counter of bytes whose CRC has been calculated. 
        * This value is updated while CRC is being calculated, so calling application
        * can view the progress of CRC calc (for multithreaded applications).
        * Maximum value for pulCount is uhFileSize.
        */
       DWORD GetCRC(const ULONGLONG dhUpto=0, const BOOL bRecalc = FALSE,
          const volatile BOOL* pbAbort=NULL, volatile ULONG* pulCount = NULL);
    
       /** @cmember File size in bytes as a DWORD value. */
       DWORD GetLength(void) const { return (DWORD) m_uhFileSize; };
       /** @cmember File size in bytes as an ULONGLONG value. */
       ULONGLONG GetLength64(void) const { return m_uhFileSize; };
       
       /** Get File split info (equivalent to CFindFile members) */
    
       /** 
        * @cmember Gets the file drive 
        * @rdesc Returns C: for C:\WINDOWS\WIN.INI 
        */
       CString GetFileDrive(void) const;
       /** 
        * @cmember Gets the file dir 
        * @rdesc Returns \WINDOWS\ for C:\WINDOWS\WIN.INI 
        */
       CString GetFileDir(void) const;
       /** @cmember returns WIN for C:\WINDOWS\WIN.INI */
       CString GetFileTitle(void) const;
       /** @cmember returns INI for C:\WINDOWS\WIN.INI */
       CString GetFileExt(void) const;
       /** @cmember returns C:\WINDOWS\ for C:\WINDOWS\WIN.INI */
       CString GetFileRoot(void) const { return GetFileDrive() + GetFileDir(); };
       /** @cmember returns WIN.INI for C:\WINDOWS\WIN.INI */
       CString GetFileName(void) const { return GetFileTitle() + GetFileExt(); };
       /** @cmember returns C:\WINDOWS\WIN.INI for C:\WINDOWS\WIN.INI */
       const CString& GetFilePath(void) const { return m_strFilePath; }
    
       /* Get File times info (equivalent to CFindFile members) */
       /** @cmember returns creation time */
       const CTime& GetCreationTime(void) const { return m_timCreation; };
       /** @cmember returns last access time */
       const CTime& GetLastAccessTime(void) const { return m_timLastAccess; };
       /** @cmember returns las write time */
       const CTime& GetLastWriteTime(void) const { return m_timLastWrite; };
    
       /* Get File attributes info (equivalent to CFindFile members) */
       /** @cmember returns file attributes */
       DWORD GetAttributes(void) const { return m_dwAttributes; };
       /** @cmember returns TRUE if the file is a directory */
       BOOL IsDirectory(void) const { return m_dwAttributes & FILE_ATTRIBUTE_DIRECTORY; };
       /** @cmember Returns TRUE if the file has archive bit set */
       BOOL IsArchived(void) const { return m_dwAttributes & FILE_ATTRIBUTE_ARCHIVE; };
       /** @cmember Returns TRUE if the file is read-only */
       BOOL IsReadOnly(void) const { return m_dwAttributes & FILE_ATTRIBUTE_READONLY; };
       /** @cmember Returns TRUE if the file is compressed */
       BOOL IsCompressed(void) const { return m_dwAttributes & FILE_ATTRIBUTE_COMPRESSED; };
       /** @cmember Returns TRUE if the file is a system file */
       BOOL IsSystem(void) const { return m_dwAttributes & FILE_ATTRIBUTE_SYSTEM; };
       /** @cmember Returns TRUE if the file is hidden */
       BOOL IsHidden(void) const { return m_dwAttributes & FILE_ATTRIBUTE_HIDDEN; };
       /** @cmember Returns TRUE if the file is temporary */
       BOOL IsTemporary(void) const { return m_dwAttributes & FILE_ATTRIBUTE_TEMPORARY; };
       /** @cmember Returns TRUE if the file is a normal file */
       BOOL IsNormal(void) const { return m_dwAttributes == 0; };
       
       LPARAM m_lParam;        /** User-defined parameter */
    private:
       /** @access Private members */
    
       CString m_strFilePath;  /** @cmember Full filepath of file (directory+filename) */
       DWORD m_dwAttributes;   /** @cmember File attributes of file (as returned by FindFile() */
       ULONGLONG m_uhFileSize; /** @cmember File of size. (COM states LONGLONG as hyper, so "uh" means 
                              unsigned hyper) */
       CTime m_timCreation;    /** @cmember Creation time */
       CTime m_timLastAccess;  /** @cmember Last Access time */
       CTime m_timLastWrite;   /** @cmember Last write time */
    
       DWORD m_dwChecksum;     /** @cmember Checksum calculated for the first m_uhChecksumBytes bytes */
       DWORD m_dwCRC;          /** @cmember CRC calculated for the first m_uhCRCBytes bytes */
       DWORD m_uhCRCBytes;     /** @cmember Number of file bytes with CRC calc'ed (4 multiple or filesize ) */
       DWORD m_uhChecksumBytes;/** @cmember Number of file bytes with Checksum calc'ed (4 multiple or filesize) */
    }; 
    
    /**
     * @class Allows to retrieve <c CFileInfo>s from files/directories in a directory
     */
    class CFileInfoArray : public CArray<CFILEINFO, CFileInfo&> {
    public:
       /** @access Public members */
    
       /**
        * @cmember Default constructor
        */
       CFileInfoArray();
    
    
       /** 
        * @cmember,menum Default values for <md CFileInfoArray.lAddParam>
        */   
       enum { 
          AP_NOSORT=0,         /** @@emem Insert <c CFileInfo>s in a unordered manner */
          AP_SORTASCENDING=0,  /** @@emem Insert <c CFileInfo>s in a ascending order */
          AP_SORTDESCENDING=1, /** @@emem Insert <c CFileInfo>s in a descending number */
          AP_SORTBYSIZE=2,     /** @@emem AP_SORTBYSIZE | Insert <c CFileInfo>s ordered by uhFileSize (presumes array is 
                                   previously ordered by uhFileSize). */
          AP_SORTBYNAME=4      /** @@emem AP_SORTBYNAME | Insert <c CFileInfo>s ordered by strFilePath (presumes array is 
                                     previously ordered by strFilePath) */
       };
    
       /**
        * @cmember Adds a file or all contained in a directory to the CFileInfoArray
        * Only "static" data for CFileInfo is filled (by default CRC and checksum are NOT 
        * calculated when inserting CFileInfos).
          Returns the number of <c CFileInfo>s added to the array
        * @parm Name of the directory, ended in backslash.
        * @parm Mask of files to add in case that strDirName is a directory
        * @parm Wether to recurse or not subdirectories
        * @parmopt Parameter to pass to protected member function AddFileInfo
        * @parmopt Wether to add or not CFileInfos for directories
        * @parmopt Pointer to a variable to signal abort of directory retrieval 
        * (multithreaded apps).
        * @parmopt pulCount Pointer to a variable incremented each time a CFileInfo is added to the
        * array (multithreaded apps).
        * @xref <mf CFileInfoArray.AddFile> <mf CFileInfoArray.AddFileInfo> <md CFileInfoArray.AP_NOSORT>
        */
       int AddDir(const CString strDirName, const CString strMask, const BOOL bRecurse, 
          LPARAM lAddParam=AP_NOSORT, const BOOL bIncludeDirs=FALSE, 
          const volatile BOOL* pbAbort = NULL, volatile ULONG* pulCount = NULL);
    
       /**
        * @cmember Adds a single file or directory to the CFileInfoArray. In case of directory, files
        * contained in the directory are NOT added to the array.
        * Returns the position in the array where the <c CFileInfo> was added (-1 if <c CFileInfo>
        * wasn't added)
        * @parm Name of the file or directory to add. NOT ended with backslash.
        * @parm Parameter to pass to protected member function AddFileInfo.
        * @xref <mf CFileInfoArray.AddDir> <mf CFileInfoArray.AddFileInfo>
        */
       int AddFile(const CString strFilePath, LPARAM lAddParam);
    
    
    protected:
       /** @access Protected Members */
    
       /**
        * @cmember Called by AddXXXX to add a CFileInfo to the array. Can be overriden to:
        * 1. Add only desired CFileInfos (filter)
        * 2. Fill user param lParam
        * 3. Change sort order/criteria
        * Returns the position in the array where the CFileInfo was added or -1 if the CFileInfo 
        * wasn't added to the array.
        * Default implementation sorts by lAddParam values and adds all CFileInfos 
        * (no filtering)
        * @parm CFileInfo to insert in the array.
        * @parm Parameter passed from AddDir function.
        * @xref <mf CFileInfoArray.AddDir>
        */
       virtual int AddFileInfo(CFileInfo& finf, LPARAM lAddParam);
    };
    

    How to use it

    I recommend you to read thoroughly the above class header to get an overall view of the classes and their methods. For further refference, you can inspect FCompare's source code (see second half of article).

    Anyway, there it goes some sample code:

    This code adds all files in root directory and its subdirectories (but not directories themselves) to the array and TRACEs them:

    CFileInfoArray fia;
    
    fia.AddDir(
       "C:\\",                                     // Directory
       "*.*",                                      // Filemask (all files)
       TRUE,                                       // Recurse subdirs
       fia::AP_SORTBYNAME | fia::AP_SORTASCENDING, // Sort by name and ascending
       FALSE                                       // Do not add array entries for directories (only for files)
    );
    TRACE("Dumping directory contents\n");
    for (int i=0;i<fia.GetSize();i++) TRACE(fia[i].GetFilePath()+"\n");
    

    You can also call AddDir multiple times. The example shows files in root directories (but not subdirectories) of C:\\ and D:\\:
    CFileInfoArray fia;
    
    // Note both AddDir use the same sorting order and direction
    fia.AddDir("C:\\", "*.*", FALSE, fia::AP_SORTBYNAME | fia::AP_SORTASCENDING, FALSE );
    fia.AddDir("D:\\", "*.*", FALSE, fia::AP_SORTBYNAME | fia::AP_SORTASCENDING, FALSE );
    TRACE("Dumping directory contents for C:\\ and D:\\ \n");
    for (int i=0;i<fia.GetSize();i++) TRACE(fia[i].GetFilePath()+"\n");
    

    Or you can add individual files:

    CFileInfoArray fin;
    
    // Note both AddDir and AddFile must use the same sorting order and direction
    fia.AddDir("C:\\WINDOWS\\", "*.*", FALSE, fia::AP_SORTBYNAME | fia::AP_SORTDESCENDING, FALSE );
    fia.AddFile("C:\\AUTOEXEC.BAT", fia::AP_SORTBYNAME | fia::SORTDESCENDING);
    TRACE("Dumping directory contents for C:\\WINDOWS\\ and file C:\\AUTOEXEC.BAT\n");
    for (int i=0;i<fia.GetSize();i++) TRACE(fia[i].GetFilePath()+"\n");
    

    And mix directories with individual files:

    CFileInfoArray fin;
    
    // Note both AddDir and AddFile must use the same sorting order and direction
    // Note also the list of filemasks *.EXE and *.COM
    fia.AddDir("C:\\WINDOWS\\", "*.EXE;*.COM", FALSE, fia::AP_SORTBYNAME | fia::AP_SORTDESCENDING, FALSE );
    fia.AddFile("C:\\AUTOEXEC.BAT", fia::AP_SORTBYNAME | fia::SORTDESCENDING);
    // Note no trailing bar for next AddFile (we want to insert an entry for the directory
    // itself, not for the files inside the directory)
    fia.AddFile("C:\\PROGRAM FILES", fia::AP_SORTBYNAME | fia::SORTDESCENDING);
    TRACE("Dumping directory contents for C:\\WINDOWS\\, file C:\\AUTOEXEC.BAT and "
    " directory \"C:\\PROGRAM FILES\" \n");
    for (int i=0;i<fia.GetSize();i++) TRACE(fia[i].GetFilePath+"\n");
    

    Implementation details and rationale

    Sample application: FCompare

    Download source - 35 KB FCompare or Binary File Compare is an application to binary compare a group of files, selectable recursively from a given directory and filemask.
    Binary comparison can be done by comparing files' size, CRC, checksum or contents. When comparing by CRC, checksum and contents you can limit the number of bytes the comparison will take into account.

    Technical Features

    How to use it

    I think it's pretty straightforward to use, anyway there it goes the normal procedure of use:
    1. Fill Directory editbox either by typing a directory or by selecting one through the browse directory dialog that appears when pressing .... If you want to recurse subdirectories, check Recurse dirs checkbox.
    2. Fill File masks editbox with a semicolon separated list of filemasks, for example *.htm;*.html;*.shtml;*.asp to find all HTML-related files.
    3. Press Add to Source button. The files in the selected directory will be gathered and the Source files listview filled.
    4. Select another (or the same) directory and filemasks.
    5. Press Add to Target button. The files in the selected directory will be gathered and the Target files listview filled.
    6. Select a comparison method: For checksum, CRC and contents you can enter in UpTo editbox the number of bytes of the file that will be used to calc the value (thus speeding up the calculation). Enter 0 to use all the bytes of the file for calculation.
    7. If you want to supress duplicated files (files that appear in both target and source listviews) from appearing in matched listview, uncheck Compare duplicates.
    8. Press Compare button.
    9. Matched files will appear in Compare tab. You can export the three lists to a file by pressing Export... button and selecting a file.

    Implementation details and rationale

     

    History:

    1999-9-23 ATL (v1.4)
    1999-9-16 ATL (v1.3)
    1999-9-2 ATL (v1.2)
    1999-4-30 ATL (v1.1, Internal Release)
    1999-4-7 ATL (v1.1, Internal Release)

    Recycling bits

    For FCompare I've borrowed:
    [Top] Sign in to vote for this article:     PoorExcellent  
    Hint: For improved responsiveness, use Internet Explorer 4 (or above) with Javascript enabled, choose 'Use DHTML' from the View dropdown and hit 'Set Options'.
     Keyword Filter
     View   Per page   Messages since
    New threadTotal Messages for this article: 0 
    Subject 
    Author 
    Date 
    -- No entries present --

    Home >> Files & Folders >> General
    last updated 24 Nov 1999
    Article content copyright Antonio Tejada Lacaci, 1999
    everything else © CodeProject, 1999-2001.
    The Code Project View our sponsorsClick here for Dundas Software's TCP/IP Development KitAdvertise on the CodeProject