Signature analysis and hashes

This lab covers searching for files with hashes and file carving. File hashes present an important method of rapidly searching for and identifying known good and bad files. A file hash database of files to be searched for can be used to rapidly identify them on a system, even when their names have been changed in an attempt to obfuscate their true type.

File carving is a file extraction method for recovering files from a partition or disk image that may be corrupt. Or it may be used to recover deleted files.

KnownGoodFiles is a hash database of files which should be excluded from further analysis. These could be things like system files. The files have been hashed using a file source which has been validated (e.g. the files were downloaded from the manufacturer).

KnownBadFiles is a hash database of files which a forensics colleague of yours has created. These are hashes of files you are specifically interested in finding and examining. These could be be questionable jpegs which were discovered on another computer, or for instance rootkit executables which you think may have been used as part of a crime.

Question 4: Sorter and filtering

It is possible to filter out known good and bad files while using sorter. This helps cut down on the number of files in the sorter output. Use the appropriate sorter commands to use each of the indexes build with KnownGoodFiles and KnownBadFiles for filtering to search the search.dd image.

Hint: Make sure that you created a new directory for the output of sorter. Make this directory /home/caine/sorter2.

HINT: just like hfind, you need to use the .hdb name of the database file and not the .idx name. The good file database is called the hash_exclude in the sorter man pages, while the bad file database is called hash_alert in the documentation.

Tests - not attempted Details
/home/caine/sorter2 exists	UNTESTED
Sorter executed on search.dd	UNTESTED
Used hash alerts	UNTESTED
Used hash exclusion	UNTESTED

Question 5: Linking the Techniques

In this question you will put all you have learned so far together to search the image for known bad files that have been obfuscated, using the output from sorter saved in the /home/caine/sorter2 directory.

The search.dd image has a number of files on it that can be found in the KnownBadFiles.hdb file. However, some of these files may have been tampered with and obfuscated by a user, perhaps by changing file extensions or using compression or zip archives.

The following table should be filled with the details of bad files found on the image.

The file information for each category identified in the sorter.sum file is saved into its own separate file. For example "Hash Database Alerts" are saved in "alert.txt" and "Extension Mismatches" is saved in "mismatch.txt".

To complete this exercise you have to open the alert.txt file and mismatch.txt file and analyse the information shown. There are 11 files in total to consider.

Example: the first file in alert.txt is Anjie.docx. From alert.txt use the inode number and hash signature of Anjie.docx to analyse the file. The alert.txt file shows Anjie.docx has a hash "37b42ccf126a804620d706ebd6b19ae8". Search using the hash in KnownBadFiles:

$ hfind KnownBadFiles.hdb  37b42ccf126a804620d706ebd6b19ae8

37b42ccf126a804620d706ebd6b19ae8        File9.gz

Now use icat on the file's inode number from alert.txt and analyse the head and tail information. You can do this manually with xxd:

$ icat /images/siglab/search.dd 312 | xxd | head -2
0000000: 1f8b 0808 7963 c94e 0003 4669 6c65 392e  ....yc.N..File9.
0000010: 6a70 6700 ec5b 0b3c 545b db5f 7b66 8c71  jpg..[.<T[._{f.q

$ icat /images/siglab/search.dd 312 | xxd | tail -2
0001840: 499e 2de4 1b83 f2c6 1df4 d7ce 2f8f dfad  I.-........./...
0001850: 0e6f 71d7 f00f d8bb fcb9 0070 0000       .oq........p..

If you google 1f8b0808 you can discover this is a gz file (a file compressed using gzip). Or you can use "file" to achieve the same thing:

$ icat /images/siglab/search.dd 312 | file -
/dev/stdin: gzip compressed data, was "File9.jpg", from Unix, ...

So from the information gathered so far, Anjie.docx is not a "docx" file at all, and is instead a gzipped file which used to be called File9.jpg. But what sort of file was it before it was compressed? To look at a compressed file you need to use a command like zcat, which is the same as cat but uncompresses files before printing them.

$ icat /images/siglab/search.dd 312 | zcat | xxd | head -2
0000000: d0cf 11e0 a1b1 1ae1 0000 0000 0000 0000  ................
0000010: 0000 0000 0000 0000 3e00 0300 feff 0900  ........>.......

Again you can manually recognise d0cf 11e0 a1b1 1ae1 indicates a word document, or again you could use file.

$ icat /images/siglab/search.dd 312 | zcat | file -
/dev/stdin: CDF V2 Document, ... , Name of Creating Application: Microsoft Office Word, ...

As this file was compressed, it is worth checking the uncompressed version against known bad hashes...

$ icat /images/siglab/search.dd 312 | zcat | md5sum
4b7e00728187f79aefc74a48a15c7681  -
$ hfind KnownBadFiles.hdb  4b7e00728187f79aefc74a48a15c7681
4b7e00728187f79aefc74a48a15c7681  File9.jpg

So Anjie.docx is a compressed version of File9.jpg. Both files are bad files. So someone started with a .docx file, renamed it to File9.jpg, then compressed it using gzip before renaming it again to Anjie.docx. As both the compressed and uncompressed versions exist in the bad hash database both files need to be listed in the table below.

Repeat this example for all the files identified as bad hash files.

Name in search.dd	md5 hash	Name in HashDatabase	Observation
Anjie.docx	37b42ccf126a804620d706ebd6b19ae8	File9.gz	gzipped and wrong extension
File9.jpg	4b7e00728187f79aefc74a48a15c7681	File9.jpg	Office document and wrong extension

Tests - not attempted Details
You have answer 1	UNTESTED
You have answer 2	UNTESTED
You have answer 3	UNTESTED
You have answer 4	UNTESTED
You have answer 5	UNTESTED
You have answer 6	UNTESTED
You have answer 7	UNTESTED
You have answer 8	UNTESTED
You have answer 9	UNTESTED

List the names of the files found and extracted from carve.dd by scalpel by comparing the hashes of the files in the scalpel output directory with the KnownBadFiles.hdb.

Hint: use the md5deep tool to recursively analyse (using the correct flag) all the files in the scalpel output directory. Ensure you use the -b flag. Save this output in /home/caine/out1.

Hint 2: you will need to clean up the md5deep output file so that hfind will work with it. Use the following regular expression (and then save the output into something like /home/caine/out2:

sed 's/\s*[0-9a-z]*.\(doc\|jpg\|txt\|gif\)//' FILE_CREATED_BY_MD5DEEP.hdb > /home/caine/out2

Hint 3: The hfind command can take a file which contains a list of md5 hashes and look up each line of that file (e.g. out2) in its hash database. You need a new flag to do this. Make sure the output file (e.g. out2) is just a list of hashes, without any filenames or other text (the sed command should have sorted this for you already but it pays to be sure).

Filenames carved from image which match a hash in KnownBadFiles

Tests - not attempted Details
answer 1	UNTESTED
answer 2	UNTESTED
answer 3	UNTESTED
answer 4	UNTESTED
answer 5	UNTESTED

Linuxzoo created by Gordon Russell.
@ Copyright 2004-2025 Edinburgh Napier University

Centos 7 intro:	Paths \| BasicShell \| Search
Linux tutorials:	intro1 intro2 wildcard permission pipe vi essential admin net SELinux1 SELinux2 fwall DNS diag Apache1 Apache2 log Mail
Caine 10.0:	Essentials \| Basic \| Search \| Acquisition \| SysIntro \| grep \| MBR \| GPT \| FAT \| NTFS \| FRMeta \| FRTools \| Browser \| Mock Exam \|
Caine 13.0:	Essentials \| Basic \| Search \| ~~Acquisition~~ \| SysIntro \| grep \| MBR \| GPT \| FAT \| NTFS \| FRMeta \| FRTools \| Browser \| Registry \| Mock Exam \|
CPD:	Cygwin \| Paths \| Files and head/tail \| Find and regex \| Sort \| Log Analysis
Kali 2020-4:	1a \| 1b \| 1c \| 2 \| 3 \| 4a \| 4b \| 5 \| 6 \| 7 \| 8a \| 8b \| 9 \| 10 \|
Kali 2024-4:	1a \| 1b \| 1c \| 2 \| 3 \| 4a \| 4b \| 5 \| 6 \| 7 \| 8a \| 8b \| 9 \| 10 \|
Useful:	Quiz \| Privacy Policy \| Terms and Conditions

Signature analysis and hashes

Objectives

Question 1: Sorter

Question 2: Generate a hash database

Question 3: hfind

Question 4: Sorter and filtering

Question 5: Linking the Techniques

Question 6: Scalpel