OpenSource For You

Faster File Search with Python

This article presents a file search utility created by using the power of the versatile Python programmin­g language. Read on to discover how it works and how it can be used in Windows systems.

-

Computer users often have a problem with file search as they tend to forget the location or path of a file even though Windows provides a file search utility. The Explorer in Windows 7 offers a search facility but it takes around two to three minutes to search a file. In this article, I will give you a Python program which will search a file on your computer’s hard disk, within hardly one second.

Let us first understand the program’s logic. Figure 1 explains this. Let us first do indexing or, in Python language terms, let’s construct a dictionary in which the file will be the key of the dictionary and the value will be the path of the file. The dictionary will be dumped into the pickle file. The next time, the file will be searched in the dictionary (dumped in the pickle file).

Now that you have understood the logic of the program, let us look at the program in detail. I have broken it into different functions. Let’s see what each function does.

#program created by mohit #offical website L4wisdom.com # email-id mohitraj.cs@gmail.com

The block of code below imports the essential modules:

import os import re import sys from threading import Thread from datetime import datetime import subprocess import cPickle dict1 = {}

Next, let’s write a function to acquire the drives. This function gets all the drives in your Windows machine. If you have inserted any external/USB pen drive or hard drive disk, the function also obtains details for them.

def get_drives(): response = os.popen("wmic logicaldis­k get caption")

list1 = [] total_file = [] t1= datetime.now() for line in response.readlines(): line = line.strip("\n") line = line.strip("\r") line = line.strip(" ") if (line == "Caption" or line == ""):

continue list1.append(line) return list1

Our next function is the search1 function, which constructs a dictionary in which the file name is the key and the path is the value of the dictionary.

def search1(drive): for root, dir, files in os.walk(drive, topdown = True): for file in files: file= file.lower() if file in dict1: file = file+”_1” dict1[file]= root else :

dict1[file]= root

The create function opens the thread process for each drive, and each thread process calls the search1 function.

def create(): t1= datetime.now() list2 = [] # empty list is created list1 = get_drives() print list1 for each in list1: process1 = Thread(target=search1, args=(each,)) process1.start() list2.append(process1)

for t in list2: t.join() # Terminate the threads

After creating the dictionary, the following code dumps the dictionary into the hard disk as a pickle file.

pickle_file = open(“finder_data”,”w”) cPickle.dump(dict1,pickle_file) pickle_file.close() t2= datetime.now() total =t2-t1 print “Time taken to create “, total print “Thanks for using L4wisdom.com”

Next time, when you search any file, the program will search the file in the dumped dictionary, as follows:

if len(sys.argv) < 2 or len(sys.argv) > 2: print “Please use proper format” print “Use <finder -c > to create database file” print “Use <finder file-name> to search file” print “Thanks for using L4wisdom.com”

elif sys.argv[1] == ‘-c’: create()

else:

t1= datetime.now() try:

pickle_file = open(“finder_data”, “r”) file_dict = cPickle.load(pickle_file) pickle_file.close() except IOError:

create() except Exception as e : print e sys.exit() file_to_be_searched = sys.argv[1].lower() list1= [] print “Path \t\t: File-name”

Here, we used the search method of regular expression­s so that we can use a regular expression to find the file.

for key in file_dict: if re.search(file_to_be_searched, key): str1 = file_dict[key]+” : “+key list1.append(str1)

list1.sort() for each in list1: print each print “-----------------------” t2= datetime.now() total =t2-t1 print “Total files are”, len(list1) print “Time taken to search “, total print “Thanks for using L4wisdom.com”

The rest of the code is very easy to understand. Let us save the complete code as finder.py (you can also download it from http://opensource­foru.com/article_source_ code/sept16/finder.zip) and make it a Windows executable (exe) file using the Pyinstalle­r module. You can also download it from http://l4wisdom.com/finder_go.php. Run the command shown in Figure 2. After running it successful­ly, you can find the finder.exe in folder C:\PyInstalle­r2.1\ finder\dist .

You can put the finder.exe file in the Windows folder, but if you place this in a different folder, you will have to set the path to that folder. Let us run the program. You can see from Figure 3 that just 33 seconds are required to create the database. Now search the file and see the power of the program.

I am going to search for songs which contain the string waada. Look at Figure 4. You can see that two searches have taken approximat­ely half a second. The program is case insensitiv­e, so using upper case or lower case doesn’t matter. The program also has the power of regular expression­s. Let’s assume that you want to search files which contain the wada or waada strings. Let us look at Figure 5. The regular expression ‘a+’ means the letter ‘a’ can appear once or many times. Again, you can get the result in less than one second. Let us consider one more example of a regular expression search. Let’s assume that you want to search the files which contain wa+da with digit numbers (see the first search of Figure 6). Assume that you want to search the files that start with the string wa+da (see the results of the second search in Figure 6).

The program is indeed very useful. Suppose, for instance, you have forgotten the file path but have a vague idea of the file name. You can search the file within one second by using regular expression­s. The best part is the speed of the search. You can try searching with the file name repeatedly. But if you use Windows Explorer, each search will take around two to four minutes.

 ??  ?? Figure 1: Program logic
Figure 1: Program logic
 ??  ?? Figure 2: Creating exe file of the Python program
Figure 2: Creating exe file of the Python program
 ??  ??
 ??  ?? Figure 4: File searching
Figure 4: File searching
 ??  ?? Figure 6: Searching used power for regular expression­s
Figure 6: Searching used power for regular expression­s
 ??  ?? Figure 5: File searching using regular expression­s
Figure 5: File searching using regular expression­s
 ??  ?? Figure 3: Creating a database of all files
Figure 3: Creating a database of all files

Newspapers in English

Newspapers from India