C9. Interacting with the Operative System and the user#

Sometimes Python programs need information from the hardware of the computer, or even to perform some basic operations in that hardware. Some other times, programs need some information that the user has to provide in an interactive way. This chapter will explain you how to code those tasks.

A note of caution: You should try to code to learn. But, if you are a newbie, and to be in safe side, you should not write any Python code that erase a directory (folder) or file

How do we Interact with the Operative System?#

Imagine that our Python program needs to interact with the hardware of our computer. For instance, it could needed to create, move or delete some directories, create or delete files, run and retrieve data from external programs, etc. All this functionality is controlled by the Operative System (OS); but, there is a variety of them: Linux, MacOS, MS-Windows, Android, Free BSD, Solaris, and so on. As users, we interact with our OS, and by end with our computer’s hardware, using the applications installed in our system.

Now we understand what are we talking about! We want our Python program to interact/control the hardware of our computer, but the challenge is that Python should be able to do it independently (or with the minimal dependence) on the OS we are using in our computer. This is important, let’s imagine that we use MS-Windows in our personal-laptop. Is then enough to stick with it? Clearly no, the professional-world is quite different, the machine where the biological data need to be analyzed has many chances to be a high-end Unix machine and we have to be ready for it

Some needs we can have are:

  • Access to some information of our OS/system

  • Create, rename, copy, erase directories

  • Move to a particular directory

  • Create, rename, copy, erase files within a directory

  • List all the files within a dir, or even all the subdirs within a dir in a recursive way

  • Run another computer program an retrieve the results

  • …and so on

Is Python able to do this?#

I have good news, the answer is yes. We can interact with our OS from our Python programs. Even more, we can use many Python statements that are independent of our OS; but, we need to take into account some “simple information” about our OS; that information can be also retrieved from our Python programs

For instance:

  • The path of a file is quite different in a Unix-like system than in a MS-Windows. Note how MS-Windows uses backslash while Linux/MacOs use a forward slash:

    • In a Unix-like system (Linux, MacOS): /home/user/sequences/my_seq.fa

    • In MS-Windows: C:\user\sequences\my_seq.fa

  • File permissions are different in different OS:

    • In Unix-like systems each individual file, directory can have independent permissions. The permissions are r,w,x (read, write, execute) for the user, members of the user’s group and others.

      For example: “-rw-rw-r–”. Here, the user can read and write; members of user’s group can read and write, but others can only read.

    • In MS-Windows the permissions are inherited from the parental dir and have a different way to be configured:

      Full Control, Modify, Read & Execute, List Folder Contents, Read, and Write.

Now we can introduce two very well known python modules: os and shutil

  • os: a portable way of using dependent functionality of the OS.

    import os
    
  • shutil: high-level operations on files and collections of files.

    import shutil 
    

The os module#

Information about our system/OS

import os # First of all, it is necessary to import the module

os.uname()#

os_info = list(os.uname()) # This provides info about our computer/OS
os_info
['Linux',
 'arcturus',
 '5.19.0-46-generic',
 '#47~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 21 15:35:31 UTC 2',
 'x86_64']

The output of os.uname() is a list:

  • Sysname: operating system name

  • Nodename: name of machine on network (implementation-defined)

  • Release: operating system release

  • Version: operating system version

  • Machine: hardware identifier

For instance: your Python program would like to know which OS/system/machine you are in, and depending on that, to perform different tasks. Then you will code the next:

if os_info[0] == 'Linux':
    pass # do something in Linux
elif os_info[0] == 'Darwin':
    pass # do something in Darwin
if os_info[0] == 'Windows':
    pass # do something in MS-Windows
else:
    pass # do something in other OS

Note: if you are a Windows user os.uname() will not work for you. check the next in google:

>> import platform
>> platform.uname()

os.getlogin()#

Simply return the actual login name

os.getlogin() # our login in the machine
'emuro'

os.stat()#

Perform a stat system call on the given path.
What is this? It provides information about the state of a file. Let’s see an example:

os.stat('./c9_interaction_with_OS__running_programs__user_input.py') # for more advanced users
os.stat_result(st_mode=33204, st_ino=6430637, st_dev=2051, st_nlink=1, st_uid=1000, st_gid=1000, st_size=65, st_atime=1689260191, st_mtime=1683622397, st_ctime=1683622397)
# The next is run from my OS. 
# Note that this only works in my jupyter, not from your python regular console.
# "stat" is a Linux command (my current OS) 
!stat c9_interaction_with_OS__running_programs__user_input.py
  File: c9_interaction_with_OS__running_programs__user_input.py
  Size: 65        	Blocks: 8          IO Block: 4096   regular file
Device: 803h/2051d	Inode: 6430637     Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   emuro)   Gid: ( 1000/   emuro)
Access: 2023-07-13 16:56:31.182536500 +0200
Modify: 2023-05-09 10:53:17.500328361 +0200
Change: 2023-05-09 10:53:17.500328361 +0200
 Birth: 2023-05-09 10:53:17.500328361 +0200

Now you can understand why Python is so popular between hackers. It allows them to retrieve “easily” information, even, from remote systems

Managing directories#

os.getcwd()#

Return a unicode string representing the current working directory.
Note: Unicode is the universal character encoding.

os.getcwd() is a extremely useful method

original_cwd = os.getcwd() # Current working directory
print(original_cwd)
/home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input

os.pardir#

Note: Do not use parenthesis here. It is an instance

# Parental directory
print(os.pardir) # prints ".." by default
..

os.chdir()#

Change the current working directory to the specified path

# Just in case: Change the working directory to the original dir, I was working on
if os.getcwd is not original_cwd:
    os.chdir(original_cwd) # <- change the directory
# Then,

# Current working directory
cwd1 = os.getcwd()
print("Current dir:\n\t", cwd1)

# Change the current working directory to the parental dir
os.chdir(os.pardir)  # <- or os.chdir('..')
cwd2 = os.getcwd()
print("Parent dir:\n\t", cwd2)

# Change the current working directory to the child-dir again
os.chdir(cwd1)
cwd3 = os.getcwd()
print("Back to the dir:\n\t", cwd3)
Current dir:
	 /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input
Parent dir:
	 /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book
Back to the dir:
	 /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input

os.path.exists()#

Test whether a path exists. Returns False for broken symbolic links

path_of_bin = "/usr/bin/" # directory every Linux user knows
if os.path.exists(path_of_bin):
    print("Great!", path_of_bin, "exists")
else:
    print(path_of_bin, "does not exists") 
Great! /usr/bin/ exists

It also works with files

path_of_zip = "/usr/bin/zip" # zip is an executable compressor file
if os.path.exists(path_of_zip):
    print("Great!", path_of_zip, "exits")
else:
    print(path_of_zip, "does not exits")
Great! /usr/bin/zip exits

os.listdir()#

This method returns a list containing the names of the files in the directory we provide or the cwd by default. It is very useful

# Just in case I changed the current working dir
if os.getcwd is not original_cwd:
    os.chdir(original_cwd)
    
my_list_of_files = os.listdir() # <- In the current dir: files and dirs
print(my_list_of_files)
['.ipynb_checkpoints', 'dir4testing_files_and_dirs', 'exercises', 'c9_interaction_with_OS__running_programs__user_input.html', 'c9_interaction_with_OS__running_programs__user_input.py', 'c9_interaction_with_OS__running_programs__user_input.ipynb']

We can also list the files and dirs within another dir:

path = "/tmp"  # Be careful with Windows
files_in = os.listdir(path) # <-
print("In", path, "there are", len(files_in), "files and/or dirs:")
for i, f in enumerate(files_in):
        print(f"{i:d}: {f:.50s}") # format: only shows 50 characters
In /tmp there are 26 files and/or dirs:
0: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-M
1: .X1025-lock
2: .XIM-unix
3: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-b
4: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-f
5: .ICE-unix
6: .font-unix
7: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-s
8: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-c
9: .com.google.Chrome.2ig6s2
10: .X11-unix
11: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-u
12: .X1024-lock
13: tracker-extract-3-files.127
14: gs_Wwtzwe
15: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-s
16: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-s
17: .Test-unix
18: .X0-lock
19: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-p
20: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-s
21: .X1-lock
22: tmpfbr49c7d.json
23: snap-private-tmp
24: tracker-extract-3-files.1000
25: systemd-private-1458c1f8fadc4cf3babe497a2a67a3c6-s

We can also change the working directory and list the files contained there

# just in case I changed anything...
# move back to the original working dir, I was working on
if os.getcwd is not original_cwd:
    os.chdir(original_cwd)
# Then,

# Current working directory
cwd1 = os.getcwd()
print("Current dir:\n\t", cwd1)

# Change the current working directory 
testing_dir = 'dir4testing_files_and_dirs'
os.chdir(testing_dir) # Here we change to another dir
new_cwd = os.getcwd()
print("New working dir:\n\t", new_cwd)

# List the files after changing the dir
my_list_of_files = os.listdir() # <- testing_dir: files and dirs
print(my_list_of_files)
Current dir:
	 /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input
New working dir:
	 /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs
['my_program.py', 'my_dir', 'John_Doe.txt']

More on managing directories: create and erase#

os.mkdir()#

Create a new directory

os.mkdir("my_brand_new_dir") # We are in the testing dir
# list the files after changing the dir
my_list_of_files = os.listdir() # In the current dir: files and dirs
my_list_of_files
['my_program.py', 'my_brand_new_dir', 'my_dir', 'John_Doe.txt']

Note: you can check that ‘my_brand_new_dir’ has been created with your usual file manager

os.makedirs()#

Note: not to confuse with mkdir.
The main difference: nested dirs can be created with os.makedirs()

# Note that os.makedirs() is very different to os.mkdir()
os.makedirs(r"my_not_so_brand_new_dir/subdir1/sub_subdir1/sub_sub_subdir1")
# list the files after changing the dir
my_list_of_files = os.listdir() # In the current dir: files and dirs
my_list_of_files
['my_program.py',
 'my_brand_new_dir',
 'my_not_so_brand_new_dir',
 'my_dir',
 'John_Doe.txt']
path = os.getcwd() + "/my_not_so_brand_new_dir" # the dir path
directories = os.walk(path) #  os.walk(); already recursive
[print(x[0]) for x in directories if x != None]
print("This was the dir structure of", "my_not_so_brand_new_dir") 
/home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs/my_not_so_brand_new_dir
/home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs/my_not_so_brand_new_dir/subdir1
/home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs/my_not_so_brand_new_dir/subdir1/sub_subdir1
/home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs/my_not_so_brand_new_dir/subdir1/sub_subdir1/sub_sub_subdir1
This was the dir structure of my_not_so_brand_new_dir

os.rmdir()#

Delete a directory

A note of caution: if you are a newbie, you should not run the next Python code, because you can end up with all your home dir wiped.

os.rmdir("my_brand_new_dir")
# list the files after changing the dir
my_list_of_files = os.listdir() # In the current dir: files and dirs
print(my_list_of_files)
['my_program.py', 'my_not_so_brand_new_dir', 'my_dir', 'John_Doe.txt']

shutil.rmtree()#

Delete a directory and all the nested subdirs, in other words: Recursively delete a directory tree
Be very, very, very careful!:
This is extremely dangerous. We can easily loose all our data or erase our home directory A note of caution: if you are a newbie, you should not run the next Python code, because you can end up with all your home dir wiped

import shutil # We need to import this new module

shutil.rmtree("my_not_so_brand_new_dir") # Recursively delete "my_not_so_brand_new_dir" dir tree
my_list_of_files = os.listdir() # In the current dir: files and dirs
my_list_of_files
['my_program.py', 'my_dir', 'John_Doe.txt']

Managing files/directories#

os.rename()#

Rename a file or directory

# Show
print("Before:\n", os.listdir())

# Rename and show
os.rename('John_Doe.txt', 'Sara_Doe.txt') # change the name of a file
print("Renamed:\n", os.listdir())

# Rename-back again and show
os.rename('Sara_Doe.txt', 'John_Doe.txt')
print("As before:\n", os.listdir())
Before:
 ['my_program.py', 'my_dir', 'John_Doe.txt']
Renamed:
 ['my_program.py', 'Sara_Doe.txt', 'my_dir']
As before:
 ['my_program.py', 'my_dir', 'John_Doe.txt']

Note: If the file does not exits it raises an error!

A file from any pathway can be renamed

# Show
my_path = str(os.getcwd()) # I can use any path. But, be careful
print("Before:\n", os.listdir())

# Rename and show
os.rename(my_path + '/John_Doe.txt',  my_path + '/Duke_Doe.txt')
print("Renamed:\n", os.listdir())

# Rename-back again and show
os.rename(my_path + '/Duke_Doe.txt', my_path + '/John_Doe.txt')
print("As before:\n", os.listdir())
Before:
 ['my_program.py', 'my_dir', 'John_Doe.txt']
Renamed:
 ['Duke_Doe.txt', 'my_program.py', 'my_dir']
As before:
 ['my_program.py', 'my_dir', 'John_Doe.txt']

Rename dirs
Using the same method, directories can be also renamed

A note of caution: if you are a newbie, you should not run the next Python code, because it deletes a directory and you can end up with all your home dir wiped

# Show, create-dir, rename-dir and erase-dir: 

# Show
print("At the beginning:\n\t", os.listdir()) 

# Create a dir and show
os.mkdir("my_brand_new_dir") # here we create the dir
print("With new dir:\n\t", os.listdir()) 

# Rename the dir and show
my_path = os.getcwd() + "/"
os.rename(my_path + "my_brand_new_dir", 
          my_path + "my_brand_new_dir_with_new_name")
print("dir renamed:\n\t", os.listdir())

# Erase the just-created-dir and show
os.rmdir(my_path + "my_brand_new_dir_with_new_name") # Erase it!
print("After erasing the renamed-dir:\n\t", os.listdir()) 
At the beginning:
	 ['my_program.py', 'my_dir', 'John_Doe.txt']
With new dir:
	 ['my_program.py', 'my_brand_new_dir', 'my_dir', 'John_Doe.txt']
dir renamed:
	 ['my_program.py', 'my_brand_new_dir_with_new_name', 'my_dir', 'John_Doe.txt']
After erasing the renamed-dir:
	 ['my_program.py', 'my_dir', 'John_Doe.txt']

The shutil module#

The shutil module offers a number of high-level operations on files and collections of files

import shutil # as always, import the module first

Files#

shutil.copy()#

Copy the source file into a destination file.
It returns the file’s destination.

# Before: list the files of my current dir
cwd = os.getcwd()
my_list_of_files = os.listdir(cwd) # In the current dir: files and dirs
print("Before the copy:", my_list_of_files)

# cp file
my_file = os.path.join(cwd, "John_Doe.txt")
cp_to_file = os.path.join(cwd, "copied_John_Doe.txt")
destination = shutil.copy(my_file, cp_to_file) # <- cp
print("Copied in ", destination)

# After: list again the files
print("After the copy:", os.listdir(cwd)) # It should have the new cp-file
Before the copy: ['my_program.py', 'my_dir', 'John_Doe.txt']
Copied in  /home/emuro/Desktop/goingOn/teaching/p4b__jupyter_book_SoSe23/p4b__jb_built/p4b-web_book/c9_interaction_with_OS__running_programs__user_input/dir4testing_files_and_dirs/copied_John_Doe.txt
After the copy: ['copied_John_Doe.txt', 'my_program.py', 'my_dir', 'John_Doe.txt']

os.remove()#

Remove a file

A note of caution: if you are a newbie, you should not run the next Python code, because it deletes a file

# Erase the copied file
os.remove(cp_to_file)
# After removing the copied file: list again the files
print("After removing the copied file:", os.listdir(cwd)) # It should have the new cp-file
After removing the copied file: ['my_program.py', 'my_dir', 'John_Doe.txt']

Homework: test that you can not remove a directory. It will give you an error message

Directories#

shutil.copytree()#

Recursively copy a directory tree and return the destination directory

A note of caution: if you are a newbie, you should not run the next Python code, because it deletes a directory

print("At the beginning:", os.listdir()) # list

# Create a buch of nested dirs
os.makedirs(r"my_dir_with_nested_subdirs/subdir1/sub_subdir1/sub_sub_subdir1/") # Create
print("Create-dir:", os.listdir()) # list

# Copy the dir (that contains nested subdirs)
shutil.copytree("my_dir_with_nested_subdirs", "thisIsAcopyOf__my_dir_with_nested_subdirs") # <- Copy
print("Also copied-dir:", os.listdir()) # list

# Erase the copied-dirs (that contains nested subdirs)
shutil.rmtree("my_dir_with_nested_subdirs")                # Erase the 2 nested-main-dirs
shutil.rmtree("thisIsAcopyOf__my_dir_with_nested_subdirs") 
print("Erased dir and copied-dir:", os.listdir()) # list
At the beginning: ['my_program.py', 'my_dir', 'John_Doe.txt']
Create-dir: ['my_program.py', 'my_dir_with_nested_subdirs', 'my_dir', 'John_Doe.txt']
Also copied-dir: ['thisIsAcopyOf__my_dir_with_nested_subdirs', 'my_program.py', 'my_dir_with_nested_subdirs', 'my_dir', 'John_Doe.txt']
Erased dir and copied-dir: ['my_program.py', 'my_dir', 'John_Doe.txt']

Running an external program from our code#

The module subprocess#

subprocess.run()#

Run command with arguments and return a “CompletedProcess instance”

import subprocess
output = subprocess.run(["ls", "-l"], capture_output=True) # ls is a Linux command, -l a parameter
print(output)
CompletedProcess(args=['ls', '-l'], returncode=0, stdout=b'total 12\n-rw-rw-r-- 1 emuro emuro   25 Mai  9 10:53 John_Doe.txt\ndrwxrwxr-x 3 emuro emuro 4096 Mai  9 10:53 my_dir\n-rw-rw-r-- 1 emuro emuro  232 Mai  9 10:53 my_program.py\n', stderr=b'')
print(output.stdout) # here you see the codification of the output
b'total 12\n-rw-rw-r-- 1 emuro emuro   25 Mai  9 10:53 John_Doe.txt\ndrwxrwxr-x 3 emuro emuro 4096 Mai  9 10:53 my_dir\n-rw-rw-r-- 1 emuro emuro  232 Mai  9 10:53 my_program.py\n'
print(output.stdout.decode())
total 12
-rw-rw-r-- 1 emuro emuro   25 Mai  9 10:53 John_Doe.txt
drwxrwxr-x 3 emuro emuro 4096 Mai  9 10:53 my_dir
-rw-rw-r-- 1 emuro emuro  232 Mai  9 10:53 my_program.py

Then the output of the external program can be captured and parsed

A nice example: a USCS mysql request
The next program makes a query against a remote mysql database, bringing back results.
Obviusly, it needs to have a msql-client installed in your computer

import subprocess 

command_with_parameters = ["mysql", "--no-defaults", "-h", "genome-mysql.soe.ucsc.edu", "-u", "genome", "-A", "-e", "select * from knownGene limit 10",  "hg38"] 
output = subprocess.run(command_with_parameters, capture_output=True, timeout=5) 
print(output.stdout.decode())
name	chrom	strand	txStart	txEnd	cdsStart	cdsEnd	exonCount	exonStarts	exonEnds	proteinID	alignID
ENST00000456328.2	chr1	+	11868	14409	11868	11868	3	11868,12612,13220,	12227,12721,14409,		uc286dmu.1
ENST00000619216.1	chr1	-	17368	17436	17368	17368	1	17368,	17436,		uc031tla.1
ENST00000473358.1	chr1	+	29553	31097	29553	29553	3	29553,30563,30975,	30039,30667,31097,		uc057aty.1
ENST00000469289.1	chr1	+	30266	31109	30266	30266	2	30266,30975,	30667,31109,		uc057atz.1
ENST00000607096.1	chr1	+	30365	30503	30365	30365	1	30365,	30503,		uc031tlb.1
ENST00000417324.1	chr1	-	34553	36081	34553	34553	3	34553,35276,35720,	35174,35481,36081,		uc001aak.4
ENST00000461467.1	chr1	-	35244	36073	35244	35244	2	35244,35720,	35481,36073,		uc057aua.1
ENST00000642116.1	chr1	+	57597	64116	57597	57597	3	57597,58699,62915,	57653,58856,64116,		uc286dmy.1
ENST00000641515.2	chr1	+	65418	71585	65564	70008	3	65418,65519,69036,	65433,65573,71585,	A0A2U3U0J3	uc001aal.2
ENST00000466430.5	chr1	-	89294	120932	89294	89294	4	89294,92090,112699,120774,	91629,92240,112804,120932,		uc057aub.1
How do we run the next query?#
mysql  --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -e 'select chrom,size from chromInfo limit 23' 

This query retrieves the length of the human chromosomes from UCSC

import subprocess 

command_with_parameters = ["mysql", "--no-defaults", "-h", "genome-mysql.soe.ucsc.edu", "-u", "genome", "-A", "-e", 'select chrom,size from chromInfo limit 24',  "hg19"] 
output = subprocess.run(command_with_parameters, capture_output=True, timeout=5) 
print(output.stdout.decode())
chrom	size
chr1	249250621
chr2	243199373
chr3	198022430
chr4	191154276
chr5	180915260
chr6	171115067
chr7	159138663
chrX	155270560
chr8	146364022
chr9	141213431
chr10	135534747
chr11	135006516
chr12	133851895
chr13	115169878
chr14	107349540
chr15	102531392
chr16	90354753
chr17	81195210
chr18	78077248
chr20	63025520
chrY	59373566
chr19	59128983
chr22	51304566
chr21	48129895

Concatenate external programs (pipe). This is more advanced but powerful#

Access NCBI-online from the command line (ncbi-entrez-direct) and retrieve the fasta sequence of a protein given.
We will provide the UniProtKB of Homo_sapiens’ PTEN: “P60484”.
Note that ncbi-entrez-direct scripts need to be installed in your computer.

echo "P60484" | epost -db protein | efetch -db protein -format fasta

If we try subprocess.run(), it just does not work, because we want to pipe “|” several commands.
This is typical while programming, you have to find a solution to carry out a task. Therefore, you need to stablish a protocol that optimizes your learning curve. Here, I provide one:

  1. Try to solve the problem by yourself

  2. Use help() or ?

  3. Google the problem

  4. Step away, take a break and try to solve it again

  5. Write it again from scratch

  6. Ask to some peer

  7. Ask for help online

Solution for our current task: there is another method that solve the pipe problem (subprocess.Popen()).

# NCBI: retrieve a fasta sequence
cmd = 'echo "P60484" | epost -db protein | efetch -db protein -format fasta' # the command includes all the different pipes
ps = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)  
fasta = ps.communicate()[0].decode()
print(fasta)
WARNING: Redundant -db 'protein' argument
>sp|P60484.1|PTEN_HUMAN RecName: Full=Phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN; AltName: Full=Mutated in multiple advanced cancers 1; AltName: Full=Phosphatase and tensin homolog
MTAIIKEIVSRNKRRYQEDGFDLDLTYIYPNIIAMGFPAERLEGVYRNNIDDVVRFLDSKHKNHYKIYNL
CAERHYDTAKFNCRVAQYPFEDHNPPQLELIKPFCEDLDQWLSEDDNHVAAIHCKAGKGRTGVMICAYLL
HRGKFLKAQEALDFYGEVRTRDKKGVTIPSQRRYVYYYSYLLKNHLDYRPVALLFHKMMFETIPMFSGGT
CNPQFVVCQLKVKIYSSNSGPTRREDKFMYFEFPQPLPVCGDIKVEFFHKQNKMLKKDKMFHFWVNTFFI
PGPEETSEKVENGSLCDQEIDSICSIERADNDKEYLVLTLTKNDLDKANKDKANRYFSPNFKVKLYFTKT
VEEPSNPEASSSTSVTPDVSDNEPDHYRYSDTTDSDPENEPFDEDQHTQITKV

Now we are ready to use the fasta sequence in our code.
Note that we can also retrieve several proteins in a single query.

# Now a simple remind on subprocess.run()
# Note that the next will work only in a Linux-like OS:
output = subprocess.run("/usr/bin/uname", capture_output=True, timeout=5)  # uname (Linux)
print(output.stdout.decode())
Linux

Obtaining interactive info from the user: input()#

Ask info to the user in an interactive way

input()#

gene_name = input("Provide a valid gene name:") # ie. PTEN
print(gene_name)
print(type(gene_name))
  1. If the input is a number, it will be necessary to cast it.

  2. It is convenient to validate the input in order to avoid errors.

# Calculate the factorial of an integer in the range: [0, 10]
import math

# input validation 
try :
    num = int(input("Find the factorial of the next number (n! \ 1<=n<=10):")) # ie. 4
    if 0 >= num or num <=10: 
        factorial = math.factorial(num)
        print(factorial)
    else:
        print("Your input-number is out of range")
except:
        print("Your input does not look like a number")

Providing command line arguments to your Python program#

This gives a lot of flexibility to your script

The sys module#

The sys module in Python provides various functions and variables that are used to manipulate different parts of the Python runtime environment

import sys # you need to import the sys module
sys.argv   # this returns a list of the input arguments to be used by our program

sys.argv returns a list with the name of your script and the input arguments:

["my_program.py", "first_input_argument", "second_input_argument", "third_input_argument", ... ]

If our programs needs exactly 3 input arguments: the command line should be:

python3 "my_program.py" "first_input_argument" "second_input_argument" "third_input_argument"

and it should report an error when another number of arguments is used

Exit the code#

sys.exit()#

It is a clean way to exit the program. Your programs exits printing the provided message, very useful for testing while developing your code

# The last statement of this course
import sys
sys.exit("Ich hoffe, Sie haben viel gelernt")

Summary#

  • Interaction with the OS:

    • import os

    • import shutil

    • The module os
      - os.uname()
      - os.getlogin()
      - os.stat()

      • Directories

        • os.getcwd()

        • os.pardir

        • os.path.exists()

        • os.listdirs()

      • More on directories

        • os.mkdir()

        • os.makedirs()

        • os.rmdir()

        • shutil.rmtree()

      • Files/Directories

        • os.rename()

    • The module shutil

      • Files

        • shutil.copy()

        • os.remove()

      • Directories

        • shutil.copytree()

  • Running an external program from our code

    • import subprocess

      • subprocess.call()

      • subprocess.run()

      • subprocess.Popen()

  • Request info to the user: input()
    - input()

  • Providing arguments to your python program (command line)
    - import sys
    - sys.argv

  • Exit the code
    - import sys
    - sys.exit(“Message”)


Exercises#