Python 101 Team
A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended.
A module can contain statements as well as function definitions.
Python has several default modules, which are present in any complete python install, such as re, sys or os. Consult here for a complete index
In order to use the contents of a specific module, you must use the import command.
import «module»
import «module» as «other_name»
from «module» import «something»
from «module» import «something» as «other_name»
from «module» import *
After importing a module, we can use it's contents in different ways depending on how the module was imported:
import sys
print(sys.argv[0])
from sys import argv
print(argv[0])
In order to write a module, you just have to write a script that can be as simple as declaring some variables.
Afterwards, just import it into your main program and they will be ready for use.
However, it is a good practice, to add the following if
statement to your code:
if __name__ == "__main__":
«Program»
This will ensure that any code inside the conditional will only be run if the script is being run as a standalone program.
Regular expressions (RE) are a very large topic. A whole course could be had on them.
They are very useful when our code starts getting full of endswith() and startswith() and lot's of conditionals all over.
RE can make our lives a lot easier, but they take a lot of getting used to and even then, they produce hard to read code. But even despite these shortcomings, they are awesome!
If you look here you can see that this section of python's documentation is as large as say - the whole control flow section. But fear not, the docs are there to help you and you don't have to learn everything about RE today. The goal here is to let you know what can be done.
Later you may want to go here to learn more. It is an introductory tutorial to RE.
A RE specifies a set of strings that match it; the functions in the re module let you check if a particular string matches a given regular expression.
This often requires the use of special characters - AKA metacharacters.
. -> Matches any character
^ -> Matches the beginning of a string (not a character)
$ -> Matches the end of a string (also, not a character)
* -> Matches the preceeding character 0 or more times
+ -> Matches the preceeding character 1 or more times
? -> Matches the preceeding character 0 or 1 times
{x} -> Matches exactly _x_ copies of the preceeding character
{x,y} -> Matches _x_ to _y_ copies of the preceeding character
\ -> Escapes the following character (for matching things like *)
[XYZ] -> Indicates a set of characters - in this case X, Y or Z
| -> Separates 2 or more REs, and matches either of them
There are, however, many more here
Using these metacharacters we can use the re module to perform useful operations, using re.search, re.sub and re.compile to name a few.
We will use re.search() as an example.
This function will look for an expression in a string and is invoked like this:
re.search(pattern, string, flags=0)
re.search will search a given string for a given pattern, and return it. If the pattern is not found, it returns None:
You must test this code in IDLE or equivalent.
Provides a portable way of using operating system dependent functionality.
os.chdir(path)
Change the current working directory to path.
Availability: Unix, Windows.
os.getcwd()
Return a string representing the current working directory.
Availability: Unix, Windows.
os.listdir(path)
Return a list containing the names of the entries in the directory given by path. The list is in arbitrary order.
Availability: Unix, Windows.
os.mkdir(path[, mode])
Create a directory named path. If the directory already exists, an error is raised.
Availability: Unix, Windows.
os.makedirs(path[, mode])
Recursive directory creation function. Like mkdir(), but makes all intermediate-level directories needed to contain the leaf directory. Raises an error exception if the leaf directory already exists or cannot be created.
os.remove(path)
Remove (delete) the file path. If path is a directory, an error is raised. For directories, use rmdir() instead.
Availability: Unix, Windows.
os.rmdir(path)
Remove (delete) the directory path. Only works when the directory is empty, otherwise, an error is raised.
Availability: Unix, Windows.
os.removedirs(path)
Remove directories recursively. Works like rmdir() except that, if the leaf directory is successfully removed, removedirs() tries to successively remove every parent directory mentioned in path until an error is raised.
os.rename(src, dst)
Rename the file or directory src to dst. If dst is a directory, an error will be raised.
Unix: if dst exists and is a file, it will be replaced silently if the user has permission.
Windows: if dst already exists, an error will be raised even if it is a file.
Availability: Unix, Windows.
os.popen(command[, mode[, bufsize]])
Deprecated since version 2.6: This function is obsolete. Use the subprocess module. Check especially the Replacing Older Functions with the subprocess Module section.
subprocess.check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False)
Run command with arguments and return its output as a byte string.
[bruno@laptop ~]$ ls -l | grep py
-rw-r--r-- 1 bruno cobig2 0 May 31 19:21 script1.py
-rw-r--r-- 1 bruno cobig2 0 May 31 19:21 script2.py
The Biopython module, is a collection of tools and modules that have been developed focusing on bioinformatic and computational biology problems. It has numerous functionalities such as:
The SeqRecord class allows identifiers and features to be associated with a sequence, creating sequence records much more richer in information:
These features can be created manually, or imported directly from a database record (GenBank).
The SeqIO.write() funtion can write a set of SeqRecord objects into a new file in a format specified by the user. You only need (i) one or more SeqRecord objects, a filename to write to, and a sequence format.
/
#