{article Dive into Python}{title} {text} {/article}

Introducing sys.modules

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> print ('\n'.join(sys.modules.keys()))
ast
idlelib.WidgetRedirector
pickle
idlelib.rpc
_warnings
time
idlelib.HyperParser
tkinter._fix
copyreg
tkinter.constants
marshal
tarfile
threading
_collections
token

#---------------------------------------------------#

The sys module contains system−level information, such as the version of Python you're running (sys.version or sys.version_info), and system−level options such as the maximum allowed recursion depth (sys.getrecursionlimit() and sys.setrecursionlimit()).

sys.modules is a dictionary containing all the modules that have ever been imported since Python was started; the key is the module name, the value is the module object. Note that this is more than just the modules your program has imported. Python preloads some modules on startup, and if you're using a Python IDE, sys.modules contains all the modules imported by all the programs you've run within the IDE.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> print ('\n'.join(sys.modules.keys()))
ast
idlelib.WidgetRedirector
pickle
idlelib.rpc
_warnings
time
idlelib.HyperParser
tkinter._fix
copyreg
tkinter.constants
marshal
tarfile
threading
_collections
token

</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

Using sys.modules

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> import fileinfo
>>> print ('\n'.join(sys.modules.keys()))
array
idlelib.Percolator
tokenize
idlelib.idlever
_bootlocale
functools
shlex
codecs
dis
idlelib.Bindings
_stat
encodings.aliases
copy
tkinter._fix
idlelib.StackViewer
code

>>> fileinfo
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>> sys.modules["fileinfo"]
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>>

As new modules are imported, they are added to sys.modules. This explains why importing the same module twice is very fast: Python has already loaded and cached the module in sys.modules, so importing the second time is simply a dictionary lookup.

Given the name (as a string) of any previously−imported module, you can get a reference to the module itself through the sys.modules dictionary.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> import fileinfo
>>> print ('\n'.join(sys.modules.keys()))
array
idlelib.Percolator
tokenize
idlelib.idlever
_bootlocale
functools
shlex
codecs
dis
idlelib.Bindings
_stat
encodings.aliases
copy
tkinter._fix
idlelib.StackViewer
code
>>> fileinfo
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>> sys.modules["fileinfo"]
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>>
</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

The __module__ Class Attribute

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> import fileinfo
>>> from fileinfo import MP3FileInfo
>>> MP3FileInfo.__module__
'fileinfo'
>>> sys.modules[MP3FileInfo.__module__]
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>>

#----------------------------------------------------------#

Every Python class has a built−in class attribute __module__, which is the name of the module in which the class is defined.

Combining this with the sys.modules dictionary, you can get a reference to the module in which a class is defined.

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:45:13) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import sys
>>> import fileinfo
>>> from fileinfo import MP3FileInfo
>>> MP3FileInfo.__module__
'fileinfo'
>>> sys.modules[MP3FileInfo.__module__]
<module 'fileinfo' from 'C:\\Python34\\fileinfo.py'>
>>>


</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}

sys.modules in fileinfo.py

fileinfo.py

"""Framework for getting filetype-specific metadata.

Instantiate appropriate class with filename. Returned object acts like a
dictionary, with key-value pairs for each piece of metadata.
import fileinfo
info = fileinfo.MP3FileInfo("/music/ap/mahadeva.mp3")
print "\\n".join(["%s=%s" % (k, v) for k, v in info.items()])

Or use listDirectory function to get info on all files in a directory.
for info in fileinfo.listDirectory("/music/ap/", [".mp3"]):
...

Framework can be extended by adding classes for particular file types, e.g.
HTMLFileInfo, MPGFileInfo, DOCFileInfo. Each class is completely responsible for
parsing its files appropriately; see MP3FileInfo for example.

This program is part of "Dive Into Python", a free Python book for
experienced programmers. Visit http://diveintopython.org/ for the
latest version.
"""

__author__ = "Mark Pilgrim (This email address is being protected from spambots. You need JavaScript enabled to view it.)"
__version__ = "$Revision: 1.3 $"
__date__ = "$Date: 2004/05/05 21:57:19 $"
__copyright__ = "Copyright (c) 2001 Mark Pilgrim"
__license__ = "Python"

import os
import sys
from UserDict import UserDict

def stripnulls(data):
"strip whitespace and nulls"
return data.replace("\00", " ").strip()

class FileInfo(UserDict):
"store file metadata"
def __init__(self, filename=None):
UserDict.__init__(self)
self["name"] = filename

class MP3FileInfo(FileInfo):
"store ID3v1.0 MP3 tags"
tagDataMap = {"title" : ( 3, 33, stripnulls),
"artist" : ( 33, 63, stripnulls),
"album" : ( 63, 93, stripnulls),
"year" : ( 93, 97, stripnulls),
"comment" : ( 97, 126, stripnulls),
"genre" : (127, 128, ord)}

def __parse(self, filename):
"parse ID3v1.0 tags from MP3 file"
self.clear()
try:
fsock = open(filename, "rb", 0)
try:
fsock.seek(-128, 2)
tagdata = fsock.read(128)
finally:
fsock.close()
if tagdata[:3] == 'TAG':
for tag, (start, end, parseFunc) in self.tagDataMap.items():
self[tag] = parseFunc(tagdata[start:end])
except IOError:
pass

def __setitem__(self, key, item):
if key == "name" and item:
self.__parse(item)
FileInfo.__setitem__(self, key, item)

def listDirectory(directory, fileExtList):
"get list of file info objects for files of particular extensions"
fileList = [os.path.normcase(f) for f in os.listdir(directory)]
fileList = [os.path.join(directory, f) for f in fileList \
if os.path.splitext(f)[1] in fileExtList]
def getFileInfoClass(filename, module=sys.modules[FileInfo.__module__]):
"get file info class from filename extension"
subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:]
return hasattr(module, subclass) and getattr(module, subclass) or FileInfo
return [getFileInfoClass(f)(f) for f in fileList]

if __name__ == "__main__":
for info in listDirectory("/music/_singles/", [".mp3"]):
print ("\n".join(["%s=%s" % (k, v) for k, v in info.items()]))
print


This is a function with two arguments; filename is required, but module is optional and defaults to the module that contains the FileInfo class. This looks inefficient, because you might expect Python to evaluate the sys.modules expression every time the function is called. In fact, Python evaluates default expressions only once, the first time the module is imported. As you'll see later, you never call this function with a module argument, so module serves as a function−level constant.

You'll plow through this line later, after you dive into the os module. For now, take it on faith that subclass ends up as the name of a class, like MP3FileInfo.

You already know about getattr, which gets a reference to an object by name. hasattr is a complementary function that checks whether an object has a particular attribute; in this case, whether a module has a particular class (although it works for any object and any attribute, just like getattr). In English, this line of code says, "If this module has the class named by subclass then return it, otherwise return the base class FileInfo."

{source}
<!-- You can place html anywhere within the source tags -->
<pre class="brush:py;">

"""Framework for getting filetype-specific metadata.

Instantiate appropriate class with filename. Returned object acts like a
dictionary, with key-value pairs for each piece of metadata.
    import fileinfo
    info = fileinfo.MP3FileInfo("/music/ap/mahadeva.mp3")
    print "\\n".join(["%s=%s" % (k, v) for k, v in info.items()])

Or use listDirectory function to get info on all files in a directory.
    for info in fileinfo.listDirectory("/music/ap/", [".mp3"]):
        ...

Framework can be extended by adding classes for particular file types, e.g.
HTMLFileInfo, MPGFileInfo, DOCFileInfo. Each class is completely responsible for
parsing its files appropriately; see MP3FileInfo for example.

This program is part of "Dive Into Python", a free Python book for
experienced programmers. Visit http://diveintopython.org/ for the
latest version.
"""

__author__ = "Mark Pilgrim (This email address is being protected from spambots. You need JavaScript enabled to view it.)"
__version__ = "$Revision: 1.3 $"
__date__ = "$Date: 2004/05/05 21:57:19 $"
__copyright__ = "Copyright (c) 2001 Mark Pilgrim"
__license__ = "Python"

import os
import sys
from UserDict import UserDict

def stripnulls(data):
    "strip whitespace and nulls"
    return data.replace("\00", " ").strip()

class FileInfo(UserDict):
    "store file metadata"
    def __init__(self, filename=None):
        UserDict.__init__(self)
        self["name"] = filename

class MP3FileInfo(FileInfo):
    "store ID3v1.0 MP3 tags"
    tagDataMap = {"title" : ( 3, 33, stripnulls),
                "artist" : ( 33, 63, stripnulls),
                "album" : ( 63, 93, stripnulls),
                "year" : ( 93, 97, stripnulls),
                "comment" : ( 97, 126, stripnulls),
                "genre" : (127, 128, ord)}

    def __parse(self, filename):
        "parse ID3v1.0 tags from MP3 file"
        self.clear()
        try:
            fsock = open(filename, "rb", 0)
            try:
                fsock.seek(-128, 2)
                tagdata = fsock.read(128)
            finally:
                fsock.close()
            if tagdata[:3] == 'TAG':
                for tag, (start, end, parseFunc) in self.tagDataMap.items():
                    self[tag] = parseFunc(tagdata[start:end])
        except IOError:
            pass

    def __setitem__(self, key, item):
        if key == "name" and item:
            self.__parse(item)
        FileInfo.__setitem__(self, key, item)

def listDirectory(directory, fileExtList):
    "get list of file info objects for files of particular extensions"
    fileList = [os.path.normcase(f) for f in os.listdir(directory)]
    fileList = [os.path.join(directory, f) for f in fileList \
                if os.path.splitext(f)[1] in fileExtList]
    def getFileInfoClass(filename, module=sys.modules[FileInfo.__module__]):
        "get file info class from filename extension"
        subclass = "%sFileInfo" % os.path.splitext(filename)[1].upper()[1:]
        return hasattr(module, subclass) and getattr(module, subclass) or FileInfo
    return [getFileInfoClass(f)(f) for f in fileList]

if __name__ == "__main__":
    for info in listDirectory("/music/_singles/", [".mp3"]):
        print ("\n".join(["%s=%s" % (k, v) for k, v in info.items()]))
        print



</pre>

<script language="javascript" type="text/javascript">
    // You can place JavaScript like this

</script>
<?php
    // You can place PHP like this

?>
{/source}