最近了解到的生物信息学技术.

前段时间写了个脚本抓取基因数据,在这里记录下用到的资源.

KEGG:

Kyoto Encyclopedia of Genes and Genomes

KEGG 是一个生物基因和基因组数据库。由GenomeNet网站(目前由京都大学生物信息学研究中心系维护) 继续维护和提供分析服务.

KEGG 有一点非常好,就是提供了KEGG API。方便研究人员自己写脚本定制数据. PS. KEGG提供了SOAP/REST两种方式.

TogoWS

Intergration of the bioinfomatics web services

由于我前面探索的时候走过了一些弯路,需要用到比较复杂的关系查询. 所以找到了 TogoWS 。

它集合了大部分的生物信息中心的数据包括 NCBIEBIDDBJKEGGPDBj, and CBRC提供了统一的查询接口(API). 使用起来很方便。提供了 SOAP/REST 两种调用方式. 还能返回多种数据格式

 

以下分别是由perl, python, ruby 编写的生物信息学编程工具库. 详细的资料请到官方网站了解.

BioPerl

BioPerl is a toolkit of perl modules useful in building bioinformatics solutions in Perl.

BioPython

Biopython is a set of freely available tools for biological computation written in Python by an international team of developers。

这个库的KEGG模块还很不完善,没有API, 没有KEGG数据格式的分析.

BioRuby

Open source bioinformatics library for Ruby。

triple_des(des3) 算法 - php,python 实现

调用电信那边的接口. 要用到 triple_des 加密算法. 整了python 和 php 两个版本的实现.

iv, key 变量指向的都是 hex 字符串.

php:

 

#!/usr/bin/php
<?php

function PaddingPKCS7($data) {
	$block_size = mcrypt_get_block_size("tripledes", "cbc");
	$padding_char = $block_size - (strlen($data) % $block_size);
	$data .= str_repeat(chr($padding_char),$padding_char);
	return $data;
}

function fmt3DESEx($data, $key, $iv) {
	$td = mcrypt_module_open(MCRYPT_3DES,"", MCRYPT_MODE_CBC, "");
	$key = pack("H48",$key);
	$iv = pack("H16",$iv);
	mcrypt_generic_init($td, $key, $iv);
	$data = PaddingPKCS7($data);
	$desResult = mcrypt_generic($td, $data);
	mcrypt_generic_deinit($td);
	mcrypt_module_close($td);
	return base64_encode($desResult);     
}



$data = "asdf";
$key = "313233343536373839303132333435363738393031323334";
$iv = "3132333435363738";
$code = fmt3DESEx(mhash(MHASH_SHA1,$data), $key, $iv);
echo $code;
?>

python:

 

#!/usr/bin/env python
# -*- coding:utf-8 -*-

import binascii
import hashlib
import base64
import pyDes

iv = '3132333435363738'
key = '313233343536373839303132333435363738393031323334'
data = "asdf"

iv = binascii.unhexlify(iv)
key = binascii.unhexlify(key)
data = hashlib.sha1(data)

k = pyDes.triple_des(key, pyDes.CBC, iv, pad=None, padmode=pyDes.PAD_PKCS5)
d = k.encrypt(data.digest())
print base64.encodestring(d)

 

python 拖拉桌面小程序. :)

descript: 拖拉两个xls文件到此小程序中, 自动进行比对并输出比对结果(xls)

 

# -*- coding: utf-8 -*-

# Import wx.Python
import wx
import xlrd,xlwt
import os.path

# Declare GUI Constants
DRAG_SOURCE    = wx.NewId()

#=====================================
def encoding(s,coding=None):
    cl = ['utf8', 'gb2312','gb18030']
    if coding:
        cl.append(coding)
    for a in cl:
        try:
            s.decode(a)
            return a
        except UnicodeEncodeError:
            pass
    return 'unknown'

def toUnicode(s,coding=None):
    if isinstance(s,unicode):
        return s
    return s.decode(encoding(s,coding))
#=================================================

# Define File Drop Target class
class FileDropTarget(wx.FileDropTarget):
    """ This object implements Drop Target functionality for Files """
    def __init__(self, obj):
        """ Initialize the Drop Target, passing in the Object Reference to
          indicate what should receive the dropped files """
        # Initialize the wxFileDropTarget Object
        wx.FileDropTarget.__init__(self)
        # Store the Object Reference for dropped files
        self.obj = obj
        self._initContent()

    def _initContent(self):
        self.content = {'same':[],'diff':[]}

    def _log(self,string):
        self.obj.WriteText(toUnicode(string))

    def _analyze(self,xls={}):

        def combine_rows(sheet):
            rows = []
            for rownum in range(sheet.nrows):
                rows.append("-".join([toUnicode(str(row) if isinstance(row,float) else row) \
                        for row in sheet.row_values(rownum)]))
            return rows

        #FIXME optime
        name = xls.keys()
        self._log(u"正在比对 "+name[0]+u" , "+name[1]+u" 中...... \n")
        rows = []
        for sheet in xls.values():
            rows.append(combine_rows(sheet))
        same_rows = list(set(rows[0]) & set(rows[1]))
        file1_rows = list(set(rows[0]) - set(rows[1]))
        file2_rows = list(set(rows[1]) - set(rows[0]))
        self.content['same'].extend(same_rows)
        self.content['diff'].extend(\
                [{'name':name[0],'rows':file1_rows},
                {'name':name[1],'rows':file2_rows}])
        self._log("比对完成 \n")
        return 1

    def _dumpToXls(self,path):
        self._log("生成比对结果中...... \n")
        xls = xlwt.Workbook()
        sheet_same = xls.add_sheet(u'相同数据',cell_overwrite_ok=True)
        same_rows = self.content['same']
        def write_rows(sheet,rows):
            for r in range(len(rows)):
                cells = rows[r].split('-')
                for c in range(len(cells)):
                    sheet.write(r,c,cells[c])
        write_rows(sheet_same,same_rows)

        sheet_diff = {}
        for diff in self.content['diff']:
            name = diff.get('name','default')
            rows = diff.get('rows',[])
            sheet_diff[name] = xls.add_sheet(name+u"独有",cell_overwrite_ok=True)
            write_rows(sheet_diff[name],rows)
        result = os.path.join(path,u" 和 ".join(sheet_diff.keys())+u'的比对结果.xls')
        xls.save(result)
        self._log(u"生成比对结果如下:\n%s\n" % (result,))
        self._initContent()

    def _compare(self,filenames):
        filenames = filenames if isinstance(filenames,list) else list(filenames)
        #for mutiple files
        xls={}
        try:
            for file in filenames:
                name = os.path.basename(file).split('.')[0]
                xls[name] = xlrd.open_workbook(file).sheets()[0]
        except xlrd.biffh.XLRDError,e:
            self._log("请载入xls格式文件")
            return False
        #filenames.sort(self._analyze)
        self._analyze(xls)
        self._dumpToXls(os.path.dirname(filenames[0]))

    def OnDropFiles(self, x, y, filenames):
        """ Implement File Drop """
        # For Demo purposes, this function appends a list of the files dropped at the end of the widget's text
        # Move Insertion Point to the end of the widget's text
        self.obj.SetInsertionPointEnd()
        # append a list of the file names dropped
        self._log("\n加载了 %d 个文件:\n" % (len(filenames)))
        if len(filenames) != 2:
            self._log("只能比对两个文件\n")
            return False
        self._compare(filenames)

class MainWindow(wx.Frame):
    """ This window displays the GUI Widgets. """
    def __init__(self,parent,id,title):
        #wx.Frame.__init__(self,parent,-4, title, size = (430,270), style=wx.DEFAULT_FRAME_STYLE|wx.NO_FULL_REPAINT_ON_RESIZE)
        wx.Frame.__init__(self,parent,-4, title, size = (430,270), \
                style=wx.MINIMIZE_BOX | wx.MAXIMIZE_BOX | wx.SYSTEM_MENU | wx.CAPTION | wx.CLOSE_BOX | wx.CLIP_CHILDREN )
        self.SetBackgroundColour(wx.WHITE)

        wx.StaticText(self, -1, u"请将要比对的EXCEL文件拖拉进下面区域进行比对",(3,0))
        self.text = wx.TextCtrl(self, -1, "", pos=(2,15),size=(418,220), style = wx.TE_MULTILINE|wx.HSCROLL|wx.TE_READONLY)

        dt = FileDropTarget(self.text)
        self.text.SetDropTarget(dt)

        # Display the Window
        self.Show(True)

    def CloseWindow(self, event):
        """ Close the Window """
        self.Close()

    def OnDragInit(self, event):
        """ Begin a Drag Operation """
        # Create a Text Data Object, which holds the text that is to be dragged
        tdo = wx.PyTextDataObject(self.text.GetStringSelection())
        # Create a Drop Source Object, which enables the Drag operation
        tds = wx.DropSource(self.text)
        # Associate the Data to be dragged with the Drop Source Object
        tds.SetData(tdo)
        # Intiate the Drag Operation
        tds.DoDragDrop(True)

class MyApp(wx.App):
    """ Define the Drag and Drop Example Application """
    def OnInit(self):
        """ Initialize the Application """
        # Declare the Main Application Window
        frame = MainWindow(None, -1, u"Excel文件比对小程序")
        # Show the Application as the top window
        self.SetTopWindow(frame)
        return True

# Declare the Application and start the Main Loop
app = MyApp(0)
app.MainLoop()

 

Python Web Site Process Bus v1.0 - 翻译

 Abstract

This document specifies a proposed standard interface between operating system events and web components (including web servers, web applications, and frameworks), to promote web component interoperability and portability.
 
摘要
这个文档指定了一个被提倡的标准接口,介于系统事件和web组件(例如web服务器,web应用程序,和框架).该接口促进网络组件之间的互操作性和可移植性.
 
Rationale and Goals
The Python community has produced many useful web application frameworks, including Django, Pylons, Turbogears, Zope, CherryPy, Paste, and many more [1]. Recently, many of these frameworks have attempted to decentralize their architectures and become more component-based, specifically using the WSGI specification [2] to decouple servers from applications, and even framework components from each other.
 
意向和目标
Python社区已经产出了许多很有用的web应用框架,例如 Django, Pylons, Turbogears, Zope, CherrPy, Paste等等.最近,许多这些框架已经试图使它们的结构分散化,从而变得更基于组件(组件化).特别使用WSGI规范去使"服务"从程序脱离,甚至是框架组件之间的脱离.
 
In general, however, each of these frameworks' and servers' natural assumption is that it alone must be in control of the OS process, generating and responding to process-wide events: startup, shutdown, and restart. As various people have tried to combine components from multiple frameworks, they generally select one framework or server as the primary process controller, and then write ad-hoc adapters to translate the startup/shutdown style of foreign components into the style of the primary controller. In many cases, this adaptation has simply not been possible when frameworks or servers are too tightly coupled to process-wide events; for example, when two frameworks register signal handlers for the same process.
 
但是一般来说, 这些框架和服务原本(自然)假设是必须由一个单独的系统进程来控制.产生和响应进程事件:启动,关闭和重启.当一些人已经尝试将不同框架的一些组件进行组合时, 他们一般都会选择一个框架或者服务作为主进程控制器.然后,编写特设的适配器将外来组件风格的启动/关闭事件翻译转换成主控制器的风格.很多情况下, 这个适配器