Package BioSQL :: Module Loader :: Class DatabaseLoader
[hide private]
[frames] | no frames]

Class DatabaseLoader

source code

Load a database with biopython objects.

Instance Methods [hide private]
 
__init__(self, adaptor, dbid)
Initialize with connection information for the database.
source code
 
load_seqrecord(self, record)
Load a Biopython SeqRecord into the database.
source code
 
_get_ontology_id(self, name, definition=None)
Returns the identifier for the named ontology.
source code
 
_get_term_id(self, name, ontology_id=None, definition=None, identifier=None)
Get the id that corresponds to a term.
source code
 
_add_dbxref(self, dbname, accession, version)
Insert a dbxref and return its id.
source code
 
_get_taxon_id(self, record)
Get the taxon id for this record.
source code
 
_load_bioentry_table(self, record)
Fill the bioentry table with sequence information.
source code
 
_load_bioentry_date(self, record, bioentry_id)
Add the effective date of the entry into the database.
source code
 
_load_biosequence(self, record, bioentry_id)
Record a SeqRecord's sequence and alphabet in the database.
source code
 
_load_comment(self, record, bioentry_id)
Record a SeqRecord's annotated comment in the database.
source code
 
_load_annotations(self, record, bioentry_id)
Record a SeqRecord's misc annotations in the database.
source code
 
_load_reference(self, reference, rank, bioentry_id)
Record a SeqRecord's annotated references in the database.
source code
 
_load_seqfeature(self, feature, feature_rank, bioentry_id)
Load a biopython SeqFeature into the database.
source code
 
_load_seqfeature_basic(self, feature_type, feature_rank, bioentry_id)
Load the first tables of a seqfeature and returns the id.
source code
 
_load_seqfeature_locations(self, feature, seqfeature_id)
Load all of the locations for a SeqFeature into tables.
source code
 
_insert_seqfeature_location(self, feature, rank, seqfeature_id)
Add a location of a SeqFeature to the seqfeature_location table.
source code
 
_load_seqfeature_qualifiers(self, qualifiers, seqfeature_id)
Insert the (key, value) pair qualifiers relating to a feature.
source code
 
_load_seqfeature_dbxref(self, dbxrefs, seqfeature_id)
Add the database crossreferences of a SeqFeature to the database.
source code
Int
_get_dbxref_id(self, db, accession)
o db String, the name of the external database containing the accession number
source code
 
_get_seqfeature_dbxref(self, seqfeature_id, dbxref_id, rank)
Check for a pre-existing seqfeature_dbxref entry with the passed seqfeature_id and dbxref_id.
source code
 
_add_seqfeature_dbxref(self, seqfeature_id, dbxref_id, rank)
Insert a seqfeature_dbxref row and return the seqfeature_id and dbxref_id
source code
 
_load_dbxrefs(self, record, bioentry_id)
Load any sequence level cross references into the database.
source code
 
_get_bioentry_dbxref(self, bioentry_id, dbxref_id, rank)
Check for a pre-existing bioentry_dbxref entry with the passed seqfeature_id and dbxref_id.
source code
 
_add_bioentry_dbxref(self, bioentry_id, dbxref_id, rank)
Insert a bioentry_dbxref row and return the seqfeature_id and dbxref_id
source code
Method Details [hide private]

__init__(self, adaptor, dbid)
(Constructor)

source code 

Initialize with connection information for the database.

XXX Figure out what I need to load a database and document it.

_get_ontology_id(self, name, definition=None)

source code 

Returns the identifier for the named ontology.

This looks through the onotology table for a the given entry name. If it is not found, a row is added for this ontology (using the definition if supplied). In either case, the id corresponding to the provided name is returned, so that you can reference it in another table.

_get_term_id(self, name, ontology_id=None, definition=None, identifier=None)

source code 

Get the id that corresponds to a term.

This looks through the term table for a the given term. If it is not found, a new id corresponding to this term is created. In either case, the id corresponding to that term is returned, so that you can reference it in another table.

The ontology_id should be used to disambiguate the term.

_get_taxon_id(self, record)

source code 

Get the taxon id for this record.

record - a SeqRecord object

This searches the taxon/taxon_name tables using the NCBI taxon ID, scientific name and common name to find the matching taxon table entry's id.

If the species isn't in the taxon table, and we have at least the NCBI taxon ID, scientific name or common name, a minimal stub entry is created in the table.

If this information is not in the record's annotation, then None is returned.

See also the BioSQL script load_ncbi_taxonomy.pl which will populate and update the taxon/taxon_name tables with the latest information from the NCBI.

_load_bioentry_table(self, record)

source code 

Fill the bioentry table with sequence information.

record - SeqRecord object to add to the database.

_load_bioentry_date(self, record, bioentry_id)

source code 

Add the effective date of the entry into the database.

record - a SeqRecord object with an annotated date bioentry_id - corresponding database identifier

_load_biosequence(self, record, bioentry_id)

source code 

Record a SeqRecord's sequence and alphabet in the database.

record - a SeqRecord object with a seq property bioentry_id - corresponding database identifier

_load_comment(self, record, bioentry_id)

source code 

Record a SeqRecord's annotated comment in the database.

record - a SeqRecord object with an annotated comment bioentry_id - corresponding database identifier

_load_annotations(self, record, bioentry_id)

source code 

Record a SeqRecord's misc annotations in the database.

The annotation strings are recorded in the bioentry_qualifier_value table, except for special cases like the reference, comment and taxonomy which are handled with their own tables.

record - a SeqRecord object with an annotations dictionary bioentry_id - corresponding database identifier

_load_reference(self, reference, rank, bioentry_id)

source code 

Record a SeqRecord's annotated references in the database.

record - a SeqRecord object with annotated references bioentry_id - corresponding database identifier

_load_seqfeature_basic(self, feature_type, feature_rank, bioentry_id)

source code 

Load the first tables of a seqfeature and returns the id.

This loads the "key" of the seqfeature (ie. CDS, gene) and the basic seqfeature table itself.

_load_seqfeature_locations(self, feature, seqfeature_id)

source code 
Load all of the locations for a SeqFeature into tables.

This adds the locations related to the SeqFeature into the
seqfeature_location table. Fuzzies are not handled right now.
For a simple location, ie (1..2), we have a single table row
with seq_start = 1, seq_end = 2, location_rank = 1.

For split locations, ie (1..2, 3..4, 5..6) we would have three
row tables with:
    start = 1, end = 2, rank = 1
    start = 3, end = 4, rank = 2
    start = 5, end = 6, rank = 3

_load_seqfeature_qualifiers(self, qualifiers, seqfeature_id)

source code 
Insert the (key, value) pair qualifiers relating to a feature.

Qualifiers should be a dictionary of the form:
    {key : [value1, value2]}

_load_seqfeature_dbxref(self, dbxrefs, seqfeature_id)

source code 
Add the database crossreferences of a SeqFeature to the database.

o dbxrefs           List, dbxref data from the source file in the
                    format <database>:<accession>

o seqfeature_id     Int, the identifier for the seqfeature in the
                    seqfeature table

Insert dbxref qualifier data for a seqfeature into the
seqfeature_dbxref and, if required, dbxref tables.
The dbxref_id qualifier/value sets go into the dbxref table
as dbname, accession, version tuples, with dbxref.dbxref_id
being automatically assigned, and into the seqfeature_dbxref
table as seqfeature_id, dbxref_id, and rank tuples

_get_dbxref_id(self, db, accession)

source code 
o db          String, the name of the external database containing
              the accession number

o accession   String, the accession of the dbxref data

Finds and returns the dbxref_id for the passed data.  The method
attempts to find an existing record first, and inserts the data
if there is no record.

Returns: Int

_get_seqfeature_dbxref(self, seqfeature_id, dbxref_id, rank)

source code 

Check for a pre-existing seqfeature_dbxref entry with the passed seqfeature_id and dbxref_id. If one does not exist, insert new data

_load_dbxrefs(self, record, bioentry_id)

source code 

Load any sequence level cross references into the database.

See table bioentry_dbxref

_get_bioentry_dbxref(self, bioentry_id, dbxref_id, rank)

source code 

Check for a pre-existing bioentry_dbxref entry with the passed seqfeature_id and dbxref_id. If one does not exist, insert new data