clj-biosequence.core

->biosequenceIndex

(->biosequenceIndex index path)
Positional factory function for class clj_biosequence.core.biosequenceIndex.

->biosequenceIndexReader

(->biosequenceIndexReader index strm)
Positional factory function for class clj_biosequence.core.biosequenceIndexReader.

->bzipped

(->bzipped file)
Positional factory function for class clj_biosequence.core.bzipped.

->fastaFile

(->fastaFile file alphabet encoding)
Positional factory function for class clj_biosequence.core.fastaFile.

->fastaReader

(->fastaReader strm alphabet)
Positional factory function for class clj_biosequence.core.fastaReader.

->fastaSequence

(->fastaSequence acc description alphabet sequence)
Positional factory function for class clj_biosequence.core.fastaSequence.

->fastaString

(->fastaString str alphabet)
Positional factory function for class clj_biosequence.core.fastaString.

->gzipped

(->gzipped file)
Positional factory function for class clj_biosequence.core.gzipped.

->uncompressed

(->uncompressed file)
Positional factory function for class clj_biosequence.core.uncompressed.

->zipped

(->zipped file)
Positional factory function for class clj_biosequence.core.zipped.

bioReader

protocol

members

make-reader

(make-reader x opts)

bioreader

(bioreader x & opts)

bioseq-proxy

Biosequence

protocol

members

alphabet

(alphabet this)
Returns the alphabet of a biosequence.

bs-seq

(bs-seq this)
Returns the sequence of a biosequence as a vector.

keywords

(keywords this)
Returns a collection of keywords.

moltype

(moltype this)
Returns the moltype of a biosequence.

protein?

(protein? this)
Returns true if a protein and false otherwise.

biosequence->file

(biosequence->file bs file & {:keys [append func], :or {append true, func fasta-string}})
Takes a collection of biosequences and prints them to file. To
append to an existing file use `:append true` and the `:func`
argument can be used to pass a function that will be used to prepare
the printed output, the default is `fasta-string` which will print
the biosequences to the file in fasta format. Returns the file.

biosequenceCitation

protocol

members

abstract

(abstract this)
Returns the abstract

authors

(authors this)
Returns the authors from a citation object.

crossrefs

(crossrefs this)
Returns crossrefs - DOI, ISBN etc

journal

(journal this)
Returns the journal of a citation object.

pend

(pend this)
Returns the end page of a citation object.

pstart

(pstart this)
Returns the start page of a citation object.

pubmed

(pubmed this)
Returns the pubmed id of a reference if there is one.

title

(title this)
Returns the title of a citation object.

volume

(volume this)
Returns the volume of a citation object.

year

(year this)
Returns the year of a citation object.

biosequenceCitations

protocol

members

citation-key

(citation-key this)
Returns a citation key from a record.

citations

(citations this)
Returns a collection of references in a sequence record.

biosequenceComments

protocol

members

comments

(comments this)
Returns comments.

filter-comments

(filter-comments this value)
Filters comments based on the return value of obj-type.

biosequenceDbRef

protocol

members

database-name

(database-name this)
Returns the name of the database.

db-properties

(db-properties this)
Returns properties of the reference.

object-id

(object-id this)
Returns the ID of an database object.

biosequenceDbRefs

protocol

members

get-db-refs

(get-db-refs this)
Returns db ref records.

biosequenceDescription

protocol

members

description

(description this)
Returns the description of a biosequence object.

biosequenceEvidence

protocol

members

evidence

(evidence this)
Returns evidence records.

biosequenceFeature

protocol

members

operator

(operator this)
Returns an operator for dealing with intervals.

biosequenceFeatures

protocol

members

feature-seq

(feature-seq this)
Returns a lazy list of features in a sequence.

filter-features

(filter-features this name)
Returns a list of features that return 'name' when called using
bs/bs-name from biosequenceName protocol.

biosequenceFile

protocol

members

bs-path

(bs-path this)
Returns the path of the file as string.

biosequenceGene

protocol

members

locus

(locus this)
Returns the gene locus.

locus-tag

(locus-tag this)
Returns a locus tag.

map-location

(map-location this)
Returns the map location.

orf

(orf this)
ORF associated with the gene.

products

(products this)
The products of a gene.

biosequenceGenes

protocol

members

genes

(genes this)
Returns sub-seq gene records.

biosequenceGoTerm

protocol

members

go-component

(go-component this)
The GO component, molecular function etc.

go-id

(go-id this)
The GO id.

go-term

(go-term this)
The GO term.

biosequenceGoTerms

protocol

members

gos

(gos this)
Returns go term records.

biosequenceID

protocol

members

accession

(accession this)
Returns the accession of a biosequence object.

accessions

(accessions this)
Returns a list of accessions for a biosequence object.

creation-date

(creation-date this)
Returns a java date object.

dataset

(dataset this)
Returns the dataset.

discontinue-date

(discontinue-date this)
Returns a date object.

update-date

(update-date this)
Returns a java date object.

version

(version this)
Returns the version of the accession nil if none.

biosequenceInterval

protocol

members

comp?

(comp? this)
Is the interval complementary to the biosequence
sequence. Boolean

end

(end this)
Returns the end position of an interval as an integer.

point

(point this)
Returns a point interval.

start

(start this)
Returns the start position of an interval as an integer.

biosequenceIntervals

protocol

members

intervals

(intervals this)
Returns a list of intervals.

biosequenceIO

protocol

members

bs-reader

(bs-reader this)
Returns a reader for a file containing biosequences. Use with
`with-open'

biosequenceName

protocol

members

allergen-names

(allergen-names this)
Returns the allergen names.

alternate-names

(alternate-names this)
Returns the alternate names.

biotech-names

(biotech-names this)
Returns the biotech names.

cd-antigen-names

(cd-antigen-names this)
Returns the cd names.

innnames

(innnames this)
Returns the innname.

names

(names this)
Returns the default names of a record.

submitted-names

(submitted-names this)
Returns the submitted names.

biosequenceNameObject

protocol

members

obj-description

(obj-description this)

obj-heading

(obj-heading this)

obj-id

(obj-id this)

obj-label

(obj-label this)

obj-name

(obj-name this)

obj-text

(obj-text this)

obj-type

(obj-type this)

obj-value

(obj-value this)

biosequenceNotes

protocol

members

notes

(notes this)
Returns notes.

biosequenceParameters

protocol

members

parameters

(parameters this)
Returns parameters from a reader.

biosequenceProtein

protocol

members

activities

(activities this)
Returns a lit of activities.

calc-mol-wt

(calc-mol-wt this)
The calculated molecular weight.

ecs

(ecs this)
Returns list of E.C numbers.

processed

(processed this)
Processing of the protein.

biosequenceProteins

protocol

members

proteins

(proteins this)
Returns protein sub-seq records.

biosequenceReader

protocol

members

biosequence-seq

(biosequence-seq this)
Returns a lazy sequence of biosequence objects.

get-biosequence

(get-biosequence this accession)
Returns the biosequence object with the corresponding
accession.

biosequenceStatus

protocol

members

status

(status this)
Status of a biosequence.

biosequenceSubCellLoc

protocol

members

subcell-location

(subcell-location this)

subcell-orient

(subcell-orient this)

subcell-topol

(subcell-topol this)

biosequenceSubCellLocs

protocol

members

subcell-locations

(subcell-locations this)

biosequenceSummary

protocol

members

summary

(summary this)
Returns the summary of a sequence.

biosequenceSynonyms

protocol

members

synonyms

(synonyms this)
Returns a list of synonyms.

biosequenceTaxonomies

protocol

members

tax-refs

(tax-refs src)
Returns taxonomy records.

biosequenceTaxonomy

protocol

members

common-name

(common-name this)
Returns the common name.

lineage

(lineage this)
Returns a lineage string.

mods

(mods this)
Returns a list of modifications to taxonomy.

tax-name

(tax-name this)
Returns the taxonomic name.

biosequenceTranslation

protocol

members

codon-start

(codon-start this)
The start codon.

frame

(frame this)
Returns the frame a sequence should be translated in.

trans-table

(trans-table this)
Returns the translation table code to be used.

translation

(translation this)
Returns the translation of a sequence.

biosequenceUrl

protocol

members

anchor

(anchor this)
Text to show as highlight

post-text

(post-text this)
Text after anchor

pre-text

(pre-text this)
Text before anchor.

url

(url this)
Returns the url.

biosequenceVariant

protocol

members

original

(original this)
Returns the original.

variant

(variant this)
Returns the variant.

clean-sequence

(clean-sequence s a)
Removes spaces and newlines and checks that all characters are
legal characters for the supplied alphabet. Replaces non-valid
characters with \X. If `a' is not a defined alphabet throws an
exception.

default-biosequence-biosequence

Default implementation of Biosequence protocol.

default-biosequence-citation

Default implementation of biosequenceCitation protocol.

default-biosequence-citations

Default implementation of biosequenceCitations protocol.

default-biosequence-comments

Default implementation of biosequenceComments protocol.

default-biosequence-dbref

Default implementation of biosequenceDbRef protocol.

default-biosequence-dbrefs

Default implementation of biosequenceDbRefs protocol.

default-biosequence-description

Default implementation of biosequenceDescription protocol.

default-biosequence-evidence

default-biosequence-feature

Default implementation of biosequenceFeature protocol.

default-biosequence-features

Default implementation of biosequenceFeatures protocol.

default-biosequence-file

Default implementation of biosequenceFile protocol.

default-biosequence-gene

Default implementation of biosequenceGene protocol.

default-biosequence-genes

Default implementation of biosequenceGenes protocol.

default-biosequence-goterm

Default implementation of biosequenceGoTerm protocol.

default-biosequence-goterms

Default implementation of biosequenceGoTerms protocol.

default-biosequence-id

Default implementation of biosequenceID protocol.

default-biosequence-interval

Default implementation of biosequenceInterval protocol.

default-biosequence-intervals

Default implementation of biosequenceIntervals protocol.

default-biosequence-name

Default implementation of biosequenceName protocol.

default-biosequence-nameobject

Default implementation of biosequenceNameObject protocol.

default-biosequence-notes

Default implementation of biosequenceNotes protocol.

default-biosequence-protein

Default implementation of biosequenceProtein protocol.

default-biosequence-proteins

Default implementation of biosequenceProteins protocol.

default-biosequence-status

Default implementation of biosequenceStatus protocol.

default-biosequence-subcell

Default implementation of biosequenceSubCellLoc protocol.

default-biosequence-subcells

Default implementation of biosequenceSubCellLocs protocol.

default-biosequence-summary

Default implementation of biosequenceSummary protocol.

default-biosequence-synonyms

Default implementation of biosequenceSynonyms protocol.

default-biosequence-tax

Default implementation of biosequenceTaxonomy protocol.

default-biosequence-taxonomies

Default implementation of biosequenceTaxonomies protocol.

default-biosequence-translation

Default implementation of biosequenceTranslation protocol.

default-biosequence-url

Default implementation of biosequenceUrl protocol.

default-biosequence-variant

Default implementation of biosequenceVariant protocol.

default-reader

delete-indexed-biosequence

(delete-indexed-biosequence index-file)

fasta-string

(fasta-string bioseq)

fastaReduce

protocol

members

fasta-reduce

(fasta-reduce this func fold)
Applies a function to sequence data streamed line-by-line and
reduces the results using the supplied `fold` function. Uses the
core reducers library so the fold function needs to have an
'identity' value that is returned when the function is called with
no arguments.

get-feature-sequence

(get-feature-sequence feature bs)
Returns a fastaSequence object containing the sequence specified in
a feature object from a biosequence.

get-interval-sequence

(get-interval-sequence interval bs)
Returns a fasta sequence corresponding to the provided interval.

get-list

macro

(get-list obj & keys)
Low level macro for retrieving data from xml elements.

get-one

macro

(get-one obj & keys)
Low level macro for retrieving data from xml elements.

get-req

(get-req a param)

get-text

macro

(get-text obj & keys)
Low level macro for retrieving data from xml elements.

id-convert

(id-convert ids from to email)
Takes a list of accessions and returns a hash-map mapping them to
accessions from another database. If nothing found returns an empty
hash-map and only returns entries that had a match. Uses the
Uniprot id mapping utility and a list of supported databases is
supplied at http://www.uniprot.org/faq/28#id_mapping_examples. Some
common mappings include:

DB Name                  Abbreviation         Direction
UniProtKB AC/ID	            ACC+ID   	         from
UniProtKB AC	                ACC                to
EMBL/GenBank/DDBJ	          EMBL_ID	           both
EMBL/GenBank/DDBJ CDS        EMBL     	         both
Entrez Gene (GeneID)         P_ENTREZGENEID     both
GI number	                  P_GI        	     both
RefSeq Protein	              P_REFSEQ_AC 	     both
RefSeq Nucleotide	          REFSEQ_NT_ID       both
WormBase	                    WORMBASE_ID  	     both

There is a 100,000 limit on accessions in a single query imposed by
Uniprot.

index-biosequence-file

(index-biosequence-file file & {:keys [func], :or {func accession}})

init-fasta-file

(init-fasta-file path alphabet & opts)
Initialises fasta protein file. Accession numbers and description
are processed by splitting the string on the first space, the
accession being the first value and description the second. Encoding
can be specified using the :encoding keyword, defaults to UTF-8.

init-fasta-reader

(init-fasta-reader strm alphabet)

init-fasta-sequence

(init-fasta-sequence accession description alphabet sequence)
Returns a new fastaSequence. Currently :iupacNucleicAcids
and :iupacAminoAcids are supported alphabets.

init-fasta-string

(init-fasta-string str alphabet)
Initialises a fasta string. Accession numbers and description are
processed by splitting the string on the first space, the accession
being the first value and description the second.

load-biosequence-index

(load-biosequence-index path)

make-date

(make-date str f)

make-date-format

(make-date-format str)

map->biosequenceIndex

(map->biosequenceIndex m__5869__auto__)
Factory function for class clj_biosequence.core.biosequenceIndex, taking a map of keywords to field values.

map->biosequenceIndexReader

(map->biosequenceIndexReader m__5869__auto__)
Factory function for class clj_biosequence.core.biosequenceIndexReader, taking a map of keywords to field values.

map->bzipped

(map->bzipped m__5869__auto__)
Factory function for class clj_biosequence.core.bzipped, taking a map of keywords to field values.

map->fastaFile

(map->fastaFile m__5869__auto__)
Factory function for class clj_biosequence.core.fastaFile, taking a map of keywords to field values.

map->fastaReader

(map->fastaReader m__5869__auto__)
Factory function for class clj_biosequence.core.fastaReader, taking a map of keywords to field values.

map->fastaSequence

(map->fastaSequence m__5869__auto__)
Factory function for class clj_biosequence.core.fastaSequence, taking a map of keywords to field values.

map->fastaString

(map->fastaString m__5869__auto__)
Factory function for class clj_biosequence.core.fastaString, taking a map of keywords to field values.

map->gzipped

(map->gzipped m__5869__auto__)
Factory function for class clj_biosequence.core.gzipped, taking a map of keywords to field values.

map->uncompressed

(map->uncompressed m__5869__auto__)
Factory function for class clj_biosequence.core.uncompressed, taking a map of keywords to field values.

map->zipped

(map->zipped m__5869__auto__)
Factory function for class clj_biosequence.core.zipped, taking a map of keywords to field values.

n50

(n50 reader)
Takes anything that can have `biosequence-seq' called on it and
returns the N50 of the sequences therein.

object->file

(object->file obj file)
Spits an object to file after making sure *print-length* is
temporarily set to false.

p-key

(p-key _)

parse-date

(parse-date d)

post-req

(post-req a param)

protein-charge

(protein-charge p & {:keys [ph disulfides], :or {ph 7, disulfides 0}})
Calculates the theoretical protein charge at the specified
pH (default 7). Uses pKa values set out in the protein alphabets
from cl-biosequence.alphabet. Considers Lys, His, Arg, Glu, Asp, Tyr
and Cys residues only and ignores all other amino acids. The number
of disulfides can be specified and 2 times this figure will be
deducted from the number of Cys residues used in the calculation.
Values used for the pKa of the N-term and C-term are 9.69 and 2.34
respectively.

return-nil

reverse-comp

(reverse-comp this)
Returns a new fastaSequence with the reverse complement sequence.

reverse-seq

(reverse-seq this)
Returns a new fastaSequence with the reverse sequence.

set-bioseq-proxy!

(set-bioseq-proxy! params)

six-frame-translation

(six-frame-translation nucleotide)(six-frame-translation nucleotide table)
Returns a lazy list of fastaSequence objects representing
translations of a nucleotide biosequence object in six frames.

sub-bioseq

(sub-bioseq bs beg)(sub-bioseq bs beg end)
Returns a new fasta sequence object with the sequence corresponding
to 'beg' (inclusive) and 'end' (exclusive) of 'bs'. If no 'end'
argument returns from 'start' to the end of the sequence. Zero
based index.

translate

(translate bs frame & {:keys [table id-alter], :or {table (ala/codon-tables 1), id-alter true}})
Returns a fastaSequence sequence representing the translation of
the specified biosequence in the specified frame.